Kevin Deldycke - parsing

How-to fix ruby’s FeedTools latin-1 parsing

Kevin Deldycke — Thu, 31 Jul 2008 00:00:00 +0200

While playing with FeedTools , a ruby library to parse RSS (or other) feeds, I’ve spotted a strange behavior, that at first looks like typical unicode parsing issue. So I’ve started to check that the original feed was encoded in the right format, and that its charset was clearly …

How-to add proxy support to Feedalizer ruby library

Kevin Deldycke — Wed, 16 Jul 2008 00:00:00 +0200

Here is a little code snippet which monkey-patch Feedalizer to let it grab web content through a HTTP proxy:

  # HTTP proxy settings
HTTP_PROXY_HOST = "123.456.78.90"
HTTP_PROXY_PORT = 8080

# Calculate proxy URL
HTTP_PROXY_URL = "http://#{HTTP_PROXY_HOST}:#{HTTP_PROXY_PORT}"

# Monkey patch feedalizer to support page grabbing through a proxy
require 'feedalizer'
class Feedalizer …

Ultimate Regular Expression for HTML tag parsing with PHP

Kevin Deldycke — Fri, 23 Mar 2007 00:00:00 +0100

!!! alert alert-warning “Disclaimer” This is a dirty hack!

To parse HTML or XML, use a dedicated library.

Tonight I found the ultimate regex to get HTML tags out of a string. It was written a year ago by Phil Haack on his blog. His regex is quite bullet-proof: it’s …