<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Ultimate Regular Expression for HTML tag parsing with PHP</title>
	<atom:link href="http://kevin.deldycke.com/2007/03/ultimate-regular-expression-for-html-tag-parsing-with-php/feed/" rel="self" type="application/rss+xml" />
	<link>http://kevin.deldycke.com/2007/03/ultimate-regular-expression-for-html-tag-parsing-with-php/</link>
	<description>Free Softwares, Computers &#38; Linux</description>
	<lastBuildDate>Sun, 21 Mar 2010 05:04:07 +0100</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: bertelli.name &#187; Blog Archive &#187; HTML hates Regexp</title>
		<link>http://kevin.deldycke.com/2007/03/ultimate-regular-expression-for-html-tag-parsing-with-php/comment-page-1/#comment-6553</link>
		<dc:creator>bertelli.name &#187; Blog Archive &#187; HTML hates Regexp</dc:creator>
		<pubDate>Wed, 02 Dec 2009 13:17:36 +0000</pubDate>
		<guid isPermaLink="false">http://kevin.deldycke.com/2007/03/ultimate-regular-expression-for-html-tag-parsing-with-php/#comment-6553</guid>
		<description>[...] Portanto&#8230; não faça isto. [...]</description>
		<content:encoded><![CDATA[<p>[...] Portanto&#8230; não faça isto. [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Romeo</title>
		<link>http://kevin.deldycke.com/2007/03/ultimate-regular-expression-for-html-tag-parsing-with-php/comment-page-1/#comment-6481</link>
		<dc:creator>Romeo</dc:creator>
		<pubDate>Mon, 05 Oct 2009 06:19:39 +0000</pubDate>
		<guid isPermaLink="false">http://kevin.deldycke.com/2007/03/ultimate-regular-expression-for-html-tag-parsing-with-php/#comment-6481</guid>
		<description>Please provide regular expression to extract any tag and attributes... along with javascript present in attributes... 

Thanks</description>
		<content:encoded><![CDATA[<p>Please provide regular expression to extract any tag and attributes&#8230; along with javascript present in attributes&#8230; </p>
<p>Thanks</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Nayana Adassuriya</title>
		<link>http://kevin.deldycke.com/2007/03/ultimate-regular-expression-for-html-tag-parsing-with-php/comment-page-1/#comment-5337</link>
		<dc:creator>Nayana Adassuriya</dc:creator>
		<pubDate>Wed, 10 Jun 2009 04:08:36 +0000</pubDate>
		<guid isPermaLink="false">http://kevin.deldycke.com/2007/03/ultimate-regular-expression-for-html-tag-parsing-with-php/#comment-5337</guid>
		<description>I want to get all the &lt;code&gt;&lt;a&gt;&lt;/code&gt; tag that include  tag withing it.
eg:
[code lang=&quot;html&quot;]
&lt;a href=&quot;www.google.com&quot;&gt;&lt;img src=&quot;google.jpg&quot;&gt;&lt;/a&gt;
[/code]
here i want to get &lt;code&gt;www.google.com&lt;/code&gt; and &lt;code&gt;google.jpg&lt;/code&gt;
please help me how can do that with some example.

simply i want to get the &quot;image url&quot; and &quot;link url&quot; when a image include inside anchor tag

how can i do it?

thanks
Nayana Adassuriya</description>
		<content:encoded><![CDATA[<p>I want to get all the <code>&lt;a&gt;</code> tag that include  tag withing it.<br />
eg:</p>
<pre class="brush: xml;">
&lt;a href="www.google.com"&gt;&lt;img src="google.jpg"&gt;&lt;/a&gt;
</pre>
<p>here i want to get <code><a  href="http://www.google.com" rel="nofollow">http://www.google.com</a></code> and <code>google.jpg</code><br />
please help me how can do that with some example.</p>
<p>simply i want to get the &#8220;image url&#8221; and &#8220;link url&#8221; when a image include inside anchor tag</p>
<p>how can i do it?</p>
<p>thanks<br />
Nayana Adassuriya</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: saucy</title>
		<link>http://kevin.deldycke.com/2007/03/ultimate-regular-expression-for-html-tag-parsing-with-php/comment-page-1/#comment-4765</link>
		<dc:creator>saucy</dc:creator>
		<pubDate>Thu, 26 Feb 2009 19:29:15 +0000</pubDate>
		<guid isPermaLink="false">http://kevin.deldycke.com/2007/03/ultimate-regular-expression-for-html-tag-parsing-with-php/#comment-4765</guid>
		<description>what part allows hyphens. i just need that part.</description>
		<content:encoded><![CDATA[<p>what part allows hyphens. i just need that part.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Fred-Eric Lafaille</title>
		<link>http://kevin.deldycke.com/2007/03/ultimate-regular-expression-for-html-tag-parsing-with-php/comment-page-1/#comment-4761</link>
		<dc:creator>Fred-Eric Lafaille</dc:creator>
		<pubDate>Sat, 21 Feb 2009 11:59:03 +0000</pubDate>
		<guid isPermaLink="false">http://kevin.deldycke.com/2007/03/ultimate-regular-expression-for-html-tag-parsing-with-php/#comment-4761</guid>
		<description>[code lang=&quot;text&quot;]
Warning: ereg_replace() [function.ereg-replace]: REG_BADRPT
[/code]</description>
		<content:encoded><![CDATA[<pre class="brush: plain;">
Warning: ereg_replace() [function.ereg-replace]: REG_BADRPT
</pre>
]]></content:encoded>
	</item>
	<item>
		<title>By: Jonathan Worent</title>
		<link>http://kevin.deldycke.com/2007/03/ultimate-regular-expression-for-html-tag-parsing-with-php/comment-page-1/#comment-4747</link>
		<dc:creator>Jonathan Worent</dc:creator>
		<pubDate>Mon, 26 Jan 2009 22:57:27 +0000</pubDate>
		<guid isPermaLink="false">http://kevin.deldycke.com/2007/03/ultimate-regular-expression-for-html-tag-parsing-with-php/#comment-4747</guid>
		<description>You should try &lt;a href=&quot;http://us3.php.net/manual/en/book.tidy.php&quot; rel=&quot;nofollow&quot;&gt;Tidy&lt;/a&gt;. Its very forgiving of tag soup. And will allow you to pars the tag soup as actual DOM (of sorts)</description>
		<content:encoded><![CDATA[<p>You should try <a  href="http://us3.php.net/manual/en/book.tidy.php" rel="nofollow">Tidy</a>. Its very forgiving of tag soup. And will allow you to pars the tag soup as actual DOM (of sorts)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: kev</title>
		<link>http://kevin.deldycke.com/2007/03/ultimate-regular-expression-for-html-tag-parsing-with-php/comment-page-1/#comment-4740</link>
		<dc:creator>kev</dc:creator>
		<pubDate>Tue, 13 Jan 2009 17:40:31 +0000</pubDate>
		<guid isPermaLink="false">http://kevin.deldycke.com/2007/03/ultimate-regular-expression-for-html-tag-parsing-with-php/#comment-4740</guid>
		<description>&lt;blockquote&gt;advising people to parse it with regular expressions is.. not smart.&lt;/blockquote&gt;

I agree.

To clarify: this code is far from being a good practice. It&#039;s just a hack intended to get rid of HTML tag soup.

Now, a little bit of context: PHP is not my language of choice and at the time I wrote this article I didn&#039;t found any PHP library that is tolerant to tag soup. Hence the hack.

For Python, the langage I practice everyday, I recommand using &lt;a href=&quot;http://codespeak.net/lxml/&quot; rel=&quot;nofollow&quot;&gt;lxml&lt;/a&gt;, especially its &lt;code&gt;lxml.html&lt;/code&gt; module. And on that subject, don&#039;t miss &lt;a href=&quot;http://blog.ianbicking.org/2008/12/10/lxml-an-underappreciated-web-scraping-library/&quot; rel=&quot;nofollow&quot;&gt;Ian Bicking&#039;s post: &quot;lxml: an underappreciated web scraping library&quot;&lt;/a&gt;.</description>
		<content:encoded><![CDATA[<blockquote><p>advising people to parse it with regular expressions is.. not smart.</p></blockquote>
<p>I agree.</p>
<p>To clarify: this code is far from being a good practice. It&#8217;s just a hack intended to get rid of HTML tag soup.</p>
<p>Now, a little bit of context: PHP is not my language of choice and at the time I wrote this article I didn&#8217;t found any PHP library that is tolerant to tag soup. Hence the hack.</p>
<p>For Python, the langage I practice everyday, I recommand using <a  href="http://codespeak.net/lxml/" rel="nofollow">lxml</a>, especially its <code>lxml.html</code> module. And on that subject, don&#8217;t miss <a  href="http://blog.ianbicking.org/2008/12/10/lxml-an-underappreciated-web-scraping-library/" rel="nofollow">Ian Bicking&#8217;s post: &#8220;lxml: an underappreciated web scraping library&#8221;</a>.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Johnny B</title>
		<link>http://kevin.deldycke.com/2007/03/ultimate-regular-expression-for-html-tag-parsing-with-php/comment-page-1/#comment-4739</link>
		<dc:creator>Johnny B</dc:creator>
		<pubDate>Sun, 11 Jan 2009 02:42:48 +0000</pubDate>
		<guid isPermaLink="false">http://kevin.deldycke.com/2007/03/ultimate-regular-expression-for-html-tag-parsing-with-php/#comment-4739</guid>
		<description>HTML isn&#039;t a regular language, so advising people to parse it with regular expressions is.. not smart.

See http://htmlparsing.icenine.ca for more information.</description>
		<content:encoded><![CDATA[<p>HTML isn&#8217;t a regular language, so advising people to parse it with regular expressions is.. not smart.</p>
<p>See <a  href="http://htmlparsing.icenine.ca" rel="nofollow">http://htmlparsing.icenine.ca</a> for more information.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: kev</title>
		<link>http://kevin.deldycke.com/2007/03/ultimate-regular-expression-for-html-tag-parsing-with-php/comment-page-1/#comment-4732</link>
		<dc:creator>kev</dc:creator>
		<pubDate>Thu, 01 Jan 2009 19:15:41 +0000</pubDate>
		<guid isPermaLink="false">http://kevin.deldycke.com/2007/03/ultimate-regular-expression-for-html-tag-parsing-with-php/#comment-4732</guid>
		<description>@Hamza: take a look &lt;a href=&quot;http://kevin.deldycke.com/2008/07/python-ultimate-regular-expression-to-catch-html-tags/&quot; rel=&quot;nofollow&quot;&gt;here&lt;/a&gt;.</description>
		<content:encoded><![CDATA[<p>@Hamza: take a look <a  href="http://kevin.deldycke.com/2008/07/python-ultimate-regular-expression-to-catch-html-tags/" rel="nofollow">here</a>.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Hamza</title>
		<link>http://kevin.deldycke.com/2007/03/ultimate-regular-expression-for-html-tag-parsing-with-php/comment-page-1/#comment-4729</link>
		<dc:creator>Hamza</dc:creator>
		<pubDate>Wed, 31 Dec 2008 15:40:02 +0000</pubDate>
		<guid isPermaLink="false">http://kevin.deldycke.com/2007/03/ultimate-regular-expression-for-html-tag-parsing-with-php/#comment-4729</guid>
		<description>Hi there, 

can any one write a good regular expression this one for Python, to remove all hmtl tags. i really need one, 

thanks in advance</description>
		<content:encoded><![CDATA[<p>Hi there, </p>
<p>can any one write a good regular expression this one for Python, to remove all hmtl tags. i really need one, </p>
<p>thanks in advance</p>
]]></content:encoded>
	</item>
</channel>
</rss>
