Kevin Deldycke - BeautifoulSouphttps://kevin.deldycke.com/Professional Yak ShaverTue, 08 Jul 2008 00:24:26 +0200Python Ultimate Regular Expression to Catch HTML Tagshttps://kevin.deldycke.com/2008/07/python-ultimate-regular-expression-to-catch-html-tags/<p>_<strong>Disclaimer</strong>: this is a dirty hack! To parse <span class="caps">HTML</span> or <span class="caps">XML</span>, use a dedicated library like the good old <a href="https://pypi.python.org/pypi/beautifulsoup4"><code>BeautifoulSoup</code></a> or <a href="https://lxml.de/lxmlhtml.html"><code>lxml.html</code></a>.</p> <p>1 year and 3 months ago I&rsquo;ve came with a <a href="https://kevin.deldycke.com/2007/03/ultimate-regular-expression-for-html-tag-parsing-with-php/"><span class="caps">PHP</span> regexp to parse <span class="caps">HTML</span> tag soup</a>.</p> <p>Here is an improved version, in Python (my …</p>Kevin DeldyckeTue, 08 Jul 2008 00:24:26 +0200tag:kevin.deldycke.com,2008-07-08:/2008/07/python-ultimate-regular-expression-to-catch-html-tags/HTMLprogrammingPythonRegular expressionBeautifoulSouplxml