I’ve just finished migrating all the comments of this blog from WordPress to Disqus . Why using an external comment platform? It’s just that I plan to ditch WordPress entirely, and switch to a static site generator in the near future. Here are some details on my migration to Disqus.

Disqus has everything you need to easily import WordPress comments. But first, I had to massage some data.

Articles of this blog features a lots of code. Comments are no exception and embed snippets too. Code blocks are rendered by the SyntaxHighlighter Evolved WordPress plugin . This extension use square brackets to enclose code. Disqus uses standard HTML tags  .

Let’s update this notation directly in WordPress database:

$ mysqldump --opt kevblog wp_comments > ./comments.sql
$ perl -pe 's/\[code lang=(.*?)\]/<pre><code class=\1>/g' ./comments.sql > comments-fixed.sql
$ sed -i 's/\[\/code\]/<\/code><\/pre>/g' ./comments-fixed.sql
$ mysql kevblog < ./comments-fixed.sql

Disqus doesn’t support HTML lists . So I manually updated WordPress comments to remove occurrences of <ul> and <ol> , and replace <li> by an UTF-8 bullet  .

Another issue: if Disqus support images in comment , in imported comments they are left as HTML tags and therefore not rendered by Disqus. I was the only one on my blog posting images in comments. So I simply moved them to the corresponding parent article.

I then had to fix the comment threading. In the first versions of WordPress, sub-commenting was not supported. I addressed this issue by recomposing threading with a series of MySQL queries:

UPDATE `wp_comments` SET `comment_parent` = 234   WHERE `comment_ID` = 342;
UPDATE `wp_comments` SET `comment_parent` = 4987  WHERE `comment_ID` = 5667;
UPDATE `wp_comments` SET `comment_parent` = 10915 WHERE `comment_ID` = 10916;
(...)

After all these updates, my comments where ready to be exported to Disqus  .

If most of my comments were successfully imported, some were left-out. The importer was not able to find their parents:

Disqus comment import: unable to find parentpost

But the reported error was not true: parent’s IDs were good and referenced an existing comment. Besides, comments Disqus was not able to import were correctly placed in their thread on my original WordPress blog.

For a moment I trough Disqus import code was subject to a race condition . But there was no <wp:comment_parent /> tags in my XML file.

I also checked that no comment were moderated:

$ grep -c "<wp:comment_approved>0</wp:comment_approved>" ./kevindeldycke.wordpress.2013-01-15-fixed.xml
0
$ grep -c "<wp:comment_approved>1</wp:comment_approved>" ./kevindeldycke.wordpress.2013-01-15-fixed.xml
892

I decided to check the comments carefully. Following the chain of comment’s parents, I found the root cause. All unimported comments were descendants of an anonymous commenter. These shared the following empty properties:

<wp:comment_author><![CDATA[]]></wp:comment_author>
<wp:comment_author_email></wp:comment_author_email>

I then decided to forced anonymous comments to bear a generic author’s name:

$ perl -0p -e 's/(<wp:comment_author><!\[CDATA\[)(\]\]><\/wp:comment_author>\s*<wp:comment_author_email><\/wp:comment_author_email>)/\1Anonymous\2/sg' ./kevindeldycke.wordpress.2013-01-15-fixed.xml > test.xml

Resulting in the following changes:

--- ./kevindeldycke.wordpress.2013-01-15-disqus-import-fixed.xml        2013-01-15 11:24:06.929837283 +0100
+++ ./test.xml  2013-01-27 16:19:00.062626017 +0100
@@ -6595,7 +6595,7 @@
    </wp:postmeta>
    <wp:comment>
      <wp:comment_id>1883</wp:comment_id>
-     <wp:comment_author><![CDATA[]]></wp:comment_author>
+     <wp:comment_author><![CDATA[Anonymous]]></wp:comment_author>
      <wp:comment_author_email></wp:comment_author_email>
      <wp:comment_author_url></wp:comment_author_url>
      <wp:comment_author_IP>123.45.67.89</wp:comment_author_IP>
@@ -8376,7 +8376,7 @@
    </wp:comment>
    <wp:comment>
      <wp:comment_id>2382</wp:comment_id>
-     <wp:comment_author><![CDATA[]]></wp:comment_author>
+     <wp:comment_author><![CDATA[Anonymous]]></wp:comment_author>
      <wp:comment_author_email></wp:comment_author_email>
      <wp:comment_author_url></wp:comment_author_url>
      <wp:comment_author_IP>123.45.67.89</wp:comment_author_IP>
(...)

I then sent the fixed WordPress XML export to Disqus as-is, which imported my 24 missing comments:

Disqus comment import: missing commentsimported