Tag Archive for 'patch'

Got “unsized object” errors with Debian’s Mailman ? Try this patch !

Last week I came across a showstopper bug on Mailman 2.1.9-7, the current version of Mailman package distributed with Debian Etch.

Here is the python traceback (from /var/log/mailman/error logfile) I get each time I’ve sent a mail to my brand new mailing-list:

Dec 20 01:20:04 2008 (14275) Uncaught runner exception: len() of unsized object
Dec 20 01:20:04 2008 (14275) Traceback (most recent call last):
  File "/usr/lib/mailman/Mailman/Queue/Runner.py", line 112, in _oneloop
    self._onefile(msg, msgdata)
  File "/usr/lib/mailman/Mailman/Queue/Runner.py", line 170, in _onefile
    keepqueued = self._dispose(mlist, msg, msgdata)
  File "/usr/lib/mailman/Mailman/Queue/IncomingRunner.py", line 130, in _dispose
    more = self._dopipeline(mlist, msg, msgdata, pipeline)
  File "/usr/lib/mailman/Mailman/Queue/IncomingRunner.py", line 153, in _dopipeline
    sys.modules[modname].process(mlist, msg, msgdata)
  File "/usr/lib/mailman/Mailman/Handlers/ToDigest.py", line 81, in process
    mbox.AppendMessage(msg)
  File "/usr/lib/mailman/Mailman/Mailbox.py", line 69, in AppendMessage
    g.flatten(msg, unixfrom=True)
  File "/usr/lib/mailman/pythonlib/email/Generator.py", line 101, in flatten
    self._write(msg)
  File "/usr/lib/mailman/pythonlib/email/Generator.py", line 136, in _write
    self._write_headers(msg)
  File "/usr/lib/mailman/pythonlib/email/Generator.py", line 182, in _write_headers
    header_name=h, continuation_ws='\t').encode()
  File "/usr/lib/mailman/pythonlib/email/Header.py", line 412, in encode
    newchunks += self._split(s, charset, targetlen, splitchars)
  File "/usr/lib/mailman/pythonlib/email/Header.py", line 297, in _split
    elen = charset.encoded_header_len(encoded)
  File "/usr/lib/mailman/pythonlib/email/Charset.py", line 354, in encoded_header_len
    raise repr(s)
TypeError: len() of unsized object

Dec 20 01:20:04 2008 (14275) SHUNTING: 1229732404.1069181+dcd89a08bf7911dac2db804b76cd42d20564c71c

Here is the corresponding (anonymized) mail sent to the mailing list from a Gmail account:

Received: by 10.180.244.13 with HTTP; Fri, 19 Dec 2008 16:32:22 -0800 (PST)
Message-ID: <1f7b086f0812192632x7427c0f7u2048609ddd50673@mail.gmail.com>
Date: Sat, 20 Dec 2008 01:32:22 +0100
From: "Kevin" <kevin@my-domain.com>
To: my-ml@lists.my-domain.com
Subject: sqdfqsdfqsfd
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: base64
Content-Disposition: inline
Delivered-To: kevin@my-domain.com

LS0KS2V2LgogIOKAoiBiYW5kOiBodHRwOi8vY29vbGNhdmVtZW4uY29tCiAg4oCiIGJsb2c6IGh0
dHA6Ly9rZXZpbi5kZWxkeWNrZS5jb20K

And now my hackish tale. Based on a quick look at Mailman’s source code, I made an educated guess that this error is just a side effect of the wrong assumption that the s variable in the Charset.encoded_header_len() method is always a string. So I came up with the following evil patch to handle (gracefully, I hope) the case of s being None.

Here is the resulting patch of my python-fu:

--- /usr/lib/mailman/pythonlib/email/Charset.py.orig   2008-12-28 19:46:23.000000000 +0100
+++ /usr/lib/mailman/pythonlib/email/Charset.py        2008-12-20 01:42:37.000000000 +0100
@@ -351,6 +351,7 @@
             lenqp = email.quopriMIME.header_quopri_len(s)
             return min(lenb64, lenqp) + len(cset) + MISC_LEN
         else:
+            return s is not None and len(str(s)) or 0
             return len(s)

     def header_encode(self, s, convert=False):

And it do the trick ! Of course I can’t guarantee that this patch is the way to definitely fix the bug. And it may corrupt data. So use it only if you’re as crazy as me ! :D

But I know, I know… As a responsible and serious hacker (sigh), I should report this bug to the Debian or Mailman project. But I’m still not familiar with Dedian’s way of reporting bugs (and to be honest, I feel lazy these days :p ). Maybe, one day…

How-to add Google Analytics tracking to Zenphoto

This is the patch I apply on each Zenphoto I install and upgrade. This little hack add Google Analytics tracking for all users except administrators.

Why ? As you can see in ticket #441 in Zenphoto bugtracker, there is no intention of adding support of GA in Zenphoto, even as an optional plugin. Hence my tiny hack. And for the non-admin stuff, I like having unbiased statistics: on low-audience websites, administrators can generate more traffic than legitimate users (if not all…).

Here is the downloadable patch file, and its content:

diff -ru ./zenphoto-orig/zp-core/template-functions.php ./zenphoto/zp-core/template-functions.php
--- ./zenphoto-orig/zp-core/template-functions.php  2008-08-15 07:43:05.000000000 +0200
+++ ./zenphoto/zp-core/template-functions.php 2008-08-16 17:08:03.000000000 +0200
@@ -147,7 +147,16 @@

    echo "<li><a href=\"".$zf."/admin.php?logout$redirect\">".gettext("Logout")."</a></li>\n";
    echo "</ul></div>\n";
- }
+ } else {
+    echo "<script type=\"text/javascript\">
+var gaJsHost = ((\"https:\" == document.location.protocol) ? \"https://ssl.\" : \"http://www.\");
+document.write(unescape(\"%3Cscript src='\" + gaJsHost + \"google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E\"));
+</script>
+<script type=\"text/javascript\">
+var pageTracker = _gat._getTracker(\"UA-XXXXXX-Y\");
+pageTracker._trackPageview();
+</script>";
+  }
 }

 /**

This patch was generated from a Zenphoto v1.2 and will likely not work with any other version.

Do not forget to update the dummy Google Analytics account ID above (UA-XXXXXX-Y) by yours.

And finally, to apply the patch, invoke the classic patch command:

patch -p0 < ./google-analytics-tracking-for-non-admin-users.patch

How-to fix ruby’s FeedTools latin-1 parsing

While playing with FeedTools, a ruby library to parse RSS (or other) feeds, I’ve spotted a strange behavior, that at first looks like typical unicode parsing issue. So I’ve started to check that the original feed was encoded in the right format, and that its charset was clearly set to the right value. But I found nothing wrong… So I dug in the FeedTools source code, and what I found is particularly disappointing…

FeedTools do a really nice job to detect the charset and handle feed’s data. So when it encounter HTML entities, it decode them to plain text. That’s good as at the end you get ready-to-use strings. Unfortunately, the method it use, CGI::unescapeHTML, stick too much to the W3C specification, which state that some of the HTML entities (if not all) are the expression of latin-1 characters. Hence the presence of latin-1 characters in pure UTF-8 RSS feeds…

To fix that, I’ve recoded the FeedTools::HtmlHelper.unescape_entities() method to convert each HTML entity it encounter to pure unicode. Here is the monkey patch I call by default from the environment.rb file of all my Ruby on Rails projects:

require 'feed_tools'

# Monkey patch feed tool.
# Use case mixed UTF-8 chars and html entities: <description>Téléchargements et Multim&#233;dia</description>
module FeedTools::HtmlHelper
  class << self

    # Force UTF-8 conversion of HTML entities with number lower than 256.
    # Based on CGI::unescapeHTML method.
    def convert_html_entities_to_unicode(string)
      string.gsub(/&(.*?);/n) do
        $KCODE = "UTF8"
        match = $1.dup
        case match
        when /\A#0*(\d+)\z/n       then
          if Integer($1) < 256
            [Integer($1)].pack("U")
          else
            "&##{$1};"
          end
        when /\A#x([0-9a-f]+)\z/ni then
          if $1.hex < 256
            [$1.hex].pack("U")
          else
            "&#x#{$1};"
          end
        else
          "&#{match};"
        end
      end
    end

    # Patch unescape_entities() method
    alias_method :unescape_entities_orig, :unescape_entities
    def unescape_entities(html)
      return unescape_entities_orig(convert_html_entities_to_unicode(html))
    end

  end
end

Ok, so this fix the issue.

But I’m not comfortable about this problem not solved cleanly. I still don’t have a clue about which component should solve the problem definitively. But I have some ideas… Here are my propositions:

  1. Submit my monkey patch to FeedTools project for integration, or
  2. Merge my monkey patch upstream in legacy ruby CGI library, or
  3. Do not allow usage of HTML entities in feeds.

My First Wordpress Patch !

Last week I’ve submitted a patch to the Wordpress open souce project. This tiny patch fix a little bug on Kubrick default theme which didn’t display the list of comments associated with a page. I’ve spotted that bug some months ago when working on my e107 to Wordpress import script. As you can see in the trac ticket, my patch was committed in the trunk of the project, and you can expect to see it in the next 2.2 version release.