WebPing Open-sourced !

I’ve just released WebPing under a GPL license. It’s available right now on a GitHub repository.

WebPing is a script I started to work on in 2009 while working at EDF. Back then, I needed a monitoring tool to keep an eye on the 80+ Plone instances that my team managed. For several corporate reasons, I wasn’t allowed to use a proper monitoring tool like Munin or Nagios. So I created a small script to fill this need. That’s how WebPing came to be.

WebPing is just a stupid Python script that is designed to be ticked regularly by a cron job. It try to fetch a list of URLs and store response times in an SQLite database. Then it create a static HTML report you’re free to serve with any HTTP server (an example Apache configuration is provided). The configuration of WebPing and the list of URLs it monitor is stored in a YAML file.

The produced HTML report use the Flot jQuery plugin to render graphs. Here is how the dashboard looks like:

Finally, WebPing is able to send reports and alerts by emails. Here is how a mail alert looks like:

Since I created WebPing, I found several other projects more or less developed around the same idea. See Kong, which is based on Django and Twill, a web-oriented DSL. Another project I spotted after the facts was multi-mechanize. Like Kong, it’s written in Python. But I never played with one or the other.

Feed Tracking Tool released under an Open-Source license

I’ve just open-sourced the Feed Tracking Tool project (aka “FTT”), my first (and only) Ruby on Rails experience.

This tool was developed within Uperto, the company I currently work for, for its internal needs. The project had an ancestor written in 2006 that was based on Pylons. It was a prototype and was barely working. Iterating over the abandoned Python code base was considered a waste of time. So in summer 2007, it was decided to rewrite this application from scratch.

As my co-worker was available and already played with Ruby on Rails, he was tasked to create the initial code base. I joined the project early on, as it was a great opportunity to play with the (then really trendy) Ruby on Rails framework.

At the end FTT was essentially a test project to explore Ruby on Rails. It was never deployed on a production server and was never used.

After roting for more than 3 years, and representing absolutely no business value in itself, I decided to release it under a GPLv2 license (with Uperto’s approval of course). My intention with this open-source release is to share back knowledge and code with the community.

FTT was living in a private Subversion repository at Uperto, but we unfortunately lost it. During the last few weeks I tried to rebuild the code history from old and partial backups. I then used my Git-based reconstruction method to consolidate everything in a Git repository. The code is now available on GitHub.

I don’t plan to maintain this project. But I may reboot it in the future if I need feed-related features, or if I need an excuse to play with Ruby on Rails again. But for now beware: the code is quite outdated and is only running on old Rails 1.2.x. This project should be considered as an ugly legacy code base. So please be indulgent while looking at FTT’s code: it was the work of unexperienced RoR developers ! ;)

Python commands

  • Add a Python’s debugger break point:
    import pdb; pdb.set_trace()
    
  • Replace accentuated characters by their ASCII equivalent in a unicode string:
    import unicodedata
    unicodedata.normalize('NFKD', u"éèàçÇÉȲ³¼ÀÁÂÃÄÅËÍÑÒÖÜÝåïš™").encode('ascii', 'ignore')
    
  • Lambda function to transform a string to a URL-friendly ID:
    getSafeURL = lambda s: '-'.join([w for w in ''.join(1).split('-') if w])
    
  • Sort a list of dicts by dict-key (source):
    import operator
    [dict(a=1, b=2, c=3), dict(a=2, b=2, c=2), dict(a=3, b=2, c=1)].sort(key=operator.itemgetter('c'))
    
  • Set urllib2 timeout (source):
    import socket
    socket.setdefaulttimeout(10)
    
  • Start a dumb HTTP server on port 8000 (source):
    python -m SimpleHTTPServer 8000
    

Date manipulation

  • Add a month to the current date:
    import datetime
    import dateutil
    datetime.date.today() + dateutil.relativedelta(months=1)
    

Package management

  • Generate a binary distribution of the current package:
    python ./setup.py sdist
    
  • Register, generate and upload to PyPi the current package as a source package, an egg and a dumb binary:
    python ./setup.py register sdist bdist_egg bdist_dumb upload
    
  • Here is how my ~/.pypirc looks like:
    [pypirc]
    servers = pypi
    [server-login]
    username:kdeldycke
    password:XXXXXXX
    

Automate Trac instance deployment with Buildout

Recently, I started to contribute to pbp.recipe.trac, a Buildout recipe aimed to simplify the management and configuration of Trac instances.

I’ve taken interest in this piece of code the day I realized the Trac instance we used at work was still running on the old 0.10.x series. Even if we spend the majority of our time there, nobody has taken care of our little Trac: it was not updated for 3 years. If you add to this a sudden need for multi-repository support (as our team is adopting other internal projects), you have enough incentives to upgrade our Trac and automate its maintenance.

So here is how I migrated our legacy Trac 0.10 instance to a brand new 0.12 thanks to Buildout and pbp.recipe.trac.

First, let’s install all system dependencies using your distribution package management tool. My target server is running an RHEL 5.4, so I’ll invoke Yum:

$ sudo yum install subversion subversion-python sqlite-devel cyrus-sasl-lib cyrus-sasl-md5 mercurial

On Debian/Ubuntu, equivalent packages should be installed with apt-get:

$ sudo apt-get install subversion python-subversion libsqlite-dev cyrus-sasl-lib cyrus-sasl-md5 mercurial

Now we create an empty structure that will host our Trac instance:

$ mkdir ~/trac-home
$ cd ~/trac-home
$ touch ./buildout.cfg

It’s time to edit the file at the core of the process: buildout.cfg. Here is my version:

[buildout]
extensions = buildout.bootstrap
parts = my-trac
deploy-server = trac.example.net

[my-trac]
recipe = pbp.recipe.trac
project-name = My Trac instance
project-description = This is my stand-alone Trac instance hosting my devlopment activities.
project-url = http://${buildout:deploy-server}:8000/my-trac
repos = my-repo-1 | svn | ${buildout:directory}/repos/my-repo-1 | svn://${buildout:deploy-server}:3690/my-repo-1
        my-repo-2 | svn | ${buildout:directory}/repos/my-repo-2 | svn://${buildout:deploy-server}:3690/my-repo-2
        my-repo-3 | svn | ${buildout:directory}/repos/my-repo-3 | svn://${buildout:deploy-server}:3690/my-repo-3
default-repo = my-repo-1
force-instance-upgrade = True
force-repos-resync = True
wiki-doc-upgrade = True
stats-plugin = enabled
permissions = anonymous | STATS_VIEW
header-logo = ${buildout:directory}/my_trac_logo.png
smtp-enabled = true
smtp-server = localhost
smtp-port = 25
smtp-from = trac@example.net
smtp-replyto = no-reply@example.net
smtp-always-cc = kevin@example.net bob@example.net
additional-menu-items = Buildbot | http://${buildout:deploy-server}:9080/console
trac-ini-additional = attachment   | max_size               | 26214400
                      browser      | downloadable_paths     | /*/trunk, /*/branches/*, /*/tags/*
                      notification | always_notify_owner    | true
                      notification | always_notify_reporter | true
                      timeline     | ticket_show_details    | true
                      wiki         | ignore_missing_pages   | true
                      svn          | branches               | /*/trunk, /*/branches/*
                      svn          | tags                   | /*/tags/*

I now encourage you to use my buildout.cfg above as a template and customize it to your needs. Please read pbp.recipe.trac documentation carefully to set the recipe options to values you like.

Before going further, we need a bootstrap.py script. This script will take care of all stuff required by a bare Python interpreter to handle a Buildout project from scratch. Let’s download the latest version:

$ wget http://svn.zope.org/repos/main/zc.buildout/trunk/bootstrap/bootstrap.py

Now we can initialize our Buildout environment. The --distribute option here is necessary to get something more modern than the abandoned setuptools:

$ python ./bootstrap.py --distribute

And then we can ask Buildout to construct our the instance:

$ ./bin/buildout

Now that we have an empty Trac 0.12 instance, we will migrate there our legacy Subversion repositories:

$ svnadmin create ./repos/my-repo-1
$ svnadmin create ./repos/my-repo-2
$ svnadmin create ./repos/my-repo-3
$ ssh -C root@legacy.example.net "svnadmin dump /software/svn/repo1" | svnadmin load ./repos/my-repo-1
$ ssh -C root@legacy.example.net "svnadmin dump /software/svn/repo2" | svnadmin load ./repos/my-repo-2
$ svnadmin load ./repos/my-repo-3 < ~/svn_repo3_20100612.dmp

Note that in this case my first two subversion repositories are still running on my legacy server, and I already have a local dump of the third.

Let’s copy the data from our legacy Trac instance. By studying the differences between a default Trac instance and the legacy one I was working on, I came to the conclusion that I only needed to move attachments and the main database. Of course this is my personal case and your’s may be a little bit different:

$ scp -rC root@legacy.example.net:/software/trac/project/attachments ./parts/my-trac/
$ scp -rC root@legacy.example.net:/software/trac/project/db/trac.db  ./parts/my-trac/db/

We need to call Buildout a second time to update our the project with all the data we’ve just migrated:

$ ./bin/buildout

Now we’ll activate and configure SASL-based authentication in all Subversion repositories:

$ sed -i 's/# use-sasl = true/use-sasl = true/' ./repos/my-repo-1/conf/svnserve.conf
$ sed -i 's/# use-sasl = true/use-sasl = true/' ./repos/my-repo-2/conf/svnserve.conf
$ sed -i 's/# use-sasl = true/use-sasl = true/' ./repos/my-repo-3/conf/svnserve.conf
$ sed -i 's/# realm = My First Repository/realm = svn/' ./repos/my-repo-1/conf/svnserve.conf
$ sed -i 's/# realm = My First Repository/realm = svn/' ./repos/my-repo-2/conf/svnserve.conf
$ sed -i 's/# realm = My First Repository/realm = svn/' ./repos/my-repo-3/conf/svnserve.conf

Create a password database with our users:

$ saslpasswd2 -f sasl.db -u svn kevin
$ saslpasswd2 -f sasl.db -u svn bob
$ ...

Setup SASL authentication on the system (please change the sasl.conf location below according your file structure):

$ touch ./sasl.conf
$ sudo ln -s /home/kevin/trac-home/sasl.conf /etc/sasl2/svn.conf

And put the following content in the sasl.conf file we just created above (don’t forget to update the sasl.db location):

pwcheck_method: auxprop
auxprop_plugin: sasldb
sasldb_path: /home/kevin/trac-home/sasl.db
mech_list: ANONYMOUS CRAM-MD5 DIGEST-MD5

It’s time to create and populate the password file used by Trac, with all the users we created 3 steps above:

$ touch ./htdigest
$ htdigest ./htdigest trac kevin
$ htdigest ./htdigest trac bob
$ ...

And now we can start the Subversion server in the background:

$ svnserve --daemon --listen-port 3690 --root ./repos/

Last step, we launch Trac’s standalone webserver:

$ ./bin/tracd --port 8000 --single-env --auth="*,htdigest,trac" ./parts/my-trac

You can now reach Trac from your browser, on the following URL:


http://trac.example.net:8000/my-trac

A final test consist in getting some code from Subversion:

$ svn co svn://trac.example.net:3690/my-repo-1

From now on, and that’s where the fun begins, each time a new Trac version is released on PyPi, I just have to:

  1. stop both Trac and Subversion standalone servers,
  2. run ./bin/buildout, and
  3. restart both Subversion and Trac servers.

That’s enough to upgrade my instance.

Now you can clearly see how it’s important to invest time in automation to save on maintenance costs and prevent code rotting… :)

Convert Lotus Notes’ nsf files to mbox with nlconverter

There is a great piece of software called nlconverter. It’s a tool designed to convert Lotus Notes’ .nsf files to mbox. It rely on win32′s COM/DDE API so it can only be used on Windows.

If you want to extract mails out of your .nsf database, this might be the tool you’re looking for. Bonus point: it’s written in Python ! ;)

Installing nlconverter and its dependencies

Here is how I installed nlconverter on a Windows 2000 (SP4) machine:

  1. First I downloaded and installed the official Python builds for Windows (2.6.6 precisely):




  2. Then Python for Windows extensions (build 214 for Python 2.6 in my case):



  3. Finally I had to download the latest icalendar archive, then extract the \iCalendar-1.2\src\icalendar folder to C:\Python26\Lib\site-packages\:
  4. Next step is to download nlconverter itself and extract it:

nlconverter GUI

First thing you have to do is to create an export of your mails as a .nsf database. Follow the previous link to get the instructions.

Now let’s convert this nsf to a mbox. nlconverter’s FAQ tells you to run the gui.exe program to perform the conversion.

Unfortunately it didn’t work for me:

So I tried the alternative approach by using the command line.

nlconverter command line

Again, most of the things I’m writing here are based on nlconverter’s FAQ:

  1. First, we have to download the notes2mbox.py script from nlconverter’s mercurial repository, as this file is not distributed in the winnlc-alpha-1.zip archive I unzipped previously. Let’s put notes2mbox.py in C:\winnlc-alpha-1\:
  2. Now we’ll modify the notes2mbox.py script to set the password (via the notesPasswd variable) and location (notesNsfPath variable) of the .nsf file. Here are the modifications I applied:
    --- notes2mbox.py.orig	2010-09-02 13:49:58.000000000 +0200
    +++ notes2mbox.py	2010-09-02 13:51:24.000000000 +0200
    @@ -14,8 +14,8 @@
     import NlconverterLib
    
     #Constantes
    -notesPasswd = "foobar"
    -notesNsfPath = "C:\\archive.nsf"
    +notesPasswd = "XXXXXXXXXXXXX"
    +notesNsfPath = "C:\\winnlc-alpha-1\\kevin-notes-big-backup-part-1.nsf"
    
     #Connection à Notes
     db = NlconverterLib.getNotesDb(notesNsfPath, notesPasswd)
    
  3. Before running the script, we have to register a Notes DLL used by nlconverter:
    regsvr32 "C:\Program Files\Notes\nlsxbe.dll"
    


    And make the Python interpreter available system-wide:

    C:\winnlc-alpha-1>SET Path=%Path%;C:\Python26
    
  4. Now we can run the notes2mbox.py script:
    C:\winnlc-alpha-1>C:\Python26\python.exe notes2mbox.py
    

If you’re lucky, you’ll get a nice mbox at the end of the process.

But I was not and the notes2mbox.py ended up with the following error:

Traceback (most recent call last):
  File "notes2mbox.py", line 21, in <module>
    db = NlconverterLib.getNotesDb(notesNsfPath, notesPasswd)
  File "C:\winnlc-alpha-1\NlconverterLib.py", line 43, in getNotesDb
    session = win32com.client.Dispatch(r'Lotus.NotesSession')
  File "C:\Python26\lib\site-packages\win32com\client\__init__.py", line 95, in Dispatch
    dispatch, userName = dynamic._GetGoodDispatchAndUserName(dispatch,userName,clsctx)
  File "C:\Python26\lib\site-packages\win32com\client\dynamic.py", line 104, in _GetGoodDispatchAndUserName
    return (_GetGoodDispatch(IDispatch, clsctx), userName)
  File "C:\Python26\lib\site-packages\win32com\client\dynamic.py", line 84, in _GetGoodDispatch
    IDispatch = pythoncom.CoCreateInstance(IDispatch, None, clsctx, pythoncom.IID_IDispatch)
pywintypes.com_error: (-2147221231, 'ClassFactory ne peut pas fournir la classe demand\xe9e', None, None)

As you can see, I tried hard to make nlconverter working, without any success. But this should not stop you to try. In fact I suspect the Lotus Notes installed on my machine to be crippled or corrupted (can’t really tell). So you may be more lucky than me. In any case, feel free to report any success or failure in the comment section below !

Maildir deduplication script in Python

Some months ago I wrote a tiny Python script which scan all folders and sub-folders of a Maildir, then remove duplicate mails.

You can give the script a list of email headers to ignore while it compares mails between each others. This is particularly helpful to find duplicate mails having the exact same content but different headers/metadatas.

I created this script to clean up a Maildir folder I messed up after moving repeatedly tons of mails from a Lotus Notes database. As you can see below, the same mail imported twice contain a variable header based on the date and time the import was performed:

This variable header make mails looks different from the point of view of the script. That’s explain why I implemented the HEADERS_TO_IGNORE parameter with the default set to X-MIMETrack.

The script is available on my GitHub repository. It was tested on MacOS X 10.6 with python 2.6.2 but should work on other systems and versions as the code is really simple (and stupid).