WebPing Open-sourced !

I’ve just released WebPing under a GPL license. It’s available right now on a GitHub repository.

WebPing is a script I started to work on in 2009 while working at EDF. Back then, I needed a monitoring tool to keep an eye on the 80+ Plone instances that my team managed. For several corporate reasons, I wasn’t allowed to use a proper monitoring tool like Munin or Nagios. So I created a small script to fill this need. That’s how WebPing came to be.

WebPing is just a stupid Python script that is designed to be ticked regularly by a cron job. It try to fetch a list of URLs and store response times in an SQLite database. Then it create a static HTML report you’re free to serve with any HTTP server (an example Apache configuration is provided). The configuration of WebPing and the list of URLs it monitor is stored in a YAML file.

The produced HTML report use the Flot jQuery plugin to render graphs. Here is how the dashboard looks like:

Finally, WebPing is able to send reports and alerts by emails. Here is how a mail alert looks like:

Since I created WebPing, I found several other projects more or less developed around the same idea. See Kong, which is based on Django and Twill, a web-oriented DSL. Another project I spotted after the facts was multi-mechanize. Like Kong, it’s written in Python. But I never played with one or the other.

Apache commands

  • Hide Subversion and Git directories content (source):
    RedirectMatch 404 /\.(svn|git)(/|$)
    
  • Disable rendering of PHP files coming from imported third party Javascript submodules (context):
    RedirectMatch 404 js-(.*)\.php$
    
  • Redirect any request to current year sub-directory (I used this for a yearly-updated static web page):
    RewriteEngine on
    RewriteRule !^/2010/ /2010/ [R=301,L]
    
  • Here is my template for domain-based virtual host routing:
    # Setup the main website access
    <VirtualHost *:80>
      ServerName example.com
      DocumentRoot /var/www/example
      # Add extra capabilities to let CMS like WordPress manage redirections
      <Directory /var/www/example>
        Options +FollowSymLinks +SymLinksIfOwnerMatch
      </Directory>
    </VirtualHost>
    # Redirect all other access to the website from different domains to the canonical URL
    <VirtualHost *:80>
      ServerName www.example.com
      ServerAlias *.example.com
      ServerAlias example.net *.example.net
      ServerAlias example.org *.example.org
      RedirectMatch permanent (.*) http://example.com$1
    </VirtualHost>
    
  • Insert dynamic headers in HTTP responses depending on the browser:
    BrowserMatchNoCase ".*MSIE\s[1-6].*" IS_DISGUSTING_BROWSER
    Header add X-advice-of-the-day "Save a kitten: use Firefox !" env=IS_DISGUSTING_BROWSER
    
  • Prevent WebDAV connexions (thanks Guillaume!):
    <Location />
      <Limit PROPFIND PROPPATCH MKCOL COPY MOVE LOCK UNLOCK PATCH>
        # Leaves GET (and HEAD), POST, PUT, DELETE, CONNECT, OPTIONS and TRACE alone
        Order allow,deny
        Deny from all
      </Limit>
    </Location>
    SetEnvIf Request_Method "OPTIONS" CLIENT_PROBE
    Header set Allow "GET, HEAD, POST, PUT, DELETE, CONNECT, OPTIONS, TRACE" env=CLIENT_PROBE
    
  • At work, we had to engineer a convoluted software architecture for our intranet to fit the network security policy of our customer. This had a bad side effect of letting the web statistic collector delete all cookies but its own, thus breaking intranet’s authentication. So we (thanks Matthieu!) came up with this unmaintainable hack on Apache side to hide our intranet’s cookies to NedStat’s Javascript embedded code:
    <LocationMatch "/(.*)">
      LoadModule headers_module modules/mod_headers.so
      RequestHeader edit Cookie "(app_cookie_001=[^;]*(; )*)" ""
      RequestHeader edit Cookie "(app_cookie_002=[^;]*(; )*)" ""
      RequestHeader edit Cookie "(app_cookie_003=[^;]*(; )*)" ""
    </LocationMatch>
    
  • Kill all apache processes and restart the service:
    /etc/init.d/apache2 stop ; pkill -9 -u www-data ; /etc/init.d/apache2 restart
    
  • Restart Apache service if no process found:
    [ `ps axu | grep -v "grep" | grep --count "www-data"` -le 0 ] && /etc/init.d/apache2 restart
    

How-to add proxy support to Feedalizer ruby library

Here is a little code snippet that monkey-patch Feedalizer to let it grab web content through a HTTP proxy:

# HTTP proxy settings
HTTP_PROXY_HOST = "123.456.78.90"
HTTP_PROXY_PORT = 8080

# Calculate proxy URL
HTTP_PROXY_URL = "http://#{HTTP_PROXY_HOST}:#{HTTP_PROXY_PORT}"

# Monkey patch feedalizer to support page grabbing through a proxy
require 'feedalizer'
class Feedalizer
  # Backup original grab_page method
  alias_method :grab_page_orig, :grab_page
  # Define new grab_page() method with proxy support
  def grab_page(url)
    open(url, :proxy => HTTP_PROXY_URL) { |io| Hpricot(io) }
  end
end

This fix, written for a Ruby on Rails-based project, lay in the environment.rb file, but I wonder if this is the right place and the right way of doing it… Anyway, it works for me ! :)

Update: A post from Matthew Higgins’ blog that answer my question above has just shown up in my feed aggregator. What’s he telling us ? That I’m a naughty programmer :

Previous to 2.0, naughty developers pasted code at the bottom of environment.rb, and the config/initializer folder was a welcome convention to help organize this madness.

For your instance, the code in this post is extracted from an “old” (prior to RoR 2.0) project, thus explaining my naughtyness… ;)