GPG commands

  • Generate a key (interactive mode):
    gpg --gen-key
    
  • You can use the key generator in an unattended mode. Values in the example below are the same as the defaults proposed in the interactive mode above. Parameters in comments are there for reference:
    gpg --gen-key --batch <<EOF
    Key-Type: RSA
    Key-Length: 2048
    Subkey-Type: RSA
    Subkey-Length: 2048
    Expire-Date: 0
    Name-Real: Kevin
    # Name-Email: kevin@deldycke.com
    # Name-Comment: My auto-generated key
    # Passphrase: my_secret_passphrase
    EOF
    
  • List available keys for the current user:
    gpg --list-keys
    
  • Decrypt a file:
    gpg --decrypt archive.001.tar.gpg --output archive.001.tar
    
  • Same as above but for a collection of files:
    gpg --multifile --decrypt archive.*.tar.gpg
    

Bazaar commands

  • Check-out in the local openerp-server folder the 6.0 branch of the OpenERP server project from Launchpad:
    bzr branch lp:openobject-server/6.0 openerp-server
    
  • Same command as above, but fetch a particular revision:
    bzr branch lp:openobject-server/6.0 -r 3425 openerp-server
    
  • Get revision number of the local copy we sit in:
    bzr revno ./
    
  • Remove lock file on the current repository:
    bzr break-lock
    

PostgreSQL commands

  • Update the default configuration to allow direct authentication from the local machine:
    sed -i 's/^local\s\+all\s\+all\s\+ident/local all all trust/g' /etc/postgresql/8.4/main/pg_hba.conf
    
  • Same as above but for local IPv4 and IPv6 connexions:
    sed -i 's/^host\s\+all\s\+all\s\+\(.*\)\s\+md5/host all all \1 trust/g' /etc/postgresql/8.4/main/pg_hba.conf
    
  • List databases:
    psql --list -U kevin
    
  • Create a new kevin_db database with the kevin user:
    createdb -U kevin kevin_db
    
  • Remove the database we created above:
    dropdb kevin_db -U kevin
    
  • To connect to a particular database:
    psql -d database_id
    
  • Show us how a table of a specific database can be recreated:
    pg_dump my_database --schema-only --table=my_table
    
  • Dump a database in a compressed format:
    pg_dump my_database -v --format=c --file=/var/lib/postgresql/my_database-db-2011-12-19.dump
    
  • Restore a compressed dump:
    pg_restore -U my_user -d my_database /var/lib/postgresql/my_database-db-2011-12-19.dump
    
  • Import an SQL file to a database:
    psql --username kevin -d kevin_db < ./database_dump.sql
    
  • Search if kevin is a PostgreSQL user:
    sudo -u postgres psql --tuples-only --no-align --command "SELECT usename FROM pg_user;" | grep --quiet 'kevin' && echo 'User found !' || echo 'User not found !'
    
  • Set the owner of a database:
    ALTER DATABASE db_id OWNER TO user_id;
    
  • Set the owner of all tables from the MY_DB_ID database to MY_DB_USER (source):
    for tbl in `psql -qAt -c "SELECT tablename FROM pg_tables WHERE schemaname = 'public';" MY_DB_ID` ; do psql -c "ALTER TABLE $tbl OWNER TO MY_DB_USER" MY_DB_ID ; done
    
  • Same as above but for sequences and views:
    for tbl in `psql -qAt -c "SELECT sequence_name FROM information_schema.sequences WHERE sequence_schema = 'public';" MY_DB_ID` ; do psql -c "ALTER TABLE $tbl OWNER TO MY_DB_USER" MY_DB_ID ; done
    for tbl in `psql -qAt -c "SELECT table_name FROM information_schema.views WHERE table_schema = 'public';" MY_DB_ID` ; do psql -c "ALTER TABLE $tbl OWNER TO MY_DB_USER" MY_DB_ID ; done
    
  • Remove from a table all rows older than a month:
    sudo -u postgres psql -d database_id  --command "DELETE FROM smile_log WHERE log_date > current_date - interval '1 month';"
    
  • Monitor queries being run in real time:
    watch -n 1 'sudo -u postgres psql --tuples-only --command "SELECT datname, procpid, date_trunc(\$\$second\$\$, age(current_timestamp, xact_start)), current_query FROM pg_stat_activity;"'
    
  • Disable all triggers of a table, excluding triggers that are used to implement foreign key constraints:
    ALTER TABLE table_id DISABLE TRIGGER ALL;
    
  • List all constraints of your database (source):
    SELECT tc.constraint_name,
    tc.constraint_type,
    tc.table_name,
    kcu.column_name,
    tc.is_deferrable,
    tc.initially_deferred,
    rc.match_option AS match_type,
    rc.update_rule AS on_update,
    rc.delete_rule AS on_delete,
    ccu.table_name AS references_table,
    ccu.column_name AS references_field
    FROM information_schema.table_constraints tc
    
    LEFT JOIN information_schema.key_column_usage kcu
    ON tc.constraint_catalog = kcu.constraint_catalog
    AND tc.constraint_schema = kcu.constraint_schema
    AND tc.constraint_name = kcu.constraint_name
    
    LEFT JOIN information_schema.referential_constraints rc
    ON tc.constraint_catalog = rc.constraint_catalog
    AND tc.constraint_schema = rc.constraint_schema
    AND tc.constraint_name = rc.constraint_name
    
    LEFT JOIN information_schema.constraint_column_usage ccu
    ON rc.unique_constraint_catalog = ccu.constraint_catalog
    AND rc.unique_constraint_schema = ccu.constraint_schema
    AND rc.unique_constraint_name = ccu.constraint_name
    
    WHERE lower(tc.constraint_type) in ('foreign key');
    
  • And finally, here is a list of great monitoring one-liners.

How I Open-Sourced an Internal Corporate Project (WebPing)

2 weeks ago I released WebPing. This article is more or less the same I wrote 4 months ago when I released the FTT project and needed to move it from SVN to Git. But this time I added more details on how I removed all sensible informations that were hard-coded in the project files.

Subversion to Git migration

Everything starts out of a local copy of the Subversion repository that was hosting the WebPing project since its inception:

$ rm -rf ./svn-repository-copy
$ tar xvzf ./svn-repository-copy.tar.gz
$ kill `ps -ef | grep svnserve | awk '{print $2}'`
$ svnserve --daemon --listen-port 3690 --root ./svn-repository-copy

Let’s initialize a Git repository:

$ rm -rf ./webping-git
$ mkdir ./webping-git
$ cd ./webping-git
$ git init
$ git commit --allow-empty -m 'Initial commit'
$ git tag "init"

We now migrate the code from Subversion to Git:

$ git svn init --no-metadata --username deldycke svn://localhost:3690
$ git svn fetch
$ git rebase --onto git-svn master
$ rm -rf ./.git/svn/
$ rm -rf ./.git/refs/original/
$ git reflog expire --all
$ git gc --aggressive --prune

Removing unrelated files and folders

As WebPing was not alone in the original Subversion repository, we need to clean up the latter and only keep code of the former. Worse, WebPing didn’t started its life in a dedicated subfolder, but as a tool of another project, and jumped from folders to folders. After identifying in the history all places were WebPing lived once, I came up with this big, convoluted command line to do the cleaning:

$ git filter-branch --force --prune-empty --tree-filter 'find ./ -not -ipath "*webping*" -and -not -path "./other-project/trunk/tools/web-ping*" -and -not -path "./other-project/trunk/tools" -and -not -path "./other-project/trunk" -and -not -path "./other-project" -and -not -path "./.git*" -and -not -path "./" | xargs rm -rf' -- --all

Strangely enough, my init tag went of after the command above. So I had to rebased it to get it in line:

$ git rebase init master

We can now remove SVN tags and branches, get rid of the imported git-svn branch, and clean up our Git repository:

$ git filter-branch --force --prune-empty --tree-filter 'find -path "./WebPing/tags*" | xargs rm -rf' -- --all
$ git filter-branch --force --prune-empty --tree-filter 'find -path "./WebPing/branches*" | xargs rm -rf' -- --all
$ git branch -r -D git-svn
$ rm -rf ./.git/svn/
$ rm -rf ./.git/refs/original/
$ git reflog expire --all
$ git gc --aggressive --prune

If I now only have WebPing code in the repository, it still jumps through the history between these following locations:

  • other-project/trunk/tools/web-ping.py
  • other-project/trunk/tools/web-ping/
  • WebPing/trunk/

Using a series of git filter-branch invocations, I managed to move everything to the root of the repository:

$ git filter-branch --force --prune-empty --tree-filter 'test -d ./other-project/trunk/tools && cp -axv ./other-project/trunk/tools/* ./ && rm -rf ./other-project/trunk/tools || echo "No tools folder found"' -- --all
$ git filter-branch --force --prune-empty --tree-filter 'test -d ./other-project/trunk/tools/web-ping && cp -axv ./other-project/trunk/tools/web-ping/* ./ && rm -rf ./other-project/trunk/tools/web-ping || echo "No web-ping folder found"' -- --all
$ git filter-branch --force --prune-empty --tree-filter 'test -d ./WebPing/trunk && cp -axv ./WebPing/trunk/* ./ && rm -rf ./WebPing/trunk || echo "No trunk folder found"' -- --all

Hide and obfuscate hard-coded content

As WebPing was created for internal needs in my previous job, its original code base contains lots of references to the former infrastructure it lives in. My professional standards requires me to remove all these sensible informations before making WebPing available to the public.

For example, here is the commands which allowed me to remove all references to hostnames of our intranets:

$ git filter-branch --force --prune-empty --tree-filter 'find . -type f -exec perl -i -pe "s/([\w-.]*?)\.(company(-intranet|-extention)?)\.(fr|com|net|org)/intranet\.example\.com/g" "{}" \;' -- --all

The Perl one-liner embedded in the command above will only apply the regular expression on a line-by-line basis. If you want to have the regexp applied on the whole content of each file, you have to use Perl’s slurp mode (source of that tip):

$ git filter-branch --force --prune-empty --tree-filter 'perl -0777 -i -pe "s/MAILING_LIST\s*=\s*\[(.*?)\]/MAILING_LIST = \[\]/gs" ./web-ping.py' -- --all

The specific example above helped me removed the content of the MAILING_LIST Python list found in web-ping.py, in order to protect from spam the email addresses of my former co-workers that were unfortunately hard-coded in that variable.

Another place to hunt for sensible information is commit messages. These can be easily modified thanks to the --msg-filter option. Here is how I removed references to our internal Trac tickets:

$ git filter-branch --force --msg-filter 'sed "s/ (see ticket:666)//g"' -- --all

I also had to remove line returns introduced by abusive usage of Windows text editors (remember, WebPing was born in a corporate environment):

$ git filter-branch --force --prune-empty --tree-filter 'perl -i -pe "s/\r//" ./*' -- --all

The last useful command I use was the following, to fix author’s name and email:

$ git filter-branch --force --env-filter '
    if [ "$GIT_AUTHOR_NAME" = "deldycke" ]
      then
        export GIT_AUTHOR_NAME="Kevin Deldycke"
        export GIT_AUTHOR_EMAIL="kevin@deldycke.com"
    fi
    if [ "$GIT_AUTHOR_NAME" = "diehr" ]
      then
        export GIT_AUTHOR_NAME="Matthieu Diehr"
        export GIT_AUTHOR_EMAIL="matthieu.diehr@gmail.com"
    fi
  ' -- --all

By using a dozen variations of the commands above, and carefully reviewing the code, I was able to engineer a clean code history.

But I certainly have been a little too blunt with these regular expressions. Some of them were able to act on binary content. As a result, I had to restore static images to their original copy.

Final steps

Now that your code is clean, all you need is to recreate you tag and fix the init tag date before committing everything to GitHub:

$ git tag -f "0.0" bad4ff7fc48b8b34f6f661d75c782c7fc0d098c5
$ git tag -f "0.1" 590ac9953df0e3bc76fd02615471e36a9796a065
$ git tag -f "0.2" 33f731054042b02c6d2600e7aead5bb7c4991b12
$ git filter-branch --env-filter '
      if [ $GIT_COMMIT = 361224542bc73bba747c7ca382e992e2cdd0c356 ]
      then
          export GIT_AUTHOR_DATE="Thu, 01 Jan 1970 00:00:00 +0000"
          export GIT_COMMITTER_DATE="Thu, 01 Jan 1970 00:00:00 +0000"
      fi' -- --all
$ git remote add origin git@github.com:kdeldycke/webping.git
$ git push -u origin master
$ git push --tags

Apache commands

  • Hide Subversion and Git directories content (source):
    RedirectMatch 404 /\.(svn|git)(/|$)
    
  • Disable rendering of PHP files coming from imported third party Javascript submodules (context):
    RedirectMatch 404 js-(.*)\.php$
    
  • Redirect any request to current year sub-directory (I used this for a yearly-updated static web page):
    RewriteEngine on
    RewriteRule !^/2010/ /2010/ [R=301,L]
    
  • Here is my template for domain-based virtual host routing:
    # Setup the main website access
    <VirtualHost *:80>
      ServerName example.com
      DocumentRoot /var/www/example
      # Add extra capabilities to let CMS like WordPress manage redirections
      <Directory /var/www/example>
        Options +FollowSymLinks +SymLinksIfOwnerMatch
      </Directory>
    </VirtualHost>
    # Redirect all other access to the website from different domains to the canonical URL
    <VirtualHost *:80>
      ServerName www.example.com
      ServerAlias *.example.com
      ServerAlias example.net *.example.net
      ServerAlias example.org *.example.org
      RedirectMatch permanent (.*) http://example.com$1
    </VirtualHost>
    
  • Insert dynamic headers in HTTP responses depending on the browser:
    BrowserMatchNoCase ".*MSIE\s[1-6].*" IS_DISGUSTING_BROWSER
    Header add X-advice-of-the-day "Save a kitten: use Firefox !" env=IS_DISGUSTING_BROWSER
    
  • Prevent WebDAV connexions (thanks Guillaume!):
    <Location />
      <Limit PROPFIND PROPPATCH MKCOL COPY MOVE LOCK UNLOCK PATCH>
        # Leaves GET (and HEAD), POST, PUT, DELETE, CONNECT, OPTIONS and TRACE alone
        Order allow,deny
        Deny from all
      </Limit>
    </Location>
    SetEnvIf Request_Method "OPTIONS" CLIENT_PROBE
    Header set Allow "GET, HEAD, POST, PUT, DELETE, CONNECT, OPTIONS, TRACE" env=CLIENT_PROBE
    
  • At work, we had to engineer a convoluted software architecture for our intranet to fit the network security policy of our customer. This had a bad side effect of letting the web statistic collector delete all cookies but its own, thus breaking intranet’s authentication. So we (thanks Matthieu!) came up with this unmaintainable hack on Apache side to hide our intranet’s cookies to NedStat’s Javascript embedded code:
    <LocationMatch "/(.*)">
      LoadModule headers_module modules/mod_headers.so
      RequestHeader edit Cookie "(app_cookie_001=[^;]*(; )*)" ""
      RequestHeader edit Cookie "(app_cookie_002=[^;]*(; )*)" ""
      RequestHeader edit Cookie "(app_cookie_003=[^;]*(; )*)" ""
    </LocationMatch>
    
  • Kill all apache processes and restart the service:
    /etc/init.d/apache2 stop ; pkill -9 -u www-data ; /etc/init.d/apache2 restart
    
  • Restart Apache service if no process found:
    [ `ps axu | grep -v "grep" | grep --count "www-data"` -le 0 ] && /etc/init.d/apache2 restart
    

Mercurial commands

  • Checkout a distant repository:
    hg clone http://bitbucket.org/mirror/django
    
  • Commit all changes locally:
    hg commit -m "Here is my commit fixing bugs."
    
  • Push local commits to the remote repository:
    hg push
    
  • Apply latest changes of the remote repository to our local working copy:
    hg pull
    
  • Align the current repository to a specific revision:
    hg update -r 502
    
  • Restore all changes and files to the state they are in the distant repository:
    hg update -C
    
  • My minimal ~/.hgrc config file:
    [ui]
    username = Kevin Deldycke <kevin@deldycke.com>
    verbose = True
    
    [auth]
    # BitBucket creds
    bb.prefix = bitbucket.org
    bb.username = kdeldycke
    bb.password = XXXXXXXXXXX
    bb.schemes = https
    
  • Display the last 5 commits:
    hg log --limit 5
    
  • Display the local changes since last commit:
    hg diff
    
  • Undo the last local commit:
    hg rollback
    
  • Create a tag on a particular revision:
    hg tag -r 432 component-2.6.1
    
  • Create a bundle file containing all changes committed locally:
    hg bundle fix-bug.bundle