Keep a Debian fresh thanks to cron-apt

As I mentioned in an old comment, I use cron-apt to keep my Debian servers fresh.

This post is just a quick reminder to my future self, about how I setup cron-apt on my machines.

First we install the package:

aptitude install cron-apt

Then we configure it:

sed -i 's/# MAILON="error"/MAILON="always"/g' /etc/cron-apt/config
sed -i 's/# MAILTO="root"/MAILTO="kevin@deldycke.com"/g' /etc/cron-apt/config

That’s it !

Fixing messed-up encoding in MySQL

Currently working on my e107 Importer plugin, I was confronted today with badly-encoded data coming from my databases.

e107 migrated to full UTF-8 years ago, but I must have messed the upgrade process at the time. That was my conclusion when I took a close look to my tables: all of them seems to be set to Latin-1 but contain UTF-8 data. Here are screenshots from SQLBuddy (a great light-weight MySQL manager) showing just that:

To fix this, I first tried to use the following command I found on the web:

mysql --database=e107db -B -N -e "SHOW TABLES"  | awk '{print "ALTER TABLE", $1, "CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;"}' | mysql --database=e107db

But this doesn’t work, as it not only change the encoding of the table, but also transcode the data inside the table.

Let’s try something else. First, we’ll export the database to a dump file, of which the encoding is forced to Latin-1:

mysqldump -a -c -e --no-create-db --add-drop-table --default-character-set=latin1 --databases 'e107db' > ./e107-data.sql

Now the trick is to change the CHARSET parameter of all CREATE TABLE directives to UTF-8:

sed -i 's/CHARSET=latin1/CHARSET=utf8/g' ./e107-data.sql

We’ll also change the NAMES directive to force MySQL to handle imported data as UTF-8:

sed -i 's/SET NAMES latin1/SET NAMES utf8/g' ./e107-data.sql

Then we’re free to import the result in a new UTF-8 database:

sed -i 's/USE `e107db`;/#USE `e107db`;/g' ./e107-data.sql
mysql --execute="CREATE DATABASE e107db_new CHARACTER SET=utf8"
mysql --database=e107db_new < ./e107-data.sql

And now, accentuated characters appears as they should in our database, meaning we’ve fixed all the mess ! :)


PS: I found another alternative method (look at the end of the linked page) which consists of temporarily handling TEXT fields as BLOB, to have MySQL treat them as binary content (thus skipping character transcoding). Haven’t tested this but sounds tricky.

Subversion commands

Native commands

  • Revert current local folder to revision 666:
    svn merge -rHEAD:666 ./
    
  • Create an empty repository:
    svnadmin create ./my-repo
    
  • Dump a repository (a sure way to migrate a subversion repository from one version to another):
    svnadmin dump ./my-repo > ./my-repo.dmp
    
  • Migrate a remote Subversion repository without creating an intermediate dump file:
    ssh -C user@myserver.com "svnadmin dump /home/user/my-repo" | svnadmin load /home/user2/my-new-repo
    
  • Launch a standalone Subversion server listening on port 3690 and serving all repositories located in ./repos/:
    svnserve --daemon --listen-port 3690 --root ./repos/
    

Local working copy hacking

  • Recursive and case insensitive content search on non-binary files from the current folder, while ignoring .svn folders and their content:
    find ./ -type f -not -regex ".*\/.svn\/.*" -exec grep -Iil "string to search" {} \;
    
  • Same thing as above but with an alternative approach (that don’t work with large folder content):
    grep -Ii "string to search" $(find . | grep -v .svn)
    

    Other alternative: use ack.

  • Use sed to replace text in all files except in subversion metadatas:
    find ./ -type f -not -regex ".*\/.svn\/.*" -print -exec sed -i 's/str1/str2/g' "{}" \;
    
  • Use svn delete to remove all files containing a tilde in their name without touching local subversion metadatas:
    find -type f -not -regex ".*\/.svn\/.*" -name "*˜*" -print -exec svn delete "{}" \;
    
  • In a repository structure containing sub-projects (thinks of Plone’s collective repository as an example), get the list of all folders in all trunks, while ignoring subversion metadata folders:
    find ./ -type d -regex ".*\/trunk\/?.*" -not -regex ".*\/.svn\/?.*" -print
    
  • Similarly to the command above, replace all occurrences of the string @coolcavemen.fr by @coolcavemen.com in all trunk subfolders while ignoring .svn content:
    find ./ -type f -regex ".*\/trunk\/.*" -not -regex ".*\/.svn\/.*" -print -exec sed -i 's/@coolcavemen\.fr/@coolcavemen\.com/g' "{}" \;
    
  • Set a svn property to ignore all .mo files during commit in every folder of our local working copy containing .po files:
    find ./ -type f -name "*.po" -regex ".*\/trunk\/.*" -not -regex ".*\/.svn\/.*" -printf "%h\n" | uniq | xargs svn propset "svn:ignore" "*.mo"
    

Moving a WordPress blog to another domain

qpx-site-domain-migration I provide hosting for free to some of my friends. One of them, QPX, had a side project called Lich’ti. But the latter is no longer active, so he decided to not renew the lich-ti.fr domain.

If Lich’ti’s domain name is dead, QPX’s personal blog is not. His website is powered by WordPress and was available at http://qpx.lich-ti.fr. My job is now to move it to http://qpx.coolcavemen.com. In this post, I’ll tell you how I’ve done it.

Before going further, backup everything, and be ready to revert back to your original situation at any moment ! What works for me will not necessary works for you…

To play nice with your visitors, you can setup a temporary maintenance page while we’re performing the migration.

Let’s start the migration by replacing, in the files served by Apache, all occurrences of the old domain name by the new one:

find /var/www/qpx-blog -mount -print -type f -exec sed -i 's/qpx.lich-ti.fr/qpx.coolcavemen.com/g' "{}" \;

If you have doubts about the efficiency of the command above, you can check the presence of the string we’re looking to replace via this command:

grep -RIi "qpx.lich-ti.fr" ./*

Then, we dump the database containing all WordPress content and config to a local file (the command will prompt for password):

mysqldump -p --host=localhost --port=3306 --user=root --opt --databases "qpx_blog" > qpx_dump.sql

And we replace all strings of the old domain by the new one:

sed 's/qpx.lich-ti.fr/qpx.coolcavemen.com/g' qpx_dump.sql > new_qpx.sql

Finally, we re-inject the modified database content after clearing the original:

mysql -p --host=localhost --port=3306 --user=root --execute='DROP DATABASE `qpx_blog`;'
mysql -p --host=localhost --port=3306 --user=root < new_qpx.sql

Now you can disable the maintenance page and test the blog to check nothing’s broken.

Again, to play nice with your visitors (and search engines), you can redirect old URLs to the new domain, with apache directives similar to this one:

<VirtualHost *:80>
  ServerName qpx.lich-ti.fr
  RedirectMatch permanent (.*) http://qpx.coolcavemen.com$1
</VirtualHost>

Text, Date & Document processing commands

  • Text replacement:
    sed 's/string to replace/replacement string/g' original-file.txt > new-file.txt
    
  • Replace all occurrences of str1 by str2 in all files below the /folder path:
    find /folder -xdev -type f -print -exec sed -i 's/str1/str2/g' "{}" \;
    
  • Same as above but ignore all content of .svn folders and .zip files:
    find /folder -xdev -type f -not -regex ".*\/\.svn\/.*" -not -iname "*\.zip" -print -exec sed -i 's/str1/str2/g' "{}" \;
    
  • In place charset transcoding:
    recode utf-8..latin-1 utf8text.txt
    
  • Remove all accented characters in a string (thanks to Matthieu for the tip):
    echo "éÈça-$" | iconv -t ASCII//translit
    
  • Get the date of last week:
    date +"%Y-%m-%d" -d last-week
    
  • Get the current date in english:
    env LC_TIME=en date +"%a %b %d %Y"
    
  • Get the number of seconds since epoch:
    date +%s
    
  • Convert back epoch time to human-readable date:
    date --date=@1234567890
    
  • Merge 2 PDF documents:
    pdftk doc1.pdf doc2.pdf cat output newdoc.pdf
    
  • Same as above, but for all PDFs of the current folder. This also have the nice side effect of removing all DRMs :) :
    gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=bigfile.pdf ./*
    
  • VIM: no autoindent on paste.
  • a list of sed one-liners.