Website Backup Script: MySQL dumps and SSH supported.

Three months after the last version, here is a big update of my backup scripts for websites. The script was greatly improved and among new features, the most important is the support of backups over SSH and backups of MySQL databases.

Change log:

  • Each item of the user’s backup_list must specify the type property (FTP, FTPs, SSH, MySQLdump or MySQLdump+ssh).
  • The property previously known as site is now host.
  • File system structure changed: /ftp-mirror folders renamed to /mirror.
  • Add SSH backups.
  • The script is able to detect if a SSH connexion can be initiated without a password. This was designed for people who don’t like the idea of storing clear password in the script. Thanks to this feature, you can benefit public key authentication from OpenSSH.
  • Use of rsync whenever it’s possible for bandwidth efficiency.
  • FTP and FTPs (aka FTP over SSL) are now handled separately: this suppress the default fall-back to FTP if FTPs is not supported by the remote server. This is safer as it doesn’t let lftp make the decision for you to send your clear password on the net.
  • All ports are optionnal, no need to specify it you use default ports.
  • Add MySQL backups thanks to mysqldump.
  • Two mode of MySQL backups: through SSH or direct connection to server.
  • A particular database to backup can be specified. Else, all databases are backed up.
  • Much more detailed logs that include external command’s output.
  • Auto-detect the existence of required external tools and commands at launch.
  • Use pexpect lib to simulate user password input.
  • Run all external commands in english for consistency.
  • Check that the script is running in a posix environnement.
  • Fix bug related to directory creation.

If you were using a previous version of my backup script and want to use this updated version, take care of changes, especially the ones describes in the first 3 items of the change log above.

Ultimate Regular Expression for HTML tag parsing with PHP

Disclaimer: this is a dirty hack ! To parse HTML or XML, use a dedicated library.

Tonight I found the ultimate regex to get HTML tags out of a string. It was written a year ago by Phil Haack on his blog. His regex is quite bullet-proof: it’s able to parse HTML tags written on multiple lines which contain any sort of attributes (with or without a value, with single or double quotes).

Unfortunately his regular expression was designed for Microsoft .NET, so I’ve spend some time to convert it to PHP. Here is the result:

$regex = "/<\/?\w+((\s+\w+(\s*=\s*(?:\".*?\"|'.*?'|[^'\">\s]+))?)+\s*|\s*)\/?>/i";

And finally, my version based on the one above:

$regex = "/<\/?\w+((\s+(\w|\w[\w-]*\w)(\s*=\s*(?:\".*?\"|'.*?'|[^'\">\s]+))?)+\s*|\s*)\/?>/i";

The latter include the following enhancement:

  • accept hyphens as attribute’s middle characters (thanks Ged)

Sapphire style for K2 WordPress theme

Sapphire custom style for K2 WordPress Theme

Yesterday I’ve build a new K2 style based on the legacy Sapphire 1.0 WordPress theme by Michael Martine. This is the result of my love to the blue bend of Sapphire and the versatility of K2.

As you can see in the footer of that blog, I’m using my Sapphire 0.1 style right now. So if you like the look and feel of that blog, don’t ask yourself and download Sapphire 0.1 for for K2 !

To install it, unzip the archive to your /wp-content/themes/k2/styles/ folder.

e107 to WordPress v0.7: support Categories and Private Pages

e107 to WordPress import plugin v0.7: News and Categories Imported screenshot I’ve released a new version of my e107 migration script for WordPress. This release is numbered v0.7.

Here is the change-log:

  • Import e107 news categories.
  • Mails can be sent to each user to warn them about their new password. This is the only solution I found to fix the issue about reseted passwords.
  • Thanks to the 2.1 branch of WordPress, static pages can be set as private. This version of my script use this feature.
  • Private pages also deprecate the questions asked to the user when importing pages. So I deleted those parameters (“Warning 2” on the screenshot) to make the import process easier.
  • Embedded e107 code updated from v0.7.8.
  • Tested with WordPress 2.1.2.
  • Some little UI imporvements (to match admin UI consistency).

Because of its de-facto standard status, my import script is now packaged as a .zip file. It can be downloaded from my Linux Scripts section.

How-to Recover a RAID array after having Zero-ized Superblocks

Today mdadm send me a mail to warn that one of my hard drive (/dev/hdd1) was ejected from my RAID-5 array. After some manipulations (no writes, just reads on the file system to get informations) and reboots, I ended up with a file system in a strange state: the folder structure was totally messed up and lots of files disappeared.

Assuming that this situation was about an inconsistent file index, I decided to reset the superblocks of the remaining physical disks:

mdadm --zero-superblock /dev/hdc1
mdadm --zero-superblock /dev/hdb1

I don’t know why I decided to do so, but it was the stupidest idea of the week. After such a violent treatment, my array refused to start:

[root@localhost ~]$ mdadm --assemble /dev/md0 --auto --scan --update=summaries --verbose
mdadm: looking for devices for /dev/md0
mdadm: no RAID superblock on /dev/hdc1
mdadm: /dev/hdc1 has wrong raid level.
mdadm: no RAID superblock on /dev/hdb1
mdadm: /dev/hdb1 has wrong raid level.
mdadm: no devices found for /dev/md0

At this moment I was sure that all my data assets were lost. I was desperate. My only alternative was to ask Google. So I did.

I spend several minutes browsing the web without hope. I finally found someone in the same situation as mine (sorry, in french) on debian-user-french mailing list.

The solution was to recreate the RAID array. This sound counter-intuitive: if we recreate a raid array over an existing one, it will be erased ! Right ? Wrong ! As it is said on debian-user-french, mdadm is smart enough to “see” that HDD of the new array were elements of a previous one. Knowing that, mdadm will try to do its best (i.e. if parameters match the previous array configuration) and rebuild the new array upon the previous one in a non-destructive way, by keeping HDD content.

So, here is how I finally recovered my RAID array:

[root@localhost ~]$ mdadm --create /dev/md0 --verbose --level=5 --raid-devices=3 /dev/hdc1 missing /dev/hdb1
mdadm: layout defaults to left-symmetric
mdadm: chunk size defaults to 64K
mdadm: size set to 312568576K
mdadm: array /dev/md0 started.

Of course this doesn’t solve my initial problem about the /dev/md0 file system: it is still in an altered state. Maybe it’s too late to recover data. But at least I reverted all my today’s mistakes, and the situation will not deteriorate until I power up my RAID ! :)