Website Backup script: Incremental Backup feature added.

I’ve changed my backup strategy today, so I updated my website-backup.py script. You can find the latest version of the script on my script page.

I now use rdiff-backup in this script to keep 32 days of incremental backups. Beside this the script do a monthly full archive of the website in bzip2 format. This new strategy has reduced the total size of my backups from 64 GB to 6.7 GB. Roughly 90% of free space gain thanks to rdiff-backup ! If rdiff-backup is so efficient in my case, this is due to the existence on my websites of large files that are rarely modified (Mp3s, Flacs, RPMs, images, etc…).

Website Backup script: New Version Save you Disk Space.

I’ve updated my website-backup.py script. I added a little optimization to delete the yesterday’s backup if nothing was changed on the remote website. This let me save some megabytes on the hard drive for everyday backups of near-static websites. The optimization I added is simply based on checksum comparison. This is the context to the previous script I wrote today: it was a tool to help me debug and experiment this new feature.

You can find the latest version of the website-backup.py script in my Linux script page. Here is the direct link to today’s version.

Archives commands

  • Extract .tar.gz file:
    tar xvzf ./file.tar.gz
    
  • Extract only one subdirectory and all its sub-content:
    tar -xvzf my-archive.tar.gz --wildcards "./directory/subdirectory*"
    
  • Create a .tar.gz file:
    tar cvzf file.tar.gz ./subfolder
    
  • Extract ./path/in/archive* subfolder content from all .tar.bz2 archives available in the current folder. Place the extracted content of each archive in a folder prefixed with the content- string:
    for ARCHIVE in `ls *.tar.bz2`; do DEST_FOLDER=content-`echo $ARCHIVE | cut -d '.' -f 1`; mkdir $DEST_FOLDER; tar -C $DEST_FOLDER -xvjf $ARCHIVE --wildcards "./path/in/archive*"; done
    
  • Extract all .gz files in the current folder:
    gunzip ./*.gz
    
  • Extract .tar.bz2 file:
    tar xvjf ./file.tar.bz2
    
  • Check a .bz2 file integrity:
    bzip2 --test ./file.bz2
    
  • Create a .zip archive of current directory, including all sub-dirs:
    zip -r archive.zip ./*
    
  • Create a 7-Zip archive (thanks to p7zip) of a folder, including all sub-directories:
    7za a archive.7z ./folder
    
  • Do the same as above, but split the archive in 50 Mib volumes:
    7za a -v50m archive.7z ./folder
    
  • Convert .tar.gz file to .tar.bz2 file:
    gzip -dc archive.tar.gz | bzip2 > archive.tar.bz2
    
  • Extract content from self-extracting shell archives:
    unshar archive.sh
    

Script to Automate FTP site Backup.

Based on my yesterday experimentations, I’ve code today a little script to automate the backup of several websites of mine. This script use lftp to mirror file from a remote host to your local machine. Then it create a bzip2 archive.

You can download the script here. To make it working, you need python on your system. To configure it, edit the ftpsite_list python list in the begining of the file.