Easy Mirroring Without RAID: the Poor Man’s Disk Array

This howto explain how to use rsync to build a data mirroring mechanism on a local machine, with two hard drives, ala RAID 1, but without RAID 1 (!).

I had the project to setup a RAID 5 array using 3*120 Gb hard drives in USB enclosures. Unfortunately my project stalled due to instability in early 2.6.x kernels (I heard that 2.6.12 and upper are now useable for “RAID over USB”).

Because of the urgency of reliable storage (and because I don’t want to waste time compiling and fine-tuning kernels), I decided to do it using traditionnal IDE host. So I plugged two 120Gb HDD on my machine as master device, one on each IDE channel.

Open Brick NG and RAID-1-like setup

Then I made a big XFS partition on each, and update my /etc/fstab:

/dev/sda1 /                auto  noatime   1 1
/dev/hda1 /mnt/hd1         xfs   defaults  1 2
/dev/hdc1 /mnt/hd1_mirror  xfs   defaults  1 2

At that moment I have to explain you that my machine is an OpenBrick NG, with a USB 2.0 512 Mb thumb drive (/dev/sda1 in the fstab) on which all my linux system is installed. That explain why my two IDE channels are free for use.

The idea is now to use /mnt/hd1 to store and manipulate my datas, then rsync that drive with his alter-ego (/mnt/hd1_mirror) every night. To do that, I’ve just added the following command in a cron entry:

rsync -a --delete --delete-excluded --delete-after /mnt/hd1/ /mnt/hd1_mirror/

And voilĂ  !

As you guess, this solution is far from perfect, and has major inconvenients regarding RAID 1:

  • No immediate backup : the backuped datas are 1-day old;
  • Seek time is not reduce by half;
  • Transfer rate is not doubled.

Oh, and by the way, be careful to not write files on /mnt/hd1_mirror/ because they will be deleted each night during the mirroring process.

Remote Backup with rsync

This little article describe how to setup an automatic backup procedure to a remote machine via the rsync tool.

Prerequisites

  • A distant server, where backup will be stored (homeserver.com in this case),
  • A user account on this server (mine was kevin),
  • A ssh deamon running on the server that allow the user to log in.

Setup rsync

First, install rsync on the client and on the server using:

urpmi rsync

Synchronization

Then, to synchronise from the local machine to the distant server, just do:

rsync -avz -e ssh /home/client_user/Documents kevin@homeserver.com:/mnt/raid2/
  • /home/client_user/Documents is the local folder we want to save (located in the home folder of the client user client_user),
  • homeserver.com is the distant server name (could be en IP address),
  • kevin is the distant user,
  • /mnt/raid2/ is the distant folder where we want to save the local one.

Croned synchronization

First, create a pair of cryptographic keys (public, private):

ssh-keygen -t rsa

Then, from the local machine as user client_user, register you on the distant server:

ssh-copy-id -i ~/.ssh/id_rsa.pub kevin@homeserver.com

In case your distant machine’s SSH server is running on another port than 22 (which is the default port), let’s said 222, here is the command that emulate ssh-copy-id (as the later doesn’t have a port parameter):

cat ~/.ssh/id_rsa.pub | ssh -p 222 kevin@homeserver.com "cat >> ~/.ssh/authorized_keys"

Create a script named rsync_data_backup.sh that contain the command you’ve used previously to synchronize your data:

rsync -avz -e ssh /home/client_user/Documents kevin@homeserver.com:/mnt/raid2/

To run this script with a cron entry, the (unsecure) solution found is to create a key without a passphrase. The cron entry could be something like:

15 13 * * 1-5 client_user /home/client_user/rsync_data_backup.sh > /home/client_user/rsync_data_backup.log

This crontab entry will automaticcaly synchronise our data each first-5 days of the week, at 13:15.