Using Munin to monitor a Debian Squeeze server

Again, here is a tutorial article exposing the recipe I use to cook a Munin on a Debian Squeeze.

As usual, let’s start by installing the main Munin package:

$ aptitude install munin

FYI, the version that aptitude choose to install was Munin 1.4.5. The default configuration coming along will make it produce graphs and HTML content to /var/cache/munin/www. Now we need to serve these pages via a web server.

As I wanted to play with nginx for a long time, I will use this opportunity to serve Munin’s content. The default version coming with Squeeze is quite old, so we’ll get the latest version from the Dotdeb repository:

$ echo "deb http://packages.dotdeb.org squeeze all" > /etc/apt/sources.list.d/squeeze-dotdeb.list
$ aptitude update
$ aptitude install nginx

And if you don’t want to get those error messages about untrusted packages, don’t forget to add Dotdeb’s keys to your keyring.

We can now test that nginx is working by starting it up then fetch the default served page:

$ /etc/init.d/nginx start
$ wget --output-document=- http://localhost | grep "Welcome to nginx"

Then we’ll disable the default nginx config and create a new one for Munin:

$ rm /etc/nginx/sites-enabled/default
$ touch /etc/nginx/sites-available/munin

In the latter, we put this minimal configuration:

server {
  server_name munin.example.com;
  root /var/cache/munin/www/;
  location / {
    index index.html;
    access_log off;
  }
}

Now we have to activate it before restarting nginx:

$ ln -s  /etc/nginx/sites-available/munin /etc/nginx/sites-enabled/munin
$ /etc/init.d/nginx restart

Now we are free to point our browser to the http://munin.example.com URL to get our graphs.

You’ll see that by default, Munin refer to your machine as localhost.localdomain. It’s time to tweak Munin a little to get nice reports:

$ sed -i 's/\[localhost\.localdomain\]/\[munin\.example\.com\]/g' /etc/munin/munin.conf

By default Munin activate a lot of great graphs. But I always find that some crucial monitoring are missing. Let’s add some more monitoring scripts:

$ aptitude install munin-plugins-extra

Here is a collection of general purpose graphs I automatically add to Munin:

$ ln -s /usr/share/munin/plugins/df_abs  /etc/munin/plugins/
$ ln -s /usr/share/munin/plugins/netstat /etc/munin/plugins/
$ echo "[netstat]
user root
" > /etc/munin/plugin-conf.d/netstat

It’s also good to have a clue about your connectivity to the rest of the world:

$ ln -s /usr/share/munin/plugins/ping_  /etc/munin/plugins/ping_google.com
$ ln -s /usr/share/munin/plugins/ping_  /etc/munin/plugins/ping_ovh.fr
$ ln -s /usr/share/munin/plugins/ping_  /etc/munin/plugins/ping_example.com

I also like to have insight about my automated backups:

$ ln -s /usr/share/munin/plugins/ps_ /etc/munin/plugins/ps_duplicity
$ ln -s /usr/share/munin/plugins/ps_ /etc/munin/plugins/ps_sshd

Monitoring temperatures, voltages and other hardware metrics is a must, unless your machine is a virtual server :) :

$ ln -s /usr/share/munin/plugins/cpuspeed         /etc/munin/plugins/
$ ln -s /usr/share/munin/plugins/acpi             /etc/munin/plugins/
$ ln -s /usr/share/munin/plugins/hddtemp_smartctl /etc/munin/plugins/
$ aptitude install i2c-tools lm-sensors
$ sensors-detect
$ ln -s /usr/share/munin/plugins/sensors_ /etc/munin/plugins/sensors_temp
$ ln -s /usr/share/munin/plugins/sensors_ /etc/munin/plugins/sensors_volt

I sometimes have a Fail2Ban deamon running on a server, so that’s a good thing to monitor it:

$ ln -s /usr/share/munin/plugins/fail2ban /etc/munin/plugins/
$ echo "[fail2ban*]
user root
" > /etc/munin/plugin-conf.d/fail2ban

Having an UPS, it’s good to monitor it too. Here is for the UPS on the local system having the MGE-Ellipse750 ID (as defined in your /etc/nut/ups.conf file):

$ ln -s /usr/share/munin/plugins/nutups_   /etc/munin/plugins/nutups_MGE-Ellipse750_voltages
$ ln -s /usr/share/munin/plugins/nutups_   /etc/munin/plugins/nutups_MGE-Ellipse750_charge
$ ln -s /usr/share/munin/plugins/nutups_   /etc/munin/plugins/nutups_MGE-Ellipse750_freq
$ ln -s /usr/share/munin/plugins/nutups_   /etc/munin/plugins/nutups_MGE-Ellipse750_current
$ ln -s /usr/share/munin/plugins/nut_misc  /etc/munin/plugins/
$ ln -s /usr/share/munin/plugins/nut_volts /etc/munin/plugins/
$ echo "[nut*]
user root

[nut_*]
env.upsname MGE-Ellipse750@localhost
" > /etc/munin/plugin-conf.d/nut

And if you have a MySQL server running on the machine, that’s a good idea to get stats:

$ ln -s /usr/share/munin/plugins/mysql_threads     /etc/munin/plugins/
$ ln -s /usr/share/munin/plugins/mysql_slowqueries /etc/munin/plugins/
$ ln -s /usr/share/munin/plugins/mysql_queries     /etc/munin/plugins/
$ ln -s /usr/share/munin/plugins/mysql_bytes       /etc/munin/plugins/

I also use some other Munin plugins coming from Munin exchange:

$ wget http://exchange.munin-monitoring.org/plugins/mysql_size_all/version/1/download --output-document=/usr/share/munin/plugins/mysql_size_all
$ ln -s /usr/share/munin/plugins/mysql_size_all /etc/munin/plugins/

An here is how I monitor my RAID array:

$ wget http://exchange.munin-monitoring.org/plugins/raid/version/3/download --output-document=/usr/share/munin/plugins/raid
$ ln -s /usr/share/munin/plugins/raid /etc/munin/plugins/
$ echo "[raid]
user root
" > /etc/munin/plugin-conf.d/raid

Finally, it’s time to monitor nginx itself:

$ ln -s /usr/share/munin/plugins/nginx_status  /etc/munin/plugins/
$ ln -s /usr/share/munin/plugins/nginx_request /etc/munin/plugins/
$ echo "[nginx_*]
env.url http://localhost/nginx_status
" > /etc/munin/plugin-conf.d/nginx

These two scripts above have some Perl module dependencies:

$ aptitude install libio-all-lwp-perl

If you don’t install the libraries above, you’ll get these kind of errors in /var/log/munin/munin-node.log:

2011/05/03-17:50:10 [2009] Error output from nginx_request:
2011/05/03-17:50:10 [2009]      Can't locate object method "new" via package "LWP::UserAgent" at /etc/munin/plugins/nginx_request line 106.
2011/05/03-17:50:10 [2009] Service 'nginx_request' exited with status 9/0.
2011/05/03-17:50:10 [2009] Error output from nginx_status:
2011/05/03-17:50:10 [2009]      Can't locate object method "new" via package "LWP::UserAgent" at /etc/munin/plugins/nginx_status line 109.
2011/05/03-17:50:10 [2009] Service 'nginx_status' exited with status 2/0.

But for this to work, we have to update the /etc/nginx/sites-enabled/munin file. Now it looks like this:

server {
  server_name munin.example.com;
  root /var/cache/munin/www/;
  # Restrict Munin access
  auth_basic "Restricted";
  auth_basic_user_file /etc/nginx/htpasswd;
  location / {
    index index.html;
    access_log off;
  }
}
server {
  allow 127.0.0.1;
  deny all;
  location /nginx_status {
    stub_status on;
    access_log off;
  }
}

Note that I’ve added a simple HTTP authentication to Munin webpages and restricted access to nginx statistics from the local machine only.

At last, before rebooting Munin and Nginx, make sure all downloaded plugins are executables. This is important and always forgotten:

$ chmod -R 755 /usr/share/munin/plugins/
$ /etc/init.d/nginx restart
$ /etc/init.d/munin-node restart

Heroic journey to RAID-5 data recovery

Last week there was a power grid failure which break down my server’s RAID array. I have no UPS (as I’m a skinflint) and no automatic email alerts (because I’m too lazy to set it up). As a result, for 5 days, my 3-disk RAID-5 array was relying on only 2 disks until I noticed the issue…

By using a combination of following commands, I was soon aware of the gravity of the situation:

cat /proc/mdstat
mdadm --examine /dev/sda1

My /dev/sda1 disk was kicked out of the array, so I did the right stuff which consisted of reconstructing the array:

mdadm /dev/md0 -a /dev/sda1

Then, in an unlucky combination of cosmic ray bombardment, spooky action at a distance and astrological misalignment, half-way to the end of the rebuilding process (which can take up to 5 hours), another disk failed ! It was late, I was tired and utterly worried about losing 1.5 To of precious data. In such a bad shape, I was afraid to worsen the situation. So I decided to shutdown the server and sleep on the problem.

The next day I tried to boot my server to find it (surprise !) stuck in the middle of the boot process, with the famous message:

hit control-D to continue or give root password to fix manually

This is “normal” as my server tried to mount the ext3 filesystem from the /dev/md0 partition that was just assembled by mdadm. Of course md0, if assembled and available to the system, was not running because only one disk, out of three, was in a clean state.

I skip here the epic substory in which I wasted days in a search of a working keyboard, but I let you imagine how such adventures makes my week…

Eventually, I was able to analyze the situation in details. My first reflex ? Check that disks are not physically dead:

fdisk -l /dev/sda
fdisk -l /dev/sdb
fdisk -l /dev/sdc

“Linux raid partitions” (type code “fd“) are still there. Good. I assumed here that disks where not physically damaged. Maybe I should have looked at S.M.A.R.T. datas and statistics (via smartmontools). But remember, I’m lazy (and a bit crazy).

The next step was to get informations about the RAID array itself using:

mdadm --detail /dev/md0

which output the status table below (probably inaccurate as I reconstructed it afterwards):

Number   Major   Minor   RaidDevice State
   0       0        0        0      removed
   1       0        0        1      faulty removed
   2       8       33        2      active sync   /dev/sdc1
   3       8       17        3      spare

What this table told us ?

  • The array is up, but not running. One of its device (sdc1) was clean and active, but it’s not enough to get a working RAID-5.
  • My first attempt to rebuild the array lead to an unexpected result: it added sda1 as a spare device (in slot #3).
  • It confirm that sdb1 unexpectedly failed and is now in a bad state (“faulty removed“).

Then I stopped the array and tried to fearlessly (re)assemble it using 3 differents methods:

mdadm -S /dev/md0
mdadm -A /dev/md0
mdadm --assemble /dev/md0 --verbose /dev/sd[abc]1
mdadm --assemble --force --scan /dev/md0 --verbose

It always failed with messages like:

mdadm: failed to RUN_ARRAY /dev/md0: Input/output error
mdadm: /dev/md0 assembled from 1 drives and 1 spare - not enough to start the array.

So I examined each drive from mdadm‘s point of view:

mdadm -E /dev/sda1
mdadm -E /dev/sdb1
mdadm -E /dev/sdc1
mdadm -E /dev/sd[abc]1 | grep Event

The lastest command compare the “Event” attribute of all devices. It output something like:

Events : 0.53120
Events : 0.53108
Events : 0.53120

which indicate that sda1 and sdc1 are somewhat synced (share the same number) and sdb1 “late” (lower number).

Here I’ve got the idea of recreating the raid array without sdb1, relying only on sda1 and sdc1, by using the “magic” (hence dangerous) --assume-clean option. The latter doesn’t build, erase or initialize a new array. It just try to assemble it “as is”. Here is the command:

mdadm --create /dev/md0 --assume-clean --level=5 --verbose --raid-devices=3 /dev/sda1 missing /dev/sdc1

And it worked ! :D

I mounted the md0 partition and cleaned it up:

fsck.ext3 -v /dev/md0
mount /dev/md0

I updated my mdadm configuration before rebooting my server:

mdadm --detail --scan >> /etc/mdadm/mdadm.conf
vi /etc/mdadm/mdadm.conf
reboot

But history repeat itself, and again, the system hang up during boot. Except this time I knew what was happening: the boot process detected the remaining sdb1 device as part of the old array (the one before the regeneration I did above) and tried to run it. Remembering my last year post, I zero-ized the superblock of sdb1:

mdadm -S /dev/md0
mdadm --zero-superblock /dev/sdb1

A server reboot proved I was right and my md0 partition was automagically mounted in altered state:

localhost:~# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sdb1[3] sda1[0] sdc1[2]
      1465143808 blocks level 5, 64k chunk, algorithm 2 [3/2] [U_U]

unused devices: <none>

I just had to re-add sdb1 to fill the available slot and update the mdadm configuration to get back my array in its initial state:

mdadm --manage /dev/md0 --add /dev/sdb1
mdadm --detail --scan >> /etc/mdadm/mdadm.conf
vi /etc/mdadm/mdadm.conf

How-to Recover a RAID array after having Zero-ized Superblocks

Today mdadm send me a mail to warn that one of my hard drive (/dev/hdd1) was ejected from my RAID-5 array. After some manipulations (no writes, just reads on the file system to get informations) and reboots, I ended up with a file system in a strange state: the folder structure was totally messed up and lots of files disappeared.

Assuming that this situation was about an inconsistent file index, I decided to reset the superblocks of the remaining physical disks:

mdadm --zero-superblock /dev/hdc1
mdadm --zero-superblock /dev/hdb1

I don’t know why I decided to do so, but it was the stupidest idea of the week. After such a violent treatment, my array refused to start:

[root@localhost ~]$ mdadm --assemble /dev/md0 --auto --scan --update=summaries --verbose
mdadm: looking for devices for /dev/md0
mdadm: no RAID superblock on /dev/hdc1
mdadm: /dev/hdc1 has wrong raid level.
mdadm: no RAID superblock on /dev/hdb1
mdadm: /dev/hdb1 has wrong raid level.
mdadm: no devices found for /dev/md0

At this moment I was sure that all my data assets were lost. I was desperate. My only alternative was to ask Google. So I did.

I spend several minutes browsing the web without hope. I finally found someone in the same situation as mine (sorry, in french) on debian-user-french mailing list.

The solution was to recreate the RAID array. This sound counter-intuitive: if we recreate a raid array over an existing one, it will be erased ! Right ? Wrong ! As it is said on debian-user-french, mdadm is smart enough to “see” that HDD of the new array were elements of a previous one. Knowing that, mdadm will try to do its best (i.e. if parameters match the previous array configuration) and rebuild the new array upon the previous one in a non-destructive way, by keeping HDD content.

So, here is how I finally recovered my RAID array:

[root@localhost ~]$ mdadm --create /dev/md0 --verbose --level=5 --raid-devices=3 /dev/hdc1 missing /dev/hdb1
mdadm: layout defaults to left-symmetric
mdadm: chunk size defaults to 64K
mdadm: size set to 312568576K
mdadm: array /dev/md0 started.

Of course this doesn’t solve my initial problem about the /dev/md0 file system: it is still in an altered state. Maybe it’s too late to recover data. But at least I reverted all my today’s mistakes, and the situation will not deteriorate until I power up my RAID ! :)

Easy Mirroring Without RAID: the Poor Man’s Disk Array

This howto explain how to use rsync to build a data mirroring mechanism on a local machine, with two hard drives, ala RAID 1, but without RAID 1 (!).

I had the project to setup a RAID 5 array using 3*120 Gb hard drives in USB enclosures. Unfortunately my project stalled due to instability in early 2.6.x kernels (I heard that 2.6.12 and upper are now useable for “RAID over USB”).

Because of the urgency of reliable storage (and because I don’t want to waste time compiling and fine-tuning kernels), I decided to do it using traditionnal IDE host. So I plugged two 120Gb HDD on my machine as master device, one on each IDE channel.

Open Brick NG and RAID-1-like setup

Then I made a big XFS partition on each, and update my /etc/fstab:

/dev/sda1 /                auto  noatime   1 1
/dev/hda1 /mnt/hd1         xfs   defaults  1 2
/dev/hdc1 /mnt/hd1_mirror  xfs   defaults  1 2

At that moment I have to explain you that my machine is an OpenBrick NG, with a USB 2.0 512 Mb thumb drive (/dev/sda1 in the fstab) on which all my linux system is installed. That explain why my two IDE channels are free for use.

The idea is now to use /mnt/hd1 to store and manipulate my datas, then rsync that drive with his alter-ego (/mnt/hd1_mirror) every night. To do that, I’ve just added the following command in a cron entry:

rsync -a --delete --delete-excluded --delete-after /mnt/hd1/ /mnt/hd1_mirror/

And voilà !

As you guess, this solution is far from perfect, and has major inconvenients regarding RAID 1:

  • No immediate backup : the backuped datas are 1-day old;
  • Seek time is not reduce by half;
  • Transfer rate is not doubled.

Oh, and by the way, be careful to not write files on /mnt/hd1_mirror/ because they will be deleted each night during the mirroring process.

Créer un Espace de Stockage Fiable avec RAID 5 et LVM sous Linux

Cet article explique comment créer un espace de stockage redondant et fiable en utilisant du matériel grand public et bon marché. Cela est possible par la combinaison de Linux, des mécanismes RAID logiciel et du gestionnaire de volumes logiques LVM.

Mise à jour: Le but initial était d’utiliser des boîtiers externes USB pour construire une matrice RAID. En réalité je n’ai jamais pu obtenir de résultats convaincants car les disques USB ne sont pas aussi fiables que des disques dur IDE classique. Je m’explique: une partie seulement des instructions IDE trouvent leurs équivalents dans le protocole USB, limitant ainsi l’accès bas niveau aux disques durs externes par le kernel linux. Voila pourquoi cet article reste inachevé et que certaines parties ci-dessous peuvent paraître décousues.

Pour commencer une explication de la technologie RAID et de ses intérêts n’est pas superflu.

Je dispose de 2 disques durs de 120 Go, que je met chacun dans un boîtier externe USB 2. Je les branche ensuite sur mon OpenBrick NG qui possède sur son port IDE0 un disque dur de 160 Go qui héberge l’OS. L’OS en question est une Mandrakelinux 10.0 installée sur les 40 premiers gigas du disque IDE, dans des partitions classiques qui ne seront pas protégées par le RAID. J’ai choisi une Mandrakelinux 10.0 car à l’époque la Mandriva 2005 n’était pas encore disponible et la Mandrake 10.1 à un udev buggé qui ne créée pas les devices RAID (donc impossibilité d’activer automatiquement le RAID au démarrage).

Supposons à partir de maintenant que l’OS est installé, pour nous concentrer uniquement sur la configuration et la mise en route du RAID.

Étape 1: Formater les partitions

J’ai donc les devices suivants:

  • /dev/hda -> DD de 160 Go (40 Go pour l’OS et 120 Go de libre)
  • /dev/sda -> DD externe de 120 Go n°1
  • /dev/sdb -> DD externe de 120 Go n°2

Nous voulons créer une matrice RAID 5 à partir de 3 x 120 Go. Plutôt que de faire une seule grosse partition de 120 Go par disque, nous allons créer dans chacun des disques trois partitions de 40 Go (3 x 3 x 40 Go = 3 x 120 Go). Nous construirons ensuite 3 unités RAID 5 de 3 x 40 Go puis nous les assemblerons via LVM. L’intérêt de diviser nos grosses partitions en plus petites est de réduire considérablement (par un facteur 3 dans notre cas) le temps de régénération de nos unités RAID en cas de corruption d’une partition.

Les partitions à créer sont de type Linux RAID. On pourra éventuellement faire cela avec drakconf.

Étape 2: Configuration de la matrice RAID

Nous utiliserons mdadm pour la gestion de notre RAID.

Note: A partir de la 10.1, la version de webmin fournie avec la Mandrake supporte mdadm. Pour arriver à nos fins par ce moyen, on pourras s’inspirer d’un article sur la mise en place d’un RAID via webmin.

Installation de mdadm:

urpmi mdadm

Création des matrices:

mdadm --create --verbose /dev/md0 --level=5 --raid-devices=3 /dev/hda2 /dev/sda2 /dev/sdb2
mdadm --create --verbose /dev/md1 --level=5 --raid-devices=3 /dev/hda3 /dev/sda1 /dev/sdb1
mdadm --create --verbose /dev/md2 --level=5 --raid-devices=3 /dev/hda4 /dev/sda3 /dev/sdb3

Lors de la création, les paramètres par défaut sont suffisants. Pour information, les paramètres optimaux sont:

  • Parity: left symetric
  • Persistent super block
  • Chunk size: 32kb ou 64kb (pour nos partitions de 40 Go)

Éditons le fichier de configuration /etc/mdadm.conf:

DEVICE          /dev/sda*
DEVICE          /dev/sdb*
DEVICE          /dev/hda2
DEVICE          /dev/hda3
DEVICE          /dev/hda4
ARRAY           /dev/md0 devices=/dev/hda2,/dev/sda2,/dev/sdb2
ARRAY           /dev/md1 devices=/dev/hda3,/dev/sda1,/dev/sdb1
ARRAY           /dev/md2 devices=/dev/hda4,/dev/sda3,/dev/sdb3

Avant d’aller plus loin, il faut attendre que les matrices soient construites:

watch -n1 'cat /proc/mdstat'

Dans mon cas, cela à nécessité entre deux et trois heures pour chaque unité RAID.

Note: avec la Mandrake 10.1, lors de la création des matrices RAID, on aurait eu des problèmes du type raidstart failed : /dev/md1: No such file or directory, qui peuvent être résolus en créant les device manuellement:

mknod /dev/md0 b 9 0
mknod /dev/md1 b 9 1
mknod /dev/md2 b 9 2

Ces commandes créent les unités RAID dont nous avons besoin. Malheureusement elles ne sont pas autodetectées au démarrage donc, dans le cas d’une mdk 10.1, il aurait fallu faire cette manip à chaque démarrage de la machine. Voila une bonne raison pour ne pas utiliser la version 10.1.

Étape 3: Agréger les matrices RAID via LVM

J’ai choisi LVM pour agréger les unités RAID, pour bénéficier d’un redimensionnement flexible de mon espace disque, avant de parer à tous les scénarios possibles auxquels je serais confronté dans le futur. Il est tout à fait possible de faire la même chose avec du RAID linéaire (voir l’étape “3-bis” ci-après), mais dans ce cas on perd la une certaine souplesse au niveau des partitions.

Installation de LVM:

urpmi lvm2

On peut consulter la liste des disques parents sur le système avec lvmdiskscan.

Ensuite il faut créer un volume physique (PV = Physical Volume) pour chaque unité RAID:

pvcreate /dev/md0
pvcreate /dev/md1
pvcreate /dev/md2

Créons maintenant un groupe de volumes contenant nos trois partitions :

vgcreate vg01 /dev/md0 /dev/md1 /dev/md2

(marche pas ???)

Étape 3-bis: Agréger les matrices avec du RAID linéaire au lieu d’utiliser LVM

Si vous voulez utiliser du RAID linéaire plutôt que LVM, il faut créer une nouvelle unité RAID sur la base des trois premières:

mdadm --create --verbose /dev/md3 --level=linear --raid-devices=3 /dev/md0 /dev/md1 /dev/md2

Puis penser à mettre à jour /etc/mdadm.conf:

DEVICE          /dev/sda*
DEVICE          /dev/sdb*
DEVICE          /dev/hda2
DEVICE          /dev/hda3
DEVICE          /dev/hda4
DEVICE          /dev/md0
DEVICE          /dev/md1
DEVICE          /dev/md2
ARRAY           /dev/md0 devices=/dev/hda2,/dev/sda2,/dev/sdb2
ARRAY           /dev/md1 devices=/dev/hda3,/dev/sda1,/dev/sdb1
ARRAY           /dev/md2 devices=/dev/hda4,/dev/sda3,/dev/sdb3
ARRAY           /dev/md3 devices=/dev/md0,/dev/md1,/dev/md2

Étape 4: Créer le système de fichier

Formater en xfs:

mkfs.xfs -f /dev/md3

J’ai choisi xfs comme filesystem car il peut être agrandit à chaud, lorsque la partition est montée.

Pour monter le tout:

mkdir -p /mnt/data
mount /dev/md3 /mnt/data

Et enfin, pour le montage automatique au démarrage de notre serveur, il faut ajouter la ligne suivante à notre fichier /etc/fstab:

/dev/md3 /mnt/data xfs defaults 0 0

Maintenance du système

  • Réintégrer une partition dans la matrice.

    Si une partition est éjectée d’une unité raid (par exemple sda1 sur md1), il faut faire:

    cat /proc/mdstat
    mdadm --examine /dev/sda1
    mdadm /dev/md1 -a /dev/sda1
    cat /proc/mdstat
    

    La première commande montre que le RAID est dégradé. La seconde commande examine le status du disque qui à été éjecté de la matrice. La troisième ligne permet de réintégrer à chaud la partition dans la matrice. Et enfin la dernière commande nous montre l’avancement de la reconstruction de la matrice (ce qui peut prendre pas mal de temps).

  • Ré-assembler une matrice.

    La commande est du type:

    mdadm --stop /dev/md0
    mdadm --assemble /dev/md0
    

    Attention --assemble se base sur le fichier /etc/mdadm.conf.

  • Créer une unité RAID dégradée.

    La commande suivante créée une unité RAID 5 sur 3 disques durs, en indiquant que le premier est absent via le mot clé missing:

    mdadm --create /dev/md0 --level=5 --raid-devices=3 missing /dev/hda1 /dev/sda1
    

De la lecture complémentaire sur RAID 5 et LVM