How-to Recover a RAID array after having Zero-ized Superblocks

Today mdadm send me a mail to warn that one of my hard drive (/dev/hdd1) was ejected from my RAID-5 array. After some manipulations (no writes, just reads on the file system to get informations) and reboots, I ended up with a file system in a strange state: the folder structure was totally messed up and lots of files disappeared.

Assuming that this situation was about an inconsistent file index, I decided to reset the superblocks of the remaining physical disks:

mdadm --zero-superblock /dev/hdc1
mdadm --zero-superblock /dev/hdb1

I don’t know why I decided to do so, but it was the stupidest idea of the week. After such a violent treatment, my array refused to start:

[root@localhost ~]$ mdadm --assemble /dev/md0 --auto --scan --update=summaries --verbose
mdadm: looking for devices for /dev/md0
mdadm: no RAID superblock on /dev/hdc1
mdadm: /dev/hdc1 has wrong raid level.
mdadm: no RAID superblock on /dev/hdb1
mdadm: /dev/hdb1 has wrong raid level.
mdadm: no devices found for /dev/md0

At this moment I was sure that all my data assets were lost. I was desperate. My only alternative was to ask Google. So I did.

I spend several minutes browsing the web without hope. I finally found someone in the same situation as mine (sorry, in french) on debian-user-french mailing list.

The solution was to recreate the RAID array. This sound counter-intuitive: if we recreate a raid array over an existing one, it will be erased ! Right ? Wrong ! As it is said on debian-user-french, mdadm is smart enough to “see” that HDD of the new array were elements of a previous one. Knowing that, mdadm will try to do its best (i.e. if parameters match the previous array configuration) and rebuild the new array upon the previous one in a non-destructive way, by keeping HDD content.

So, here is how I finally recovered my RAID array:

[root@localhost ~]$ mdadm --create /dev/md0 --verbose --level=5 --raid-devices=3 /dev/hdc1 missing /dev/hdb1
mdadm: layout defaults to left-symmetric
mdadm: chunk size defaults to 64K
mdadm: size set to 312568576K
mdadm: array /dev/md0 started.

Of course this doesn’t solve my initial problem about the /dev/md0 file system: it is still in an altered state. Maybe it’s too late to recover data. But at least I reverted all my today’s mistakes, and the situation will not deteriorate until I power up my RAID ! :)

34 thoughts on “How-to Recover a RAID array after having Zero-ized Superblocks

  1. have zeroed superblocks, rebuilt 4 2tb raid 5 twice now, replacing different 2tb drives in different orders (twice) both times once superblocks have been zero’d (which is interesting as original superblocks would be recognized briefly if each ‘missing’ superblock 2tb drive was hot unplugged; then plugged – superblock would show perfect ‘clean’ md info!) but they have been zero’ed twice now; first time out of order of original layout (by mistake) and second time zero’d and rebuilt in orderd as remembered. No avail.

    No recognized fs before array rebuild, or after. all drives have always been stated as ‘clean’ even before superblocks have been zero’d.

    Any ideas would be great. approx 5tb and 20years of data lost at moment. Thank you for great help so far!

  2. Thanks for posting your notes. I found my Fedora 16 x86_64 system unresponsive this morning and had to hard boot it.

    During bootup dracut reported that /dev/md1 couldn’t be found.

    Boot into rescue and the auto discovery process for Linux partitions couldn’t locate the RAID 5 with it’s Logical Volumes, either.

    mdadm --create /dev/md1 --verbose --level=5 --raid-devices=3 /dev/sda1 /dev/sdb1 /dev/sdc1
    

    This found that each of the devices were “part of a raid array” I answered yes to the Continue creating array after which cat /proc/mdstat showed my raid5 array in recovery.

    Whoop.

    This has happened before on this system, usually over a weekend where the desktop environment sits idle for a couple of days. I suspect some kind of power saving is kicking in on the Fedora 16 desktop. Perhaps it’s trying to sleep the hard disks and somethings getting mucked.

    I also noticed I had CStates enabled in the bios, disabling that for future testing.

    Thanks again for posting your notes.

  3. Pingback: Recover mdadm raid1 data after recreate new array over old

  4. Thanks for this post, I’ve zeroed my superblock to shrink my RAID-1 array in 2 steps. 1 hard drive after the other.
    After a reboot in rescue mode, I couldn’t assemble the array. But creating with –create made it perfect :) Now it is synchronizing :)

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>