Yesterday I assembled a new ZFS array. It’s a 7x 1TB SATA RAID-Z1 array. Having to replace a failed drive within the first hour of the array’s life, I decided adding a hot spare would be a good idea. After installing it in the server (hotplugging) I checked with dmesg that it was detected as drive /dev/sdi. After adding the drive with zpool add tank spare sdi and running zpool status tank, everything looked right as rain.

After rebooting the machine, I once again checked the output of zpool status tank and was told that the entire array was unavailable! One drive, /dev/sdb was unavailable, and the other 7 had “corrupt metadata”. I was shocked by what I was seeing but I was pretty sure of the cause. To test my hunch I shutdown the system, removed the spare from the machine, started it back up, and removed the spare from the pool by running zpool remove tank /dev/sdi. (You can’t use the shorthand zpool remove tank sdi here. Because /dev/sdi doesn’t actually exist, you have to be explicit in the drive to remove.) After removing the spare, everything worked as before and no data errors were generated.

The problem arose from the fact that the drive was being detected by the kernel before the others. This caused the new drive to be sdb, what used to be drive sdb to become sdc, and so on down the line. Since every drive was the wrong letter, all the metadata on each drive was technically incorrect. The question becomes, how do we fix this and avoid it in the future?

The answer is quite simple. Anyone familiar with Linux, UNIX, BSD or any other form of *NIX would know that there is a folder of symlinks in /dev/disk/ which are named using properties of the disks themselves. Of note is /dev/disk/by-id/ which uses the drive’s model and serial number. The trick to to specify those in the command when creating the array and adding spares. That doesn’t help my situation though. I already have an array that uses the standard /dev/sdX nomenclature. This is where the wonder of ZFS comes in.

ZFS has the ability to export and import storage pools so that they can be moved from server to server. One thing to note is that the drive names are not expected to remain the same during such a move. Aha! We just need to export the pool using zpool export tank and then reimport the pool using zpool import tank -d /dev/disk/by-id/. The -d flag is used to tell ZFS to look for drives in the specified directory instead of /dev/. This will rebuild the array using the new drive identifiers, which can be seen by running zpool status tank. Now no matter what order the drives are loaded in, those links never change. Try it yourself: unplug and rearrange all your SATA cables and nothing will break. Now to get back to installing my hot spare, I just ran zpool add tank spare /dev/disk/by-id/some-long-name-i-dont-remember. After that I just added the hdparm -S 150 /dev/disk/by-id/some-long-name-i-dont-remember rule to /etc/rc.local to spin down the disk and that was it.