Migrating existing RAID1 volumes to bigger drives

Every once in a while hardware needs to be replaced. Sometimes boards, sometimes CPUs and sometimes drives. Drives are the ones most difficult to replace as they hold all that precious data one does not want to lose.

I personally hate reinstalling stuff; so I try as much as possible to “migrate” my data around with the least effort (from my part) and the least downtime.

Preparations

For clarity, the system details are:

  • /dev/sda, /dev/sdb – are the two new 3TB drives
  • /dev/sdc, /dev/sdd – are another raid set which is not affected by this procedure
  • /dev/sde, /dev/sdf – are the two old 500GB drives, with 2 raid-1 arrays on them (md0 and md1)

Raid volumes:

  • /dev/md0 (raid 1) made up of sde1 and sdf1, to be migrated to sda1 and sdb1
  • /dev/md1 (raid 1) made up of sde2 and sdf2, to be migrated to sda2 and sdb2
  • the other arrays are not affected by the transition

You’ll be able to get the drive designations in order by re-arranging the cables/drives/ports before starting the new procedure (Linux raid is smart enough to recognise its drives no matter the ports/connectors they’re on).

This will only require one reboot so the downtime should be minimal. The plan is that the rest of the procedure be done on the live system without more downtime (not even when removing the old drives, if your board/system supports hotplug).

When making changes to important data and/or on a live system, ALWAYS TRIPLE CHECK targets before performing destructive operations. And ALWAYS HAVE BACKUPS. Otherwise you can lose everything within seconds.

Create the new partition table on one of the new bigger drives. You can use either fdisk or gparted or any other tool you like.

Since my drives are larger than 2TB (and I’ll be using a partition that’s also larger than 2TB) I had to use the new GPT partition table format.

partitions-3tbTo quickly clone the partition table to the second new drive, with GPT partition tables use
sgdisk --backup=table /dev/sda
sgdisk --load-backup=table /dev/sdb
sgdisk -G /dev/sdb

where sda is source and sdb is destination. The last command generates a new UUID on the second disk. sgdisk is not yet available in most repositories but can be installed right from the source. (Edit: As of 2015.07.13, the download links on the official site are non-functional; you can find the correct repository here).

For md-dos partition tables,
sfdisk -d /dev/sda | sfdisk /dev/sdb
will work just fine (sda is source, sdb is destination).

Changing the drives in the arrays

First off disable write intent bitmap (if enabled).
mdadm --grow /dev/md0 --bitmap=none
mdadm --grow /dev/md1 --bitmap=none

Now it’s time to add the partitions on the new drives to the arrays:
mdadm --add /dev/md0 /dev/sda1
mdadm --add /dev/md0 /dev/sdb1
mdadm --grow --raid-devices=4 /dev/md0

Then the second array
mdadm --add /dev/md1 /dev/sda2
mdadm --add /dev/md1 /dev/sdb2
mdadm --grow --raid-devices=4 /dev/md1

Now we wait for the data to sync (this will take minutes to hours depending on size). The sync status can be monitored with
watch cat /proc/mdstat

If you notice the sync speed is too low (and you expect the disks to be capable of more speed), you can check the sync limits that are in place with:
sysctl dev.raid.speed_limit_min
sysctl dev.raid.speed_limit_max

You could, for example, increase the minimum limit with
sysctl -w dev.raid.speed_limit_min=value

A couple of hours later, when the migration is complete it’s time to distance yourself from the old disks and remove them from the arrays (but don’t be too haste in throwing them away; they have a backup of all your data, after all).

Continue by making the new disks bootable.

Then soft-fail and remove the old disk from the arrays:
mdadm --fail /dev/md0 /dev/sde1
mdadm --remove /dev/md0 /dev/sde1
mdadm --fail /dev/md0 /dev/sdf1
mdadm --remove /dev/md0 /dev/sdf1
mdadm --grow --raid-devices=2 /dev/md0

mdadm --fail /dev/md1 /dev/sde2
mdadm --remove /dev/md1 /dev/sde2
mdadm --fail /dev/md1 /dev/sdf2
mdadm --remove /dev/md1 /dev/sdf2
mdadm --grow --raid-devices=2 /dev/md1

Check /proc/mdstat again to make sure you removed the right disks and everything is correct and synced.

You can now disconnect the old drives (don’t erase them just yet, they still hold all your data from a few hours ago).

Enlarging the arrays to use up all the new available space

This step is a simplified procedure of my previous enlarging a RAID1 array tutorial.

Start by checking that you can live-resize the array. You can do this with mdadm –examine /dev/sda1 (and /dev/sdb1)

If the available space is the same as the used space, you’ll need to use the offline procedure described in the tutorial linked above.

Enlarge the arrays using
maadm --grow /dev/md0 --size=max
mdadm --grow /dev/md1 --size=max

Check that the arrays has been resized with –examine. The arrays will sync once more (wait for them to finish).

Then grow the filesystems to the available space
resize2fs /dev/md0
resize2fs /dev/md1

Re-enable write intent bitmap (to improve sync speeds and fault tolerance)
mdadm --grow /dev/md0 --bitmap=internal
mdadm --grow /dev/md1 --bitmap=internal

Update mdmadm.conf
mdadm --examine --scan >> /etc/mdadm/mdadm.conf

Schedule a filesystem check at the next reboot just to keep things clean
touch /forcefsck

To reboot right away and perform a filesystem check
shutdown -rF now

Leave a Reply