How I Do Encrypted, Mirrored ZFS Root on Linux

Update: No more keyfiles!

I am done with Solaris. A quick look through this blog should be enough to see how much I like Solaris. So when I say “I’m done,” I want to be perfectly clear as to why: Oracle. As long as Oracle continues to keep Solaris’ development and code under wraps, I cannot feel comfortable using it or advocating for it, and that includes at work, where we are paying customers. I stuck with it up until now, waiting for a better alternative to come about, and now that ZFS is stable on Linux, I’m out.

I’ve returned to my first love, Gentoo. Well, more specifically, I’ve landed in Funtoo, a variant of Gentoo. I learned almost everything I know about Linux on Gentoo, and being back in its ecosystem feels like coming back home. Funtoo, in particular, addresses a lot of the annoyances that made me leave Gentoo in the first place by offering more stable packages of core software like the kernel and GCC. Its Git-based Portage tree is also a very nice addition. But it was Funtoo’s ZFS documentation and community that really got my attention.

The Funtoo ZFS install guide is a nice starting point, but my requirements were a bit beyond the scope of the document. I wanted:

  • redundancy handled by ZFS (that is, not by another layer like md),
  • encryption using a passphrase, not a keyfile,
  • and to be prompted for the passphrase once, not for each encrypted device.

My solution is depicted below:

ZFS Root Diagram

A small block device is encrypted using a passphrase. The randomly initialized contents of that device are then in turn used as a keyfile for unlocking the devices that make up the mirrored ZFS rpool. Not pictured is Dracut, the initramfs that takes care of assembling the md RAID devices, unlocking the encrypted devices, and mounting the ZFS root at boot time.

Here is a rough guide for doing it yourself:

  1. Partition the disks.
    Without going in to all the commands, use gdisk to make your first disk look something like this:

    # gdisk -l /dev/sda
    ...
    Number  Start (sector)    End (sector)  Size       Code  Name
       1            2048         1026047   500.0 MiB   FD00  Linux RAID
       2         1026048         1091583   32.0 MiB    EF02  BIOS boot partition
       3         1091584         1099775   4.0 MiB     FD00  Linux RAID
       4         1099776       781422734   372.1 GiB   8300  Linux filesystem
    

    Then copy the partition table to the second disk:

    # sgdisk --backup=/tmp/table /dev/sda
    # sgdisk --load-backup=/tmp/table /dev/sdb
    # sgdisk --randomize-guids /dev/sdb
    

    If your system uses EFI rather than BIOS, you won’t need a BIOS boot partition, so adjust your partition numbers accordingly.

  2. Create the md RAID devices for /boot and the keyfile.
    # mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sda1 /dev/sdb1
    # mdadm --create /dev/md1 --level=1 --raid-devices=2 /dev/sda3 /dev/sdb3
    
  3. Set up the crypt devices.
    First, the keyfile:

    # cryptsetup -c aes-xts-plain64 luksFormat /dev/md1
    # cryptsetup luksOpen /dev/md1 keyfile
    # dd if=/dev/urandom of=/dev/mapper/keyfile
    

    Then, the ZFS vdevs:

    # cryptsetup -c aes-xts-plain64 luksFormat /dev/sda4 /dev/mapper/keyfile
    # cryptsetup -c aes-xts-plain64 luksFormat /dev/sdb4 /dev/mapper/keyfile
    # cryptsetup -d /dev/mapper/keyfile luksOpen /dev/sda4 rpool-crypt0
    # cryptsetup -d /dev/mapper/keyfile luksOpen /dev/sdb4 rpool-crypt1
    
  4. Format and mount everything up.
    # zpool create -O compression=on -m none -R /mnt/funtoo rpool mirror rpool-crypt0 rpool-crypt1
    # zfs create rpool/ROOT
    # zfs create -o mountpoint=/ rpool/ROOT/funtoo
    # zpool set bootfs=rpool/ROOT/funtoo rpool
    # zfs create -o mountpoint=/home rpool/home
    # zfs create -o volblocksize=4K -V 2G rpool/swap
    # mkswap -f /dev/zvol/rpool/swap
    # mkfs.ext2 /dev/md0
    # mkdir /mnt/funtoo/boot && mount /dev/md0 /mnt/funtoo/boot
    

    Now you can chroot and install Funtoo as you normally would.

When it comes time to finish the installation and set up the Dracut initramfs, there is a number of things that need to be in place. First, the ZFS package must be installed with the Dracut module. The current ebuild strips it out for some reason. I have a bug report open to fix that.

Second, /etc/mdadm.conf must be populated so that Dracut knows how to reassemble the md RAID devices. That can be done with the command mdadm --detail --scan > /etc/mdadm.conf.

Third, /etc/crypttab must be created so that Dracut knows how to unlock the encrypted devices:

keyfile /dev/md1 none luks
rpool-crypt0 /dev/sda4 /dev/mapper/keyfile luks
rpool-crypt1 /dev/sdb4 /dev/mapper/keyfile luks

Finally, you must tell Dracut about the encrypted devices required for boot. Create a file, /etc/dracut.conf.d/devices.conf containing:

add_device="/dev/md1 /dev/sda4 /dev/sdb4"

Once all that is done, you can build the initramfs using the command dracut --hostonly. To tell Dracut to use the ZFS root, add the kernel boot parameter root=zfs. The actual filesystem it chooses to mount is determined from the zpool’s bootfs property, which was set above.

And that’s it!

Now, I go a little further by creating a set of Puppet modules to do the whole thing for me. Actually, I can practically install a whole system from scratch with one command thanks to Puppet.

I also have a script that runs after boot to close the keyfile device. You’ve got to protect that thing.

# cat /etc/local.d/keyfile.start
#!/bin/sh
/sbin/cryptsetup luksClose keyfile

I think one criticism that could be leveled against this setup is that all the data on the ZFS pool gets encrypted and decrypted twice. That is because the redundancy comes in at a layer higher than the crypt layer. A way around that would be to set it up the other way around: encrypt a md RAID device and build the ZFS pool on top of that. Unfortunately, that comes at the cost of ZFS’s self healing capabilities. Until encryption support comes to ZFS directly, that’s the trade-off we have to make. In practice, though, the double encryption of this setup doesn’t make a noticeable performance impact.

UPDATE

I should mention that I’ve learned that Dracut is much smarter than I would have guessed, and it will let you enter a passphrase once and it will try it on all of the encrypted devices. This eliminates the need for the keyfile in my case, so I’ve updated all of my systems to simply use the same passphrase on all of the encrypted devices. I have found it to be a simpler and more reliable setup.