Mounting Encrypted ZFS Datasets at Boot

When ZFS encryption was released in Solaris 11 Express, I went out and bought four 2 TB drives and moved all of my data to a fresh, fully-encrypted zpool. I don’t keep a lot of sensitive data, but it brings me peace of mind to know that, in the event of theft or worse, my data is secure.

I chose to protect the data keys using a passphrase as opposed to using a raw key on disk. In my opinion, the only safe key is one that’s inside your head (though the US v. Fricosu case has me reevaluating that). The downside is that Solaris will ignore passphrase-encrypted datasets at boot.

The thing is, I run several services that depend on the data stored in my encrypted ZFS datasets. When Solaris doesn’t mount those filesystems at boot, those services fail to start or come up in very weird states that I must recover from manually. I would rather pause the boot process to wait for me to supply the passphrase so those services come up properly. Fortunately this is possible with SMF!

All of the services I am concerned about depend on, in one way or another, the svc:/system/filesystem/local:default service, which is responsible for mounting all of the filesystems. That service, in turn, depends on the single-user milestone. So I just need to inject my own service between the single-user milestone and the system/filesystem/local service that fails when it doesn’t have the keys. That failure will pause the boot process until it is cleared.

I wrote a simple manifest that expresses the dependencies between single-user and system/filesystem/local:

<?xml version="1.0"?>
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">

<service_bundle type='manifest' name='nest'>

<service
   name='system/filesystem/nest'
   type='service'
   version='1'>

    <create_default_instance enabled='true' />
    <single_instance />

    <dependency
       name='single-user'
       grouping='require_all'
       restart_on='none'
       type='service'>
        <service_fmri value='svc:/milestone/single-user' />
    </dependency>

    <dependent
       name='nest-local'
       grouping='require_all'
       restart_on='none'>
        <service_fmri value='svc:/system/filesystem/local' />
    </dependent>

    <exec_method
       type='method'
       name='start'
       exec='/lib/svc/method/nest start'
       timeout_seconds='60' />

    <exec_method
       type='method'
       name='stop'
       exec=':true'
       timeout_seconds='60' />

    <property_group name='startd' type='framework'>
        <propval name='duration' type='astring' value='transient' />
    </property_group>

    <stability value='Unstable' />

    <template>
        <common_name>
            <loctext xml:lang='C'>Load key for 'nest' zpool</loctext>
        </common_name>
    </template>
</service>

</service_bundle>

and a script at /lib/svc/method/nest that gets called by SMF:

#!/sbin/sh

. /lib/svc/share/smf_include.sh

case "$1" in
    'start')
        if [ $(zfs get -H -o value keystatus nest) != "available" ]; then
            echo "Run '/usr/sbin/zfs key -lr nest && /usr/sbin/svcadm clear $SMF_FMRI'" | smf_console
            exit $SMF_EXIT_ERR_FATAL
        fi
        ;;

    *)
        echo "Usage: $0 start"
        exit $SMF_EXIT_ERR_CONFIG
        ;;
esac

exit $SMF_EXIT_OK

The script checks whether the keys are available, and if not, prints a helpful hint to the console. The whole thing looks something like this at boot:

SunOS Release 5.11 Version 11.0 64-bit
Copyright (c) 1983, 2011, Oracle and/or its affiliates. All rights reserved.
Hostname: falcon

Run '/usr/sbin/zfs key -lr nest && /usr/sbin/svcadm clear svc:/system/filesystem/nest:default'
May 30 14:31:06 svc.startd[11]: svc:/system/filesystem/nest:default: Method "/lib/svc/method/nest start" failed with exit status 95.
May 30 14:31:06 svc.startd[11]: system/filesystem/nest:default failed fatally: transitioned to maintenance (see 'svcs -xv' for details)

falcon console login: jlee
Password: 
falcon% sudo -s
falcon# /usr/sbin/zfs key -lr nest && /usr/sbin/svcadm clear svc:/system/filesystem/nest:default
Enter passphrase for 'nest': 
falcon#

When I get to the console shell, I can just copy and paste the command printed by the script. Once the service failure is cleared, SMF continues the boot process normally and all of my other services come up exactly as I’d expect.

No, it’s not very pretty, but I’d rather have a little bit of manual intervention during the boot process for as infrequently as I do it, than to have to clean up after services that come up without the correct dependencies. And with my new homemade LOM, it’s not too much trouble to run commands at the console, even remotely.

One thought on “Mounting Encrypted ZFS Datasets at Boot

Leave a Reply