Mounting Encrypted ZFS Datasets at Boot

When ZFS encryption was released in Solaris 11 Express, I went out and bought four 2 TB drives and moved all of my data to a fresh, fully-encrypted zpool. I don’t keep a lot of sensitive data, but it brings me peace of mind to know that, in the event of theft or worse, my data is secure.

I chose to protect the data keys using a passphrase as opposed to using a raw key on disk. In my opinion, the only safe key is one that’s inside your head (though the US v. Fricosu case has me reevaluating that). The downside is that Solaris will ignore passphrase-encrypted datasets at boot.

The thing is, I run several services that depend on the data stored in my encrypted ZFS datasets. When Solaris doesn’t mount those filesystems at boot, those services fail to start or come up in very weird states that I must recover from manually. I would rather pause the boot process to wait for me to supply the passphrase so those services come up properly. Fortunately this is possible with SMF!

All of the services I am concerned about depend on, in one way or another, the svc:/system/filesystem/local:default service, which is responsible for mounting all of the filesystems. That service, in turn, depends on the single-user milestone. So I just need to inject my own service between the single-user milestone and the system/filesystem/local service that fails when it doesn’t have the keys. That failure will pause the boot process until it is cleared.

I wrote a simple manifest that expresses the dependencies between single-user and system/filesystem/local:

<?xml version="1.0"?>
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">

<service_bundle type='manifest' name='nest'>

<service
   name='system/filesystem/nest'
   type='service'
   version='1'>

    <create_default_instance enabled='true' />
    <single_instance />

    <dependency
       name='single-user'
       grouping='require_all'
       restart_on='none'
       type='service'>
        <service_fmri value='svc:/milestone/single-user' />
    </dependency>

    <dependent
       name='nest-local'
       grouping='require_all'
       restart_on='none'>
        <service_fmri value='svc:/system/filesystem/local' />
    </dependent>

    <exec_method
       type='method'
       name='start'
       exec='/lib/svc/method/nest start'
       timeout_seconds='60' />

    <exec_method
       type='method'
       name='stop'
       exec=':true'
       timeout_seconds='60' />

    <property_group name='startd' type='framework'>
        <propval name='duration' type='astring' value='transient' />
    </property_group>

    <stability value='Unstable' />

    <template>
        <common_name>
            <loctext xml:lang='C'>Load key for 'nest' zpool</loctext>
        </common_name>
    </template>
</service>

</service_bundle>

and a script at /lib/svc/method/nest that gets called by SMF:

#!/sbin/sh

. /lib/svc/share/smf_include.sh

case "$1" in
    'start')
        if [ $(zfs get -H -o value keystatus nest) != "available" ]; then
            echo "Run '/usr/sbin/zfs key -lr nest && /usr/sbin/svcadm clear $SMF_FMRI'" | smf_console
            exit $SMF_EXIT_ERR_FATAL
        fi
        ;;

    *)
        echo "Usage: $0 start"
        exit $SMF_EXIT_ERR_CONFIG
        ;;
esac

exit $SMF_EXIT_OK

The script checks whether the keys are available, and if not, prints a helpful hint to the console. The whole thing looks something like this at boot:

SunOS Release 5.11 Version 11.0 64-bit
Copyright (c) 1983, 2011, Oracle and/or its affiliates. All rights reserved.
Hostname: falcon

Run '/usr/sbin/zfs key -lr nest && /usr/sbin/svcadm clear svc:/system/filesystem/nest:default'
May 30 14:31:06 svc.startd[11]: svc:/system/filesystem/nest:default: Method "/lib/svc/method/nest start" failed with exit status 95.
May 30 14:31:06 svc.startd[11]: system/filesystem/nest:default failed fatally: transitioned to maintenance (see 'svcs -xv' for details)

falcon console login: jlee
Password: 
falcon% sudo -s
falcon# /usr/sbin/zfs key -lr nest && /usr/sbin/svcadm clear svc:/system/filesystem/nest:default
Enter passphrase for 'nest': 
falcon#

When I get to the console shell, I can just copy and paste the command printed by the script. Once the service failure is cleared, SMF continues the boot process normally and all of my other services come up exactly as I’d expect.

No, it’s not very pretty, but I’d rather have a little bit of manual intervention during the boot process for as infrequently as I do it, than to have to clean up after services that come up without the correct dependencies. And with my new homemade LOM, it’s not too much trouble to run commands at the console, even remotely.

Persistent Search Domains With NWAM and DHCP

What I Want

I want to be able to refer to systems on both my home and work networks by their hostnames rather than their fully-qualified domain names, so, ‘prey’ instead of ‘prey.thestaticvoid.com’ and ‘acad2’ instead of ‘acad2.es.gwu.edu’.

NWAM Settings

The Problem

I would typically set my home and work domains as the search setting in /etc/resolv.conf. Unfortunately, either NWAM or the Solaris DHCP client (I haven’t decided which) overwrites resolv.conf on every new connection. DHCP on Linux does the same thing, but I can configure it by editing dhclient.conf (or whatever is being used these days, it’s been a while. I think I just set my domains in the NetworkManager GUI and forget about it).

The Solaris DHCP client configuration is not nearly as flexible, and neither is NWAM which gives you the option of replacing resolv.conf with information supplied by the DHCP server, or provided by you, but not a mix of both. I do like having the nameservers set by the DHCP server, so supplying a manual configuration is not an option.

What I Tried

The first thing I tried was setting the LOCALDOMAIN environmental variable in /etc/profile. From the resolv.conf man page:

You can override the search keyword of the system
resolv.conf file on a per-process basis by setting the
environment variable LOCALDOMAIN to a space-separated list
of search domains.

I thought, great, a way to manage domain search settings without worrying about what’s doing what to resolv.conf. It didn’t work as advertised:

% LOCALDOMAIN=thestaticvoid.com ping prey
ping: unknown host prey
% s touch /etc/resolv.conf
% LOCALDOMAIN=thestaticvoid.com ping prey
prey is alive
% LOCALDOMAIN=thestaticvoid.com ping prey
ping: unknown host prey

Next, I considered adding an NWAM Network Modifier to set my search string in resolv.conf after a new connection is established. This worked reasonably well, but didn’t handle the case when you switch from one network to another, for example, from wireless to wired. The only events in NWAM that can trigger a script when the network connection changes happens before DHCP messes up resolv.conf.

Finally, in the course of my testing, I discovered that the svc:/network/dns/client service was restarting with every network connection change. I looked into its manifest and saw that it was designed to wait for changes to resolv.conf:

<!--
 Wait for potential DHCP modification of resolv.conf.
-->
<dependency
   name='net'
   grouping='require_all'
   restart_on='none'
   type='service'>
    <service_fmri value='svc:/network/service' />
</dependency>

So I could write another service which depends on dns/client and restarts whenever dns/client does and I would have the last word about what goes into my configuration file!

My Solution

I wrote a service, svc:/network/dns/resolv-conf, with the following manifest:

<?xml version="1.0"?>
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">

<service_bundle type="manifest" name="dns-resolv-conf">
    <service name="network/dns/resolv-conf"
        type="service"
        version="1">
        <create_default_instance enabled="false" />
        <single_instance />

        <dependency name="dns-client"
            grouping="require_all"
            restart_on="restart"
            type="service">
            <service_fmri value="svc:/network/dns/client" />
        </dependency>

        <dependent name="resolv-conf"
            grouping="optional_all"
            restart_on="restart">
            <service_fmri value="svc:/milestone/name-services" />
        </dependent>

        <exec_method type="method"
            name="start"
            exec="/lib/svc/method/dns-resolv-conf start"
            timeout_seconds="60" />

        <exec_method type="method"
            name="stop"
            exec="/lib/svc/method/dns-resolv-conf stop"
            timeout_seconds="60" />

        <property_group name="options" type="application">
            <propval name="search" type="astring" value="" />
        </property_group>

        <property_group name="startd" type="framework">
            <propval name="duration" type="astring" value="transient" />
        </property_group>

        <stability value="Unstable" />

        <template>
            <common_name>
                <loctext xml:lang="C">resolv.conf Settings</loctext>
            </common_name>
            <documentation>
                <manpage title="resolv.conf" section="4"
                    manpath="/usr/share/man" />
            </documentation>
        </template>
    </service>
</service_bundle>

which calls the script, /lib/svc/method/dns-resolv-conf containing:

#!/sbin/sh

. /lib/svc/share/smf_include.sh

search=$(svcprop -p options/search $SMF_FMRI)

case "$1" in
    "start")
        # Don't do anything if search option not provided.
        [ "$search" == '""' ] && exit $SMF_EXIT_OK

        # Reverse the lines because we either want to:
        #   add the search line after the *last* domain line or
        #   add it to the very top of the file if there is no domain line
        tac /etc/resolv.conf | grep -v "^search" | gawk '
            /^domain/ {
                if (!isset) {
                    print "search", $2, search
                    isset=1
                }
            }

            END {
                if (!isset) {
                    print "search", search
                }
            }

            1
        '
search="$search" | tac > /etc/resolv.conf.new && mv -f /etc/resolv.conf.new /etc/resolv.conf
        ;;

    "stop")
        # Just get rid of any search lines, I guess.
        grep -v "^search" /etc/resolv.conf > /etc/resolv.conf.new && mv -f /etc/resolv.conf.new /etc/resolv.conf
        ;;

    *)
        echo "Usage: $0 { start | stop }"
        exit $SMF_EXIT_ERR_CONFIG
esac

exit $SMF_EXIT_OK

So now I can set my search options like:

% svccfg -s resolv-conf setprop 'options/search="thestaticvoid.com es.gwu.edu"'
% svcadm refresh resolv-conf
% svcadm enable resolv-conf
% cat /etc/resolv.conf
domain  iss.gwu.edu
search iss.gwu.edu thestaticvoid.com es.gwu.edu
nameserver  161.253.152.50
nameserver  128.164.141.12

Problem solved! Or at least worked-around in the least hacky way I can!

Start Virtual NICs on OpenSolaris Boot

One of the more frustrating things I deal with on OpenSolaris is that every time I reboot, I have to manually bring up each virtual network interface in order to start all of my zones. There is a bug report for this problem that says a fix will be integrated into b132, which is just a few weeks away, but in the mean time, I’ve whipped up an SMF service to handle this for me. Create a file vnic.xml:

<?xml version="1.0"?>
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">

<service_bundle type='manifest' name='vnic'>

<service
    name='network/vnic'
    type='service'
    version='1'>

    <dependency
        name='network_service'
        grouping='require_all'
        restart_on='none'
        type='service'>
        <service_fmri value='svc:/network/service' />
    </dependency>

    <dependent
        name='network_vnic'
        grouping='optional_all'
        restart_on='none'>
        <service_fmri value='svc:/system/zones' />
    </dependent>

    <exec_method
        type='method'
        name='start'
        exec='/usr/sbin/dladm up-vnic ${SMF_FMRI/*:/}'
        timeout_seconds='60' />

    <exec_method
        type='method'
        name='stop'
        exec=':true'
        timeout_seconds='60' />

    <property_group name='startd' type='framework'>
        <propval name='duration' type='astring' value='transient' />
    </property_group>

    <stability value='Unstable' />

    <template>
        <common_name>
            <loctext xml:lang='C'>
            Virtual Network Interface
            </loctext>
        </common_name>
        <documentation>
            <manpage title='dladm' section='1M'
                manpath='/usr/share/man' />
        </documentation>
    </template>
</service>

</service_bundle>

This service should run sometime after the network is started but before the zones are started. Load it in with svccfg -v import vnic.xml and create an instance of the service for each of the VNICs that you want to start. For example, if you want to start vnic0 on boot:

# svccfg -s vnic add vnic0
# svcadm refresh vnic0
# svcadm enable vnic0

UPDATE: Build 132 is out an this functionality has been integrated as the svc:/network/datalink-management:default service. The services that were added above can be removed by running svccfg delete vnic.

Mixer State in OpenSolaris

I’ve recently installed OpenSolaris on my desktop and noticed that my volume settings do not persist between reboots. A quick search revealed that that functionality hasn’t been implemented yet. The thread suggested using the mixerctl command to save and restore the mixer state so I’ve thrown together an SMF service to do it automatically on boot and shutdown.

First, the script which should go into /lib/svc/method/sound-mixer:

#!/sbin/sh

. /lib/svc/share/smf_include.sh
smf_is_globalzone || exit $SMF_EXIT_OK

ctl_file=$(svcprop -p options/ctl_file $SMF_FMRI)

case "$1" in
'start')
        if [ ! -f $ctl_file ]; then
                echo "Mixer control file $ctl_file does not exist."
                exit $SMF_EXIT_OK
        fi

        if ! /usr/sbin/mixerctl -r $ctl_file; then
                echo "Error restoring mixer state."
                exit $SMF_EXIT_OK
        fi
        ;;

'stop')
        if ! /usr/sbin/mixerctl -f -s $ctl_file; then
                echo "Error saving mixer state."
                exit $SMF_EXIT_OK
        fi
        ;;

*)
        echo "Usage: $0 { start | stop }"
        exit $SMF_EXIT_ERR_CONFIG
        ;;
esac

exit $SMF_EXIT_OK

Second, the manifest which can be saved anywhere and loaded with svccfg -v import <manifest>:

<?xml version="1.0"?>
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">

<service_bundle type='manifest' name='mixer'>

<service
       name='system/sound/mixer'
       type='service'
       version='1'>

        <create_default_instance enabled='true' />
        <single_instance />

        <dependency
           name='fs-local'
           grouping='require_all'
           restart_on='none'
           type='service'>
                <service_fmri value='svc:/system/filesystem/local' />
        </dependency>
       
        <dependency
           name='device-audio'
           grouping='require_all'
           restart_on='none'
           type='service'>
                <service_fmri value='svc:/system/device/audio' />
        </dependency>

        <exec_method
               type='method'
               name='start'
               exec='/lib/svc/method/sound-mixer start'
               timeout_seconds='60' />

        <exec_method
               type='method'
               name='stop'
               exec='/lib/svc/method/sound-mixer stop'
               timeout_seconds='60' />

        <property_group name='options' type='application'>
                <propval name='ctl_file' type='astring' value='/etc/sound/mixer.state' />
        </property_group>

        <property_group name='startd' type='framework'>
                <propval name='duration' type='astring' value='transient' />
        </property_group>

        <stability value='Unstable' />

        <template>
                <common_name>
                        <loctext xml:lang='C'>Mixer State Saver</loctext>
                </common_name>
                <documentation>
                        <manpage title='mixerctl' section='1M'
                           manpath='/usr/share/man' />
                </documentation>
        </template>

</service>

</service_bundle>

UPDATE: In b130, the audioctl command replaces mixerctl. In the sound-mixer script above, change /usr/sbin/mixerctl -r $ctl_file to /usr/bin/audioctl load-controls $ctl_file and /usr/sbin/mixerctl -f -s $ctl_file to /usr/bin/audioctl save-controls -f $ctl_file.