The Solaris 11 Experience So Far

I have a system (a zone on which this blog is hosted) that has been running the same installation of Solaris since 11/11/2009, starting with OpenSolaris 2009.06. In the time since, it has seen every public build of OpenSolaris, then OpenIndiana, and finally Solaris 11 Express. Now, exactly two years later, I’ve updated it to Solaris 11 11/11, and I’d like to share my experience so far.

The update itself did not go smoothly. I was sitting at Solaris 11 Express SRU 8 and thought, like every update I’ve done in the past, that I could just run pkg image-update. Silly me, because when I did and then rebooted, the kernel panicked. No big deal, that’s what boot environments are for. I reverted to the previous boot environment and found some helpful documentation that told me to do exactly what I just did. It turns out that there is no way to update to SRU 13 using the support repositories because they already contain the Solaris 11 11/11 packages, and pkg tries to pull some of them in. And there is no way to update just pkg because the ips-consolidation prevents it, and trying to update the ips-consolidation pulls the entire package which breaks everything just the same. In short, Oracle bungled it. The only way to update to SRU 13 that I could see was to download the SRU 13 repository ISO from My Oracle Support and set up a local repository. Once I was on SRU 13, I could continue with the update to the 11/11 release. But there were more surprises in store for me.

First, it looks like pkg decided to start enforcing consistent attributes on files shared by multiple packages. Fine, I can understand that. As a result, I had to remove a lot of my custom packages (mostly from spec-files-extra) which I’ll have to rebuild. Second, pkg decided it doesn’t like the opensolaris.org packages anymore so I had to uninstall OpenOffice.org. Also fair enough.

Happily, after that, the updates got applied successfully and the system rebooted into the 11/11 release. Next came the zone updates. When I did the normal zoneadm -z foo detach && zoneadm -z foo attach -u deal, I was told I had to convert my zones to a new ZFS structure which more closely matches the global zone. The script /usr/lib/brand/shared/dsconvert actually worked flawlessly and the updated zones came up fine.

Unfortunately I couldn’t SSH into my zones because my DNS server didn’t know where they were. It seems that with the updated networking framework, DHCP doesn’t request a hostname anymore. (/etc/default/dhcpagent still says inet <hostname> can be put in /etc/hostname.<if> to request the hostname.) I found that you can create an addr object that requests a hostname with ipadm create-addr -T dhcp -h <hostname> <addrobj>, but NWAM pretty much won’t let you create or modify anything with ipadm, and there were no options for requesting hostnames with nwamcfg. As a result, I had to disable NWAM (netadm enable -p ncp DefaultFixed) and then I could set up the interface with ipadm. Why doesn’t Solaris request hostnames by default? Not very “cloud-like” if you ask me.

I have to say, I’m impressed by the way global zones and non-global zones are linked in the new release. Zone updates were an obvious shortcoming of previous releases. We’ll see how well it works when Solaris 11 Update 1 comes out.

What else…I lost my ability to pfexec to root. Oracle removed the “Primary Administrator” profile for security reasons so I had to install sudo. Not a big deal, I just wish they had said something a little louder about it.

Also, whatever update to pkg happened, it wiped out my repositories under /var/pkg. I had to restore them from a snapshot. Bad Oracle!

I’m also a little confused about some of the changes to the way networking settings are stored. For example, when I first booted the global zone, I found that my NFSv4 domain name was reset by NWAM. I set it to what it should be with sharectl set -p nfsmapid_domain=thestaticvoid.com nfs, but is that going to be overwritten again by NWAM? Also, the name resolver settings are now stored in the svc:/network/dns/client:default service, and according to the documentation, DHCP will set the service properties properly, but I have yet to see this work.

And the last problem I’ll mention is that the update removed my virtual consoles. I had to install the virtual-console package to restore them.

Overall, I’m happy that I was at least able to update to the latest release. Oracle could have cut off any update path from OpenSolaris. However, the update should have been a lot smoother. It doesn’t speak well of future updates when I can’t even update from one supported release (SRU 8) to another. I also wish Oracle were more open about upcoming changes (as in, having more preview releases or, dare I say it, opening development the way OpenSolaris was). Even to me, a long time pre-Solaris 11 user, the changes to zones and networking are huge in this release, and I would rather have not been so surprised by them.

Persistent Search Domains With NWAM and DHCP

What I Want

I want to be able to refer to systems on both my home and work networks by their hostnames rather than their fully-qualified domain names, so, ‘prey’ instead of ‘prey.thestaticvoid.com’ and ‘acad2’ instead of ‘acad2.es.gwu.edu’.

NWAM Settings

The Problem

I would typically set my home and work domains as the search setting in /etc/resolv.conf. Unfortunately, either NWAM or the Solaris DHCP client (I haven’t decided which) overwrites resolv.conf on every new connection. DHCP on Linux does the same thing, but I can configure it by editing dhclient.conf (or whatever is being used these days, it’s been a while. I think I just set my domains in the NetworkManager GUI and forget about it).

The Solaris DHCP client configuration is not nearly as flexible, and neither is NWAM which gives you the option of replacing resolv.conf with information supplied by the DHCP server, or provided by you, but not a mix of both. I do like having the nameservers set by the DHCP server, so supplying a manual configuration is not an option.

What I Tried

The first thing I tried was setting the LOCALDOMAIN environmental variable in /etc/profile. From the resolv.conf man page:

You can override the search keyword of the system
resolv.conf file on a per-process basis by setting the
environment variable LOCALDOMAIN to a space-separated list
of search domains.

I thought, great, a way to manage domain search settings without worrying about what’s doing what to resolv.conf. It didn’t work as advertised:

% LOCALDOMAIN=thestaticvoid.com ping prey
ping: unknown host prey
% s touch /etc/resolv.conf
% LOCALDOMAIN=thestaticvoid.com ping prey
prey is alive
% LOCALDOMAIN=thestaticvoid.com ping prey
ping: unknown host prey

Next, I considered adding an NWAM Network Modifier to set my search string in resolv.conf after a new connection is established. This worked reasonably well, but didn’t handle the case when you switch from one network to another, for example, from wireless to wired. The only events in NWAM that can trigger a script when the network connection changes happens before DHCP messes up resolv.conf.

Finally, in the course of my testing, I discovered that the svc:/network/dns/client service was restarting with every network connection change. I looked into its manifest and saw that it was designed to wait for changes to resolv.conf:

<!--
 Wait for potential DHCP modification of resolv.conf.
-->
<dependency
   name='net'
   grouping='require_all'
   restart_on='none'
   type='service'>
    <service_fmri value='svc:/network/service' />
</dependency>

So I could write another service which depends on dns/client and restarts whenever dns/client does and I would have the last word about what goes into my configuration file!

My Solution

I wrote a service, svc:/network/dns/resolv-conf, with the following manifest:

<?xml version="1.0"?>
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">

<service_bundle type="manifest" name="dns-resolv-conf">
    <service name="network/dns/resolv-conf"
        type="service"
        version="1">
        <create_default_instance enabled="false" />
        <single_instance />

        <dependency name="dns-client"
            grouping="require_all"
            restart_on="restart"
            type="service">
            <service_fmri value="svc:/network/dns/client" />
        </dependency>

        <dependent name="resolv-conf"
            grouping="optional_all"
            restart_on="restart">
            <service_fmri value="svc:/milestone/name-services" />
        </dependent>

        <exec_method type="method"
            name="start"
            exec="/lib/svc/method/dns-resolv-conf start"
            timeout_seconds="60" />

        <exec_method type="method"
            name="stop"
            exec="/lib/svc/method/dns-resolv-conf stop"
            timeout_seconds="60" />

        <property_group name="options" type="application">
            <propval name="search" type="astring" value="" />
        </property_group>

        <property_group name="startd" type="framework">
            <propval name="duration" type="astring" value="transient" />
        </property_group>

        <stability value="Unstable" />

        <template>
            <common_name>
                <loctext xml:lang="C">resolv.conf Settings</loctext>
            </common_name>
            <documentation>
                <manpage title="resolv.conf" section="4"
                    manpath="/usr/share/man" />
            </documentation>
        </template>
    </service>
</service_bundle>

which calls the script, /lib/svc/method/dns-resolv-conf containing:

#!/sbin/sh

. /lib/svc/share/smf_include.sh

search=$(svcprop -p options/search $SMF_FMRI)

case "$1" in
    "start")
        # Don't do anything if search option not provided.
        [ "$search" == '""' ] && exit $SMF_EXIT_OK

        # Reverse the lines because we either want to:
        #   add the search line after the *last* domain line or
        #   add it to the very top of the file if there is no domain line
        tac /etc/resolv.conf | grep -v "^search" | gawk '
            /^domain/ {
                if (!isset) {
                    print "search", $2, search
                    isset=1
                }
            }

            END {
                if (!isset) {
                    print "search", search
                }
            }

            1
        '
search="$search" | tac > /etc/resolv.conf.new && mv -f /etc/resolv.conf.new /etc/resolv.conf
        ;;

    "stop")
        # Just get rid of any search lines, I guess.
        grep -v "^search" /etc/resolv.conf > /etc/resolv.conf.new && mv -f /etc/resolv.conf.new /etc/resolv.conf
        ;;

    *)
        echo "Usage: $0 { start | stop }"
        exit $SMF_EXIT_ERR_CONFIG
esac

exit $SMF_EXIT_OK

So now I can set my search options like:

% svccfg -s resolv-conf setprop 'options/search="thestaticvoid.com es.gwu.edu"'
% svcadm refresh resolv-conf
% svcadm enable resolv-conf
% cat /etc/resolv.conf
domain  iss.gwu.edu
search iss.gwu.edu thestaticvoid.com es.gwu.edu
nameserver  161.253.152.50
nameserver  128.164.141.12

Problem solved! Or at least worked-around in the least hacky way I can!