The Solaris 11 Experience So Far

I have a system (a zone on which this blog is hosted) that has been running the same installation of Solaris since 11/11/2009, starting with OpenSolaris 2009.06. In the time since, it has seen every public build of OpenSolaris, then OpenIndiana, and finally Solaris 11 Express. Now, exactly two years later, I’ve updated it to Solaris 11 11/11, and I’d like to share my experience so far.

The update itself did not go smoothly. I was sitting at Solaris 11 Express SRU 8 and thought, like every update I’ve done in the past, that I could just run pkg image-update. Silly me, because when I did and then rebooted, the kernel panicked. No big deal, that’s what boot environments are for. I reverted to the previous boot environment and found some helpful documentation that told me to do exactly what I just did. It turns out that there is no way to update to SRU 13 using the support repositories because they already contain the Solaris 11 11/11 packages, and pkg tries to pull some of them in. And there is no way to update just pkg because the ips-consolidation prevents it, and trying to update the ips-consolidation pulls the entire package which breaks everything just the same. In short, Oracle bungled it. The only way to update to SRU 13 that I could see was to download the SRU 13 repository ISO from My Oracle Support and set up a local repository. Once I was on SRU 13, I could continue with the update to the 11/11 release. But there were more surprises in store for me.

First, it looks like pkg decided to start enforcing consistent attributes on files shared by multiple packages. Fine, I can understand that. As a result, I had to remove a lot of my custom packages (mostly from spec-files-extra) which I’ll have to rebuild. Second, pkg decided it doesn’t like the opensolaris.org packages anymore so I had to uninstall OpenOffice.org. Also fair enough.

Happily, after that, the updates got applied successfully and the system rebooted into the 11/11 release. Next came the zone updates. When I did the normal zoneadm -z foo detach && zoneadm -z foo attach -u deal, I was told I had to convert my zones to a new ZFS structure which more closely matches the global zone. The script /usr/lib/brand/shared/dsconvert actually worked flawlessly and the updated zones came up fine.

Unfortunately I couldn’t SSH into my zones because my DNS server didn’t know where they were. It seems that with the updated networking framework, DHCP doesn’t request a hostname anymore. (/etc/default/dhcpagent still says inet <hostname> can be put in /etc/hostname.<if> to request the hostname.) I found that you can create an addr object that requests a hostname with ipadm create-addr -T dhcp -h <hostname> <addrobj>, but NWAM pretty much won’t let you create or modify anything with ipadm, and there were no options for requesting hostnames with nwamcfg. As a result, I had to disable NWAM (netadm enable -p ncp DefaultFixed) and then I could set up the interface with ipadm. Why doesn’t Solaris request hostnames by default? Not very “cloud-like” if you ask me.

I have to say, I’m impressed by the way global zones and non-global zones are linked in the new release. Zone updates were an obvious shortcoming of previous releases. We’ll see how well it works when Solaris 11 Update 1 comes out.

What else…I lost my ability to pfexec to root. Oracle removed the “Primary Administrator” profile for security reasons so I had to install sudo. Not a big deal, I just wish they had said something a little louder about it.

Also, whatever update to pkg happened, it wiped out my repositories under /var/pkg. I had to restore them from a snapshot. Bad Oracle!

I’m also a little confused about some of the changes to the way networking settings are stored. For example, when I first booted the global zone, I found that my NFSv4 domain name was reset by NWAM. I set it to what it should be with sharectl set -p nfsmapid_domain=thestaticvoid.com nfs, but is that going to be overwritten again by NWAM? Also, the name resolver settings are now stored in the svc:/network/dns/client:default service, and according to the documentation, DHCP will set the service properties properly, but I have yet to see this work.

And the last problem I’ll mention is that the update removed my virtual consoles. I had to install the virtual-console package to restore them.

Overall, I’m happy that I was at least able to update to the latest release. Oracle could have cut off any update path from OpenSolaris. However, the update should have been a lot smoother. It doesn’t speak well of future updates when I can’t even update from one supported release (SRU 8) to another. I also wish Oracle were more open about upcoming changes (as in, having more preview releases or, dare I say it, opening development the way OpenSolaris was). Even to me, a long time pre-Solaris 11 user, the changes to zones and networking are huge in this release, and I would rather have not been so surprised by them.

Wireless 802.1X Support in Solaris

The George Washington University (where I work and study) has recently implemented 802.1X to secure its wireless networks. 802.1X defines support for EAP over Ethernet (including wireless) and the WPA standards define several modes of EAP that can be used.

Solaris (I’m referring to version 11, OpenSolaris, OpenIndiana, and Illumos) supports WPA. It modified an early version of wpa_supplicant and called it “wpad“. However, they seemed to make a point of stripping out all EAP support in wpad.

So when my Network Security instructor said we had to do a term project of our choosing relating to network security, I decided I’d try to get 802.1X working in Solaris. To do this, I decided I could either add the EAP bits back into wpad, or add the Solaris-specific bits to the latest version of wpa_supplicant. wpad is based on very old code. It’s not even clear which version of wpa_supplicant it is based on, and there is no record of the massive amount of changes they made. It would be too hard for me to figure out where to plug EAP back in, and who knows how many bugs and security vulnerabilities were fixed upstream that we’d be missing out on.

Fortunately, wpa_supplicant is very modular, and reasonably well documented. I was able to graft the older Solaris code onto the newer interfaces. The result of my work is currently maintained in my own branch at GitHub. It’s not perfect, but it works (and I’ll explain how). Solaris has a very limited public API for wireless support and my goal was to get wpa_supplicant working without having to modify any system libraries or the kernel. I struggled to figure out some idiosyncrasies such as:

  • Events (association, disassociation, etc.) are only sent to wpa_supplicant when WPA is enabled in the driver.
  • Full scan results are only available when WPA is disabled in the driver.
  • Scan results don’t provide nearly as much information as their Linux counterparts do, such as access point capabilities, signal strength, noise levels, etc. I was very worried I wouldn’t be able to fill out the scan results structure fully and wpa_supplicant would refuse to work without complete information.

Here is how you can get 802.1X support working on your Solaris laptop:

  1. Install the wpa_supplicant package from my package repository:
    # pkg set-publisher -p http://pkg.thestaticvoid.com/
    # pkg install wpa_supplicant
    
  2. Add the configuration for your protected wireless networks to /etc/wpa_supplicant.conf. Here is mine:

    ctrl_interface=/var/run/wpa_supplicant
    ctrl_interface_group=0
    ap_scan=0

    network={
        ssid="prey"
        key_mgmt=WPA-PSK
        psk="<network key>"
    }

    network={
        ssid="GW1X"
        key_mgmt=WPA-EAP
        eap=TTLS
        identity="jameslee"
        anonymous_identity="anonymous"
        password="<personal password>"
        phase2="auth=PAP"
    }

    The most important thing here is ap_scan=0. This tells wpa_supplicant not to do any scanning or association of its own. Those tasks will be handled by dladm and NWAM.

  3. Backup /usr/lib/inet/wpad and replace it with this script:

    #!/bin/sh

    interface=`echo $@ | /usr/bin/sed 's/.*-i *\([a-z0-9]*\).*/\1/'`
    exec /usr/sbin/wpa_supplicant -Dsolaris -i$interface -c/etc/wpa_supplicant.conf -s &

Now connect to a wireless network with NWAM or dladm. When prompted for a network key, enter anything; it won’t be used. The actual keys will be looked up in /etc/wpa_supplicant.conf. Here is an example of me connecting to my 802.1X-secured network using dladm:

# dladm connect-wifi -e GW1X -s wpa -k nwam-GW1X iwh0
# dladm show-wifi
LINK       STATUS            ESSID               SEC    STRENGTH   MODE   SPEED
iwh0       connected         GW1X                wpa    excellent  g      54Mb

-k nwam-GW1X” refers to a dummy key setup by NWAM. dladm will complain if it’s not supplied a key.

That should be it!

Future Directions

Obviously, the integration of wpa_supplicant and NWAM/dladm leaves a lot to be desired. If there is sufficient interest, I will start looking into how to modify the dladm security framework in Illumos to include EAP related configurations (keys, certificates, identities; it’s all much more complicated than the single pre-shared key that dladm supports now). My hope, though, is that Oracle is already working on this. Do you hear that Oracle?

Fun With vpnc

I recently got a new laptop at work and I decided to put OpenSolaris on it. This meant I had to setup vpnc in order to access the server networks and wireless here. I installed my vpnc package, copied the profile from my Ubuntu workstation, and started it up. It connected, but no packets flowed. I didn’t have time to investigate, so I decided to work on it some more at home.

The strange thing is that it connected from home with the very same profile and everything worked fine. I immediately suspected something was wrong with the routing tables, like maybe some of the routes installed by vpnc-script were conflicting with the routes necessary to talk to the VPN concentrator. I endlessly compared the routing tables between work and home and my working Ubuntu workstation, removing routes, adding routes, and manually constructing the routing table until I was positive it could not be that.

Everything I pinged worked. I could ping the concentrator. I could ping the gateway. I could ping the tunnel device. I could ping the physical interface—or so I thought.

As I was preparing to write a message to the vpnc-devel mailing list requesting help, I did some pings to post the output in the email. I ran

$ ping <concentrator ip>
<concentrator ip> is alive

which looked good, but I wanted the full ping output, so I ran

$ ping -s <concentrator ip>
PING <concentrator ip>: 56 data bytes
^C
----<concentrator ip> PING Statistics----
4 packets transmitted, 1 packets received, 75% packet loss
round-trip (ms)  min/avg/max/stddev = 9223372036854776.000/0.000/0.000/-NaN

For some reason, only the first ping was getting through. The rest were getting hung up somewhere. The really strange thing was that I saw the same behavior on the local physical interface:

$ ifconfig bge0
bge0: flags=1004843 mtu 1500 index 3
        inet 161.253.143.151 netmask ffffff00 broadcast 161.253.143.255
$ ping -s 161.253.143.151
PING 161.253.143.151: 56 data bytes
^C
----161.253.143.151 PING Statistics----
5 packets transmitted, 1 packets received, 80% packet loss
round-trip (ms)  min/avg/max/stddev = 9223372036854776.000/0.000/0.000/-NaN

I have never seen a situation where you couldn’t even ping a local physical interface! I checked and double checked that IPFilter wasn’t running. Finally I started a packet capture of the physical interface to see what was happening to my pings:

# snoop -d bge0 icmp
Using device bge0 (promiscuous mode)
161.253.143.151 -> <concentrator ip> ICMP Destination unreachable (Bad protocol 50)
161.253.143.151 -> <concentrator ip> ICMP Destination unreachable (Bad protocol 50)
161.253.143.151 -> <concentrator ip> ICMP Destination unreachable (Bad protocol 50)
^C

That’s when by chance I saw messages being sent to the VPN concentrator saying “bad protocol 50.” IP protocol 50 represents “ESP”, commonly used for IPsec. Apparently Solaris eats these packets. Haven’t figured out why.

I remembered seeing something in the vpnc manpage about ESP packets:

--natt-mode <natt/none/force-natt/cisco-udp>

      Which NAT-Traversal Method to use:
      o    natt -- NAT-T as defined in RFC3947
      o    none -- disable use of any NAT-T method
      o    force-natt -- always use NAT-T encapsulation  even
           without presence of a NAT device (useful if the OS
           captures all ESP traffic)
      o    cisco-udp -- Cisco proprietary UDP  encapsulation,
           commonly over Port 10000

I enabled force-natt mode, which encapsulates the ESP packet in a UDP packet, normally to get past NAT, and it started working! In retrospect, I should have been able to figure that out much easier. First, it pretty much says it on the vpnc homepage: “Solaris (7 works, 9 only with –natt-mode forced).” I didn’t even notice that. Second, I should have realized that I was behind a NAT at home and not at work, so they would be using a different NAT-traversal mode by default. Oh well, it was a good diagnostic exercise, hence the post to share the experience.

In other vpnc related news, I’ve ported Kazuyoshi’s patch to the open_tun and solaris_close_tun functions of OpenVPN to the tun_open and tun_close functions of vpnc. His sets up the tunnel interface a little bit differently and adds TAP support. It solves the random problems vpnc had with bringing up the tunnel interface such as:

# ifconfig tun0
tun0: flags=10010008d0<POINTOPOINT,RUNNING,NOARP,MULTICAST,IPv4,FIXEDMTU> mtu 1412 index 8
        inet 128.164.xxx.yy --> 128.164.xxx.yy netmask ffffffff 
        ether f:ea:1:ff:ff:ff
# ifconfig tun0 up
ifconfig: setifflags: SIOCSLIFFLAGS: tun0: no such interface
# dmesg | grep tun0
Jul 23 14:56:05 swan ip: [ID 728316 kern.error] tun0: DL_BIND_REQ failed: DL_OUTSTATE

The changes are in the latest vpnc package available from my package repository.

VPNC for OpenSolaris

I’ve compiled VPNC and the requisite TUN/TAP driver for OpenSolaris so that I can access my work network from home. Kazuyoshi’s driver adds TAP functionality to the original TUN driver which hasn’t been updated in nine years. It’s a real testament to the stability of the OpenSolaris kernel ABI that the module still compiles, loads, and works properly.

All of the software can be installed from my repository onto build 111 or higher:

$ pfexec pkg set-publisher -O http://pkg.thestaticvoid.com/ thestaticvoid
$ pfexec pkg install vpnc

The tun driver should load automatically and create /dev/tun. Now create a VPN profile configuration in /etc/vpnc/. The configuration contains a lot of private information so I’m not going to share mine here, but /etc/vpnc/default.conf is a good start.

One thing I do like to do is make sure only certain subnets are tunneled through the VPN. That way connecting to the VPN doesn’t interrupt any connections that are already established (for example, AIM). To do that I create a script /etc/vpnc/gwu-networks-script containing

#!/bin/sh

# Only tunnel GWU networks through VPN
CISCO_SPLIT_INC=2
CISCO_SPLIT_INC_0_ADDR=161.253.0.0
CISCO_SPLIT_INC_0_MASK=255.255.0.0
CISCO_SPLIT_INC_0_MASKLEN=16
CISCO_SPLIT_INC_0_PROTOCOL=0
CISCO_SPLIT_INC_0_SPORT=0
CISCO_SPLIT_INC_0_DPORT=0
CISCO_SPLIT_INC_1_ADDR=128.164.0.0
CISCO_SPLIT_INC_1_MASK=255.255.0.0
CISCO_SPLIT_INC_1_MASKLEN=16
CISCO_SPLIT_INC_1_PROTOCOL=0
CISCO_SPLIT_INC_1_SPORT=0
CISCO_SPLIT_INC_1_DPORT=0

. /etc/vpnc/vpnc-script

then add Script /etc/vpnc/gwu-networks-script to the end of my VPN profile configuration.

Connecting to the VPN you should see messages like:

$ pfexec vpnc gwu
Enter password for jameslee@<no>: 
which: no ip in (/sbin:/usr/sbin:/usr/gnu/bin:/usr/bin:/usr/sbin:/sbin)
which: no ip in (/sbin:/usr/sbin:/usr/gnu/bin:/usr/bin:/usr/sbin:/sbin)
add net 128.164.<no>: gateway 128.164.<no>
add host 128.164.<no>: gateway 161.253.<no>
add net 161.253.0.0: gateway 128.164.<no>
add net 128.164.0.0: gateway 128.164.<no>
add net 128.164.<no>: gateway 128.164.<no>
add net 128.164.<no>: gateway 128.164.<no>
VPNC started in background (pid: 594)...

The vpnc-script will modify your /etc/resolv.conf and routing tables so be sure to run vpnc-disconnect when you are done with the connection to restore the original configuration.

Thanks to the good folks at OpenConnect for a well-maintained vpnc-script which works on Solaris. Spec files for these packages are available from my GitHub repository if you want to roll your own.

MusicBrainz Picard

MusicBrainz along with the Picard tagger is without a doubt the best way to organize and manage large collections of music. The tagger will fingerprint audio files and automatically correct their metadata and filenames.

I’ve been using MusicBrainz since 2005, and even attempted to write my own tagger for it in Java back when Picard didn’t exist. When I switched to OpenSolaris, it was one of the programs I missed the most. So I went about building a package for it.

Unfortunately, the software has a lot of complicated dependencies such as Qt and FFmpeg which aren’t included in OpenSolaris either. FFmpeg I can understand; it infringes on countless software patents <insert rant here>. But Qt? There’s no reason for that. It is easily the second most popular graphics toolkit for Unix. Sure, the Solaris KDE guys have a build of it, but it installs to a non-standard prefix and doesn’t include 64-bit libs. No thank you.

Anyway, the package and its dependencies are up on my package repository for b132 and later. You know the deal…pfexec pkg install picard. Spec files are, as always, available from my GitHub repository.

Now that I have a good start on the FFmpeg package, I’m going to keep working on it, adding support for more codecs and eventually build MPlayer so I can stop using this guy’s less-than-ideal build.

EDIT: Just FYI, in order to get nice antialiased fonts in Qt applications, I had to modify the fontconfig settings. This is not necessary for GTK+ applications because they get their settings from the gnome-appearance-properties dialog. So in ~/.fonts.conf add:

<?xml version="1.0"?>
<!DOCTYPE fontconfig SYSTEM "fonts.dtd">
<fontconfig>
<!--  Use the Antialiasing -->
  <match target="font">
    <edit name="antialias" mode="assign"><bool>true</bool></edit>
  </match>
</fontconfig>

Other Qt appearance settings can be changed from the qtconfig dialog.

Music Player Daemon on OpenSolaris

MPD is essential software for me. It’s one of the few music players out there for Unix that does gapless playback and ReplayGain. It’s also nice that, because it’s a daemon, I’m not bound to any particular interface. Fortunately, there is a really good one in the form of Sonata.

MPD is not included in OpenSolaris yet, so last weekend I built some packages for it. The build has been stable for me and I’m happy with the state of the packages so I thought I’d share them. First add my package repository:

$ pfexec pkg set-authority -O http://pkg.thestaticvoid.com/ thestaticvoid

MPD

This package and its dependencies require OpenSolaris 2009.06 or newer. Install it by typing pfexec pkg install mpd. The following formats are supported:

$ mpd -V
...
Supported decoders:
[mad] mp3 mp2
[vorbis] ogg oga
[oggflac] ogg oga
[flac] flac
[audiofile] wav au aiff aif
[faad] aac
[mp4] m4a mp4
[mpcdec] mpc
[wavpack] wv

Supported outputs:
shout null fifo ao solaris httpd 

Supported protocols:
file:// http://

I plan on adding ffmpeg support soon which will add support for even more codecs.

To run MPD, create a configuration file in your home directory like

port                    "6600"
music_directory         "~/music"
playlist_directory      "~/.mpd/playlists"
db_file                 "~/.mpd/mpd.db"
log_file                "~/.mpd/mpd.log"

Create any directories from the configuration file that don’t exist, such as ~/.mpd/playlists and start the daemon by running mpd ~/.mpdconf as your user. It will immediately build a library of your music.

Alternatively, mpd can be run system-wide, which just seems more appropriate to me for whatever reason. The only complicated part about this is that you have to give MPD permission to write to the audio device. Edit /etc/logindevperms, find the /dev/sound/* lines and change the mode to 0666 so that they look like:

/dev/console    0666    /dev/sound/*        # audio devices
/dev/vt/active  0666    /dev/sound/*        # audio devices

Logout and log back in for the settings to take effect. Then modify /etc/mpd.conf to your liking and start the daemon by typing svcadm enable mpd. You may have to svcadm refresh manifest-import for SMF to load the mpd manifest.

mpdscribble

I also built a package for mpdscribble which is a mature, well-maintained scrobbler for Last.fm. Install it by typing pfexec pkg install mpdscribble. Set your Last.fm or Libre.fm username and password in /etc/mpdscribble.conf and start the daemon with svcadm enable mpdscribble. That’s all there is to it.

Sonata

Sonata is a lightweight cilent for MPD. Looks pretty nice too:

Sonata

Because Sonata requires Python 2.5, and OpenSolaris 2009.06 only really supports Python 2.3, this package requires build 127 or newer. Install it by typing pfexec pkg install sonata. It can be launched from the Applications->Sound & Video menu.