Mounting Encrypted ZFS Datasets at Boot

When ZFS encryption was released in Solaris 11 Express, I went out and bought four 2 TB drives and moved all of my data to a fresh, fully-encrypted zpool. I don’t keep a lot of sensitive data, but it brings me peace of mind to know that, in the event of theft or worse, my data is secure.

I chose to protect the data keys using a passphrase as opposed to using a raw key on disk. In my opinion, the only safe key is one that’s inside your head (though the US v. Fricosu case has me reevaluating that). The downside is that Solaris will ignore passphrase-encrypted datasets at boot.

The thing is, I run several services that depend on the data stored in my encrypted ZFS datasets. When Solaris doesn’t mount those filesystems at boot, those services fail to start or come up in very weird states that I must recover from manually. I would rather pause the boot process to wait for me to supply the passphrase so those services come up properly. Fortunately this is possible with SMF!

All of the services I am concerned about depend on, in one way or another, the svc:/system/filesystem/local:default service, which is responsible for mounting all of the filesystems. That service, in turn, depends on the single-user milestone. So I just need to inject my own service between the single-user milestone and the system/filesystem/local service that fails when it doesn’t have the keys. That failure will pause the boot process until it is cleared.

I wrote a simple manifest that expresses the dependencies between single-user and system/filesystem/local:

<?xml version="1.0"?>
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">

<service_bundle type='manifest' name='nest'>

<service
   name='system/filesystem/nest'
   type='service'
   version='1'>

    <create_default_instance enabled='true' />
    <single_instance />

    <dependency
       name='single-user'
       grouping='require_all'
       restart_on='none'
       type='service'>
        <service_fmri value='svc:/milestone/single-user' />
    </dependency>

    <dependent
       name='nest-local'
       grouping='require_all'
       restart_on='none'>
        <service_fmri value='svc:/system/filesystem/local' />
    </dependent>

    <exec_method
       type='method'
       name='start'
       exec='/lib/svc/method/nest start'
       timeout_seconds='60' />

    <exec_method
       type='method'
       name='stop'
       exec=':true'
       timeout_seconds='60' />

    <property_group name='startd' type='framework'>
        <propval name='duration' type='astring' value='transient' />
    </property_group>

    <stability value='Unstable' />

    <template>
        <common_name>
            <loctext xml:lang='C'>Load key for 'nest' zpool</loctext>
        </common_name>
    </template>
</service>

</service_bundle>

and a script at /lib/svc/method/nest that gets called by SMF:

#!/sbin/sh

. /lib/svc/share/smf_include.sh

case "$1" in
    'start')
        if [ $(zfs get -H -o value keystatus nest) != "available" ]; then
            echo "Run '/usr/sbin/zfs key -lr nest && /usr/sbin/svcadm clear $SMF_FMRI'" | smf_console
            exit $SMF_EXIT_ERR_FATAL
        fi
        ;;

    *)
        echo "Usage: $0 start"
        exit $SMF_EXIT_ERR_CONFIG
        ;;
esac

exit $SMF_EXIT_OK

The script checks whether the keys are available, and if not, prints a helpful hint to the console. The whole thing looks something like this at boot:

SunOS Release 5.11 Version 11.0 64-bit
Copyright (c) 1983, 2011, Oracle and/or its affiliates. All rights reserved.
Hostname: falcon

Run '/usr/sbin/zfs key -lr nest && /usr/sbin/svcadm clear svc:/system/filesystem/nest:default'
May 30 14:31:06 svc.startd[11]: svc:/system/filesystem/nest:default: Method "/lib/svc/method/nest start" failed with exit status 95.
May 30 14:31:06 svc.startd[11]: system/filesystem/nest:default failed fatally: transitioned to maintenance (see 'svcs -xv' for details)

falcon console login: jlee
Password: 
falcon% sudo -s
falcon# /usr/sbin/zfs key -lr nest && /usr/sbin/svcadm clear svc:/system/filesystem/nest:default
Enter passphrase for 'nest': 
falcon#

When I get to the console shell, I can just copy and paste the command printed by the script. Once the service failure is cleared, SMF continues the boot process normally and all of my other services come up exactly as I’d expect.

No, it’s not very pretty, but I’d rather have a little bit of manual intervention during the boot process for as infrequently as I do it, than to have to clean up after services that come up without the correct dependencies. And with my new homemade LOM, it’s not too much trouble to run commands at the console, even remotely.

My Homemade LOM

In one of the final classes of my CS master’s program, Embedded Computing, we were required to complete a semester project of our choosing involving embedded systems. Like in previous semester projects, I wanted to do something that I would actually be able to use after the class ended.

This time around I chose to build a lights-out manager for my Sun Ultra 24 server (which this blog is hosted on). With a LOM I can control the system’s power and access its serial console so I will be able to perform OS updates remotely, among other things.

Since I already owned one and I didn’t want to spend a lot of money, I chose to develop the project on top of the Arduino platform. I like the size of the Arduino, and the availability of different shields to minimize soldering. It’s also able to be powered by USB, which is perfect because the Ultra 24 has an internal USB port that always supplies power.

To make things a little more challenging for myself (because the Arduino is pretty easy on its own), I chose to implement a hardware UART to communicate with the Ultra 24’s serial port. Specifically, I chose to use the Maxim MAX3110E SPI UART and RS-232 transceiver. Great little chip.

For communication with the outside world, I bought an Ethernet shield from Freetronics. It’s compatible with the official Arduino Ethernet shield, but includes a fix to allow the network module to work with other SPI devices (such as my UART) on the same bus. I started to implement the network UI using Telnet, but after realizing I would have to translate the serial console data from VT100 to NVT, I switched to Rlogin, which is like Telnet, but assumes like-to-like terminal types.

Lastly, for controling the system’s power, I figured out how to tap into the Ultra 24’s power LED and switch. Using the LED, I can check whether the system is on or off, and using the switch circuit and a transistor, I can power the system on and off. I managed to do this without affecting the operation of the front panel buttons/LEDs.

I’ll spare you all of the implementation details (if you’re interested, you can read my report). Suffice it to say, the thing works as well as I could have imagined. Here is a screenshot of me using the serial console on my workstation:

From my research, the serial and power motherboard headers are the same on most modern Intel systems, so this LOM should work on more than just an Ultra 24. If you want to build one of your own, my code is available on GitHub and the hardware schematic is in the report I linked to above.

ITT I Try to Justify Upgrading My Camera

I’ve said it before, and I’ll say it again: I am not a photographer. I am just a guy who enjoys taking high quality photos of the places I go and the things I enjoy doing. A couple years ago I picked up a Canon EOS Rebel T2i, a sharp 15-85mm f/3.5-5.6 zoom lens, and everyone’s favorite, the 50mm f/1.4, before a two-week trip to Japan. In the time since, I’ve taken thousands of photos and really enjoyed learning the technology and different photographic techniques. The T2i is really impressive, and is still a better camera than I am a photographer. That said, I’ve always known that I would want to upgrade at some point. The T2i is too small for my large hands to hold securely or comfortably, and I am not really fond of its plasticky build.

Meanwhile, I went through a couple hundred photos from a recent trip to Belgium with my girlfriend, and though I really like what I was able to capture, and I’m satisfied with the way they turned out, I can’t help but notice a distinct point-and-shoot quality about them. Obviously they can’t all be winners, but in the spur of the moment, it’s all too easy for me to let my zoom lens do all the work at the expense of good composition.

Looking back at my photos, I noticed that I like the ones I’ve taken with my 50mm lens the most. The fixed focal length forces me to really think about how I want the photo to look, and I have to move my feet in order to get it. Its large aperture allows me to take photos at a lower ISO level in low light, which means less sensor noise. And its shallow depth-of-field potential stimulates my creativity. I think Kai from DigitalRev explains it best:

However, with my T2i’s APS-C-sized sensor, the field-of-view of the 50mm lens is more like an 80mm lens on a full-frame sensor, which is really narrow for indoor shooting. I would have to stand all the way across the room to get anything more than just a headshot. So I started to look at what equivalent “normal” lenses I could get for my T2i.

I settled on the Canon EF 28mm f/1.8, and I’m generally very happy with it. It’s sharp, fast, and well made. However, the difference in the depth-of-field between the 50mm and 28mm is very noticeable, even at f/1.8. It’s much harder to get that nicely blurred background unless you’re within a couple feet of your subject with the 28mm. That is just the nature of wider angle lenses.

Meanwhile, Canon just released the brand-new full-frame EOS 5D Mark III and prices for the three-year-old 5D Mark II are dropping. I never would have considered getting a full-frame camera before (that’s just silly—I’m an amateur and full-frame cameras are for the pros, right?), but the prices are not much more than the 7D now. The 5D Mark II is still a great camera. One of the few things people complain about is its poor auto-focus performance. Fortunately, I learned early to use center-point and back-button focus, and I don’t shoot sports, so I couldn’t care less about auto-focus performance.

The 5D Mark II is better in almost every way compared to my T2i. It’s very well built and will fit my hands, so I’ll enjoy holding and using it. And as a bonus, the full-frame sensor will enable me to get the depth-of-field I’m used to on my 50mm lens with the field-of-view similar to my 28mm, which can open a whole new world of possibilities for me.

All of this is just to say: I think I’m going to upgrade to a 5D Mark II. I think now is the time. I am serious about improving my photography and I think sticking with only prime lenses for a while will help. It takes a huge variable out of the equation (focal length) so hopefully I can concentrate on the more important things. In fact, I’ve already sold my zoom lens. Between the money from the sale of my lens, the money I should be able to get from selling my T2i, and some credit card cash back, I will be able to pick up the 5D Mark II for a good price. It will be an early graduation present to myself. And if it turns out not to be right for me, photo gear keeps its value pretty well, so I can always sell it.

I’m not crazy for wanting to upgrade, am I?

UPDATE 05/09/2012: I managed to pick up a factory refurbished 5D Mark II during Canon’s friends and family sale for a ridiculously good price ($1596 after taxes and shipping). Canon’s refurbs, if you don’t know, are like new and mine was no exception. I sold my T2i and 15-85mm lens for $1000 after eBay and Paypal took their cut, and I allowed myself to buy a refurbed EF 100mm f/2 USM lens and Speedlite 580EX II during the same Canon sale.

And the result? Well, I feel like I’m already starting to make some good improvements. Shooting with primes often forces me to think more creatively about composition and perspective to get the shot I want. Then taking the shots into Lightroom helps me do minor white balance and color corrections to really make them pop. And finally, the flash is just a lot of fun.

 

Setting Up My Retirement Investments with TIAA-CREF

This is a long post mostly so I can look back and remember what I did, but I’m posting it publicly in case anyone is in a similar position and could benefit from my research.

About a year ago I became eligible for my employer’s retirement plan, and at the time I was completely overwhelmed by this whole new world of mutual fund investments. Not only was I given a large selection of funds to choose from, but I was also given the choice of providers: Fidelity and TIAA-CREF. I did a little research, but couldn’t really decide what to do, so I just selected the defaults, Fidelity with the 2050 target fund, and let it sit.

Then, over the Thanksgiving weekend, I started reading the long-term investment thread at Something Awful where it quickly became apparent that (1.) choosing assets to invest in doesn’t have to be that hard, (2.) expense ratios (what a fund costs) matter a lot, and (3.) high-cost actively managed funds in general don’t perform better than their lower-cost index-based equivalents.

So I took another look at what was available to me in Fidelity, and I didn’t like what I saw. Two stock index funds, each tracking the S&P 500; one REIT index fund, and one bond index fund. The rest of the funds available were actively managed with expenses around 0.8% or more. Even the few Vanguard funds available to me through Fidelity seemed obscure and expensive.

Then I remembered about TIAA-CREF. I pulled up a list of their offerings, and it made me a little more optimistic. Not only are their options cheaper overall, but they also have index funds available for more market segments. Really the worst thing that could be said after an initial look is that they could have more international representation, but it would turn out that, for me, it doesn’t really matter.

I started reading TIAA-CREF’s literature and taking their asset allocation (AA) quizzes, you know the ones. Well, considering I’m only 25 and I have a good 40 years until retirement, I would consider myself more willing to take on risk than someone a little older. Their quiz would have me put 86% into equities, 9% in real estate, and 5% in bonds. Why 9% real estate? Why not ten, or eight? And can 5% of your portfolio really affect anything?

At this point, I should note that TIAA’s Real Estate Account (TREA) is a unique investment vehicle among providers in that it invests directly in commercial real estate, rather than in companies that manage real estate as the more risky REIT funds do. There is really nothing else like it, and that made it hard to reconcile with other popular AA tips I found on the internet, such as having a simple three-fund portfolio. I wanted to know whether I should include real estate, if I should follow TIAA-CREF’s advice for AA, why they chose the numbers they did, and also why the Bogleheads advocate slightly more conservatives allocations. I may be young, but I don’t want to turn the risk up to 11 just because I can. I want managed risk that I can understand.

So I picked up a copy of The Intelligent Asset Allocator by William Bernstein, hoping it would shed some light on my questions. To be honest, I was hoping it would have The Answer in it, that it would point me to The Optimal Asset Allocation. Thankfully, it did a whole lot better than that. It stated in no uncertain terms that there is no such thing as an optimal asset allocation (except in retrospect), and anyone claiming to have one is conning you. The book started off by providing useful metrics for measuring the performance (in terms of annual return) and risk (in terms of standard deviation) of different asset classes. It was an easy, quick read that gave me a few techniques for understanding the behavior of different asset classes, and how they’ve historically interacted with each other in portfolios. Best of all, it gave me the confidence to tackle the same sort of research on my own.

Continue reading

The Solaris 11 Experience So Far

I have a system (a zone on which this blog is hosted) that has been running the same installation of Solaris since 11/11/2009, starting with OpenSolaris 2009.06. In the time since, it has seen every public build of OpenSolaris, then OpenIndiana, and finally Solaris 11 Express. Now, exactly two years later, I’ve updated it to Solaris 11 11/11, and I’d like to share my experience so far.

The update itself did not go smoothly. I was sitting at Solaris 11 Express SRU 8 and thought, like every update I’ve done in the past, that I could just run pkg image-update. Silly me, because when I did and then rebooted, the kernel panicked. No big deal, that’s what boot environments are for. I reverted to the previous boot environment and found some helpful documentation that told me to do exactly what I just did. It turns out that there is no way to update to SRU 13 using the support repositories because they already contain the Solaris 11 11/11 packages, and pkg tries to pull some of them in. And there is no way to update just pkg because the ips-consolidation prevents it, and trying to update the ips-consolidation pulls the entire package which breaks everything just the same. In short, Oracle bungled it. The only way to update to SRU 13 that I could see was to download the SRU 13 repository ISO from My Oracle Support and set up a local repository. Once I was on SRU 13, I could continue with the update to the 11/11 release. But there were more surprises in store for me.

First, it looks like pkg decided to start enforcing consistent attributes on files shared by multiple packages. Fine, I can understand that. As a result, I had to remove a lot of my custom packages (mostly from spec-files-extra) which I’ll have to rebuild. Second, pkg decided it doesn’t like the opensolaris.org packages anymore so I had to uninstall OpenOffice.org. Also fair enough.

Happily, after that, the updates got applied successfully and the system rebooted into the 11/11 release. Next came the zone updates. When I did the normal zoneadm -z foo detach && zoneadm -z foo attach -u deal, I was told I had to convert my zones to a new ZFS structure which more closely matches the global zone. The script /usr/lib/brand/shared/dsconvert actually worked flawlessly and the updated zones came up fine.

Unfortunately I couldn’t SSH into my zones because my DNS server didn’t know where they were. It seems that with the updated networking framework, DHCP doesn’t request a hostname anymore. (/etc/default/dhcpagent still says inet <hostname> can be put in /etc/hostname.<if> to request the hostname.) I found that you can create an addr object that requests a hostname with ipadm create-addr -T dhcp -h <hostname> <addrobj>, but NWAM pretty much won’t let you create or modify anything with ipadm, and there were no options for requesting hostnames with nwamcfg. As a result, I had to disable NWAM (netadm enable -p ncp DefaultFixed) and then I could set up the interface with ipadm. Why doesn’t Solaris request hostnames by default? Not very “cloud-like” if you ask me.

I have to say, I’m impressed by the way global zones and non-global zones are linked in the new release. Zone updates were an obvious shortcoming of previous releases. We’ll see how well it works when Solaris 11 Update 1 comes out.

What else…I lost my ability to pfexec to root. Oracle removed the “Primary Administrator” profile for security reasons so I had to install sudo. Not a big deal, I just wish they had said something a little louder about it.

Also, whatever update to pkg happened, it wiped out my repositories under /var/pkg. I had to restore them from a snapshot. Bad Oracle!

I’m also a little confused about some of the changes to the way networking settings are stored. For example, when I first booted the global zone, I found that my NFSv4 domain name was reset by NWAM. I set it to what it should be with sharectl set -p nfsmapid_domain=thestaticvoid.com nfs, but is that going to be overwritten again by NWAM? Also, the name resolver settings are now stored in the svc:/network/dns/client:default service, and according to the documentation, DHCP will set the service properties properly, but I have yet to see this work.

And the last problem I’ll mention is that the update removed my virtual consoles. I had to install the virtual-console package to restore them.

Overall, I’m happy that I was at least able to update to the latest release. Oracle could have cut off any update path from OpenSolaris. However, the update should have been a lot smoother. It doesn’t speak well of future updates when I can’t even update from one supported release (SRU 8) to another. I also wish Oracle were more open about upcoming changes (as in, having more preview releases or, dare I say it, opening development the way OpenSolaris was). Even to me, a long time pre-Solaris 11 user, the changes to zones and networking are huge in this release, and I would rather have not been so surprised by them.

Wireless 802.1X Support in Solaris

The George Washington University (where I work and study) has recently implemented 802.1X to secure its wireless networks. 802.1X defines support for EAP over Ethernet (including wireless) and the WPA standards define several modes of EAP that can be used.

Solaris (I’m referring to version 11, OpenSolaris, OpenIndiana, and Illumos) supports WPA. It modified an early version of wpa_supplicant and called it “wpad“. However, they seemed to make a point of stripping out all EAP support in wpad.

So when my Network Security instructor said we had to do a term project of our choosing relating to network security, I decided I’d try to get 802.1X working in Solaris. To do this, I decided I could either add the EAP bits back into wpad, or add the Solaris-specific bits to the latest version of wpa_supplicant. wpad is based on very old code. It’s not even clear which version of wpa_supplicant it is based on, and there is no record of the massive amount of changes they made. It would be too hard for me to figure out where to plug EAP back in, and who knows how many bugs and security vulnerabilities were fixed upstream that we’d be missing out on.

Fortunately, wpa_supplicant is very modular, and reasonably well documented. I was able to graft the older Solaris code onto the newer interfaces. The result of my work is currently maintained in my own branch at GitHub. It’s not perfect, but it works (and I’ll explain how). Solaris has a very limited public API for wireless support and my goal was to get wpa_supplicant working without having to modify any system libraries or the kernel. I struggled to figure out some idiosyncrasies such as:

  • Events (association, disassociation, etc.) are only sent to wpa_supplicant when WPA is enabled in the driver.
  • Full scan results are only available when WPA is disabled in the driver.
  • Scan results don’t provide nearly as much information as their Linux counterparts do, such as access point capabilities, signal strength, noise levels, etc. I was very worried I wouldn’t be able to fill out the scan results structure fully and wpa_supplicant would refuse to work without complete information.

Here is how you can get 802.1X support working on your Solaris laptop:

  1. Install the wpa_supplicant package from my package repository:
    # pkg set-publisher -p http://pkg.thestaticvoid.com/
    # pkg install wpa_supplicant
    
  2. Add the configuration for your protected wireless networks to /etc/wpa_supplicant.conf. Here is mine:

    ctrl_interface=/var/run/wpa_supplicant
    ctrl_interface_group=0
    ap_scan=0

    network={
        ssid="prey"
        key_mgmt=WPA-PSK
        psk="<network key>"
    }

    network={
        ssid="GW1X"
        key_mgmt=WPA-EAP
        eap=TTLS
        identity="jameslee"
        anonymous_identity="anonymous"
        password="<personal password>"
        phase2="auth=PAP"
    }

    The most important thing here is ap_scan=0. This tells wpa_supplicant not to do any scanning or association of its own. Those tasks will be handled by dladm and NWAM.

  3. Backup /usr/lib/inet/wpad and replace it with this script:

    #!/bin/sh

    interface=`echo $@ | /usr/bin/sed 's/.*-i *\([a-z0-9]*\).*/\1/'`
    exec /usr/sbin/wpa_supplicant -Dsolaris -i$interface -c/etc/wpa_supplicant.conf -s &

Now connect to a wireless network with NWAM or dladm. When prompted for a network key, enter anything; it won’t be used. The actual keys will be looked up in /etc/wpa_supplicant.conf. Here is an example of me connecting to my 802.1X-secured network using dladm:

# dladm connect-wifi -e GW1X -s wpa -k nwam-GW1X iwh0
# dladm show-wifi
LINK       STATUS            ESSID               SEC    STRENGTH   MODE   SPEED
iwh0       connected         GW1X                wpa    excellent  g      54Mb

-k nwam-GW1X” refers to a dummy key setup by NWAM. dladm will complain if it’s not supplied a key.

That should be it!

Future Directions

Obviously, the integration of wpa_supplicant and NWAM/dladm leaves a lot to be desired. If there is sufficient interest, I will start looking into how to modify the dladm security framework in Illumos to include EAP related configurations (keys, certificates, identities; it’s all much more complicated than the single pre-shared key that dladm supports now). My hope, though, is that Oracle is already working on this. Do you hear that Oracle?

Automounting NFSv4 over SSH

For the past couple of years, I’ve used SSHFS to access my fileserver remotely (mostly from work). It’s always been pretty slow and it isn’t very stable on Solaris, so I’ve switched to NFSv4 over SSH. My biggest hangup of using NFS was how to secure it over the internet. Its Kerberos support is completely overkill for my needs and I never really wanted to deal with the complications of scripting the set up of an SSH tunnel, either. It all seemed so fragile.

Then I discovered autossh which does all the work of setting up and maintaining the tunnel for me. I coupled that with an executable autofs map to automatically start the tunnel just before trying to mount a share, like:

#!/bin/bash

export AUTOSSH_PIDFILE=/var/run/falcon-tunnel.pid
export AUTOSSH_GATETIME=0
export AUTOSSH_DEBUG=1

if [ -f $AUTOSSH_PIDFILE ]; then
    kill -HUP $(cat $AUTOSSH_PIDFILE)
else
    autossh -f -M 0 -o ServerAliveInterval=5 -NL 2050:localhost:2049 jlee@falcon
fi

echo "-fstype=nfs4,port=2050 localhost:/nest/$1"

Using an executable autofs map allows me to avoid reconciling the differences between service managers like SMF and Upstart, offering a consistent way to start the tunnel exactly when it’s needed on both Solaris and Linux. When you ‘cd’ into a directory managed by autofs, autossh is started or woken up, then the share is mounted over the tunnel. If there is a network interruption or change (from wired to wireless, for example), ssh will disconnect after 15 seconds of inactivity and autossh will restart it. NFS is smart enough to resume its operation when the tunnel is reestablished.

autossh has built-in support for heartbeat monitoring, but I’ve found SSH’s built-in ServerAliveInterval feature to be more reliable.

With this setup I have very simple, robust, and secure remote access to my fileserver.

Using Nitrogen as a Library Under Yaws

Motivation

I’ve been working on a project off and on for the past year which uses the Spring Framework extensively. I love Spring for how easy it makes web development, from wiring up various persistence and validation libraries, to dependency injection, and brainless security and model-view-controller functionality. However, as the project has grown, I’ve become more and more frustrated with one aspect of Spring and Java web development in general: performance and resource usage. It’s so bad, I’ve pretty much stopped working on it altogether. Between Eclipse and Tomcat, you’ve already spent over 2 GB of memory, and every time you make a source code change, Tomcat has to reload the application which takes up to 30 seconds on my system, if it doesn’t crash first. This doesn’t suit my development style of making and testing lots of small, incremental changes.

So rather than buy a whole new computer, I’ve started to look for a new lightweight web framework to convert the project to. I really like Erlang and have wanted to write something big in it for a while, so when I found the Nitrogen Web Framework, I thought this might be my opportunity to do so. Erlang is designed for performance and fault-tolerance and has a great standard library in OTP, including a distributed database, mnesia, which should eliminate my need for an object-relational mapper (it stores Erlang terms directly) and enable me to make my application highly available in the future without much fuss. Nitrogen has the added benefit of simplifying some of the fancy things I wanted to do with AJAX but found too difficult with Spring MVC.

The thing I don’t like about Nitrogen is that it is designed to deliver a complete, stand-alone application with a built-in web server of your choosing and a copy of the entire Erlang runtime. This seems to be The Erlang/OTP Way of doing things, but it seems very foreign to me. I already have Erlang installed system-wide and a web server, Yaws, that I have a lot of time invested in. I’d rather use Nitrogen as a library in my application under Yaws just like I was using Spring as a library in my application under Tomcat.

Procedures

I start my new project with Rebar:

$ mkdir test && cd test
$ wget https://bitbucket.org/basho/rebar/downloads/rebar && chmod +x rebar
$ ./rebar create-app appid=test
==> test (create-app)
Writing src/test.app.src
Writing src/test_app.erl
Writing src/test_sup.erl
$ mkdir static include templates  # These directories will be used later

Now I define my project’s dependencies in rebar.config in the same directory:

{deps, [
    {nitrogen_core, "2.1.*", {git, "git://github.com/nitrogen/nitrogen_core.git", "HEAD"}},
    {nprocreg, "0.2.*", {git, "git://github.com/nitrogen/nprocreg.git", "HEAD"}},
    {simple_bridge, "1.2.*", {git, "git://github.com/nitrogen/simple_bridge.git", "HEAD"}},
    {sync, "0.1.*", {git, "git://github.com/rklophaus/sync.git", "HEAD"}}
]}.

These dependencies are taken from Nitrogen’s rebar.config. Next I write a Makefile to simplify common tasks:

default: compile static/nitrogen

get-deps:
        ./rebar get-deps

include/basedir.hrl:
        echo '-define(BASEDIR, "$(PWD)").' > include/basedir.hrl

static/nitrogen:
        ln -sf ../deps/nitrogen_core/www static/nitrogen

compile: include/basedir.hrl get-deps
        ./rebar compile

clean:
        -rm -f static/nitrogen include/basedir.hrl
        ./rebar delete-deps
        ./rebar clean

distclean: clean
        -rm -rf deps ebin

I expect I’ll be tweaking this Makefile some more in the future, but it demonstrates the absolute minimum to compile the application. When I run make, four things happen the first time:

  1. BASEDIR is defined as the current directory in include/basedir.hrl. We’ll use this later.
  2. All of the Nitrogen dependencies are pulled from Git to the deps directory.
  3. All of the code is compiled.
  4. The static content from Nitrogen (mostly Javascript files) is symlinked into our static content directory.

Next I prepare the code for running under Yaws. First I create the Nitrogen appmod in src/test_yaws.erl:

-module(test_yaws).
-export ([out/1]).

out(Arg) ->
    RequestBridge = simple_bridge:make_request(yaws_request_bridge, Arg),
    ResponseBridge = simple_bridge:make_response(yaws_response_bridge, Arg),
    nitrogen:init_request(RequestBridge, ResponseBridge),
    nitrogen:run().

This is taken from Nitrogen repository. I also modify the init/0 function in src/test_sup.erl to start the nprocreg application, similar to how it is done in Nitrogen proper:

init([]) ->
    application:start(nprocreg),
    {ok, { {one_for_one, 5, 10}, []} }.

Lastly, I add a function to src/test_app.erl which can be used by Yaws to start the application:

-export([start/0]).

start() ->
    application:start(test).

One other thing that I do before loading the application up in Yaws is create a sample page, src/index.erl. This is downloaded from Nitrogen:

-module (index).
-compile(export_all).
-include_lib("nitrogen_core/include/wf.hrl").
-include("basedir.hrl").

main() -> #template { file=?BASEDIR ++ "/templates/bare.html" }.

title() -> "Welcome to Nitrogen".

body() ->
    #container_12 { body=[
        #grid_8 { alpha=true, prefix=2, suffix=2, omega=true, body=inner_body() }
    ]}.

inner_body() ->
    [
        #h1 { text="Welcome to Nitrogen" },
        #p{},
        "
If you can see this page, then your Nitrogen server is up and
running. Click the button below to test postbacks.
"
,
        #p{},
        #button { id=button, text="Click me!", postback=click },
        #p{},
        "
Run <b>./bin/dev help</b> to see some useful developer commands.
"

    ].

event(click) ->
    wf:replace(button, #panel {
        body="You clicked the button!",
        actions=#effect { effect=highlight }
    }).

I make sure to include basedir.hrl (generated by the Makefile, remember?) and modify the template path to start with ?BASEDIR. Since where Yaws is running is out of our control, we must reference files by absolute pathnames. Speaking of templates, I downloaded mine from the Nitrogen repository. Obviously, it can be modified however you want or you could create one from scratch.

Before we continue, I recompile everything by typing make.

Now the fun begins: wiring it all up in Yaws. I use my package for OpenSolaris which puts the configuration file in /etc/yaws/yaws.conf. I add the following to it:

ebin_dir = /docs/test/deps/nitrogen_core/ebin
ebin_dir = /docs/test/deps/nprocreg/ebin
ebin_dir = /docs/test/deps/simple_bridge/ebin
ebin_dir = /docs/test/deps/sync/ebin
ebin_dir = /docs/test/ebin

runmod = test_app

<server test.thestaticvoid.com>
    port = 80
    listen = 0.0.0.0
    docroot = /docs/test/static
    appmods = </, test_yaws>
</server>

Obviously, your paths will probably be different. The point is to tell Yaws where all of the compiled code is, tell it to start your application (where the business logic will be contained), and tell it to use the Nitrogen appmod. Restart Yaws and it should all be working!

Now for some cool stuff. If you run the svc:/network/http:yaws service from my package, or you start Yaws like yaws --run_erl svc, you can run yaws --to_erl svc (easiest to do with root privileges) and get access to Yaws’s Erlang console. From here you can hot-reload code. For example, modify the title in index.erl and recompile by running make. In the Erlang console, you can run l(index). and it will pick up your changes. But there is something even cooler. From the Erlang console, type sync:go(). and now whenever you make a change to a loaded module’s source code, it will automatically be recompiled and loaded, almost instantly! It looks something like:

# yaws --to_erl svc
Attaching to /var//run/yaws/pipe/svc/erlang.pipe.1 (^D to exit)

1> sync:go().
Starting Sync (Automatic Code Reloader)
ok
2> 
=INFO REPORT==== 17-Feb-2011::15:03:10 ===
/docs/test/src/index.erl:0: Recompiled. (Reason: Source modified.)

=INFO REPORT==== 17-Feb-2011::15:04:20 ===
/docs/test/src/index.erl:11: Error: syntax error before: body

=INFO REPORT==== 17-Feb-2011::15:04:26 ===
/docs/test/src/index.erl:0: Fixed!

2> sync:stop().

=INFO REPORT==== 17-Feb-2011::15:07:17 ===
    application: sync
    exited: stopped
    type: temporary
ok

One gotcha that may or may not apply to you, is that Yaws should have permission to write to your application’s ebin directory if you want to save the automatically compiled code. In my case, Yaws runs as a different user than I develop as, a practice that I would highly recommend. So I use a ZFS ACL to allow the web server user read and write access:

$ /usr/bin/chmod -R A+user:webservd:rw:f:allow /docs/test/ebin
$ /usr/bin/ls -dv /docs/test/ebin
drwxr-xr-x+  2 jlee     staff          8 Feb 17 15:04 /docs/test/ebin
     0:user:webservd:read_data/write_data:file_inherit:allow
     1:owner@::deny
     2:owner@:list_directory/read_data/add_file/write_data/add_subdirectory
         /append_data/write_xattr/execute/write_attributes/write_acl
         /write_owner:allow
     3:group@:add_file/write_data/add_subdirectory/append_data:deny
     4:group@:list_directory/read_data/execute:allow
     5:everyone@:add_file/write_data/add_subdirectory/append_data/write_xattr
         /write_attributes/write_acl/write_owner:deny
     6:everyone@:list_directory/read_data/read_xattr/execute/read_attributes
         /read_acl/synchronize:allow

ACLs are pretty scary to some people, but I love ’em 🙂

Other Thoughts

You would not be able to run multiple Nitrogen projects on separate virtual hosts using this scheme. Nitrogen maps request paths to module names (for example, requesting “/admin/login” would load a module admin_login) and module names must be unique in Erlang. I think it would be possible to work around this using a Yaws rewrite module, though I haven’t tested it. I imagine if one virtual host maps “/admin/login” to “/foo/admin/login” and another maps it to “/bar/admin/login”, then Nitrogen would search for foo_admin_login and bar_admin_login, respectively, eliminating the conflicting namespace problem.

Now that I’ve gone through all the trouble of setting up Nitrogen the way I like, I should start converting my application over. Hopefully I’ll like it. It would be a shame to have done all this work for naught. I’m sure there will be posts to follow.

Persistent Search Domains With NWAM and DHCP

What I Want

I want to be able to refer to systems on both my home and work networks by their hostnames rather than their fully-qualified domain names, so, ‘prey’ instead of ‘prey.thestaticvoid.com’ and ‘acad2’ instead of ‘acad2.es.gwu.edu’.

NWAM Settings

The Problem

I would typically set my home and work domains as the search setting in /etc/resolv.conf. Unfortunately, either NWAM or the Solaris DHCP client (I haven’t decided which) overwrites resolv.conf on every new connection. DHCP on Linux does the same thing, but I can configure it by editing dhclient.conf (or whatever is being used these days, it’s been a while. I think I just set my domains in the NetworkManager GUI and forget about it).

The Solaris DHCP client configuration is not nearly as flexible, and neither is NWAM which gives you the option of replacing resolv.conf with information supplied by the DHCP server, or provided by you, but not a mix of both. I do like having the nameservers set by the DHCP server, so supplying a manual configuration is not an option.

What I Tried

The first thing I tried was setting the LOCALDOMAIN environmental variable in /etc/profile. From the resolv.conf man page:

You can override the search keyword of the system
resolv.conf file on a per-process basis by setting the
environment variable LOCALDOMAIN to a space-separated list
of search domains.

I thought, great, a way to manage domain search settings without worrying about what’s doing what to resolv.conf. It didn’t work as advertised:

% LOCALDOMAIN=thestaticvoid.com ping prey
ping: unknown host prey
% s touch /etc/resolv.conf
% LOCALDOMAIN=thestaticvoid.com ping prey
prey is alive
% LOCALDOMAIN=thestaticvoid.com ping prey
ping: unknown host prey

Next, I considered adding an NWAM Network Modifier to set my search string in resolv.conf after a new connection is established. This worked reasonably well, but didn’t handle the case when you switch from one network to another, for example, from wireless to wired. The only events in NWAM that can trigger a script when the network connection changes happens before DHCP messes up resolv.conf.

Finally, in the course of my testing, I discovered that the svc:/network/dns/client service was restarting with every network connection change. I looked into its manifest and saw that it was designed to wait for changes to resolv.conf:

<!--
  Wait for potential DHCP modification of resolv.conf.
-->
<dependency
    name='net'
    grouping='require_all'
    restart_on='none'
    type='service'>
    <service_fmri value='svc:/network/service' />
</dependency>

So I could write another service which depends on dns/client and restarts whenever dns/client does and I would have the last word about what goes into my configuration file!

My Solution

I wrote a service, svc:/network/dns/resolv-conf, with the following manifest:

<?xml version="1.0"?>
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">

<service_bundle type="manifest" name="dns-resolv-conf">
    <service name="network/dns/resolv-conf"
        type="service"
        version="1">
        <create_default_instance enabled="false" />
        <single_instance />

        <dependency name="dns-client"
            grouping="require_all"
            restart_on="restart"
            type="service">
            <service_fmri value="svc:/network/dns/client" />
        </dependency>

        <dependent name="resolv-conf"
            grouping="optional_all"
            restart_on="restart">
            <service_fmri value="svc:/milestone/name-services" />
        </dependent>

        <exec_method type="method"
            name="start"
            exec="/lib/svc/method/dns-resolv-conf start"
            timeout_seconds="60" />

        <exec_method type="method"
            name="stop"
            exec="/lib/svc/method/dns-resolv-conf stop"
            timeout_seconds="60" />

        <property_group name="options" type="application">
            <propval name="search" type="astring" value="" />
        </property_group>

        <property_group name="startd" type="framework">
            <propval name="duration" type="astring" value="transient" />
        </property_group>

        <stability value="Unstable" />

        <template>
            <common_name>
                <loctext xml:lang="C">resolv.conf Settings</loctext>
            </common_name>
            <documentation>
                <manpage title="resolv.conf" section="4"
                    manpath="/usr/share/man" />
            </documentation>
        </template>
    </service>
</service_bundle>

which calls the script, /lib/svc/method/dns-resolv-conf containing:

#!/sbin/sh

. /lib/svc/share/smf_include.sh

search=$(svcprop -p options/search $SMF_FMRI)

case "$1" in
    "start")
        # Don't do anything if search option not provided.
        [ "$search" == '""' ] && exit $SMF_EXIT_OK

        # Reverse the lines because we either want to:
        #   add the search line after the *last* domain line or
        #   add it to the very top of the file if there is no domain line
        tac /etc/resolv.conf | grep -v "^search" | gawk '
            /^domain/ {
                if (!isset) {
                    print "search", $2, search
                    isset=1
                }
            }

            END {
                if (!isset) {
                    print "search", search
                }
            }

            1
        '
search="$search" | tac > /etc/resolv.conf.new && mv -f /etc/resolv.conf.new /etc/resolv.conf
        ;;

    "stop")
        # Just get rid of any search lines, I guess.
        grep -v "^search" /etc/resolv.conf > /etc/resolv.conf.new && mv -f /etc/resolv.conf.new /etc/resolv.conf
        ;;

    *)
        echo "Usage: $0 { start | stop }"
        exit $SMF_EXIT_ERR_CONFIG
esac

exit $SMF_EXIT_OK

So now I can set my search options like:

% svccfg -s resolv-conf setprop 'options/search="thestaticvoid.com es.gwu.edu"'
% svcadm refresh resolv-conf
% svcadm enable resolv-conf
% cat /etc/resolv.conf
domain  iss.gwu.edu
search iss.gwu.edu thestaticvoid.com es.gwu.edu
nameserver  161.253.152.50
nameserver  128.164.141.12

Problem solved! Or at least worked-around in the least hacky way I can!

CrashPlan

I have a little storage array that I store my life on. Music, movies, photographs, projects, school work—I’d be devastated if I lost any of it. And yet, I don’t have any sort of backup for it. Last year I evaluated various online backup services but concluded that my 5 Mbps (~600 KB/s) upload bandwidth was just too slow to feasibly backup all of my data. Now I have a 25 Mbps (~3 MB/s) symmetric connection, so last week when I got a promotional email from CrashPlan announcing their new version and prices, I decided to give it another try.

CrashPlan is, as far as I know, the only online backup solution that officially supports Solaris, and it’s not half-assed either. The software is delivered as a standard SVR4 package which installs to /opt/sfw/crashplan and includes an SMF manifest. Normally I’d never trust consumer-oriented proprietary software like this, but their Solaris support instills confidence in me. I can only hope that they continue to maintain it, despite the uncertainty surrounding Solaris’s future.

Like I said, installation was a breeze. Looking back at my shell history, it was as easy as:

# cd /tmp
# wget http://download.crashplan.com/installs/solaris/install/CrashPlan/CrashPlan_3.0_Solaris.tar.gz
# tar -xvzf CrashPlan_3.0_Solaris.tar.gz
# pkgadd -d . CrashPlan
# svccfg import /opt/sfw/crashplan/bin/crashplan.xml
# svcadm enable crashplan

From there the GUI can be launched as a regular user by running /opt/sfw/crashplan/bin/CrashPlanDesktop. The user interface is clean and simple. On the first run, it walks you through setting up an account. New users get a 30-day free trial to CrashPlan+, which includes unlimited online backups. I’m still on my trial, but as long as it continues to work for me, I expect I’ll purchase a subscription for $5/month.

First thing I did after registering was to go into the security settings and change the archive encryption key type to use a private password. This encrypts the key which encrypts my data with a separate password so even if someone hijacks my CrashPlan account, they will not be able to restore any of my files. The other advanced option, supplying your own private data key, I would argue is less secure since the key is stored in-the-clear on the local system and it cannot be changed without invalidating all of your backups. Security is very important to me, so I am happy to see that they give control over these settings to the user, though I wish the backup agent were open-source to enable more public scrutiny. At the very least, I’d like for CrashPlan to provide more details about their encryption methods similar to SpiderOak.

Next I directed the software to backup my storage array mounted at /nest to CrashPlan Central and off it went. I’m currently seeing speeds around 6 Mbps (750 KB/s) which is slightly disappointing on my fast connection, but not unacceptable. They claim that they do not cap or throttle connections, though from what I’ve read, speed is largely dependent on which of CrashPlan’s many datacenters you are provisioned to. They’ve been experiencing much higher volume than normal with last week’s release of CrashPlan 3, so I hope to see increased speed when that activity subsides.

I do like that the backup actually takes place in the background, so the GUI is only ever necessary for changing settings and performing restores. I tested a restore and saw much better speeds around 16 Mbps (2 MB/s), though still not even close to saturating my internet connection.

My backup should hopefully be done by the new year and then it’ll just be a matter of performing small nightly incrementals.