Fun With vpnc

I recently got a new laptop at work and I decided to put OpenSolaris on it. This meant I had to setup vpnc in order to access the server networks and wireless here. I installed my vpnc package, copied the profile from my Ubuntu workstation, and started it up. It connected, but no packets flowed. I didn’t have time to investigate, so I decided to work on it some more at home.

The strange thing is that it connected from home with the very same profile and everything worked fine. I immediately suspected something was wrong with the routing tables, like maybe some of the routes installed by vpnc-script were conflicting with the routes necessary to talk to the VPN concentrator. I endlessly compared the routing tables between work and home and my working Ubuntu workstation, removing routes, adding routes, and manually constructing the routing table until I was positive it could not be that.

Everything I pinged worked. I could ping the concentrator. I could ping the gateway. I could ping the tunnel device. I could ping the physical interface—or so I thought.

As I was preparing to write a message to the vpnc-devel mailing list requesting help, I did some pings to post the output in the email. I ran

$ ping <concentrator ip>
<concentrator ip> is alive

which looked good, but I wanted the full ping output, so I ran

$ ping -s <concentrator ip>
PING <concentrator ip>: 56 data bytes
^C
----<concentrator ip> PING Statistics----
4 packets transmitted, 1 packets received, 75% packet loss
round-trip (ms)  min/avg/max/stddev = 9223372036854776.000/0.000/0.000/-NaN

For some reason, only the first ping was getting through. The rest were getting hung up somewhere. The really strange thing was that I saw the same behavior on the local physical interface:

$ ifconfig bge0
bge0: flags=1004843 mtu 1500 index 3
        inet 161.253.143.151 netmask ffffff00 broadcast 161.253.143.255
$ ping -s 161.253.143.151
PING 161.253.143.151: 56 data bytes
^C
----161.253.143.151 PING Statistics----
5 packets transmitted, 1 packets received, 80% packet loss
round-trip (ms)  min/avg/max/stddev = 9223372036854776.000/0.000/0.000/-NaN

I have never seen a situation where you couldn’t even ping a local physical interface! I checked and double checked that IPFilter wasn’t running. Finally I started a packet capture of the physical interface to see what was happening to my pings:

# snoop -d bge0 icmp
Using device bge0 (promiscuous mode)
161.253.143.151 -> <concentrator ip> ICMP Destination unreachable (Bad protocol 50)
161.253.143.151 -> <concentrator ip> ICMP Destination unreachable (Bad protocol 50)
161.253.143.151 -> <concentrator ip> ICMP Destination unreachable (Bad protocol 50)
^C

That’s when by chance I saw messages being sent to the VPN concentrator saying “bad protocol 50.” IP protocol 50 represents “ESP”, commonly used for IPsec. Apparently Solaris eats these packets. Haven’t figured out why.

I remembered seeing something in the vpnc manpage about ESP packets:

--natt-mode <natt/none/force-natt/cisco-udp>

      Which NAT-Traversal Method to use:
      o    natt -- NAT-T as defined in RFC3947
      o    none -- disable use of any NAT-T method
      o    force-natt -- always use NAT-T encapsulation  even
           without presence of a NAT device (useful if the OS
           captures all ESP traffic)
      o    cisco-udp -- Cisco proprietary UDP  encapsulation,
           commonly over Port 10000

I enabled force-natt mode, which encapsulates the ESP packet in a UDP packet, normally to get past NAT, and it started working! In retrospect, I should have been able to figure that out much easier. First, it pretty much says it on the vpnc homepage: “Solaris (7 works, 9 only with –natt-mode forced).” I didn’t even notice that. Second, I should have realized that I was behind a NAT at home and not at work, so they would be using a different NAT-traversal mode by default. Oh well, it was a good diagnostic exercise, hence the post to share the experience.

In other vpnc related news, I’ve ported Kazuyoshi’s patch to the open_tun and solaris_close_tun functions of OpenVPN to the tun_open and tun_close functions of vpnc. His sets up the tunnel interface a little bit differently and adds TAP support. It solves the random problems vpnc had with bringing up the tunnel interface such as:

# ifconfig tun0
tun0: flags=10010008d0<POINTOPOINT,RUNNING,NOARP,MULTICAST,IPv4,FIXEDMTU> mtu 1412 index 8
        inet 128.164.xxx.yy --> 128.164.xxx.yy netmask ffffffff 
        ether f:ea:1:ff:ff:ff
# ifconfig tun0 up
ifconfig: setifflags: SIOCSLIFFLAGS: tun0: no such interface
# dmesg | grep tun0
Jul 23 14:56:05 swan ip: [ID 728316 kern.error] tun0: DL_BIND_REQ failed: DL_OUTSTATE

The changes are in the latest vpnc package available from my package repository.

A Professional Photo Workflow for OpenSolaris

I am not a professional by any means, but I like to know I can get the most out of my tools if the need arises. That means shooting in RAW along side JPEG so I can take control of image processing settings or correct little mistakes such as under-exposure or incorrect white balance. RAW files contain raw sensor data from the camera (duh) and must be processed by special programs before they can be printed or shared. My camera came with the Canon Digital Photo Professional software which I’ve heard is pretty good. There are other (expensive) commercial options such as Adobe Lightroom. Obviously none of these work in Solaris (though they might work in Wine), so I decided to explore the open-source offerings.

Fortunately, this is a good time in the open-source world for RAW processing. Tools like UFRaw and LensFun are maturing rapidly and beginning to give their commercial counterparts a run for their money. I spent the past week porting them, and the color management software, Argyll, to OpenSolaris.

Argyll

Argyll is a suite of color management tools for Unix and Windows. It can be used to calibrate displays, cameras, scanners, and printers. When all of your equipment is properly calibrated, then colors should appear the same on all devices. So if I were to photograph a stop sign, it would appear to be the same red on my monitor as in real life.

Color Calibration Tools

Color calibration requires special equipment. For your monitor, you need a colorimeter. I already had an X-rite i1Display to calibrate my TVs, and it works just fine with Argyll and Solaris (using libusb). Following these instructions I was able to calibrate my monitors in a few minutes. It was so easy I did my work monitors and laptop too!

Camera calibration was just as easy following Pascal de Bruijn’s instructions. I picked up a very affordable IT8.7 target from Wolf Faust. It arrived from Germany in about a week.

Argyll can be installed from my software repository by typing pfexec pkg install SFEargyll.

UFRaw

UFRaw

UFRaw with lens correction support using LensFun can be installed from my repository by typing pfexec pkg install ufraw. I went through hell trying to port this and its dependencies. LensFun was particularly terrible with its crazy Makefiles (please use Autotools!) and non-standard C++ which Sun Studio choked on.

I don’t have much else to say about this yet, I’m still playing around with it.

VPNC for OpenSolaris

I’ve compiled VPNC and the requisite TUN/TAP driver for OpenSolaris so that I can access my work network from home. Kazuyoshi’s driver adds TAP functionality to the original TUN driver which hasn’t been updated in nine years. It’s a real testament to the stability of the OpenSolaris kernel ABI that the module still compiles, loads, and works properly.

All of the software can be installed from my repository onto build 111 or higher:

$ pfexec pkg set-publisher -O http://pkg.thestaticvoid.com/ thestaticvoid
$ pfexec pkg install vpnc

The tun driver should load automatically and create /dev/tun. Now create a VPN profile configuration in /etc/vpnc/. The configuration contains a lot of private information so I’m not going to share mine here, but /etc/vpnc/default.conf is a good start.

One thing I do like to do is make sure only certain subnets are tunneled through the VPN. That way connecting to the VPN doesn’t interrupt any connections that are already established (for example, AIM). To do that I create a script /etc/vpnc/gwu-networks-script containing

#!/bin/sh

# Only tunnel GWU networks through VPN
CISCO_SPLIT_INC=2
CISCO_SPLIT_INC_0_ADDR=161.253.0.0
CISCO_SPLIT_INC_0_MASK=255.255.0.0
CISCO_SPLIT_INC_0_MASKLEN=16
CISCO_SPLIT_INC_0_PROTOCOL=0
CISCO_SPLIT_INC_0_SPORT=0
CISCO_SPLIT_INC_0_DPORT=0
CISCO_SPLIT_INC_1_ADDR=128.164.0.0
CISCO_SPLIT_INC_1_MASK=255.255.0.0
CISCO_SPLIT_INC_1_MASKLEN=16
CISCO_SPLIT_INC_1_PROTOCOL=0
CISCO_SPLIT_INC_1_SPORT=0
CISCO_SPLIT_INC_1_DPORT=0

. /etc/vpnc/vpnc-script

then add Script /etc/vpnc/gwu-networks-script to the end of my VPN profile configuration.

Connecting to the VPN you should see messages like:

$ pfexec vpnc gwu
Enter password for jameslee@<no>: 
which: no ip in (/sbin:/usr/sbin:/usr/gnu/bin:/usr/bin:/usr/sbin:/sbin)
which: no ip in (/sbin:/usr/sbin:/usr/gnu/bin:/usr/bin:/usr/sbin:/sbin)
add net 128.164.<no>: gateway 128.164.<no>
add host 128.164.<no>: gateway 161.253.<no>
add net 161.253.0.0: gateway 128.164.<no>
add net 128.164.0.0: gateway 128.164.<no>
add net 128.164.<no>: gateway 128.164.<no>
add net 128.164.<no>: gateway 128.164.<no>
VPNC started in background (pid: 594)...

The vpnc-script will modify your /etc/resolv.conf and routing tables so be sure to run vpnc-disconnect when you are done with the connection to restore the original configuration.

Thanks to the good folks at OpenConnect for a well-maintained vpnc-script which works on Solaris. Spec files for these packages are available from my GitHub repository if you want to roll your own.

MusicBrainz Picard

MusicBrainz along with the Picard tagger is without a doubt the best way to organize and manage large collections of music. The tagger will fingerprint audio files and automatically correct their metadata and filenames.

I’ve been using MusicBrainz since 2005, and even attempted to write my own tagger for it in Java back when Picard didn’t exist. When I switched to OpenSolaris, it was one of the programs I missed the most. So I went about building a package for it.

Unfortunately, the software has a lot of complicated dependencies such as Qt and FFmpeg which aren’t included in OpenSolaris either. FFmpeg I can understand; it infringes on countless software patents <insert rant here>. But Qt? There’s no reason for that. It is easily the second most popular graphics toolkit for Unix. Sure, the Solaris KDE guys have a build of it, but it installs to a non-standard prefix and doesn’t include 64-bit libs. No thank you.

Anyway, the package and its dependencies are up on my package repository for b132 and later. You know the deal…pfexec pkg install picard. Spec files are, as always, available from my GitHub repository.

Now that I have a good start on the FFmpeg package, I’m going to keep working on it, adding support for more codecs and eventually build MPlayer so I can stop using this guy’s less-than-ideal build.

EDIT: Just FYI, in order to get nice antialiased fonts in Qt applications, I had to modify the fontconfig settings. This is not necessary for GTK+ applications because they get their settings from the gnome-appearance-properties dialog. So in ~/.fonts.conf add:

<?xml version="1.0"?>
<!DOCTYPE fontconfig SYSTEM "fonts.dtd">
<fontconfig>
<!--  Use the Antialiasing -->
  <match target="font">
    <edit name="antialias" mode="assign"><bool>true</bool></edit>
  </match>
</fontconfig>

Other Qt appearance settings can be changed from the qtconfig dialog.

Start Virtual NICs on OpenSolaris Boot

One of the more frustrating things I deal with on OpenSolaris is that every time I reboot, I have to manually bring up each virtual network interface in order to start all of my zones. There is a bug report for this problem that says a fix will be integrated into b132, which is just a few weeks away, but in the mean time, I’ve whipped up an SMF service to handle this for me. Create a file vnic.xml:

<?xml version="1.0"?>
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">

<service_bundle type='manifest' name='vnic'>

<service
    name='network/vnic'
    type='service'
    version='1'>

    <dependency
        name='network_service'
        grouping='require_all'
        restart_on='none'
        type='service'>
        <service_fmri value='svc:/network/service' />
    </dependency>

    <dependent
        name='network_vnic'
        grouping='optional_all'
        restart_on='none'>
        <service_fmri value='svc:/system/zones' />
    </dependent>

    <exec_method
        type='method'
        name='start'
        exec='/usr/sbin/dladm up-vnic ${SMF_FMRI/*:/}'
        timeout_seconds='60' />

    <exec_method
        type='method'
        name='stop'
        exec=':true'
        timeout_seconds='60' />

    <property_group name='startd' type='framework'>
        <propval name='duration' type='astring' value='transient' />
    </property_group>

    <stability value='Unstable' />

    <template>
        <common_name>
            <loctext xml:lang='C'>
            Virtual Network Interface
            </loctext>
        </common_name>
        <documentation>
            <manpage title='dladm' section='1M'
                manpath='/usr/share/man' />
        </documentation>
    </template>
</service>

</service_bundle>

This service should run sometime after the network is started but before the zones are started. Load it in with svccfg -v import vnic.xml and create an instance of the service for each of the VNICs that you want to start. For example, if you want to start vnic0 on boot:

# svccfg -s vnic add vnic0
# svcadm refresh vnic0
# svcadm enable vnic0

UPDATE: Build 132 is out an this functionality has been integrated as the svc:/network/datalink-management:default service. The services that were added above can be removed by running svccfg delete vnic.

Music Player Daemon on OpenSolaris

MPD is essential software for me. It’s one of the few music players out there for Unix that does gapless playback and ReplayGain. It’s also nice that, because it’s a daemon, I’m not bound to any particular interface. Fortunately, there is a really good one in the form of Sonata.

MPD is not included in OpenSolaris yet, so last weekend I built some packages for it. The build has been stable for me and I’m happy with the state of the packages so I thought I’d share them. First add my package repository:

$ pfexec pkg set-authority -O http://pkg.thestaticvoid.com/ thestaticvoid

MPD

This package and its dependencies require OpenSolaris 2009.06 or newer. Install it by typing pfexec pkg install mpd. The following formats are supported:

$ mpd -V
...
Supported decoders:
[mad] mp3 mp2
[vorbis] ogg oga
[oggflac] ogg oga
[flac] flac
[audiofile] wav au aiff aif
[faad] aac
[mp4] m4a mp4
[mpcdec] mpc
[wavpack] wv

Supported outputs:
shout null fifo ao solaris httpd 

Supported protocols:
file:// http://

I plan on adding ffmpeg support soon which will add support for even more codecs.

To run MPD, create a configuration file in your home directory like

port                    "6600"
music_directory         "~/music"
playlist_directory      "~/.mpd/playlists"
db_file                 "~/.mpd/mpd.db"
log_file                "~/.mpd/mpd.log"

Create any directories from the configuration file that don’t exist, such as ~/.mpd/playlists and start the daemon by running mpd ~/.mpdconf as your user. It will immediately build a library of your music.

Alternatively, mpd can be run system-wide, which just seems more appropriate to me for whatever reason. The only complicated part about this is that you have to give MPD permission to write to the audio device. Edit /etc/logindevperms, find the /dev/sound/* lines and change the mode to 0666 so that they look like:

/dev/console    0666    /dev/sound/*        # audio devices
/dev/vt/active  0666    /dev/sound/*        # audio devices

Logout and log back in for the settings to take effect. Then modify /etc/mpd.conf to your liking and start the daemon by typing svcadm enable mpd. You may have to svcadm refresh manifest-import for SMF to load the mpd manifest.

mpdscribble

I also built a package for mpdscribble which is a mature, well-maintained scrobbler for Last.fm. Install it by typing pfexec pkg install mpdscribble. Set your Last.fm or Libre.fm username and password in /etc/mpdscribble.conf and start the daemon with svcadm enable mpdscribble. That’s all there is to it.

Sonata

Sonata is a lightweight cilent for MPD. Looks pretty nice too:

Sonata

Because Sonata requires Python 2.5, and OpenSolaris 2009.06 only really supports Python 2.3, this package requires build 127 or newer. Install it by typing pfexec pkg install sonata. It can be launched from the Applications->Sound & Video menu.

Mixer State in OpenSolaris

I’ve recently installed OpenSolaris on my desktop and noticed that my volume settings do not persist between reboots. A quick search revealed that that functionality hasn’t been implemented yet. The thread suggested using the mixerctl command to save and restore the mixer state so I’ve thrown together an SMF service to do it automatically on boot and shutdown.

First, the script which should go into /lib/svc/method/sound-mixer:

#!/sbin/sh

. /lib/svc/share/smf_include.sh
smf_is_globalzone || exit $SMF_EXIT_OK

ctl_file=$(svcprop -p options/ctl_file $SMF_FMRI)

case "$1" in
'start')
        if [ ! -f $ctl_file ]; then
                echo "Mixer control file $ctl_file does not exist."
                exit $SMF_EXIT_OK
        fi

        if ! /usr/sbin/mixerctl -r $ctl_file; then
                echo "Error restoring mixer state."
                exit $SMF_EXIT_OK
        fi
        ;;

'stop')
        if ! /usr/sbin/mixerctl -f -s $ctl_file; then
                echo "Error saving mixer state."
                exit $SMF_EXIT_OK
        fi
        ;;

*)
        echo "Usage: $0 { start | stop }"
        exit $SMF_EXIT_ERR_CONFIG
        ;;
esac

exit $SMF_EXIT_OK

Second, the manifest which can be saved anywhere and loaded with svccfg -v import <manifest>:

<?xml version="1.0"?>
<!DOCTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.1">

<service_bundle type='manifest' name='mixer'>

<service
       name='system/sound/mixer'
       type='service'
       version='1'>

        <create_default_instance enabled='true' />
        <single_instance />

        <dependency
           name='fs-local'
           grouping='require_all'
           restart_on='none'
           type='service'>
                <service_fmri value='svc:/system/filesystem/local' />
        </dependency>
       
        <dependency
           name='device-audio'
           grouping='require_all'
           restart_on='none'
           type='service'>
                <service_fmri value='svc:/system/device/audio' />
        </dependency>

        <exec_method
               type='method'
               name='start'
               exec='/lib/svc/method/sound-mixer start'
               timeout_seconds='60' />

        <exec_method
               type='method'
               name='stop'
               exec='/lib/svc/method/sound-mixer stop'
               timeout_seconds='60' />

        <property_group name='options' type='application'>
                <propval name='ctl_file' type='astring' value='/etc/sound/mixer.state' />
        </property_group>

        <property_group name='startd' type='framework'>
                <propval name='duration' type='astring' value='transient' />
        </property_group>

        <stability value='Unstable' />

        <template>
                <common_name>
                        <loctext xml:lang='C'>Mixer State Saver</loctext>
                </common_name>
                <documentation>
                        <manpage title='mixerctl' section='1M'
                           manpath='/usr/share/man' />
                </documentation>
        </template>

</service>

</service_bundle>

UPDATE: In b130, the audioctl command replaces mixerctl. In the sound-mixer script above, change /usr/sbin/mixerctl -r $ctl_file to /usr/bin/audioctl load-controls $ctl_file and /usr/sbin/mixerctl -f -s $ctl_file to /usr/bin/audioctl save-controls -f $ctl_file.

suEXEC on OpenSolaris

One nice thing about having all dynamic content being generated by CGI is that you can use suEXEC to run the scripts as a different user. This is primarily used for systems where you have multiple untrusted users who run sites in one HTTP server. Then no one can interfere with anyone else. It can also be used simply for separating the application from the server.

I’m the only user on my server so I don’t necessarily have any of these security concerns, but I have enabled suEXEC for convenience. For example, WordPress will allow you to modify the stylesheets from the admin interface as long as it can write to them. With suEXEC, the admin interface can run as my Unix user, so I can edit the files from both the web interface and the command line without having wide-open permissions or switching to root.

Same applies for Trac where I can manage the project with the web interface or trac-admin on the command line. The same effect could pretty much be obtained by using Unix groups properly:

# groupadd wordpress
# usermod -G wordpress webservd
# usermod -G wordpress jlee  # my username
# chgrp -R wordpress /docs/thestaticvoid.com  # virtualhost document root
# chmod -R g+ws /docs/thestaticvoid.com  # make directory writable and always owned by
                                           the wordpress group

Then umask 002 would have to be set in Apache’s and my profile so any files that get created can be written to by the other users in the group. That’s all well and good, but it seems like a bit of work and I don’t like the idea of messing with the default umask.

On to suEXEC. First, let’s show the current user that PHP executes as. Create a file test.php containing <?php echo exec("id"); ?>. Accessing the script from your web browser should show something like uid=80(webservd) gid=80(webservd).

Next, in OpenSolaris, the suexec binary must be enabled:

# cd /usr/apache2/2.2/bin/  # go one directory further into the amd64 dir
                              if you're running 64-bit
# mv suexec.disabled suexec
# chown root:webservd suexec
# chmod 4750 suexec
# ./suexec -V
 -D AP_DOC_ROOT="/var/apache2/2.2/htdocs"
 -D AP_GID_MIN=100
 -D AP_HTTPD_USER="webservd"
 -D AP_LOG_EXEC="/var/apache2/2.2/logs/suexec_log"
 -D AP_SAFE_PATH="/usr/local/bin:/usr/bin:/bin"
 -D AP_UID_MIN=100
 -D AP_USERDIR_SUFFIX="public_html"

These variables were set at compile time and cannot be changed. They ensure that certain conditions must be met in order to use the binary. That’s very important because it’s setuid root. The first thing I had to do was move everything from my old document root to the one specified above in AP_DOC_ROOT. Then you can add SuexecUserGroup jlee jlee (with whatever username and group you want the scripts to run as) to your <VirtualHost> section of the Apache configuration. At this point if you try to execute test.php you’ll probably see one of a couple errors in the suEXEC log (/var/apache2/2.2/logs/suexec_log):

  • [2009-07-27 11:08:02]: uid: (1000/jlee) gid: (1000/jlee) cmd: php-cgi
    [2009-07-27 11:08:02]: command not in docroot (/usr/php/bin/php-cgi)

    In this case, php-cgi is going to have to be moved to the document root:

    $ cp /usr/php/bin/php-cgi /var/apache2/2.2/htdocs/
    $ pfexec vi /etc/apache2/2.2/conf.d/php-cgi.conf  # modify the ScriptAlias appropriately
    $ svcadm restart http
  • [2009-07-27 11:11:07]: uid: (1000/jlee) gid: (1000/jlee) cmd: php-cgi
    [2009-07-27 11:11:07]: target uid/gid (1000/1000) mismatch with directory (0/2) or program (0/0)

    Make sure everything that suexec is to execute is owned by the same user and group as specified in the SuexecUserGroup line of your Apache configuration.

Now, running test.php should give the correct results: uid=1000(jlee) gid=1000(jlee). Done!

As a side note, I lose all frame of reference while I write so I can’t remember if I’m writing this for you or me, explaining what I’ve done or what you should do. Sorry 🙂

Reducing Memory Footprint of Apache Services

An interesting thing happened when I set up this blog. It first manifested itself as a heap of junk mail in my inbox. Then no mail at all. I had run out of memory. WordPress requires me to run MySQL and that extra 12M pushed me over the 256M cap in my OpenSolaris 2009.06 zone. As a result SpamAssassin could not spawn, and ultimately Postfix died. So I sought out to try to reduce my memory footprint.

Let’s take a look at where things were when I got started:

$ prstat -s rss -Z 1 1 | cat
   PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP       
 13488 webservd  183M   92M sleep   59    0   0:00:28 0.0% trac.fcgi/1
 13479 webservd   59M   41M sleep   59    0   0:00:14 0.0% trac.fcgi/1
 13489 webservd   59M   41M sleep   59    0   0:00:14 0.0% trac.fcgi/1
  4463 mysql      64M   12M sleep   59    0   0:02:39 0.0% mysqld/10
 19296 root       13M 8444K sleep   59    0   0:00:25 0.0% svc.configd/16
 19619 named      11M 5824K sleep   59    0   0:03:51 0.0% named/7
 13473 root       64M 4352K sleep   59    0   0:00:00 0.0% httpd/1
 19358 root       12M 3688K sleep   59    0   0:00:54 0.0% nscd/31
 19294 root       12M 3180K sleep   59    0 244:37:22 0.0% svc.startd/13
 13476 webservd   64M 2940K sleep   59    0   0:00:00 0.0% httpd/1
 13486 webservd   64M 2924K sleep   59    0   0:00:00 0.0% httpd/1
 13745 root     6248K 2832K cpu1    59    0   0:00:00 0.0% prstat/1
 13721 root     5940K 2368K sleep   39    0   0:00:00 0.0% bash/1
 13485 webservd   64M 2252K sleep   59    0   0:00:00 0.0% httpd/1
 13482 webservd   64M 2168K sleep   59    0   0:00:00 0.0% httpd/1
ZONEID    NPROC  SWAP   RSS MEMORY      TIME  CPU ZONE                        
    39       60  494M  246M    96% 244:47:13 0.1% case                        
Total: 60 processes, 149 lwps, load averages: 0.61, 0.62, 0.52

First thing I noticed is the 174M that Trac was taking up. I was running it as a FastCGI service for speed. The problem with that is it remains resident even when it’s not processing any requests, which is most of the time. One option I tried was setting DefaultMaxClassProcessCount 1 in my /etc/apache2/2.2/conf.d/fcgid.conf file. This effectively limits Trac to only one process at a time, which greatly reduces the memory utilization, but means it can only service one request at a time. That’s not an option.

Fortunately, my zone seems to have good, fast processors and disks, so I can put up with running it as standard CGI service. Easy enough to make the switch, just move some things around in my Apache configuration:

ScriptAlias /trac /usr/share/trac/cgi-bin/trac.cgi
#ScriptAlias /trac /usr/share/trac/cgi-bin/trac.fcgi
#DefaultInitEnv TRAC_ENV "/trac/iriverter"

<Location "/trac">
    SetEnv TRAC_ENV "/trac/iriverter"
    Order allow,deny
    Allow from all
</Location>

So things are looking much better, but I’m still not happy with it:

$ prstat -s rss -Z 1 1 | cat
   PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP       
 15362 webservd   74M   31M sleep   59    0   0:00:00 0.0% httpd/1
 15388 webservd   69M   30M sleep   59    0   0:00:00 0.0% httpd/1
 15366 webservd   66M   22M sleep   59    0   0:00:00 0.0% httpd/1
...
ZONEID    NPROC  SWAP   RSS MEMORY      TIME  CPU ZONE                        
    39       58  254M  113M    44% 244:46:20 0.2% case 

Now Apache is being a hog, and that’s only a few of the httpd processes. By default on Unix, Apache uses the prefork MPM which serves each request from its own process. It likes to keep around a handful of children for performance, so it doesn’t have to swawn a new one each time. The problem is if your request involves PHP, each httpd process will load its own instance of the PHP module and it doesn’t let it go when it’s finished. I get this. It’s all for performance. My initial reaction was: wouldn’t be nice if Apache was threaded so requests can all share the same PHP code. That’s when I was introduced to the worker MPM. It serves requests from threads so it’s efficient, but also has a couple of children for fault tolerance. This is easy to switch to in OpenSolaris:

# svcadm disable http
# svccfg -s http:apache22 setprop httpd/server_type=worker
# svcadm refresh http
# svcadm enable http

I also copied /etc/apache2/2.2/samples-conf.d/mpm.conf into /etc/apache2/2.2/conf.d/ which includes some sane defaults like only spawning two servers to start with. This was good:

$ prstat -s rss -Z 1 1 | cat
...
ZONEID    NPROC  SWAP   RSS MEMORY      TIME  CPU ZONE                        
    39       50  125M   75M    29% 244:46:23 0.3% case

75M makes me feel safe, like I could take the occasional spam bomb. What I forgot to mention is that mod_php isn’t supported with the worker MPM since any of its extensions might not be thread-safe. This is okay, because PHP can be run as a CGI program which has the additional benefit of being memory efficient (at the cost of speed) since it’s only loaded when it’s executed. All I had to do was create a file /etc/apache2/2.2/conf.d/php-cgi.conf containing:

<IfModule worker.c>
    ScriptAlias /php-cgi /usr/php/bin/php-cgi

    <Location "/php-cgi">
        Order allow,deny
        Allow from all
    </Location>
   
    Action php-cgi /php-cgi
    AddHandler php-cgi .php
    DirectoryIndex index.php
</IfModule>

I’ll be the first to admit, running Trac and WordPress as CGI have made them noticeably slower, but I’d rather them run slower for as much action that they get and know that my mail will get to me. If you’re faced with similar resource constraints, you may want to consider these changes. There may be other ways I can tweak Apache, such as unloading unused modules, but I’m not ready to face that yet.