DTrace to the Rescue Again!

I am really beginning to understand just how powerful and important this tool is. I ran into another problem at work where I was seeing errors, but the details weren’t very helpful. I recently made a NFS mounted filesystem read-only, and some errors started to pop up in the system log:

# dmesg
...
Mar 30 12:09:00 ragno nfs: [ID 808668 kern.notice] NFS write error on host nasfb: Read-only file system.
Mar 30 12:09:00 ragno nfs: [ID 702911 kern.notice] (file handle: 1d010000 1000000 e10d1300 a5122c4b 1d010000 1000000 15000000 5c179c4a)
...

I wanted to know which file is being written to. The file handle isn’t very useful for me. I tried using lsof like:

# lsof -N -z ragno /zones/ragno/root/home
lsof: WARNING: can't stat() 5 zone file systems; using dev= options
COMMAND     PID ZONE      USER   FD   TYPE        DEVICE             SIZE/OFF       NODE NAME
httpd      3065 ragno webservd  cwd   VDIR     256,65718                    0          0 /zones/ragno (ragno)
httpd      3065 ragno webservd  rtd   VDIR     256,65718                    0          0 /zones/ragno (ragno)
httpd      3065 ragno webservd  txt   VREG     256,65718                    0          0 /zones/ragno (ragno)
httpd      3065 ragno webservd  txt   VREG     256,65718                    0          0 /zones/ragno (ragno)
...

but it didn’t give me any useful information either. Maybe I’m not using it correctly?

Then I found the function in the kernel which is responsible outputting the NFS write errors that appear in the system log:

...
void
nfs_write_error(vnode_t *vp, int error, cred_t *cr)
{
...

This function gets passed a vnode_t pointer which has tons of data on the current I/O operation, including the file name. DTrace enabled me to access this data dynamically:

# dtrace -n 'fbt:nfs:nfs*_write_error:entry /zonename == "ragno"/ {vp = (vnode_t*) arg0; printf("%s", stringof(vp->v_path));}'
dtrace: description 'fbt:nfs:nfs*_write_error:entry ' matched 2 probes
CPU     ID                    FUNCTION:NAME
  0  56891            nfs_write_error:entry /zones/ragno/root/home/rmserver/Logs/rmaccess.log
  0  56891            nfs_write_error:entry /zones/ragno/root/home/rmserver/Logs/rmaccess.log
  0  56891            nfs_write_error:entry /zones/ragno/root/home/rmserver/Logs/rmaccess.log
  0  56891            nfs_write_error:entry /zones/ragno/root/home/rmserver/Logs/rmaccess.log
  0  56891            nfs_write_error:entry /zones/ragno/root/home/rmserver/Logs/rmaccess.log

And there’s the culprit. Now I know which file needs to be relocated to local read/write storage. Neato!

Watching Network Connections with DTrace

DTrace is one of those magical tools that I sort of know how to use but never had the chance to. Usually, firing up truss is sufficient to figure out what’s happening in a process.

But recently, a team at GW was setting up a web application and having some problems. It would hang whenever it was accessed. The application is highly modular and accesses several other servers for resources, so I suspected there must have been some sort of network problem, probably a blocked firewall port somewhere. Unfortunately, the application produces no logs (!) so it was pretty much guesswork.

I started snoop to watch for traffic generated by the application. I knew from watching the production application that there should be communication between two servers over port 9300, but I wasn’t seeing anything. I thought of using truss to determine if the application was at least trying to connect, but being a web application, the process was way too short-lived to attach to. Even if I was able to attach to it, all I might have learned was that the application was calling the ‘connect’ system call and possibly getting a return value.

DTrace, on the other hand, doesn’t have to attach to a particular process. It just sits in the kernel and dynamically observes what your scripts tell it to. It also has the ability to look into data structures so you can actually see what values are passed to functions and system calls. I used this ability to watch every call to the connect system call.

#!/usr/sbin/dtrace -qs

syscall::connect:entry
{
        socks = (struct sockaddr*) copyin(arg1, arg2);
        hport = (uint_t) socks->sa_data[0];
        lport = (uint_t) socks->sa_data[1];
        hport <<= 8;
        port = hport + lport;

        printf("%s: %d.%d.%d.%d:%d\n", execname, socks->sa_data[2], socks->sa_data[3], socks->sa_data[4], socks->sa_data[5], port);
}

This script copies arg2 bytes from the address pointed to by arg1 into kernel space and uses the data to determine the port (big endian order) and destination address.

When run, the script immediately revealed that the web application was trying to connect to itself rather than a database server, a simple configuration mistake made difficult to diagnose due to poor logging facilities in the application.

In the future, there will be a network provider for DTrace which will simplify the job of extracting data from network calls. It should then be possible to rewrite the script to simply:

#!/usr/sbin/dtrace -qs

tcp:::connect-request
{
        printf("%s: %s:%d\n", execname, args[2]->ip_daddr, args[4]->tcp_dport);
}

Hopefully I’ll have more opportunities to use DTrace in the future.