DTrace to the Rescue Again!

I am really beginning to understand just how powerful and important this tool is. I ran into another problem at work where I was seeing errors, but the details weren’t very helpful. I recently made a NFS mounted filesystem read-only, and some errors started to pop up in the system log:

# dmesg
...
Mar 30 12:09:00 ragno nfs: [ID 808668 kern.notice] NFS write error on host nasfb: Read-only file system.
Mar 30 12:09:00 ragno nfs: [ID 702911 kern.notice] (file handle: 1d010000 1000000 e10d1300 a5122c4b 1d010000 1000000 15000000 5c179c4a)
...

I wanted to know which file is being written to. The file handle isn’t very useful for me. I tried using lsof like:

# lsof -N -z ragno /zones/ragno/root/home
lsof: WARNING: can't stat() 5 zone file systems; using dev= options
COMMAND     PID ZONE      USER   FD   TYPE        DEVICE             SIZE/OFF       NODE NAME
httpd      3065 ragno webservd  cwd   VDIR     256,65718                    0          0 /zones/ragno (ragno)
httpd      3065 ragno webservd  rtd   VDIR     256,65718                    0          0 /zones/ragno (ragno)
httpd      3065 ragno webservd  txt   VREG     256,65718                    0          0 /zones/ragno (ragno)
httpd      3065 ragno webservd  txt   VREG     256,65718                    0          0 /zones/ragno (ragno)
...

but it didn’t give me any useful information either. Maybe I’m not using it correctly?

Then I found the function in the kernel which is responsible outputting the NFS write errors that appear in the system log:

...
void
nfs_write_error(vnode_t *vp, int error, cred_t *cr)
{
...

This function gets passed a vnode_t pointer which has tons of data on the current I/O operation, including the file name. DTrace enabled me to access this data dynamically:

# dtrace -n 'fbt:nfs:nfs*_write_error:entry /zonename == "ragno"/ {vp = (vnode_t*) arg0; printf("%s", stringof(vp->v_path));}'
dtrace: description 'fbt:nfs:nfs*_write_error:entry ' matched 2 probes
CPU     ID                    FUNCTION:NAME
  0  56891            nfs_write_error:entry /zones/ragno/root/home/rmserver/Logs/rmaccess.log
  0  56891            nfs_write_error:entry /zones/ragno/root/home/rmserver/Logs/rmaccess.log
  0  56891            nfs_write_error:entry /zones/ragno/root/home/rmserver/Logs/rmaccess.log
  0  56891            nfs_write_error:entry /zones/ragno/root/home/rmserver/Logs/rmaccess.log
  0  56891            nfs_write_error:entry /zones/ragno/root/home/rmserver/Logs/rmaccess.log

And there’s the culprit. Now I know which file needs to be relocated to local read/write storage. Neato!

Leave a Reply