I am really beginning to understand just how powerful and important this tool is. I ran into another problem at work where I was seeing errors, but the details weren’t very helpful. I recently made a NFS mounted filesystem read-only, and some errors started to pop up in the system log:
# dmesg ... Mar 30 12:09:00 ragno nfs: [ID 808668 kern.notice] NFS write error on host nasfb: Read-only file system. Mar 30 12:09:00 ragno nfs: [ID 702911 kern.notice] (file handle: 1d010000 1000000 e10d1300 a5122c4b 1d010000 1000000 15000000 5c179c4a) ...
I wanted to know which file is being written to. The file handle isn’t very useful for me. I tried using lsof like:
# lsof -N -z ragno /zones/ragno/root/home lsof: WARNING: can't stat() 5 zone file systems; using dev= options COMMAND PID ZONE USER FD TYPE DEVICE SIZE/OFF NODE NAME httpd 3065 ragno webservd cwd VDIR 256,65718 0 0 /zones/ragno (ragno) httpd 3065 ragno webservd rtd VDIR 256,65718 0 0 /zones/ragno (ragno) httpd 3065 ragno webservd txt VREG 256,65718 0 0 /zones/ragno (ragno) httpd 3065 ragno webservd txt VREG 256,65718 0 0 /zones/ragno (ragno) ...
but it didn’t give me any useful information either. Maybe I’m not using it correctly?
Then I found the function in the kernel which is responsible outputting the NFS write errors that appear in the system log:
void
nfs_write_error(vnode_t *vp, int error, cred_t *cr)
{
...
This function gets passed a vnode_t pointer which has tons of data on the current I/O operation, including the file name. DTrace enabled me to access this data dynamically:
# dtrace -n 'fbt:nfs:nfs*_write_error:entry /zonename == "ragno"/ {vp = (vnode_t*) arg0; printf("%s", stringof(vp->v_path));}' dtrace: description 'fbt:nfs:nfs*_write_error:entry ' matched 2 probes CPU ID FUNCTION:NAME 0 56891 nfs_write_error:entry /zones/ragno/root/home/rmserver/Logs/rmaccess.log 0 56891 nfs_write_error:entry /zones/ragno/root/home/rmserver/Logs/rmaccess.log 0 56891 nfs_write_error:entry /zones/ragno/root/home/rmserver/Logs/rmaccess.log 0 56891 nfs_write_error:entry /zones/ragno/root/home/rmserver/Logs/rmaccess.log 0 56891 nfs_write_error:entry /zones/ragno/root/home/rmserver/Logs/rmaccess.log
And there’s the culprit. Now I know which file needs to be relocated to local read/write storage. Neato!