But recently, a team at GW was setting up a web application and having some problems. It would hang whenever it was accessed. The application is highly modular and accesses several other servers for resources, so I suspected there must have been some sort of network problem, probably a blocked firewall port somewhere. Unfortunately, the application produces no logs (!) so it was pretty much guesswork.
I started snoop to watch for traffic generated by the application. I knew from watching the production application that there should be communication between two servers over port 9300, but I wasn’t seeing anything. I thought of using truss to determine if the application was at least trying to connect, but being a web application, the process was way too short-lived to attach to. Even if I was able to attach to it, all I might have learned was that the application was calling the ‘connect’ system call and possibly getting a return value.
DTrace, on the other hand, doesn’t have to attach to a particular process. It just sits in the kernel and dynamically observes what your scripts tell it to. It also has the ability to look into data structures so you can actually see what values are passed to functions and system calls. I used this ability to watch every call to the connect system call.
socks = (struct sockaddr*) copyin(arg1, arg2);
hport = (uint_t) socks->sa_data;
lport = (uint_t) socks->sa_data;
hport <<= 8;
port = hport + lport;
printf("%s: %d.%d.%d.%d:%d\n", execname, socks->sa_data, socks->sa_data, socks->sa_data, socks->sa_data, port);
This script copies arg2 bytes from the address pointed to by arg1 into kernel space and uses the data to determine the port (big endian order) and destination address.
When run, the script immediately revealed that the web application was trying to connect to itself rather than a database server, a simple configuration mistake made difficult to diagnose due to poor logging facilities in the application.
In the future, there will be a network provider for DTrace which will simplify the job of extracting data from network calls. It should then be possible to rewrite the script to simply:
printf("%s: %s:%d\n", execname, args->ip_daddr, args->tcp_dport);
Hopefully I’ll have more opportunities to use DTrace in the future.