[...] one process is waiting for a very long time in a send system call. It is sending on a valid fd but the question we have is that, is there any way to find who is on the other end of that fd? We want to know to which process is that message being sent to. [...]
Here is how I proceed in finding the other end of the socket, and the state of the socket connection with Mozilla's Thunderbird mail client in one end of the socket connection:
Get the process id of the application
% prstat PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP 22385 mandalik 180M 64M sleep 49 0 0:05:15 0.1% thunderbird-bin/5
Run pfiles on the pid - it prints a list of open files including open sockets (pfiles is the Solaris supported equivalent of lsof utility).
% pfiles 2238522385: /usr/lib/thunderbird/thunderbird-bin -UILocale C -contentLocale C Current rlimit: 512 file descriptors ... ... 33: S_IFSOCK mode:0666 dev:280,0 ino:31544 uid:0 gid:0 size:0 O_RDWR
O_NONBLOCK SOCK_STREAM SO_SNDBUF(49152),SO_RCVBUF(49640) sockname: AF_INET 192.168.1.2 port: 60364 peername: AF_INET 192.18.39.10 port: 993 ... ...
Locate the socket id and the corresponding sockname/port#, peername/port# in the output of pfiles pid (see step #2).
Here my assumption is that I know the socket id I'm interested in. In the above output, 33 is the socket id. One end of the socket is bound to port 60364 on the local host 192.168.1.2; and the other end of the socket is bound to port 993 on the remote host 192.18.39.10.
Run netstat -a
egrep "
% netstat -a
egrep "60364
993"solaris-devx-iprb0.60364 mail-sfbay.sun.com.993 48460 0 49640 0 ESTABLISHED
If you want to see the host names in numbers (IP addresses), run netstat with option -n.
% netstat -an
egrep "60364
993"192.168.1.2.60364 192.18.39.10.993 49559 0 49640 0 ESTABLISHED
Now since we know both ends of the socket, we can easily get the state of the socket connection at the other end by running netstat -an
egrep '
If the state of the socket connection is CLOSE_WAIT, have a look at the following diagnosis: CPU hog with connections in CLOSE_WAIT.
Finally to answer ... which process is that message being sent to ... part of the original question:
Follow the above steps and find the remote host (or IP) and remote port number. To find the corresponding process id on the remote machine to which the other half of the socket belongs to, do the following:
Login as root user on the remote host.
cd /proc
Run pfiles *
egrep "^[0-9]
sockname" > /tmp/pfiles.txt.
vi /tmp/pfiles.txt and search for the port number. If you scroll few lines up, you can see the process ID, name of the process along with its argument(s).