This content has been marked as final. Show 4 replies
OK 'never' is a bit strong... :-)
Some background for you... A number of our mail servers running sendmail end up in a state where we start running out of available slots (based on the number of child processes), this appears to be due to door_call() not returning in a timely fashion. So far its taking over an hour for some of these processes to exit, if they exit at all.
so for example:
root@mailer /# ps -e -o user,pid,ppid,etime,args | grep 24866 root 913 20488 00:00 grep 24866 root 24866 20829 01:15:08 /usr/lib/sendmail -bd -q15m
But trying identify this event is proving tricky which is why I was thinking of dtrace.
root@mailer /# echo "1::findstack -v" | mdb -p 24866 stack pointer for thread 1: ffbfcaa0 [ ffbfcaa0 libc.so.1`_door_call+8() ] ffbfcb18 libc.so.1`_nsc_trydoorcall_ext+0x1b8(ffbfcc74, ffbfcc70, ffbfcc6c, 0 , 0, deadbeed) ffbfcc08 libc.so.1`_nsc_search+0xc0(ff1a6868, ff123dc4, 5, ffbfcd48, fecf0000 , 16f4a0) ffbfcc78 libc.so.1`nss_search+0x34(ff1a6868, ff123dc4, 5, ffbfcd48, feef2a00, 0) ffbfcce8 libnsl.so.1`_switch_getipnodebyaddr_r+0x60(ffbfce78, 10, 1a, 16a8a0, 16a8b4, 2120) ffbfcd80 libnsl.so.1`_get_hostserv_inetnetdir_byaddr+0x2dc(123aa0, ffbfced0, ffbfce88, 123aa0, ff127658, 6) ffbfce18 libnsl.so.1`getipnodebyaddr+0x4b4(10a11c, 4, 2, ffbfcf4c, 165ea0, 16a8a0) ffbfcee8 sendmail`sm_gethostbyaddr+0x70(10a11c, 4, 2, 1, 5, 10d000) ffbfcf50 sendmail`hostnamebyanyaddr+0x78(10a118, 14, 5, 109b8c, 4, 2) ffbfcfc0 sendmail`getrequests+0x11d0(e4ac8, 10b400, 10b800, 10b7fc, 134, 0) ffbfdcc0 main+0x3c04(10ac00, 10b400, 384, d6000, e4800, b2800) ffbffca8 _start+0x108(0, 0, 0, 0, 0, 0)
Well then, we have a lot more information than it first appeared, eh? We know the process in question.
There's a couple potential problems here, but the most likely one to me, based on experience, is here:
Are you using IPV6?
ffbfce18 libnsl.so.1`getipnodebyaddr+0x4b4(10a11c, 4, 2, ffbfcf4c, 165ea0, 16a8a0)