4 Replies Latest reply: Jun 11, 2010 8:42 AM by 807559 RSS

    Function entry but no return

    807559
      I'm trying to figure out a way to isolate PIDs of processes that enter door_call() but never return. Finding my self needing more control structures than are available in D.

      Has anyone managed to do this sort script?

      Cheers

      Steve
        • 1. Re: Function entry but no return
          807559
          How are you defining 'never'? I can imagine a few approaches that don't seem so complicated, so perhaps I am missing on what exactly you want to do.
          • 2. Re: Function entry but no return
            807559
            OK 'never' is a bit strong... :-)

            Some background for you... A number of our mail servers running sendmail end up in a state where we start running out of available slots (based on the number of child processes), this appears to be due to door_call() not returning in a timely fashion. So far its taking over an hour for some of these processes to exit, if they exit at all.

            so for example:
            root@mailer /# ps -e -o user,pid,ppid,etime,args  | grep 24866
                root   913 20488       00:00 grep 24866
                root 24866 20829    01:15:08 /usr/lib/sendmail -bd -q15m
            root@mailer /# echo "1::findstack -v" | mdb -p 24866
            stack pointer for thread 1: ffbfcaa0
            [ ffbfcaa0 libc.so.1`_door_call+8() ]
              ffbfcb18 libc.so.1`_nsc_trydoorcall_ext+0x1b8(ffbfcc74, ffbfcc70, ffbfcc6c, 0
              , 0, deadbeed)
              ffbfcc08 libc.so.1`_nsc_search+0xc0(ff1a6868, ff123dc4, 5, ffbfcd48, fecf0000
              , 16f4a0)
              ffbfcc78 libc.so.1`nss_search+0x34(ff1a6868, ff123dc4, 5, ffbfcd48, feef2a00, 
              0)
              ffbfcce8 libnsl.so.1`_switch_getipnodebyaddr_r+0x60(ffbfce78, 10, 1a, 16a8a0, 
              16a8b4, 2120)
              ffbfcd80 libnsl.so.1`_get_hostserv_inetnetdir_byaddr+0x2dc(123aa0, ffbfced0, 
              ffbfce88, 123aa0, ff127658, 6)
              ffbfce18 libnsl.so.1`getipnodebyaddr+0x4b4(10a11c, 4, 2, ffbfcf4c, 165ea0, 
              16a8a0)
              ffbfcee8 sendmail`sm_gethostbyaddr+0x70(10a11c, 4, 2, 1, 5, 10d000)
              ffbfcf50 sendmail`hostnamebyanyaddr+0x78(10a118, 14, 5, 109b8c, 4, 2)
              ffbfcfc0 sendmail`getrequests+0x11d0(e4ac8, 10b400, 10b800, 10b7fc, 134, 0)
              ffbfdcc0 main+0x3c04(10ac00, 10b400, 384, d6000, e4800, b2800)
              ffbffca8 _start+0x108(0, 0, 0, 0, 0, 0)
            But trying identify this event is proving tricky which is why I was thinking of dtrace.
            • 3. Re: Function entry but no return
              807559
              Well then, we have a lot more information than it first appeared, eh? We know the process in question.

              There's a couple potential problems here, but the most likely one to me, based on experience, is here:
              ffbfce18 libnsl.so.1`getipnodebyaddr+0x4b4(10a11c, 4, 2, ffbfcf4c, 165ea0, 16a8a0)
              Are you using IPV6?
              • 4. Re: Function entry but no return
                807559
                Hi,

                Now we're not using IPV6 deliberately, but didn't the whole host and IP address lookup mechanism get updated so that it didn't matter with version of IP you were using???

                Or have I become confused?

                Cheers

                Steve