2 Replies Latest reply: Apr 3, 2012 9:35 AM by scampsd RSS

    SUN Studio 12 works on one process but fails on another

    scampsd
      Good afternoon,
      I am working on a Solaris 10 machine (Sun server T2000 machine).

      On this machine, Sun Studio 12 is installed:
      pkginfo | grep -i dbx
      application SPROdbx                      Sun Studio 12 Debugging Tools
      application SPROdbxx                     Sun Studio 12 Debugging Tools 64-bit
      I have two processes, dbx works for one of them but not for the other:
      Process 1) (dbx is OK):
      dbx wpbx_sdp30_sep
      For information about new features see `help changes'
      To remove this message, put `dbxenv suppress_startup_message 7.6' in your .dbxrc
      Reading wpbx_sdp30_sep
      Reading ld.so.1
      Process 2) (dbx is NOK):
      dbx wpbx_sdp31_sep
      For information about new features see `help changes'
      To remove this message, put `dbxenv suppress_startup_message 7.6' in your .dbxrc
      Reading wpbx_sdp31_sep
      
      dbx: internal error: signal SIGSEGV (no mapping at the fault address)
      dbx's coredump will appear in /tmp
      Abort(coredump)
      I have already done a "truss" of the "dbx" but this does not reveal any information.
      Do you have any idea how to stop dbx from corring for the second process?

      Thanks
      Dominique
        • 1. Re: SUN Studio 12 works on one process but fails on another
          scampsd
          Good afternoon,
          Meanwhile I have found the dbx core-file, and I have asked for a call stack. This is the result:
          <Machine_prompt>#cd /opt/SUNWspro/bin
          <Machine_prompt>#dbx dbx /var/core/core_<Machine_name>_dbx_10000
          For information about new features see `help changes'
          To remove this message, put `dbxenv suppress_startup_message 7.6' in your .dbxrc
          Reading dbx
          core file header read successfully
          Reading ld.so.1
          Reading libintl.so.1
          Reading libnsl.so.1
          Reading libsocket.so.1
          Reading libdl.so.1
          Reading libgen.so.1
          Reading libw.so.1
          Reading libm.so.1
          Reading libc.so.1
          Reading libc_psr.so.1
          Reading en_US.ISO8859-1.so.3
          Reading libcpc.so.1
          Reading libpctx.so.1
          Reading libnvpair.so.1
          Reading libdevinfo.so.1
          Reading libproc.so.1
          Reading libsec.so.1
          Reading librtld_db.so.1
          Reading libelf.so.1
          Reading libctf.so.1
          Reading libavl.so.1
          Reading ld.so.1
          WARNING!!
          A loadobject was found with an unexpected checksum value.
          See `help core mismatch' for details, and run `proc -map'
          to see what checksum values were expected and found.
          dbx: warning: Some symbolic information might be incorrect.
          program terminated by signal ABRT (Abort)
          0xffffffff7e8d3894: __lwp_kill+0x0008:  bcc,a,pt  %icc,__lwp_kill+0x18  ! 0xffffffff7e8d38a4
          (dbx) where -l
          =>[1] libc.so.1:__lwp_kill(0x0, 0x6, 0xffffffff7e8d2798, 0x1a1b20, 0x0, 0x0), at 0xffffffff7e8d3894
            [2] libc.so.1:raise(0x6, 0x0, 0x1000bfd0c, 0xffffffffffffffff, 0xffffffff7e9ec000, 0x0), at 0xffffffff7e870ec8
            [3] libc.so.1:abort(0x1, 0x1b8, 0xffffffff7e8d2798, 0x1a1b20, 0x0, 0x0), at 0xffffffff7e84a5ec
            [4] 0x1000bfd0c(0xb, 0x100400, 0x1, 0x1005104ec, 0x10051d110, 0x10061c000), at 0x1000bfd0c
            [5] libc.so.1:__sighndlr(0xb, 0xffffffff7fffe570, 0xffffffff7fffe290, 0x1000bfa9c, 0x0, 0xa), at 0xffffffff7e8d2798
            ---- called from signal handler with signal 11 (SIGSEGV) ------
            [6] libc.so.1:strlen(0x5369c7cb00001, 0x100639ee8, 0x5369c7cb00001, 0x1e, 0x1, 0xffffffff7fffe8a8), at 0xffffffff7e83b598
            [7] libc.so.1:_strdup(0x5369c7cb00001, 0xffffffff7cc6ff98, 0x1, 0x1ab94c, 0x100505748, 0x1ab800), at 0xffffffff7e873dc4
            [8] 0x100333098(0x100639ee0, 0x100639ee8, 0xffffffff7cb00000, 0x1e, 0x40, 0xffffffff7fffe8a8), at 0x100333098
            [9] 0x10025af38(0x10066e340, 0xffffffff7fffec10, 0xffffffff7cb00000, 0x1706d4, 0x1006726e0, 0x100671a30), at 0x10025af38
            [10] 0x1002582d0(0x1006726e0, 0xffffffff7cb00000, 0x1706d4, 0xffffffff7fffec10, 0x10066e340, 0x10056ddfc), at 0x1002582d0
            [11] 0x100257f40(0x1006726e0, 0xffffffff7fffec10, 0x1706d4, 0x100569, 0x100569c90, 0x10056dd9b), at 0x100257f40
            [12] 0x1002507f4(0x1006726e0, 0x1006985b0, 0xffffffffffffffff, 0x1006726e0, 0x100638860, 0x1006985da), at 0x1002507f4
            [13] 0x1002510dc(0x1006726e0, 0x1006985b0, 0xffffffffffffffff, 0x10056c, 0x100400, 0x10061c000), at 0x1002510dc
            [14] 0x1002538c0(0x10056d438, 0x1006985b0, 0x1, 0x10056d, 0x1, 0x1006726e0), at 0x1002538c0
            [15] 0x1001b5ec4(0x100672050, 0x0, 0x0, 0x8, 0x100400, 0x0), at 0x1001b5ec4
            [16] 0x10022fd80(0x10051d110, 0xffffffff7ffff5a8, 0x0, 0x8002, 0x100672050, 0x0), at 0x10022fd80
            [17] 0x10022f8e8(0x10051d110, 0xffffffff7ffff5a8, 0xffffffff7ffff5a8, 0x100538ef8, 0x8, 0xffffffff7ffff922), at 0x10022f8e8
            [18] 0x1000bff9c(0x10051d110, 0x10063f8b0, 0xffffffff7ffff5a8, 0x100538ef8, 0x100538000, 0x100538), at 0x1000bff9c
            [19] 0x1000c4878(0x2, 0xffffffff7ffff6b8, 0x10063f8b0, 0x1005111a8, 0x8202, 0x1000bfa9c), at 0x1000c4878
          (dbx) exit
          dbx: internal warning: td_ta_clear_event() failed -- debugger service failed
          dbx: internal warning: td_ta_sync_tracking_enable(0) failed -- debugger service failed
          I hope this helps you finding an answer.
          Best regards
          Dominique

          Edited by: scampsd on Apr 3, 2012 3:56 PM
          • 2. Re: SUN Studio 12 works on one process but fails on another
            scampsd
            Good news, I have found the problem:

            As mentioned in the dbx core-file's call stack, the issue is caused by the "strlen" function in "libc.so.1" file.
            I have also done a "strlen" function (using awk) and I have encountered the same problem:
            <Machine_prompt># strings <process_name> | awk {'print $1 "|" length($1)'}
            ...
            awk: line 0 (NR=4774): Record too long (LIMIT: 19999 bytes)
            After some analysis I have indeed found an entry in "strings <process_name>" that was over 24000 characters long.
            I have contacted the programmer, normally (s)he will modify this and by this the problem will be solved.

            Kind regards
            Dominique

            Edited by: scampsd on Apr 3, 2012 4:35 PM