1 Reply Latest reply: May 27, 2014 11:45 AM by Todd Little-Oracle RSS

    Core Dump in Foccur32() due to Address Misalignment

    2672821

      Hi,

       

      I am working on Tuxedo 11.1_RP090 with HP-UX IA64, and after a call to tpcall() there is a core dumped due to address alignment issue. The following is excerpt from gdb backtrace.

      <snip>

      Program terminated with signal 10, Bus error.

      BUS_ADRALN - Invalid address alignment. Please refer to the following link that helps in handling unaligned data: http://docs.hp.com/en/7730/newhelp0610/pragmas.htm#pragma-pack-ex3

      #0  0xc000000000211ab0:0 in _lwp_kill+0x30 ()

         from /usr/lib/hpux64/libpthread.so.1

      (gdb) db

      Undefined command: "db".  Try "help".

      (gdb) bt

      #0  0xc000000000211ab0:0 in _lwp_kill+0x30 ()

         from /usr/lib/hpux64/libpthread.so.1

      #1  0xc000000000178810:0 in pthread_kill+0x9d0 ()

         from /usr/lib/hpux64/libpthread.so.1

      #2  0xc0000000003f80e0:0 in raise+0xe0 () from /usr/lib/hpux64/libc.so.1

      #3  0xc00000001e5a2d80:0 in skgesigOSCrash () at skgesig.c:376

      #4  0xc00000001f666900:0 in kpeDbgSignalHandler () at kpedbg.c:1074

      #5  0xc00000001e5a3220:0 in skgesig_sigactionHandler () at skgesig.c:799

      #6  <signal handler called>

      #7  Foccur32 () at Foccur32.c:87

      #8  0xc00000001498c020:0 in _tmaff_delallflds () at affinity.c:725

      #9  0xc00000001498b570:0 in _tmaff_acall () at affinity.c:117

      #10 0xc00000001478f7a0:0 in _tpacall_internal () at tmacall.c:588

      #11 0xc0000000147a2a30:0 in _tpcall_internal () at tmcall.c:349

      #12 0xc0000000147a0ed0:0 in _tpcall_ () at tmcall.c:157

      #13 0xc0000000147a3790:0 in tpcall () at tmcall.c:474

      #14 0xc000000002a3bc90:2 in inline mtux_flags () at my_app.c:1078

      #15 0xc000000002a3bc80:2 in mtux_sync (l_name=<not available>,

          l_service=<not available>, l_request_buf=<not available>,

          l_request_buf_len=<not available>, l_response_buf=<not available>,

          l_response_buf_len=<not available>, l_flags=<not available>)

          at my_app.c:1215

      </snip>

       

      In Frame 15, there is a call to tpcall() as follows:-

      tpcall((char *)l_service,

                     l_request_buf,

                     l_request_buf_len,

                     l_response_buf,

                     l_response_buf_len,

                     mtux_flags(l_flags))

       

      From  there the tuxedo code gets called and Signal 10 is raised in Frame 7, inside Foccur32(). Since we do not have the code for Foccur32.c,

      it is difficult for us to determine which address data structure passed as input parameter to tpcall() is misaligned or is it some misaligned pointer in Foccur32() code.

      It would be helpful, if someone can guide as to which structure needs to be examined for possible misalignment.

       

       

      Can someone please let me know what can be done to know whether code causing issue at line no. 87 in Foccut32() comes from the parameters passed to tpcall()?

      I am not sure whether it is due to the parameters we pass or due to some internal data structure used by Foccur32().

       

      Message was edited by: 642aa413-1410-45b0-8534-b407133fb819

        • 1. Re: Core Dump in Foccur32() due to Address Misalignment
          Todd Little-Oracle

          Hi,

           

          From a quick look at the code in affinity.c I'm guessing that you have a memory corruption problem.  The fault is occurring while trying to access the Tuxedo META TCM, a header that Tuxedo associates with the request.  If you can create a simple reproducer I would suggest opening a support request and provide the reproducer.

           

          As always, if this is a new problem, what's changed in your environment or application?  Also what is the environment/configuration of your Tuxedo application?

           

          Regards,

          Todd Little

          Oracle Tuxedo Chief Architect