13 Replies Latest reply on Mar 17, 2011 2:21 PM by 777730

    Exception handling is not working in GCC compile shared object

    777730
      Hello,

      I am facing very strange issue on Solaris x86_64 platform with C++ code compiled usging gcc.3.4.3.

      I have compiled shared object that load into web server process space while initialization. Whenever any exception generate in code base, it is not being caught by exception handler. Even though exception handlers are there. Same code is working fine since long time but on Solaris x86, Sparc arch, Linux platform

      With Dbx, I am getting following stack trace.

      Stack trace is
      dbx: internal error: reference through NULL pointer at line 973 in file symbol.cc
      [1] 0x11335(0x1, 0x1, 0x474e5543432b2b00, 0x59cb60, 0xfffffd7fffdff2b0, 0x11335), at 0x11335
      ---- hidden frames, use 'where -h' to see them all ----
      =>[4] __cxa_throw(obj = (nil), tinfo = (nil), dest = (nil), , line 75 in "eh_throw.cc"
      [5] OBWebGate_Authent(r = 0xfffffd7fff3fb300), line 86 in "apache.cpp"
      [6] ap_run_post_config(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x444624
      [7] main(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x42c39a

      I am using following link options.

      Compile option is

      /usr/sfw/bin/g++ -c -I/scratch/ashishas/view_storage/build/coreid1014/palantir/apache22/solaris-x86_64/include -m64 -fPIC -D_REENTRANT -Wall -g -o apache.o apache.cpp

      Link option is
      /usr/sfw/bin/g++ -shared -m64 -o apache.so apache.o -lsocket -lnsl -ldl -lpthread -lthread


      At line 86, we are just throwing simple exception which have catch handlers in place. Also we do have catch(...) handler as well.

      Surpursing things are..same issue didn't observe if we make it as executable.
      Issue only comes if this is shared object loaded on webserver. If this is plain shared object, opened by anyother exe, it works fine.


      Can someone help me out. This is completly blocking issue for us. Using Solaris Sun Studio compiler is no option as of now.
        • 1. Re: Exception handling is not working in GCC compile shared object
          Fedor-Oracle
          shared object that load into web server process space
          ... same issue didn't observe if we make it as executable.
          When you "inject" your shared object into some other process a well-being of your exception handling depends on that other process.

          Mechanics of x64 stack traversing (unwind) performed when you throw the exception is quite complicated,
          particularly involving a "nearly-standartized" Unwind interface (say, Unwind_RaiseException).

          When we are talking about g++ on Solaris there are two implementations of unwind interface, one in libc and one in libgcc_s.so.

          When you g++-compile the executable you get it directly linked with libgcc_s.so and Unwind stuff resolves into libgccs.

          When g++-compiled shared object is loaded into non-g++-compiled executable's process _Unwind calls are most likely already resolved into Solaris libc.

          Thats why you might see the difference.
          Now, what exactly causes this difference can vary, I can only speculate.

          All that would not be a problem if _Unwind interface was completely standartized and properly implemented.
          However there are two issues currently:
          * gcc (libstdc++ in particular) happens to use additional non-standard _Unwind calls which are not present in Solaris libc
          naturally, implementation details of Unwind implementation in libc differs to that of libgccs, so when all the standard _Unwind
          routines are resolved into Solaris version and one non-standard _Unwind routine is resolved into gcc version you get a problem
          (most likely that is what happens with you)

          * libc Unwind sometimes is unable to decipher the code generated by gcc.
          However that is likely to happen with modern gcc (say, 4.4+) and not that likely with 3.4.3


          Btw, you can check your call frame to see where _Unwind calls come from:
          where -h -l
          If you indeed stomped on "mixed _Unwind" problem then the only chance for you is to play with linker
          so it binds Unwind stuff from your library directly into libgccs.
          Not tried it myself though.

          regards,
          __Fedor.
          • 2. Re: Exception handling is not working in GCC compile shared object
            777730
            Thanks SFy for the reply. This is indeed my impression also. I confirmed same with -l option also.

            [1] 0x1116d(0x1, 0x1, 0x474e5543432b2b00, 0x59cb60, 0xfffffd7fffdff280, 0x1116d), at 0x1116d
            [2] libc.so.1:_Unwind_RaiseException_Body(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff05b38c
            [3] libc.so.1:_SUNW_Unwind_RaiseException(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff05b579 =>[4] libstdc++.so.6:__cxa_throw(obj = (nil), tinfo = (nil), dest = (nil), , line 75 in "eh_throw.cc"
            [5] apache.so:OBWebGate_Init(p = 0x498188, plog = 0x4ca318, ptemp = 0x4cc328, s = 0x4c43c8), line 62 in "apache.cpp"
            [6] httpd:ap_run_post_config(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x444624
            [7] httpd:main(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x42c39a

            I already tried many things to avoid libc dependency but didn't succeeded. Do you have any suggestion for it.
            • 3. Re: Exception handling is not working in GCC compile shared object
              Fedor-Oracle
              I confirmed same with -l option
              well, your stack trace does not show any signs of libgcc_s unwind, thus no signs of mixed interface usage.
              however it might have happened before the crash.
              Do you have any suggestion for it.
              You might want to experiment with Solaris linker's direct binding (http://download.oracle.com/docs/cd/E19963-01/html/819-0690/gehwq.html).
              However the problem is that references to _Unwind are not only in your own code but also inside G++ STL (libstdc++).
              Say, cxa_throw is what throws the exception, and it is a function from libstdc++.

              You can try running your httpd with LD_PRELOAD=.../libgcc_s.so to see if it is really a culprit.
              PRELOAD will have precedence over libc stuff.
              That however might cause any unwinding in httpd itself broken (is it C++?).

              regards,
              __Fedor.
              • 4. Re: Exception handling is not working in GCC compile shared object
                777730
                Hi fedor,

                We have tried with W,direct option also bit it did not help. With LD_preload, we are seeing issue. After setting LD_PRELOAD, we are getting ld error for libc.so.1.

                Can anybody help us here.
                • 5. Re: Exception handling is not working in GCC compile shared object
                  Fedor-Oracle
                  After setting LD_PRELOAD, we are getting ld error for libc.so.1
                  What kind of error?
                  • 6. Re: Exception handling is not working in GCC compile shared object
                    777730
                    export LD_PRELOAD=/usr/sfw/lib/libgcc_s.so.1

                    I started apache webserver and got this error.

                    bash-2.05b$ bin/apachectl start
                    ld.so.1: httpd: fatal: /usr/sfw/lib/libgcc_s.so.1: wrong ELF class: ELFCLASS32
                    Killed

                    Also getting same error while executing other command.,
                    bash-2.05b$ ls
                    ld.so.1: ls: fatal: /usr/sfw/lib/libgcc_s.so.1: wrong ELF class: ELFCLASS32
                    Killed
                    • 7. Re: Exception handling is not working in GCC compile shared object
                      777730
                      I set LD_PRELOAD_32, and problem with ld.so.1: httpd: fatal: /usr/sfw/lib/libgcc_s.so.1: wrong ELF class: ELFCLASS32
                      Killed gone away but there is no changes in output.

                      I am still getting core dump.
                      • 8. Re: Exception handling is not working in GCC compile shared object
                        Fedor-Oracle
                        I set LD_PRELOAD_32, and problem with ld.so.1: httpd: fatal: /usr/sfw/lib/libgcc_s.so.1: wrong ELF class: ELFCLASS32
                        gone away but there is no changes in output.
                        The reason of this error is that your process (httpd) is 64-bit and you are feeding it 32-bit libgcc_s.
                        You solved it by stopping feeding it using LD_PRELOAD_32 which has no effect on 64-bit process.
                        Thus no change in behavior.

                        Instead you should be using:
                        LD_PRELOAD=/usr/sfw/lib/amd64/libgcc_s.so.1
                        (or LD_PRELOAD_64, which should have similar effect).

                        regards,
                        __Fedor.
                        • 9. Re: Exception handling is not working in GCC compile shared object
                          777730
                          Hi Fedor, It works for small application which I created to simulate the issue. No core observed.

                          Now I am trying with original app, will let you know results.

                          Many Many thanks for yr help.
                          • 10. Re: Exception handling is not working in GCC compile shared object
                            777730
                            Hi Fedor,

                            It works for us. Thank you. Is there any other way to handle it..may be something need to add in link option?
                            • 11. Re: Exception handling is not working in GCC compile shared object
                              Fedor-Oracle
                              Is there any other way to handle it..may be something need to add in link option?
                              Unfortunately none that I can suggest right away.
                              I will try asking folks around.

                              regards,
                              __Fedor.
                              • 12. Re: Exception handling is not working in GCC compile shared object
                                777730
                                Unfortunatly it works for apache webserver but not for Oracle Http Server. Please help me here.
                                • 13. Re: Exception handling is not working in GCC compile shared object
                                  777730
                                  Any input?

                                  I have also confirmed with truss output, libgcc_s.so.1 is loaded first. Any other suggestion?