9 Replies Latest reply: Apr 25, 2012 11:11 AM by Mfalco-Oracle RSS

    Reg Core Dump

    926397
      We are getting bad address core dump while using Coherence lib .

      Please find the dbx stack trace below in .



      @null (l@9) terminated by signal SEGV (no mapping at the fault address)
      0xffffffffffffffff: <bad address 0xffffffffffffffff>
      (dbx) where
      [1] 0xfa666f08(0xf71f8940, 0x28ec9c8, 0xf71f8930, 0x18, 0x8, 0xf71f8880), at 0xfa666f08
      [2] 0xfa662da8(0xf71f8ab0, 0x28ec9c8, 0xf71f89b0, 0xf71f8940, 0x154396e8, 0xfdf88a80), at 0xfa662da8
      [3] 0xfa414a6c(0xf71f8ab0, 0x28ec9d0, 0xf71f8aa0, 0xf71f91e0, 0x0, 0xf71f89b0), at 0xfa414a6c
      [4] 0xfaba8c78(0x28ec9c8, 0xf71f8c20, 0x0, 0xf71f8b80, 0xf71f8ab0, 0x1), at 0xfaba8c78
      [5] 0xfb1f6b8c(0xf71f9190, 0xf71f91e0, 0xf71f8c20, 0xf71f91a0, 0xf71f8c30, 0x28af100), at 0xfb1f6b8c
      [6] 0xfb1f2af0(0x28aaeb0, 0xf71f91a0, 0x0, 0x1, 0xf71f9190, 0xfb1df870), at 0xfb1f2af0
      [7] 0xfb1f5c48(0x28aaeb0, 0x569fb10, 0xf71f9270, 0xf71f9280, 0xf71f9280, 0xf71f9360), at 0xfb1f5c48
      [8] 0xfad25bd0(0x289f3b8, 0xf71f9360, 0xf71f9360, 0xf71f92f0, 0x1, 0xf71f9360), at 0xfad25bd0
      [9] 0xfabf44c0(0x569fb10, 0xfb1fefd4, 0xfeb88ae0, 0xf71f9360, 0xfabf309c, 0xfeaae700), at 0xfabf44c0
      [10] 0xfa8a3d70(0x289f228, 0xf71f9460, 0xf71f9420, 0xf71f9430, 0xfdf88a80, 0xf71f9460), at 0xfa8a3d70
      [11] 0xfa8a338c(0x289f228, 0xf71f9540, 0xf71f9540, 0xf71f94d0, 0x1, 0xf71f95c0), at 0xfa8a338c
      [12] 0xfa7c1dd8(0x25b6ef8, 0x7f13700, 0xf71f9540, 0xf71f95c0, 0xf71f9550, 0x289f2a0), at 0xfa7c1dd8
      [13] 0xfabf44c0(0x7f13700, 0xfa7c3a60, 0xfe37b854, 0xf71f95c0, 0xfabf309c, 0xfe37ac90), at 0xfabf44c0
      [14] 0xfb2c9184(0xf71f9be0, 0xfdd965d4, 0x0, 0xffffffff, 0x1, 0x0), at 0xfb2c9184
      [15] 0xfa8866b4(0x28df150, 0xf71f9d14, 0xf71fa390, 0xfdf88a80, 0x0, 0xf71fa134), at 0xfa8866b4
      [16] 0xfa90c320(0x41f30d0, 0xfe476544, 0x28aae50, 0xf71fa380, 0x3, 0xf71fa458), at 0xfa90c320
      [17] 0xfad8f044(0x867800, 0x41f3198, 0xf71fb500, 0xf71fa670, 0xf71fb500, 0xf71fa5b0), at 0xfad8f044
      [18] 0xfad838dc(0x28aad08, 0xf71fb234, 0xfe7f165c, 0xfdf88a80, 0xf71fad6c, 0xf71fae44), at 0xfad838dc
      [19] 0xfb293ed0(0x28dba98, 0x28aad08, 0xc54400, 0xf71fb500, 0xf71fb4d0, 0xfdf88a80), at 0xfb293ed0
      [20] 0xfa758038(0x28dbab0, 0xfe6ea03c, 0xfe6ea03c, 0xf71fb5a0, 0x1, 0xf71fb5c0), at 0xfa758038
      [21] 0xfa9d9918(0xf71fbed0, 0x28dbdc0, 0xfdf88a80, 0x28e05b8, 0xf71fbf28, 0xfe4eab9c), at 0xfa9d9918
      [22] 0xfa614f74(0x17be110, 0xf71fc000, 0x0, 0x0, 0xfa629ee8, 0xfe275df0), at 0xfa614f74


      Without Coherence lib we don't see these kind of bad address core dumps
        • 1. Re: Reg Core Dump
          Mfalco-Oracle
          Hi 923394,

          Can you provide some more details including:
          - Coherence version
          - OS Version
          - Client type (Java, C++)
          - JDK version if Java, Compiler version if C++

          Additionally if you have a reproducer you can share that would be of significant help.

          thanks,

          Mark
          Oracle Coherence
          • 2. Re: Reg Core Dump
            926868
            - Coherence version           : 3.6.1.0.0
            - OS Version               : Solaris 10 10/09 s10s_u8wos_08a SPARC
            - Client type (Java, C++) : C++
            - JDK version if Java, Compiler version if C++ : CC: Sun C++ 5.9 SunOS_sparc Patch 124863-01 2007/07/25
            • 3. Re: Reg Core Dump
              Mfalco-Oracle
              Hi 923865,

              The compiler patch version you are on is less than the minimum supported version for Coherence on your platform. See http://docs.oracle.com/cd/E24290_01/coh.371/e22839/gs_install.htm#BABDCDFG for details. While I can't say for sure that upgrading to the latest, or even minimally supported patch level will address the issue, it should be done regardless, and may well resolve things.

              thanks,

              Mark
              Oracle Coherence
              • 4. Re: Reg Core Dump
                926868
                Some time we do see below stack trace caused by coherence ,

                Current function is coherence::native::NativeAtomic64::peek
                79 return m_lAtomic;
                (dbx) where
                =>[1] coherence::native::NativeAtomic64::peek(this = 0x3844119f), line 79 in "NativeAtomic64.hpp"
                [2] coherence::lang::Object::_detach(this = 0x38441197, fEscaped = false), line 761 in "Object.hpp"
                [3] 0xfa2a7fc0(0x3192d5b0, 0x0, 0xc0000000, 0xfdf88a80, 0x0, 0x4f5ca40), at 0xfa2a7fc0
                [4] 0xfa7c5570(0x3192d528, 0x1, 0x1000, 0x3f1728, 0x3f1400, 0xfdf88a80), at 0xfa7c5570
                [5] 0xfab9537c(0x3192d5c8, 0x1ffffe0, 0xfdf88a80, 0x80000000, 0xc0, 0x80), at 0xfab9537c
                [6] coherence::lang::Object::_detach(this = 0x3192d5c8, fEscaped = false), line 774 in "Object.hpp"
                [7] 0xfa7ccd34(0xf71fa288, 0x2, 0x18, 0xfdf88a80, 0x1, 0x1), at 0xfa7ccd34
                [8] 0xfa886744(0x2c0d4e0, 0xf71f9d14, 0xf71f9d14, 0xfdf88a80, 0x0, 0xf71fa134), at 0xfa886744
                [9] 0xfa90c320(0x655adb0, 0xfe476544, 0x2bc2a40, 0xf71fa380, 0x2, 0xf71fa458), at 0xfa90c320
                [10] 0xfad8f044(0x867800, 0x655ae78, 0xf71fb500, 0xf71fa670, 0xf71fb500, 0xf71fa5b0), at 0xfad8f044
                [11] 0xfad838dc(0x2bc28f8, 0xf71fb234, 0xfe7f165c, 0xfdf88a80, 0xf71fad6c, 0xf71fae44), at 0xfad838dc
                [12] 0xfb293ed0(0x2c09620, 0x2bc28f8, 0xc54400, 0xf71fb500, 0xf71fb4d0, 0xfdf88a80), at 0xfb293ed0
                [13] 0xfa758038(0x2c09638, 0xfe6ea03c, 0xfe6ea03c, 0xf71fb5a0, 0x1, 0xf71fb5c0), at 0xfa758038
                [14] 0xfa9d9918(0xf71fbed0, 0x2c09948, 0xfdf88a80, 0x2c0e9c8, 0xf71fbf28, 0xfe4eab9c), at 0xfa9d9918
                [15] 0xfa614f74(0x256e910, 0xf71fc000, 0x0, 0x0, 0xfa629ee8, 0xfe275df0), at 0xfa614f74
                (dbx)

                Please advice.
                • 5. Re: Reg Core Dump
                  Mfalco-Oracle
                  Hi 923865,

                  Please confirm that you've upgraded to at least the minim supported compiler patch level. Our minimum patch level requirement is based on bugs found within the compiler, and using anything less that the specified version means that your problem could very well be the result of one of the compiler bugs we'd previously identified.

                  That being said the new information you've provided does allow for a bit more of a hint as to what may be going wrong. The stack in question shows that the reference counter of a coherence managed object was being decremented from a non thread-safe handle, and the object in question had apparently already been destructed, thus the handle should not have been referencing it. I would check for is that your code follows the thread-safety guidelines (http://docs.oracle.com/cd/E24290_01/coh.371/e22839/cpp_objectmod.htm#sthref79) for our C++ API. My guess would be that you have code which uses a non thread-safe handle in a multi-threaded context, i.e. a global handle, and that the reference count is getting corrupted and the object is being deleted too early. The thread-safety guidelines in the above link covers where and how to use thread-safe handles.

                  If you are unable to resolve this issue based on the above, it is likely that we'll need a reproducer in order to provide further assistance. Again please ensure that you are using at least the minimum supported compiler version as well.

                  Mark
                  Oracle Coherence
                  • 6. Re: Reg Core Dump
                    926868
                    We are using C++ compiler version CC: Sun C++ 5.9 SunOS_sparc Patch 124863-01 2007/07/25.


                    After core dumping , In stack trace all C++ string variables are pointing same address location (0xfef7bafc).
                    Other than C++ string variable we can print and debug it.

                    The same address (0xfef7bafc) pointing by all core dump stack trace string variables.

                    Example :
                    (dbx) print "StringVariable"
                    dbx: cannot access address 0xfef7bafc
                    (dbx) examine 0xfef7bafc
                    0xfef7bafc: npos : dbx: core file read error: address 0xfef7bafc not in data space
                    (dbx)

                    ------------------------
                    • 7. Re: Reg Core Dump
                      Mfalco-Oracle
                      Hi user 923865,

                      As mentioned earlier in this thread the compiler patch version you've indicated is not supported with Coherence for C++, and you will need to upgrade to at least patch level 124863-14. Please give this a try and let us know if the issue persists.

                      Mark
                      Oracle Coherence
                      • 8. Re: Reg Core Dump
                        926868
                        we are using Coherence Version : 3.6.1.0.0.
                        Please let us know version of the C++ compiler used to generate "libcoherence.so" your side.
                        • 9. Re: Reg Core Dump
                          Mfalco-Oracle
                          Hi User 923865,

                          You can refer to the Coherence 3.6.x docs regarding these requirements here (http://docs.oracle.com/cd/E15357_01/coh.360/e15726/gs_install.htm#BABDCDFG), but the answer is the same as posted above. It looks like you will need to start by upgrading your compiler to the minimum supported version of 124863-14.

                          Mark
                          Oracle Coherence