2 Replies Latest reply on Oct 26, 2011 9:34 PM by Ali Bahrami-Oracle

    relocation errors on solaris 10 u10 due to symbol versioning from ld

      After updating to solaris 10 u10, I’ve been seeing issues with libstdc++ (from gcc 4.2.4 and 4.4.5) built using binutils 2.19, the gnu linker and gas. I’m getting relocation errors when using certain symbols from libstdc++. LD_DEBUG output shows that these are not being considered because the interpreter believes that they have the hidden bit set:
      03888: 1: symbol=_ZNSi6ignoreEv; lookup in file=a.out [ ELF ]
      03888: 1: symbol=_ZNSi6ignoreEv; lookup in file=/home/server/lib/libstdc++.so.6 [ ELF ]
      03888: 1: symbol=_ZNSi6ignoreEv; hash index=3055; version=32770; skipping symbol with GNU version hidden bit set in file=/home/server/lib/libstdc++.so.6
      03888: 1: symbol=_ZNSi6ignoreEv; lookup in file=/lib/libm.so.2 [ ELF ]
      03888: 1: symbol=_ZNSi6ignoreEv; lookup in file=/home/server/lib/libgcc_s.so.1 [ ELF ]
      03888: 1: symbol=_ZNSi6ignoreEv; lookup in file=/lib/libc.so.1 [ ELF ]
      03888: 1:
      03888: 1:
      03888: 1: libc.so.1: a.out: fatal: relocation error: file a.out: symbol _ZNSi6ignoreEv: referenced symbol not found

      Using elfdump on libstdc++ shows that certain symbols seem to have extra bits set that are now being interpreted as hidden. The previous output was for the ver to be 32770 (0x8002). This occurs on both sparc and x86 distributions.

      Output from elfdump:
      index value size type bind oth ver shndx name
      [72] 0x00048ca4 0x0000003a FUNC GLOB D 2H .text ZNKSt14basicifstreamIcSt11char_traitsIcEE7is_openEv
      [384] 0x0004893c 0x0000001f FUNC GLOB D 2H .text ZNKSbIwSt11chartraitsIwESaIwEE11_M_disjunctEPKw
      [647] 0x00048bf8 0x00000035 FUNC GLOB D 2H .text ZNSt19istreambufiteratorIcSt11char_traitsIcEEppEv
      [887] 0x0004895c 0x0000003a FUNC GLOB D 2H .text ZNKSt13basicfstreamIwSt11char_traitsIwEE7is_openEv
      [935] 0x00048b90 0x00000042 FUNC GLOB D 2H .text ZNSs7M_moveEPcPKcj
      [958] 0x00048998 0x0000003a FUNC GLOB D 2H .text ZNKSt13basicfstreamIcSt11char_traitsIcEE7is_openEv
      [990] 0x00048aa8 0x00000040 FUNC GLOB D 2H .text ZNSbIwSt11chartraitsIwESaIwEE9_M_assignEPwjw
      [1154] 0x00048a6c 0x00000039 FUNC GLOB D 2H .text ZNKSs15M_check_lengthEjjPKc
      [1331] 0x000489d4 0x0000002f FUNC GLOB D 2H .text ZNSbIwSt11chartraitsIwESaIwEE4_Rep26_M_set_length_and_sharableEj
      [1406] 0x00048ae8 0x00000042 FUNC GLOB D 2H .text ZNSbIwSt11chartraitsIwESaIwEE7_M_moveEPwPKwj
      [1520] 0x00048f48 0x000001fe FUNC GLOB D 2H .text ZNSt13basicistreamIwSt11char_traitsIwEE6ignoreEi
      [1604] 0x00048ce0 0x0000003a FUNC GLOB D 2H .text ZNKSt14basicofstreamIwSt11char_traitsIwEE7is_openEv
      [1615] 0x00048c30 0x00000037 FUNC GLOB D 2H .text ZNSt19istreambufiteratorIwSt11char_traitsIwEEppEv
      [1831] 0x00048b2c 0x00000042 FUNC GLOB D 2H .text ZNSbIwSt11chartraitsIwESaIwEE7_M_copyEPwPKwj
      [2080] 0x00048828 0x00000010 FUNC GLOB D 2H .text ZNSt11chartraitsIwE2eqERKwS2_
      [2185] 0x00048a04 0x0000002c FUNC GLOB D 2H .text ZNSs4Rep26_M_set_length_and_sharableEj
      [2342] 0x00048e48 0x00000100 FUNC GLOB D 2H .text ZNSt13basicistreamIwSt11char_traitsIwEE6ignoreEv
      [2397] 0x00048b70 0x00000020 FUNC GLOB D 2H .text ZNSs9M_assignEPcjc
      [2400] 0x00049148 0x000001fa FUNC GLOB D 2H .text _ZNSi6ignoreEi
      [2617] 0x00048818 0x00000010 FUNC GLOB D 2H .text ZNSt11chartraitsIcE2eqERKcS2_
      [2798] 0x00048c68 0x0000003a FUNC GLOB D 2H .text ZNKSt14basicofstreamIcSt11char_traitsIcEE7is_openEv
      [3055] 0x00048d58 0x000000f0 FUNC GLOB D 2H .text _ZNSi6ignoreEv
      [3326] 0x00048a30 0x00000039 FUNC GLOB D 2H .text ZNKSbIwSt11chartraitsIwESaIwEE15_M_check_lengthEjjPKc
      [3336] 0x00048d1c 0x0000003a FUNC GLOB D 2H .text ZNKSt14basicifstreamIwSt11char_traitsIwEE7is_openEv
      [3381] 0x00048bd4 0x00000023 FUNC GLOB D 2H .text ZNSs7M_copyEPcPKcj
      [3447] 0x00048920 0x0000001b FUNC GLOB D 2H .text ZNKSs11M_disjunctEPKc

      My question is why is this now being interpretted as hidden? I'd also like to know where the ver field above comes from, since it doesn't seem to map to any field in the Elf32_sym structure. Looking at the raw object file, I don't see the source of this hidden attribute (or even the 32770 value from before). The raw data from .dynsym table for one of the problematic symbols (3055 above) mapped to Elf32_sym is :
      st_name: 3f 10 00 00 st_value: 58 8d 04 00 st_size: f0 00 00 00 st_info: 12 st_other: 00 st_shndx: 0b 00

      It's my understanding that visibility is set within the st_other field (which is zero here), so this seems to be something else.

      Additionally, I've noticed that if the interpreter within the application is libc and is a symlink, the program crashes. For example:
      bash-3.00$ elfdump -i /tmp/protoc

      Interpreter Section: .interp
      bash-3.00$ LD_LIBRARY_PATH=/usr/lib:/lib /tmp/protoc
      Segmentation Fault (core dumped)
      bash-3.00$ LD_LIBRARY_PATH=/lib:/usr/lib /tmp/protoc
      Missing input file.

      /usr/lib/libc.so.1 is just a link to /lib, so there must be a bug in the new ld when the interpreter is a symlink.
      bash-3.00$ ls -l /usr/lib/libc.so.1 /lib/libc.so.1
      -rwxr-xr-x 1 root bin 1424620 Jul 4 23:13 /lib/libc.so.1
      lrwxrwxrwx 1 root root 19 Jun 29 14:29 /usr/lib/libc.so.1 -> ../../lib/libc.so.1

      Both of these problems are new with u10 (kernel 147441-02). Any help on these issues, particularly the first is greatly appreciated.

        • 1. Re: relocation errors on solaris 10 u10 due to symbol versioning from ld
          We're looking into the "relocation" issue. More later.

          As for the libc interpreter. I can't reproduce the problem with a simple program.

          % cat main.c
          #include <stdio.h>
          void main(){ (void) printf("Hello world\n");}
          % LD_OPTIONS=-I/usr/lib/libc.so.1 cc -o main main.c
          % LD_LIBRARY_PATH=/usr/lib:/lib ./main
          Hello world
          % LD_LIBRARY_PATH=/lib:/usr/lib ./main
          Hello world
          % LD_OPTIONS=-I/lib/libc.so.1 cc -o main main.c
          % LD_LIBRARY_PATH=/usr/lib:/lib ./main
          Hello world
          % LD_LIBRARY_PATH=/lib:/usr/lib ./main
          Hello world

          Do you have a stack trace from protoc?

          • 2. Re: relocation errors on solaris 10 u10 due to symbol versioning from ld
            Ali Bahrami-Oracle
            Hi Bennet,

            Solaris 10 Update 10 inherited a bunch of work we did for Solaris 11,
            including some support for the GNU extensions to our scheme for ELF
            symbol versioning.

            The GNU versioning hidden bit is the topmost bit of the ELF symbol version,
            which is found in the versym section. This is an array that parallels the symbol
            table. In the GNU version of ELF versioning, a symbol version has 15-bits,
            plus this flag. The flag means "disable" or "Ignore". It's not part of symbol visibility,
            despite the name, and so, not part of st_other.

            You can learn more about it here: http://lists.debian.org/lsb-spec/1999/12/msg00017.html

            Note that elfdump shows the version as 2H, rather than 32770. The H represents
            the top bit (0x8000), allowing you to see that its really in version 2, but hidden.

            I believe that ld.so.1 needs to have its logic for this test tweaked. As Rod said yesterday,
            we're looking into it. As a workaround, I believe that you can disable the runtime
            linkers check by removing the DT_VERSYM entry from the objects dynamic section:

            % elfedit -e 'dyn:delete versym' libstdc++.so.6 libstdc++.so.6-alt

            Be sure to move the original object to a safe name (i.e. libstdc++.so.6.orig)
            before putting the altered one in its place.

            It is also worth noting that this object was produced by gcc configured to use
            the GNU ld. This is not a well supported combination under Solaris, and we
            recommend configuring gcc to use the native Solaris ld. A gcc that targeted
            the Solaris ld would not have encountered this issue.