7 Replies Latest reply on Sep 15, 2004 9:28 PM by 807575

    SunStudio9 Compiler, Constructor and buggy behaviour

    807575
      Hi,

      We've recently upgraded to the sun studio 9 compiler and I'm noticing some rather odd behaviour while running through the debugger. This "odd behaviour" ultimately causes a coredump and I've managed to track it back to a place where it seems the constructors are getting quite a bit mixed up. I'm at a loss for why the compiled code is working in this way and not sure whether the compiler has chewed anything up. If anyone has any ideas, I would be most grateful.

      This is a stripped down example, but you can get the point...

      struct ds_def
      {
      ds_def() : last_sta_nam(" ") {}

      fn_string<16> last_sta_nam; // this is a basic fixed-size string
      // ... there are other data members in this struct also
      };

      struct intf_entry_def
      {
      intf_entry_def() { //.. does some minor initialisation }

      ds_def ds_;
      };

      class X
      {
      public:
      void foo()
      {
      intf_ptr_ = new intf_entry_def;
      }
      private:
      intf_entry_def * intf_ptr_;
      };

      Using dbx, I stopped the running executable when in the ds_def::ds_def() constructor and had a look at some of the data members and their addresses....

      (dbx) examine this
      0x021a2588: 0x015db798

      (dbx) print sizeof(*this)
      sizeof(*this) = 784

      (dbx) examine &last_sta_nam
      0x021a2870: 0x00000000
      (dbx) examine &this->last_sta_nam
      0x021a2808: 0x015db798

      (dbx) print -r last_sta_nam
      last_sta_nam = {
      last_sta_nam.data_ptr_ = (nil)
      data_size_ = 0
      data_set_ = false
      pad_char_ = '\0'
      data_ = ""
      }
      (dbx) print -r this->last_sta_nam
      this->last_sta_nam = {
      this->last_sta_nam.data_ptr_ = 0x21a2818 " "
      data_size_ = 16U
      data_set_ = true
      pad_char_ = ' '
      data_ = " "
      }

      Now, the odd thing here is that &last_sta_nam and &this->last_sta_nam are pointing to two separate areas in memory. As far as I'm aware, reference to "this" shouldnt make a difference but it does. This is why I'm wondering whether there is any compiler magic occuring, that is unfortunately getting it wrong. Printing using the "this" pointer yields the right result while without it doesnt.

      A number of the other data member addresses used with "this" and without point correctly to the same address in memory.

      In intf_entry_def::intf_entry_def(), the values are still viewable in their different parts, but still confusingly, to different areas.

      (dbx) examine &ds_.last_sta_nam
      0x021a2870: 0x00000000
      (dbx) examine &this->ds_.last_sta_nam
      0x021a2808: 0x015db798

      Finally, once the objects have all been constructed and we're back in X::foo, the "this" reference finally yields the same address in memory, but unfortunately is pointing to the wrong one. The original last_sta_nam is consigned somewhere we cant get and the coredump occurs when last_sta_nam is attempted to be used (its invariants are broken).

      (dbx) examine &entry_ptr_.ds_.last_sta_nam
      0x021a2870: 0x00000000
      (dbx) examine &this->entry_ptr_.ds_.last_sta_nam
      0x021a2870: 0x00000000

      I'm aware that the C++ standard leaves a number of things such as memory layout up to the implementation, but this doesnt seem like normal C++ behaviour to me.

      Is there anyone out there knowledgeable enough in the sun C++ implementation with any ideas?

      thx,
      J
        • 1. Re: SunStudio9 Compiler, Constructor and buggy behaviour
          807575
          The oddities you area seeing are more likely to be debugger artifacts than incorrect code generation.

          First of all, are you compiling with any optimization options (-O or -xO)? The optimizer shuffles code around, making it hard to debug. Try compiling with -g and without any -O or -xO options.

          Next, how are you stopping in the constructor? If you use "stop in constructor", it's possible that dbx is stopping too soon, before all the entry code has been run, and in particular before the "this" pointer has been set. Try setting a breakpoint on the first executable statement in the constructor, and see if that makes a difference.

          If still no joy, post (or email) enough code that I can try to duplicate your problem.
          • 2. Re: SunStudio9 Compiler, Constructor and buggy behaviour
            807575
            The result causes a coredump, and the template class' data members are completely ruined after the constructor has run, whereas they are correct and visible while stepping in the constructor with the debugger. It may be that there are debugger artifacts clouding this issue, but there are very real problems happening in the code..

            The code is compiled using -g, and there are no code optimisation flags such as -O or -xO being used.

            Debugger is invoked by using "stop in ds_def::ds_def". There are no executable statements in the constructor (members are built using initialization, not assignment), and the highlighted statement line is the closing curly brace prior to return.

            Thanks for your help so far. I will try and get a sample of code that has this problem, but so far my attempts have not been successful and this can only be reproduced in the full executable.
            • 3. Re: SunStudio9 Compiler, Constructor and buggy behaviour
              807575
              ok, I found the cause of the error. The structure has been defined twice in the global namespace.

              This would normally cause an error in compilation, but as they are built as separate compilation units it manages to get through the compiler and linker ok. This is why I was seeing different addresses for the data members, and why the data was trashed on return to the calling function.

              Maybe the linker should throw an error during the build?

              Anyway, here is some code that highlights this problem for your reference.

              // part.h

              #ifndef PART_H
              #define PART_H

              class A
              {
              public:
              virtual void foo() = 0;
              };

              #endif

              // partA.h

              #include <part.h>
              #include <iostream>

              struct ds_def
              {
              ds_def() : a_(1), a2_(9), b_(3), c_(4) {}
              int a_;
              int a2_;
              int b_;
              int c_;
              };

              class X : public A
              {
              public:
              virtual void foo();

              private:
              ds_def *ds_;
              };

              // partA.cc

              #include <partA.h>
              #include <partList.h>

              using namespace std;

              void X::foo()
              {
              ds_ = new ds_def;
              cout << "X - a_(should be 1) = " << ds_->a_
              << ",a2_(should be 9) = " << ds_->a2_
              << ",b_(should be 3) = " << ds_->b_
              << ",c_(should be 4) = " << ds_->c_ << endl;
              }

              struct initX
              {
              initX()
              {
              PartList::registerA(&x_, 1);
              }
              X x_;
              } initX;

              // partB.h

              #include <part.h>
              #include <iostream>
              struct ds_def
              {
              ds_def() : a_(1), b_(2), c_(3) {}
              int a_;
              int b_;
              int c_;
              };

              class Y : public A
              {
              public:
              Y()
              {
              }

              void foo();

              private:
              ds_def *ds_;
              };

              // partB.cc

              #include <partB.h>
              #include <partList.h>

              using namespace std;

              void Y::foo()
              {
              ds_ = new ds_def;
              cout << "Y - a_(should be 1) = " << ds_->a_
              << ",b_(should be 2) = " << ds_->b_
              << ",c_(should be 3) = " << ds_->c_ << endl;
              }

              struct initY
              {
              initY()
              {
              PartList::registerA(&y_, 2);
              }
              Y y_;
              } initY;

              // partList.h

              #include <part.h>
              #include <map>

              using std::map;

              class PartList
              {
              public:
              static void registerA(A* a, const int& i);

              static A* get(const int& i);

              private:
              static map<int, A*> list_;
              };

              // partList.cc

              #include <partList.h>

              using namespace std;

              void PartList::registerA(A* a, const int& i)
              {
              list_ = a;
              }

              A* PartList::get(const int& i)
              {
              return list_[i];
              }

              map<int, A*> PartList::list_;

              // main.cc

              #include <part.h>
              #include <partList.h>

              int main()
              {
              PartList::get(1)->foo();
              PartList::get(2)->foo();
              }

              # makeit.sh

              CC -g -c -I`pwd` -o partList.o partList.cc
              CC -g -c -I`pwd` -o partA.o partA.cc
              CC -g -c -I`pwd` -o partB.o partB.cc
              CC -g -c -I`pwd` -o main.o main.cc
              CC -g -o memerr main.o partList.o partA.o partB.o

              • 4. Re: SunStudio9 Compiler, Constructor and buggy behaviour
                807575
                Your partList.cc is not compilable. After fixing the definitions of register and get, I did not find any multiple definitions of any data object.

                The linker will complain about multiple definitions of global objects, unless you use the -zmuldefs option, which tells the linker to ignore them.

                Another way to get multiple definitions of global objects is to put objects into a shared library with local or symbolic linkage. The library will have its own copy that is not necessarily shared with other parts of the program.


                • 5. Re: SunStudio9 Compiler, Constructor and buggy behaviour
                  807575
                  apologies for that. cant think why it copied incorrectly.

                  the linker didnt complain about multiple definitions of global objects and in the code above we broke the one definition rule. you can notice that ds_def is defined in both partA.h and partB.h with different data members. it would have been nice to get a linker error but then the standard doesnt require the implementation to enforce this.

                  interestingly enough, it looks like the 5.4 & 5.6 compiler implements things differently. in our previous version (CC: Forte Developer 7 C++ 5.4 2002/03/09) this would build and work in a manner that looks ok. but we really know that this is a problem just waiting to surface, which is why when we upgraded (CC: Sun C++ 5.6 2004/06/02) we got the following results which shows that Y::b_ is really 9, when it should be 2.

                  X - a_(should be 1) = 1,a2_(should be 9) = 9,b_(should be 3) = 3,c_(should be 4) = 4
                  Y - a_(should be 1) = 1,b_(should be 2) = 9,c_(should be 3) = 3

                  its all undefined behaviour, and all our fault. i'm just glad there was a quick resolution, so thanks for your help.

                  rgds,
                  J
                  • 6. Re: SunStudio9 Compiler, Constructor and buggy behaviour
                    807575
                    this is an aside... but i've just realised why the code didnt copy correctly and wouldnt compile. its a saving and re-rendering issue in html.

                    the code was correct as i cut and pasted it, but it converted the [ i ] to an italics html tag.

                    so

                    [ i ] normally gets converted to italics when there are no spaces present..


                    worth knowing!
                    • 7. Re: SunStudio9 Compiler, Constructor and buggy behaviour
                      807575
                      We are apparently using different meanings for the phrase "multiple definitions".

                      The linker can detect multiple definitions of a symbol. A symbol definition is a name assigned to a region of storage. It you say "int x=1;" or "myclass y(1);" or "int foo() { return 0; }" in two modules, the linker will complain, because a symbol can be defined only once in a program.

                      In your example, there are no mutliple definitions in that sense. The code defines a type two different ways, but no conflict is visible to the linker. The only use of the type is to create objects on the heap. Heap objects do not have names and the linker knows nothing about the use of the heap.