This discussion is archived
10 Replies Latest reply: Mar 22, 2012 9:24 AM by koval RSS

CC name mangling confusion

koval Newbie
Currently Being Moderated
While compiling QtCreator using Solaris Studio I have come across to an interesting issue with name mangling:
Class QmlDesigner::AbstractView contains method
ModelNode createModelNode(const QString&, int, int, const PropertyListType&, const PropertyListType&, const QString&, ModelNode::NodeSourceType);
It is exported from the implementation file abstractview.cpp and used in qmlstate.cpp file linked to the same library
nm -C applied to the resulting shared object gives:
[10448] |   1576144|     169|FUNC |GLOB |0    |11     |QmlDesigner::ModelNode QmlDesigner::AbstractView::createModelNode(const QString&,int,int,const QList<QPair<QString,QVariant> >&,const QList<QPair<QString,QVariant> >&,const QString&,QmlDesigner::ModelNode::NodeSourceType)
                                                       [__1cLQdDmlDesignerMAbstractViewPcreateModelNode6MrknHQdDString_iirknFQdDList4nFQdDPair4n0C_nIQdDVariant_____74n0AJModelNodeONodeSourceType__8_]
                                                                                                                                                                    ^                               ^
[14294] |         0|       0|FUNC |GLOB |0    |UNDEF  |QmlDesigner::ModelNode QmlDesigner::AbstractView::createModelNode(const QString&,int,int,const QList<QPair<QString,QVariant> >&,const QList<QPair<QString,QVariant> >&,const QString&,QmlDesigner::ModelNode::NodeSourceType)
                                                       [__1cLQdDmlDesignerMAbstractViewPcreateModelNode6MrknHQdDString_iirknFQdDList4nFQdDPair4n0C_nIQdDVariant_____rk54n0AJModelNodeONodeSourceType__9B_]
                                                                                                                                                                    ^^^                               ^^
I have marked characters which are different (7 replaced with rk5 and 8 with 9B)

Demangled output shows the same prototype but manglings are different (11 coming from abstractview.cpp and UNDEF from qmlstate.cpp)

I guess if the function were mangled same in both cases the linker would resolve it and the resulting shared object would not have undefined symbols

Is there any document describing mangling rules of CC compiler? I have once read an article at Oracle sites about ABI stability which mentions publishing such a document but I cannot google it anywhere.

Where does the difference come from? is it a bug in mangler or demangler?
I think mangling should be a 1-to-1 function not allowing two different codings for same prototype. Maybe the demangler does not correctly decode the small difference in the symbols above which decoded properly would indicate what the compiler "thought" of the function in while exporting and using as extern


I have noticed the same difference in mangling for Sun Studio 12 (CC: Sun C++ 5.9 SunOS_sparc Patch 124863-01 2007/07/25)
Solaris Studio 12.2 (CC: Sun C++ 5.11 SunOS_sparc/i386 2010/08/13)
and Solaris Studio 12.3 (CC: Sun C++ 5.12 SunOS_i386 2011/11/16)
  • 1. Re: CC name mangling confusion
    Steve_Clamage Pro
    Currently Being Moderated
    You have run into a known bug in name mangling, documented here:
    http://docs.oracle.com/cd/E24457_01/html/E21987/glnzd.html#gkgak
    (read this short article before reading the rest of this post)

    If we fix the bug, some currently working programs will fail to link, if they mix old and new binaries.
    If we don't fix the bug, some valid programs will not link, as in your example.
    We have chosen to leave the bug in place on older platforms, to avoid breaking working programs. Those platforms are
    Sparc Solaris, 32-bit and 64-bit
    x86 Solaris, 32-bit

    However, on platforms introduced after the bug was discovered, there was no issue with backward compatibility. The bug does not exist on these platforms:
    x86 Solaris, 64 bit
    Linux, 32-bit and 64 bit

    If you are on a 32-bit x86 Solaris platform, consider whether you can build in 64-bit mode. If so, the bug will not be present, and in addition, 64-bit programs are often more efficient than 32-bit programs.

    Otherwise:

    If the problem is due to inconsistent top-level const on function parameters, such as
    int foo(const int);
    int foo(int); // same function
    it is best to fix the source code to use const consistently. Technically, the compiler should ignore the const, but the inconsistency can be confusing to human readers of the code.

    If the problem is the more subtle one of mixing typedefs of composite types with explicit coding of the same type, modifying the source code might not be a realistic solution. The weak-declaration solution shown in the document might be a workable (if ugly) solution in that case.

    Finally, the compiler has a hidden option to fix all known mangling bugs unconditionally. We don't publicize the option because
    - It is unstable. Future patches or releases could change mangling if more bugs are found.
    - You might need to recompile all of the C++ code, including 3rd-party libraries, using this option.
    - If you create a library with this option, it might be incompatible with code compiled without the option.
    - As with all hidden options, it is subject to change or removal without notice.
    - We view this option as "use at your own risk".

    If after all these caveats you still want to try the option, here it is:
    -Qoption ccfe -abiopt=mangle6
    Be sure to add it to every CC command, and recompile everything.

    Fortunately, none of the C++ system libraries shipped with Solaris or Studio is affected by this bug or the hidden option, so you don't need to worry about different versions of those libraries.
  • 2. Re: CC name mangling confusion
    koval Newbie
    Currently Being Moderated
    Thanks a lot, this article was really helpful!

    Mangling was inconsistent in this case because declaration uses typedef alias PropertyListType and definition uses the target type itself: QList<QPair<QString, QVariant> >
    After changing declaration to use the typedef everything now mangles ok

    QtCreator code runs into this problem in more places but in the form of const/const-less arguments which were easy to find - demangled output shows const in one symbol and no const in the other.

    PS I understand that fixing the bug would break compatibility but couldn't the demangler somehow show differences?
    I think adding a comment like /*typedef alias*/ before the argument type could be useful and it would not change the property of the output string that it is a valid C++ function prototype
  • 3. Re: CC name mangling confusion
    Steve_Clamage Pro
    Currently Being Moderated
    koval wrote:
    PS I understand that fixing the bug would break compatibility but couldn't the demangler somehow show differences?
    The different manglings represent the same function; that's the problem. The bugs are in the compiler not following its own rules in a few cases, generating a syntactically valid mangling but not the canonical mangling. The demangler is able to parse the alternative manglings, and generate the same function prototype for each alternative.
    I think adding a comment like /*typedef alias*/ before the argument type could be useful and it would not change the property of the output string that it is a valid C++ function prototype
    Sorry, I don't understand what you mean.
  • 4. Re: CC name mangling confusion
    koval Newbie
    Currently Being Moderated
    Steve_Clamage wrote:
    koval wrote:
    PS I understand that fixing the bug would break compatibility but couldn't the demangler somehow show differences?
    The different manglings represent the same function; that's the problem. The bugs are in the compiler not following its own rules in a few cases, generating a syntactically valid mangling but not the canonical mangling. The demangler is able to parse the alternative manglings, and generate the same function prototype for each alternative.
    I think adding a comment like /*typedef alias*/ before the argument type could be useful and it would not change the property of the output string that it is a valid C++ function prototype
    Sorry, I don't understand what you mean.
    I was wondering if there is a trace in alternative mangling form that the argument type was originally a typedef alias.
    If the demangler could recognize such a situation it might add the additional comment in the output string

    For example:
    suppose there is a function void print(const std::string&)
    which the compiler mangled in some non-canonical way

    The demangler detected that the mangling is the effect of the bug with typedefs as arguments

    The output from demangler could be in this situation:
    void print(const _/*typedef alias for*/_ std::basic_string<char, std::char_traits<char>, std::allocator<char> >&)

    +[Edit]+
    If the mangling appears in canonical form no comments should be emitted to let the user notice where the different symbols come from
    (demangled form is easier to read and putting the comment for one symbol indicates that problems were caused by this particular compiler bug)

    Edited by: koval on 2012-03-21 00:22
  • 5. Re: CC name mangling confusion
    Steve_Clamage Pro
    Currently Being Moderated
    Now I see what you are getting at.

    Cases of non-canonical mangling probably could in principle be detected. The demangler could re-mangle the demangled name and see whether the result matches the input. I see potential problems with this approach.

    - A name mangler is a nearly complete C++ parser, meaning the demangler would become similar in size and complexity to a C++ compiler front end.

    - The mangler would not have prior declarations of names to guide the parsing. Parsing would necessarily fail for some demangled names due to ambiguities in the C++ grammar that are resolved by checking what kind of thing the name is.

    - The demangled name is not always a syntactically valid declaration, meaning the parser would either fail, or need special syntax rules.

    It might be possible to come up with some ad-hoc analyses that would find some cases of non-canonical mangling, but I don't think the complexity is worth the benefit. The bugs show up rarely, as evidenced by the years that elapsed between the initial release of the compiler and the first bug report about problems with name mangling.
  • 6. Re: CC name mangling confusion
    koval Newbie
    Currently Being Moderated
    I didn't realize demangling is such a complicated process - I judged it wrongly looking at quite moderate size of /usr/lib/libdemangle.so

    It would be convenient to have such a feature in the demangler but you are right the bug is not that painful once you know about the possible scenarios when it can come out.

    I am only wondering about what http://developers.sun.com/solaris/articles/CC_abi/CC_abi_content.html says:
    The ABI was published as a public document.
    Is such a document still available?


    Anyway thanks once more for your explanation of the bug, I have the newest version of QtCreator compiled with Solaris Studio and running on our Solaris boxes. After minor patches it even runs on Solaris 9!
  • 7. Re: CC name mangling confusion
    Steve_Clamage Pro
    Currently Being Moderated
    A DEMANGLER is usually quite straightforward. You design your mangling scheme so that it is easy to demangle -- just gobble up tokens and convert to names and symbols.

    To MANGLE a name from source code requires most of a C++ compiler front end.

    The ABI document you reference was for C++ 4.x, which shipped from about 1993-1998. The C++ 5.0 through 5.11 compilers supported that ABI as an option (-compat=4), but only for compatibility with old code. The ABI supports only old-style (ARM, 1990) C++. As of Studio 12.3 (C++ 5.12) that ABI is no longer supported.

    I don't know whether the old ABi doc ever appeared on a public web site. "Public" just meant Sun would give a copy to anyone who asked. Since we don't use that ABI any more, it's no longer very interesting.

    Oracle does not have a public document that describes the default (standard C++) ABI.
  • 8. Re: CC name mangling confusion
    koval Newbie
    Currently Being Moderated
    I think I understand you now, the compiler front end would be requried to determine if the mangling is canonical but the demangler does not need it to retrieve demangled version from mangled symbol.

    Fortunately we do not need ABI document to be able to demangle symbols, libdemangle makes it available for programmers.

    Do you have the knowledge if `cplus_demangle' routine from libdemangle is signal-safe, that is it only uses the functions marked by POSIX as permitted in a signal handler?
  • 9. Re: CC name mangling confusion
    Steve_Clamage Pro
    Currently Being Moderated
    The function uses pthread mutex calls to make it thread-safe, and calls printf on error conditions. So no, you should not call the function from a signal handler.
  • 10. Re: CC name mangling confusion
    koval Newbie
    Currently Being Moderated
    Thanks for the info, I was afraid it is not suitable for singal handlers...

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points