Dev Tip: How to Get Finer-Grained Control of Debugging Information

Version 5

    with Oracle Solaris Studio

     

    by Ivan Soleimanipour

     

    The new options in Oracle Solaris Studio 12.4 provide much finer-grained control over debug information, which allows you to choose how much information is provided and to reduce the amount of disk space needed for the executable.

     


     

    Introduction

     

    Oracle Solaris Studio 12.4 provides new options to exert a finer level of control over information that is embedded into an executable to be used by the debugger and other tools. The primary motivation for this feature is to reduce the size of debugging information while trading off debugging features. This article will start by describing the controls and then show some examples of how to use them and size savings.

     

    Curbing Complexity

     

    The complete list of the finer-grained knobs might be overwhelming, so there should be an easier way to gain the same level of control but with using simpler options. To that end we have:

     

    FlagComment
    -gThe traditional debugging option which will continue to provide all the information it provided historically. We'll actually pick it apart later.
    -g3Is intended for more bulky information. At the moment, this only constitutes macro definition information.
    -g2Same as -g.
    -g1This is new to Oracle Solaris Studio 12.4. It is intended for minimal debuggability of deployed applications. It provides the file name and line numbers as well as simple parameter information. This is the information that is considered crucial during post-mortem debugging.
    -g0(C++ only) Same as -g but enabling the C++ compiler to do some function inlining.
    -gnoneDon't emit any debugging information.

     

    There may be scenarios that are not captured by the above family of -g options. Consequently there are a set of flags that provide an even finer granularity of control.

     

    The Finer-Grained Knobs

     

    The flag -g expands to the following:

     

    FlagComment
    -xdebuginfo=lineProvides information about file names and line numbers. Primarily affects stack traces, source-level single stepping and ability to put breakpoints on source lines.
    -xdebuginfo=paramProvides information about parameters' types and values. The type information is not restricted to simple types such as int and float. It also includes compound types such as char * and char *argv[] and even Node *np. However, information about the internal details of structures such as Node is not provided. That is the job of tagtype information.
    -xdebuginfo=tagtypeProvides information about the fields of structures, unions, enums and classes. This includes class static data members.
    -xdebuginfo=variableProvides information about all variables other than parameters. This includes extern and static variables including those nested inside functions.
    -xdebuginfo=declProvides information about the types of variables and functions whose definition wasn't compiled with debug info but for which we have declarations. This allows for accurate evaluation of variables such as errno or the calling of functions such as printf().
    -xdebuginfo=codetagCodetag information is used by a variety of tools. For example, it helps RTC (dbx Run Time Checking) to handle bitfield accesses correctly.
    -xglobalize=yesMakes static data definitions global and prepends unique identifiers to avoid namespace collisions.
    -xpatchpadding=fixAdds padding to the start of functions to enable tools to patch the functions. This option together with -xglobalize=yes enables the use of the dbx Fix and Continue feature.
    +d (for unoptimized code)By default the C++ compiler will inline member functions that appear in explicit inlined form:

     

    class Node {    ...    char *name() const { return n_name; }    ...    }

     

    The C++ compiler flag +dforces the creation of fully outlined definitions so that the member functions are visible in stack traces, can have breakpoints put on them, and be stepped into or be called from a debugger.

     

    When no optimization is requested, -g includes +d; optimized compiles do not include +d. There is a historical option, -g0, which is exactly like -g but without +d. (That is why -gnonewas used as the name for requesting no debug information.)

     

    The debug flag -g1 doesn't specify +d.
    -xkeep_unref=funcs,varsThis option ensures that even unused code is available. Compilers will often not emit code or information about functions and variables that are unused by the actual program. Sometimes these functions and variables are only referenced from a debugger:

     

    (dbx) assign debug_format = "x" (dbx) call n->debug_dump()

     

    The following flags complete the list:

     

       

    FlagComment
    -xdebuginfo=hwcprofProvides information on variable access names by load and store instructions. Used by -xhwcprof.
    -xdebuginfo=macroIncluded by -g3.

     

    Your Mileage May Vary

     

    In the following, we've chosen dbx itself as a "typical" C++ program and gcc, from SPEC cpu2006, as a "typical" C program.

     

    Note: Of course no program is "typical." Some programs include every .h in every .c, while others are very careful in that regard. Some programs heavily depend on inlining while others favor outlined functions. Some programs heavily depend on the C++ Standard Template Library and others don't (dbx is among the latter).

     

    Furthermore the amount of debug and other info may heavily depend on the compiler (C, C++, or Fortran), the platform, 32- versus 64-bit and other compiler options, notably optimizations (-O, etc.).

     

    Therefore, the information below should primarily be taken as illustrations of how to use the finer-grained knobs.

     

    Sizes of Executables

     

    Executables are crammed with information. Command line utilities such as size(1), elfdump(1), and dwarfdump(1) can provide a lot of detail, but it's either too much or not enough.

     

    Oracle Solaris Studio 12.4 has a utility, dsize(1), that provides just the right amount of information for learning about the contribution of debug info to executable size. This utility is intended for field diagnosis so it is located in <install>/lib/dbx as opposed to <install>/bin.

     

    Let's start by examining what is crammed into the dbx and gcc executable when they are compiled with Oracle Solaris Studio compilers.

     

    The following table was produced using the dsize command. The -h option gives "human readable" sizes.

     

    $ dsize -h dbx-g gcc-g                               dbx-g      gcc-g                          ---------------------- L                .interp:      17  b      17  b L                  .hash:  143.27 Kb   37.64 Kb L                .dynsym:  286.45 Kb   75.27 Kb L                .dynstr:  623.55 Kb   96.40 Kb L          .SUNW_version:     544  b      64  b L           .SUNW_versym:   35.81 Kb    9.41 Kb L            .SUNW_reloc:     144  b      24  b L              .rela.plt:    3.43 Kb     792  b L                  .text:    4.52 Mb    4.98 Mb L                  .init:    1.09 Kb      16  b L                  .fini:      20  b      12  b L      .exception_ranges:  411.12 Kb          - L                .rodata:  765.55 Kb  109.23 Kb L               .rodata1:   45.06 Kb  178.94 Kb L             .SUNW_move:      24  b          - L                   .got:     204  b      28  b L                   .plt:    3.48 Kb     844  b L               .dynamic:     400  b     280  b L             .ex_shared:      52  b          - L                  .data:  713.25 Kb    5.19 Kb L               .picdata:       0  b       0  b L                 .data1:  165.58 Kb          - L                   .bss:       0  b       0  b  S               .symtab:  380.23 Kb  127.97 Kb  S               .strtab:  719.21 Kb  138.95 Kb                .annotate:  515.36 Kb  491.80 Kb                 .comment:  521.85 Kb  191.91 Kb  SDb         .debug_info:   27.85 Mb    2.57 Mb  SDb         .debug_line:  839.17 Kb  458.32 Kb  SDb       .debug_abbrev:  501.44 Kb   68.37 Kb  SDi         .stab.index:  486.91 Kb   71.04 Kb  S              .compcom:    8.60 Kb          -  SDb          .debug_loc:   49.57 Kb          -                .shstrtab:     326  b     259  b  SDi      .stab.indexstr:    1.57 Mb  192.16 Kb                   Elf hdr:      52  b      52  b        Program hdr table:     160  b     160  b        Section hdr table:    1.41 Kb    1.17 Kb       Total section size:   40.99 Mb    9.76 Mb                    Tally:   40.99 Mb    9.76 Mb                          ----------------------          Total file size:   40.99 Mb    9.76 Mb                     slop:      20  b      25  b                    Loaded:    7.64 Mb    5.48 Mb                         :        18%        56%               not Loaded:   33.35 Mb    4.27 Mb                         :        81%        43%                 Stripped:   32.33 Mb    3.60 Mb                         :        78%        36%             not Stripped:    8.66 Mb    6.15 Mb                         :        21%        63%                    Debug:   31.25 Mb    3.34 Mb                         :        76%        34%                    Index:    2.05 Mb  263.20 Kb                         :         4%         2%                     Bulk:   29.20 Mb    3.09 Mb                         :        71%        31%                 NonDebug:    9.74 Mb    6.41 Mb                         :        23%        65%                          ----------------------                                dbx-g      gcc-g  ===== DWARF stats for 'dbx-g'                TAG TYPE:      N  NA       SIZE  AVG              subprogram:  388574 174   15.74 Mb   42                 |-decls:  369829 164   15.14 Mb   42                 |- defs:   18745 170  614.15 Kb   33        formal_parameter:  684508 143    3.69 Mb    5                  member:  162533  98    3.01 Mb   19              class_type:   30767 149    1.17 Mb   39              enumerator:   63839  47  843.28 Kb   13                variable:   29462 169  634.22 Kb   22                 |-decls:    9224 167  266.77 Kb   29                 |- defs:   20238 147  367.45 Kb   18            compile_unit:     369   1  552.83 Kb 1534                 typedef:   11362 125  364.34 Kb   32            pointer_type:   60661  13  296.20 Kb    5          structure_type:   12921 126  239.66 Kb   18        enumeration_type:    7044 122  198.00 Kb   28              const_type:   20358  22   99.40 Kb    5           SUN_codeflags:   19092  73   77.37 Kb    4           lexical_block:    7894 108   69.99 Kb    9      SUN_class_template:    1857  42   62.43 Kb   34             inheritance:    6514  49   57.25 Kb    9          reference_type:   10295  49   50.27 Kb    5               base_type:    4022   3   44.98 Kb   11     SUN_rtti_descriptor:    1349  92   44.90 Kb   34              array_type:    4395 137   43.97 Kb   10 template_type_parameter:    8951  41   43.71 Kb    5         subroutine_type:    4852 116   35.19 Kb    7                  friend:    6971  43   34.04 Kb    5                constant:   13099 103   33.00 Kb    2           subrange_type:    4395 101   26.71 Kb    6              union_type:    2475 116   23.59 Kb    9               namespace:     385 110    7.65 Kb   20        unspecified_type:    1962  41    7.16 Kb    3    imported_declaration:    1352 126    7.04 Kb    5      ptr_to_member_type:     217  59    1.91 Kb    9  unspecified_parameters:    1628  71    1.59 Kb    1             thrown_type:     411  40    1020  b    2   SUN_function_template:       6   4     356  b   59         imported_module:      48  26     248  b    5           volatile_type:      41  20     212  b    5  ===== DWARF stats for 'gcc-g'                TAG TYPE:      N  NA       SIZE  AVG              enumerator:   35781  23  529.56 Kb   15                  member:   27575  24  516.29 Kb   19              subprogram:   12725  50  413.32 Kb   33                 |-decls:    7042  44  183.30 Kb   26                 |- defs:    5683  37  230.02 Kb   41                variable:   20126  39  384.01 Kb   19                 |-decls:     571  30   18.48 Kb   33                 |- defs:   19555  36  365.53 Kb   19        formal_parameter:   25061  36  270.86 Kb   11           lexical_block:   17076  30  150.08 Kb    9           SUN_codeflags:    8119  23   64.50 Kb    8            compile_unit:     155   1   58.06 Kb  383          structure_type:    3383  37   55.09 Kb   16            pointer_type:    7607  13   37.14 Kb    5                 typedef:    1726  28   27.44 Kb   16              array_type:    2289  35   22.35 Kb    9              const_type:    4096  18   20.00 Kb    5               base_type:    1214   8   13.82 Kb   11           subrange_type:    2289  33   13.48 Kb    6        enumeration_type:     610  35   11.29 Kb   18              union_type:     455  21    5.61 Kb   12         subroutine_type:     683  43    4.00 Kb    5  unspecified_parameters:     356  28     356  b    1           volatile_type:       1   1       5  b    5

     

    An executable is composed of "sections," the names of which appear usually with a "." in front. Sections marked L are loaded into memory when the program is executed.

     

    Sections not marked with L do not affect the working memory needed for execution but instead contribute to memory used by debuggers. This includes the ELF symbol table, .symtab and .strtab.

     

    The S in sections marked with S is for strippable (see strip(1)) not for symbol. Sections containing debugging information are marked with D.

     

    Debugging information is further divided into i (index) and b (bulk). The index information has to be read by dbx during program load, whereas the bulk information is read in on demand.

     

    After the detailed section information, coarse-grained statistics are presented. The most relevant are these:

     

             Total file size:   40.99 Mb  103.09 Mb                    Debug:   31.25 Mb   46.73 Mb                         :        76%        45%

     

    which shows what percentage of the executable's size is taken up by debug information. In this case, 76 percent of the file size for dbx and 45 percent of the file size for gcc is taken up with debug information.

     

    Further down, the bulk of DWARF Debug information itself is broken down into records and a statistical breakdown of their prevalence is presented.

     

    Executable Size Versus Disk Storage

     

    By default, all debugging information goes into the executable. One way to shrink executable size is to use the -xs=no option, which will leave the bulk of debugging information in .o's and not include it in the executable.

     

    When compiled with the default of -xs=yes, the final executable will contain all the debug information, so it will be roughly the same size as the sum of the sizes of all the object files. When compiled with -xs=no, the object files remain the same size but only the index information is copied into the executable. For example, dbx has 31 MB of debug info and 2 MB of index info. Compiled with -xs=no, dbx will be 29 MB smaller than when it is compiled with the default of -xs=yes.

     

    Note that the debugger will need access to the object files when debugging an application compiled with -xs=no.

     

    Generating Plain Debuggable Executables

     

    If you're only interested in plain debugging (no Run Time Checking, no Fix and Continue), there's a whole host of features which can be turned off without any loss of primary debuggability.

     

    Compile your code as follows:

     

    cc/CC ... -g \         -xdebuginfo=no%codetag \         -xkeep_unref=no%funcs,no%vars \         -xglobalize=no \         -xpatchpadding=0

     

    The savings are practically non-existent for C++ and modest for C in our example, but we'll cover the circumstances where they might become significant below.

     

    -g:                                dbx-g      gcc-g                    .text:    4.52 Mb    4.98 Mb                  .strtab:  719.21 Kb  138.95 Kb          Total file size:   40.99 Mb    9.76 Mb                    Debug:   31.25 Mb    3.34 Mb  Plain debugging:                                dbx-g      gcc-g                    .text:    4.30 Mb    4.90 Mb                  .strtab:  644.40 Kb  118.69 Kb          Total file size:   40.57 Mb    9.53 Mb                    Debug:   31.26 Mb    3.28 Mb

     

    Note that the debug size actually increases by a small amount for dbx because of the longer command line that gets recorded in  DW_TAG_compile_unit.DW_AT_SUN_command_line.

     

    The bulk of the savings comes from reduction in .text size, which is attributable to the flags -xkeep_unref=no%funcs and -xpatchpadding=0.

     

    Results from patch padding can be highly variable. Patch padding puts a four-word (for sparc-32) patch area in front of each function so that calls to an old generation of a function can be routed to the patched new generation.

     

    When optimization is used for 32-bit executables on SPARC, the compiler will put patch padding only in front of "small" functions, so -xpatchpadding=0 isn't going to be of much benefit.

     

    For 64-bit SPARC executables, a patch area is always needed, so -xpatchpadding=0 is likely to be more useful regardless of optimization.

     

    On x86 platforms, functions are aligned on cache lines so they invariably they will have padding in front of them anyway and, again, -xpatchpadding=0 might not make much difference.

     

    There is some minor savings in .strtab due to -xglobalize=no.

     

    Globalization converts file static symbols to global symbols and, in order to avoid potential name conflicts, will prefix the global name with a string such as $XA4AZhK$N8GSUys. With -xglobalize=no, symbol names become shorter.

     

    A program usually doesn't have an inordinate amount of file statics, so -xglobalize=no might not save much. However, as an example, some really old code may be peppered with things such as

     

    static char *RCSId_%F% = "...";

     

    in header files and then the number of globalized symbols may skyrocket.

     

    Biggest Bang for the Buck

     

    For C++ programs, the biggest savings in size incurring the least amount of loss of functionality may be achieved if you compile your code to exclude declaration information; note that this retains RTC and fix and continue functionality.

     

    CC ... -g -xdebuginfo=no%decl

     

    This produces the following results:

     

    -g:                                dbx-g      gcc-g          Total file size:   40.99 Mb    9.76 Mb                    Debug:   31.25 Mb    3.34 Mb              ===== DWARF stats for 'dbx-g'              subprogram:  388574 174   15.74 Mb   42                 |-decls:  369829 164   15.14 Mb   42                 |- defs:   18745 170  614.15 Kb   33                variable:   29462 169  634.22 Kb   22                 |-decls:    9224 167  266.77 Kb   29                 |- defs:   20238 147  367.45 Kb   18  Removing declaration information:                                dbx-g      gcc-g            Total file size:   21.48 Mb    9.36 Mb                     Debug:   11.74 Mb    2.94 Mb              ===== DWARF stats for 'dbx-nodecl'              subprogram:   21651 137  708.68 Kb   33                 |-decls:    2906  76   96.12 Kb   33                 |- defs:   18745 137  612.56 Kb   33                variable:   22124 123  421.73 Kb   19                 |-decls:     144  29    4.92 Kb   34                 |- defs:   21980 123  416.81 Kb   19

     

    Note that not all the decls have been removed from dbx; it is built using some separately compiled libraries, and the decls from them have been retained.

     

    You can see how dbx's executable size was almost slashed by half, going from 41 MB to 21.5 MB.

     

    The main loss of functionality here is the ability to inspect variables or call functions in other libraries not compiled with debuginfo.

     

    -g0

     

    For unoptimized C++ code, using -g will stop function inlining, which will reduce performance. The option -g0 exists to allow inlining of functions together with the generation of debug information for unoptimized builds. Note that the flag that disables C++ inlining is +d, and there is no flag with an opposite meaning.

     

    The size savings are not significant, and the loss in functionality may be considered too severe.

     

    However, -g1 also doesn't turn on +d, so using -g1 with +d might be something you want to consider if you need to see more functions in the stack traces.

     

    -g:                                dbx-g                    .text:    4.52 Mb          Total file size:   40.99 Mb                    Debug:   31.25 Mb               ===== DWARF stats for 'dbx-g'              subprogram:  388574 174   15.74 Mb   42                 |-decls:  369829 164   15.14 Mb   42                 |- defs:   18745 170  614.15 Kb   33  -g0:                                dbx+d                    .text:    4.72 Mb          Total file size:   40.19 Mb                    Debug:   30.71 Mb               ===== DWARF stats for 'dbx-g'              subprogram:  386054 167   15.69 Mb   42                 |-decls:  373706 157   15.28 Mb   42                 |- defs:   12348 164  419.94 Kb   34

     

    Inlining eliminates outlined definitions but the inlined calls might grow the code. It looks like here the effects cancel each other out. There is a measurable difference in the number of function definitions but that savings is swamped by the amount of information for declarations. So perhaps -g0 is best used with -xdebuginfo=no%decl.

     

    -g1

     

    To get minimally useful debugging functionality, compile your code as follows:

     

    $ CC/cc ... -g1 -xdebuginfo=no%codetag-g:                                dbx-g      gcc-g          Total file size:   40.99 Mb    9.76 Mb                    Debug:   31.25 Mb    3.34 Mb  -g1:                                dbx-g1     gcc-g1          Total file size:   15.14 Mb    7.53 Mb                    Debug:    5.86 Mb    1.20 Mb

     

    However, this isn't a very meaningful scenario. -g1 is intended for production code, which is usually optimized, and if it's a library, it's users would appreciate that it plays well with RTC (that's why -g1 includes -xdebuginfo=codetag). Such code would be more realistically compiled as follows:

     

    $ CC/cc ... -g1 -O3-g:                                dbx-g      gcc-g                    .text:    4.52 Mb    4.98 Mb               .debug_info:   27.85 Mb    2.57 Mb              .debug_line:  839.17 Kb  458.32 Kb               .debug_loc:   49.57 Kb          -           Total file size:   40.99 Mb   9.76 Mbb                    Debug:   31.25 Mb    3.34 Mb  -g1 -O3:                                dbx-g1    gcc-g1                    .text:    2.75 Mb    2.93 Mb               .debug_info:    3.13 Mb  587.89 Kb              .debug_line:    1.11 Mb 1021.36 Kb               .debug_loc:  903.85 Kb  552.29 Kb                   Total file size:   14.73 Mb    7.02 Mb                    Debug:    7.18 Mb    2.34 Mb

     

    Optimization makes for smaller code (compare sizes of .text sections) and because of -g1, we get much less debug info. However, the savings in the amount of debug info are not as substantial as they could be, because for optimized code, parameter locations are described using location lists, stored in .debug_loc, which are bulkier.

     

    Adding Structure Information Back In

     

    When debugging a program compiled with -g1, you will know the types and values of parameters:

     

    (dbx) where -v   ...   [6] main_cmd_loop(interp = 0x605c10) (optimized), at 0xe1a84 (line ~1926) in "dbxmain.cc"   ...  (dbx) whatis interp register struct Interp *interp;  (dbx) print *interp    *interp = {     /* dbxmain.o: Possibly missing debuginfo: tagtype -- see `help -xdebuginfo' */ }

     

    Note how dbx suggests that the reason you can't see the contents of interp is because -g1 doesn't include -xdebuginfo=tagtype.

     

    You could recompile your whole application with plain -g or tack on -xdebuginfo=tagtype to supplement -g1, but this information is rather bulky. However, there's a way to provide tagtype information to the debugger that doesn't affect the primary executable size.

     

    The trick is to create to one .c or .cc file (for example, AllFiles.cc) that includes all pertinent header files. Compile this single file into a shared library (for example, libAllFiles.so). This file can be compiled with full debug, but only needs to be compiled with -xdebuginfo=tagtype.

     

    Next you can issue the dbx commands:

     

    (dbx) loadobject -load libAllFiles.so (dbx) module AllFiles.o

     

    This will supplement the debugging information from the a.out with all the type information available in libAllFiles.so. Note that libAllFiles.so will not participate in the execution of a.out. The end result is that if you, for example, see a stack frame like the following:

     

    [5] pass1(Node *n) "pass1.c":456

     

    you will be able to issue the following:

     

    (dbx) print *n

     

    and get all the Node details.

     

    Summary

     

    Let's recap how the size of an executable evolves as we throttle debugging information.

     

                                                       

    Debug Flagdbx Executablegcc Executable
    Executable Size (MB)Percentage of -gExecutable Size (MB)Percentage of -g
    -g40.99100%9.76100%
    -g040.1998%NA
    Plain40.5799%9.5398%
    No decls21.4852%9.3696%
    -g115.1437%7.5377%
    -g1 -xO314.7336%7.0272%
    -gnone9.7824%5.4456%

     

    As you can see, debug information can take significant space. Most debug information is only used by the debugger, so it is a use of disk space rather than an impact to the memory used by the running application.

     

    The new options in Oracle Solaris Studio 12.4 provide much finer-grained control of the amount of debug information emitted, which allows a developer to choose just how much information is provided and to reduce the amount of disk space needed for the executable.

     

     

    Revision 1.0, 12/18/2014

     

    Follow us:
    Blog | Facebook | Twitter | YouTube