When Debug Symbols Get Large

ddulaney · on March 9, 2023

I recently was troubleshooting a crash that backtraced through the boost::sml library [0]. The crash didn't actually have anything to do with the library, but it was used as the core event loop.

The backtrace -- as in, just the output from running `bt` in GDB -- was over a thousand wrapped lines long. There were ~5 stack frames that took up over a hundred lines of console each to print just the function name. That product's debug builds recently hit the 2GB line, which is enough that old versions of binutils complain.

I don't know what the solution is. There's some really neat stuff you can do with template metaprogramming, and in stripped release builds it compiles down extremely tiny. Plus the code is very clean to read. But it does feel like there isn't any kind of central vision for the C++ debugging experience, and bad interactions between highly-complex modern C++ libraries, the compiler, and the debugger are probably only going to get worse unless somebody (the ISO committee? vendors? every single library author individually?) thinks really hard about debugging support.

[0]: https://github.com/boost-ext/sml

Blackthorn · on March 9, 2023

This is a big problem with gdb/dwarf and c++. It just does not work anywhere near as well as C does.

You don't even need template metaprogramming to get a horrible experience. Just imagine using std::transform and wanting to step. You obviously want to step through your lambda that you pass it. You don't want to step through the over 9000 lines of whatever the fuck libstdc++ is doing to make std::transform work. In most cases you can't even set a breakpoint in your lambda because breakpoints are set by line number, so you'll hit the break in the transform. Leaving you needing to reformat your source code and recompile just to set a reasonable breakpoint.

This particular problem could be solved if the debugger let you say "skip through some namespace (like std) but stop if it calls something else" but c++ debugging has all sorts of nonsense like this. Honestly I've just stopped using debuggers with c++ over a decade ago. It's just not worth fighting with it.

I think the people that need to think about debugging ergonomics isn't the standard committee or the library authors. It's the authors of whoever is writing the next debugger. Gdb is great for C but it really doesn't map to C++ well at all.

mark_undoio · on March 9, 2023

GDB has the "skip" command to help with this but it has limitations.

It stops you stepping into namespaces you don't want to see but doesn't do anything when you step out into them.

It also doesn't handle when code you don't want to see calls back into code you do - and, yes, I've recommended formatting lambdas to allow breakpoints before.

I believe DWARF does support columnar information these days so it actually should be possible to solve the one line lambda given code in GDB.

For looking at any data in C++ it's also very important to have GDB's pretty printers set up (and, likely, write some of your own).

https://sourceware.org/gdb/onlinedocs/gdb/Pretty_002dPrinter...

https://undo.io/resources/gdb-watchpoint/here-quick-way-pret... (this one is from my boss)

Joker_vD · on March 10, 2023

Visual Studio debugger supported "Just My Code" since always for .NET applications and I believe they recently added something to this effect for C/C++ as well, although not nearly as flawless, of course.

ddulaney · on March 9, 2023

My solution in this case was to convince my company to buy CLion, which works reasonably well and which I believe is backed by LLDB.

stonemetal12 · on March 9, 2023

They should make a way for typedefs to trickle down to debug symbols. It would make error messages better reflect the code and offer a solution to template type bloat in debug symbols.

yjftsjthsd-h · on March 9, 2023

> over a hundred lines of console each to print just the function name

At what point is it better to just rename everything to use the sha256 of the "real" name? It's obviously only for machine use anyways, so it's not like one more layer of indirection would hurt.

ddulaney · on March 12, 2023

Heh, you’re not wrong.

It was handy to see the outer couple of layers of template expansion, though, so I could know at least which library it came from. I’d be very interested in some way to adjust the displayed format to be different from the “real” type name (which is of course different from the machine-readable mangled name). It might be better to put that functionality in the debugging tool.

jcranmer · on March 9, 2023

constexpr evaluation helps a lot with that, since you can get a lot of the Turing-complete nature without having to go crazy with nested templates to get what you need.

Joker_vD · on March 9, 2023

While I truly appreciate the effort... what the hell exactly goes into the Chromium's PDB file that it barely fits on a DVD-ROM? I am willing to bet that the executable and the source code combined together would take less space, so what is even in there?

nwallin · on March 9, 2023

> what the hell exactly goes into the Chromium's PDB file that it barely fits on a DVD-ROM?

Debug symbols in general tend to be very big. You need to be able to map each instruction in the binary to a line of code. You can expect a compiler to break each line of code into several instructions, and you can also expect that the instructions are out of order and interleaved with neighboring lines of code. So each instruction will need its own unique entry.

> I am willing to bet that the executable and the source code combined together would take less space,

     ~ $ ll -h /var/cache/distfiles/chromium-110.0.5481.177.tar.xz 
    -rw-rw-r-- 1 portage portage 1.6G Feb 22 11:02 /var/cache/distfiles/chromium-110.0.5481.177.tar.xz
     ~ $ tar -xaf /var/cache/distfiles/chromium-110.0.5481.177.tar.xz
     ~ $ du -sh chromium-110.0.5481.177/
    13G chromium-110.0.5481.177/

LanternLight83 · on March 9, 2023

Love the use of distfiles for an in the spot demonstration, makes me miss Gentoo. I suppose functional package-manager users have similar files in their stores. Thanks for sharing.

Joker_vD · on March 9, 2023

Excuse me? On Windows, "C:\Program Files (x86)\Google\Chrome" takes 603 MiB only — and it contains a 295 MiB file called "chrome.7z" inside an "Installer" subdirectory. What does Linux distribution contains that takes the rest of the space?

nwallin · on March 9, 2023

That's not the Linux distribution, it's the source code. Straight from Google.

https://commondatastorage.googleapis.com/chromium-browser-of...

The compiled binary is much smaller:

     ~ $ du -h /usr/lib64/chromium-browser
    16K /usr/lib64/chromium-browser/MEIPreload
    360K /usr/lib64/chromium-browser/locales
    383M /usr/lib64/chromium-browser

Joker_vD · on March 9, 2023

Huh, so the size comparison actually goes "compiled binary" < "debug symbols" < "source code", interesting to know, thanks.

ddulaney · on March 9, 2023

Debug symbols are often larger than the original source code though. The main culprit is C++ template expansions, which can get crazy long.

the_mitsuhiko · on March 9, 2023

It's not just Chrome. We (Sentry) handle a lot of native crashes and the amount of really large debug symbols went up in recent years. C++ template instantiations are a huge reason for it.

glandium · on March 10, 2023

Rust generics are good at it too.

the_mitsuhiko · on March 10, 2023

Absolutely and in some ways even worse. However Rust is in comparison with C and C++ a rather small player on our platform.

kg · on March 9, 2023

Web browsers contain a staggering amount of code, often linked into one huge binary so that LTCG can happen. They are basically an OS kernel+drivers+user space bundled into one.

planede · on March 9, 2023

If I had to guess, symbols for a lot of template instantiations.

helloooooooo · on March 9, 2023

Because chromee is a) a project with a lot of code b) a lot of the code is templates and c) chrome provides full debug symbols which includes all symbol names, type information and source code line numbers.

brucedawson · on March 9, 2023

And local variables, including tracking where they are stored at various points in a function

xxpor · on March 9, 2023

I don't know literally a single thing about the PDB format, but assuming it at least somewhat similar to DWARF, it's REALLY easy to get an explosion of symbols in there, especially if you're looking at individual object files.

A clean checkout of chromium is 5 GB (excluding the .git folder) by itself, the symbols will of course be larger.

Analemma_ · on March 9, 2023

C++ templates can result in huge symbol names, especially if you're using them for complex metaprogramming, which I assume Chrome is (especially for stuff like the JavaScript engine). But as another commenter alluded to, constexpr might help with this a little - if people choose to use it.

zwieback · on March 9, 2023

Is this due to C++ name mangling? I just had a flashback to the early 90s when I ran into that problem with Watcom C++ and Windows symbol loading, we had to resort to rewriting the exports and imports to shorter names just to be able to debug. Bad memories!

zetafunction · on March 9, 2023

Templated code can lead to some really long symbol names. As a random tangent, I was trying to figure out why Chrome's stack symbolizer wasn't working for some stack frames this week… and came across this comment in the symbolizer.

  char demangled[256];  // Big enough for sane demangled symbols.

Turns out the longest symbol name in the Chrome binary is 32k characters long... and some of the tests have even longer ones (the longest one I found was 98k characters).

vintagedave · on March 9, 2023

Crikey. We (C++Builder) were looking at some Boost debugging issues a couple of years ago and realised some symbols were a couple of thousand characters long. I thought _that_ was bad!

terrelln · on March 9, 2023

I've seen a multi MB symbol coming from generated code

cokernel_hacker · on March 10, 2023

Fun fact: the MSVC C++ ABI gives up if the mangled name is >= 4096 characters, it just replaces the symbol with md5(mangled name): https://github.com/llvm/llvm-project/blob/d32f71a91a432db2d9...