开发者

What are best practices for finding a bug in a C program that only shows up in optimized build

My program uses a third part library that throws segmentation fault at some point. I tried to compile the library with debug symbols and w开发者_运维技巧ithout compiler optimization, and the crash gone away. My suspect is that compiler optimizations revealed this bug. What are best practices for debugging cases like this?

EDIT - (corrected the statement above: "revealed" instead of "caused")

I think I was misunderstood. I didn't have an intention to blame compiler, or something like that. I only asked for best practices for finding a bug in such a situation, where I don't have debug symbols in the 3rd party library (the crash backtrace leads to the 3rd party library).


What you describe is quite common. And it's almost never ever a bug in the compiler optimization. Optimization does a lot of things to your code. Variables get reordered/optimized away etc. If you have one buffer overflow, it might just overflow memory that's no big deal in the debug build, but that memory is very important in the optimization build.

Use valgrind to track down memory errors - they're almost always the cause of the symptoms you see.


Your suspicion is that optimization caused a bug. My suspicion is that your code has constructs that lead to Undefined Behavior, and when the optimizer is on, this Undefined Behavior manifests itself as erroneous behavior or crash. Don't blame the optimizer. Find UB in your code... might be tricky, though. Possible culprits:

  • OutOfBounds index
  • Returning the address a temprorary
  • A zillion of other things


Compile with debug symbols and compiler optimization, it will "hopefully" fail as well. Allow the system to generate a core file (ulimit -c unlimited, then re-run the program). Load the core file into gdb to see what happened.

Another powerful tool is valgrind, run your program within valgrind with the option --db-attatch=yes it will stop and run the debugger as soon as it detects an invalid read or write. Invalid reads/writes are likely to provoke Segfault, and even if they don't, they should be removed anyway.

Good luck,


Keep putting debug statements or messageboxes in the place you think the code is crashing. The crash will occur between two messageboxes and this will help you locate the faulty code as long as the code wasn't changed too much.

Also comment out blocks of code until the crash stops coming. Keep commenting back in until the crash returns. What you last commented back in must be causing the crash, directly or indirectly.

Both of these methods are useful for general debugging and half your work is already done if you are able to reliably reproduce the crash.

I did not give specific advice for debugging compiler optimisations because it's highly unlikely the crash is caused by that. The optimisations are generally tested very robustly to ensure they do not change the function or semantics of the code in any way.


If the backtrace leads to the third-party library, use gdb to break before the library call. Verify that the parameters you're passing to the library are valid (i.e., aren't uninitialized pointers, aren't pointers to free'd memory, aren't out of range, etc.)

Can you use strace to trace the function calls and then try to determine the execution path in the third-party library? Use a printf or some other system call before the failing library call so you have a starting point in the strace output.

If you really think it's a bug in the third-party library, you'll have to compile it with optimizations on so you can reproduce the failure. Are you saying that your compiler can only include debug symbols for non-optimized builds? gdb should still work for optimized builds.


Well, going through the compiled binary isn't going to help.

So that leaves going through your code to find out what part is causing the segfault. I would just work through your code manually and start commenting things out. Once you find what's causing the error, then you can determine what to do with it. It might be worth adding printfs in select locations to see exactly where the program fails.

Think of it as doing a binary search for the error ;)


If it only blows up when you turn on optimization, then that's a strong hint you've invoked undefined behavior somewhere. Unfortunately, that UB may be nowhere near the code that actually generated the segfault (as I've discovered several times in the past).

Every time this has happened to me (which hasn't been that often), the cause was a buffer overflow somewhere else in the code. I never developed a repeatable, generally applicable technique for finding the problem, though (unless you want to call hours stepping through a debugger and swearing a generally applicable technique).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜