fatal error disappeared when running with gdb
I have a program which produces a fatal error with a testcase, and I can locate the problem by reading the log and the stack trace of the fatal - it turns out that there is a read operation upon a null pointer.
But when I try to attach gdb to it and set a breakpoint around the suspicious code, the null pointer just cannot be observed! The program works smoothly without any error.
This is a single-process, single-thre开发者_开发百科ad program, I didn't experience this kind of thing before. Can anyone give me some comments? Thanks.
Appended: I also tried to call pause() syscall before the fatal-trigger code, and expected to make the program sleep before fatal point and then attach the gdb on it on-the-fly, sadly, no fatal occurred.
It's only guesswork without looking at the code, but debuggers sometimes do this:
- They initialize certain stuff for you
- The timing of the operations is changed
I don't have a quote on GDB, but I do have one on valgrind (granted the two do wildly different things..)
My program crashes normally, but doesn't under Valgrind, or vice versa. What's happening?
When a program runs under Valgrind, its environment is slightly different to when it runs natively. For example, the memory layout is different, and the way that threads are scheduled is different.
Same would go for GDB.
Most of the time this doesn't make any difference, but it can, particularly if your program is buggy.
So the true problem is likely in your program.
There can be several things happening.. The timing of the application can be changed, so if it's a multi threaded application it is possible that you for example first set the ready flag and then copy the data into the buffer, without debugger attached the other thread might access the buffer before the buffer is filled or some pointer is set.
It's could also be possible that some application has anti-debug functionality. Maybe the piece of code is never touched when running inside a debugger.
One way to analyze it is with a core dump. Which you can create by ulimit -c unlimited
then start the application and when the core is dumped you could load it into gdb with gdb ./application ./core
You can find a useful write-up here: http://www.ffnn.nl/pages/articles/linux/gdb-gnu-debugger-intro.php
If it is an invalid read on a pointer, then unpredictable behaviour is possible. Since you already know what is causing the fault, you should get rid of it asap. In general, expect the unexpected when dealing with faulty pointer operations.
精彩评论