开发者

Irreproducible runtime errors - general approach?

I'm facing a prob开发者_StackOverflow中文版lem that is so mysterious, that I don't even know how to formulate this question... I cannot even post any piece of code.

I develop a big project on my own, started from scratch. It's nearly release time, but I can't get rid of some annoying error. My program writes an output file from time to time and during that I get either:

  • std::string out_of_range error
  • std::string length_error
  • just lots of nonsense on output

Worth noting that those errors appear very rarely and can never be reproduced, even with the same input. Memcheck shows no memory violation, even on runs where errors were previously noted. Cppcheck has no complains as well. I use STL and pthreads intensively, but without the latter one errors also happen.

I tried both newest g++ and icpc. I am running on some version of Ubuntu, but I don't believe that's the reason.

I would appreciate any help from you, guys, on how to tackle such problems. Thanks in advance.


Enable coredumps (ulimit -c or setrlimit()), get a core and start gdb'ing. Or, if you can, make a setup where you always run under gdb, so that when the error eventually happen you have some information available.


The symptoms hint at a memory corruption.

If I had to guess, I'd say that something is corrupting the internal state of the std::string object that you're writing out. Does the string object live on the stack? Have you eliminated stack smashing as a possible cause (that wouldn't be detectable by valgrind)?

I would also suggest running your executable under a debugger, set up in such a way that it would trigger a breakpoint whenever the problem happens. This would allow you to examine the state of your process at that point, which might be helpful in figuring out what's going on.


gdb and valgrind are very useful tools for debugging errors like this. valgrind is especially powerful for identifying memory access problems and memory leaks.


I encountered strange optimization bugs in gcc (like a ++i being assembled to i++ in rare circumstances). You could try declaring some critical variables volatile but if valgrind doesn't find anything, chances are low. And of course it's like shooting in the dark...

If you can at least detect that something is wrong in a certain run from inside the program, like detecting nonsensical output, you could then call an empty "gotNonsense()" function that you can break into with gdb.


If you cannot determine where exactly in the code does your program crash, one way to find that place would be using a debug output. Debug output is good way of debugging bugs that cannot be reproduced, because you will get more information about the bug the next time it happens, without the need to actively reproduce it. I recommend using some logging lib for that, boost provides one, for example.


You are using STL intensively, so you can try to run your program with libstdc++ in debug mode. It will do extra checks on iterators, containers and algorithms. To use the libstdc++ debug mode, compile your application with the compiler flag -D_GLIBCXX_DEBUG

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜