开发者

Corrupt stack problem in C/C++ program

I am running a C/C++ program in linux servers to serve videos. The program's(say named Plugin) core functionality is to convert videos and we fork a separate Plugin process for each video request. But I am having a weird problem for which sometimes server load average gets unexpectedly high. What I see from top command at this stage is that there are some processes which are running for long time and taking some huge CPU's.

When I debug this running program with gdb and backtrace stack,what I found is the corrupt stack: "Previous frame inner to this开发者_如何转开发 frame (corrupt stack?)". I have searched the net and found that this occurs if the program gets segmentation fault.

But what I know if the program gets segmentation fault, the program should crash and exit at that point. But surprisingly the program still running after segmentation fault.

What can be the causes of this? I know there must be some big problems in the program but I just can't understand from where to start fixing the problem...It would be great if any of you can show me some lights...

Thanks in advance


Attaching the debugger changes the behavior of the process so you won't get reliable investigation results most probably. Corrupted stack message from the debugger can mean that the particular debugger does not understand text info from the binary.

I would recommend running pstack several time subsequently on the problematic (this is known as "Monte Carlo performance profiling") and also attach strace or truss to the problematic and check what system calls is the process doing when consuming CPU.


Run your program under Valgrind and fix any invalid memory writes that it finds.


Certain optimisations, such as frame pointer omission, can make it harder for the debugger to understand the stack.


If you have the code, compile the program in debug and run Valgrind on it.

If you don't have the code, contact the author/provider of the program.

The corrupt stack message simply means the code is doing something weird with the memory. It does not mean the program has a segmentation fault. Also, the program can still run if it choose to handle the SIGSEGV signal.

If by forking you mean that you have some process which spawn and run other smaller processes, just monitor for such spikes and restart the process. This assumes that you have no access to the fix the program.


There could be some interesting manipulation of the stack taking place through assembly code manipulation, such as true tail-recursion optimization, self-modifying code, non-returning functions, etc. that may cause the debugger to be incapable of properly back-tracing the stack and causing it to trigger a corrupted stack error, but that doesn't necessarily mean that memory is corrupted ... but definitely something non-traditional is happening under the hood.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜