Multithreading, Multiprocessing with STOP and Continue Signals

2023-01-14 18:29 问答作者：

I am working on a project where I need to get the native stack of the Java application. I am able to achieve this partially thanks to ptrace, multiprocessing, and signals.

On Linux, a normal Java application has, at a minimum, 14 threads. Out of these 14, I am interested in only the main thread of which I have to get the native stack. Considering this objective, I have started a separate process using fork() which is monitoring the native stack of the main thread. In short, I have 2 separate processes: one is being monitored and the other does the monitoring using ptrace and signal handling.

Steps in the monitoring process:

Get the main thread ID out of the 14 threads from the monitored process.
ptrace_att开发者_如何学Goach on the main ID.
ptrace_cont on the main ID.

continuous loop starts

{

kill(main_ID, SIGSTOP)
nanosleep and check the status from the /proc/[pid]/stat directory.
ptrace_peekdata to read the stack and navigate.
ptrace_cont on the main ID.
nanosleep and check the status from the /proc/[pid]/stat directory.

}

ptrace_detach on the main ID.

This perfectly gives the native stack information continuously. However, sometimes I encounter an issue:

When I kill(main_ID, SIGSTOP) the main thread, the other threads from the process get into a finished or stoped state (T) and the entire process blocks. This is not the consistent behavior and sometimes entire process executes correctly. I cannot understand this behavior as i am only signaling the main thread. Why are the other threads affected?

Can someone help me analyze this problem?

I also tried sending SIGCONT and SIGSTOP to all of the threads of the process but the issue still occurs sometimes.

Thanks, Sandeep

Assuming you are using Linux, you should be using tkill(2) or tgkill(2) instead of kill(2). On FreeBSD, you should use the SYS_thr_kill2 syscall. Per the tkill(2) manpage:

tgkill() sends the signal sig to the thread with the thread ID tid in the thread group tgid. (By contrast, kill(2) can only be used to send a signal to a process (i.e., thread group) as a whole, and the signal will be delivered to an arbitrary thread within that process.)

Ignore the stuff about tkill(2) and friends being for internal thread library usage, it is commonly used by debuggers/tracers to send signals to specific threads.

Also, you should use waitpid(2) (or some variation of it) to wait for the thread to receive the SIGSTOP instead of polling on /proc/[pid]/stat. This approach will be more efficient and more responsive.

Finally, it appears that you are doing some sort of stack sampling. You may want to check out Google PerfTools as these tools include a CPU sampler that is doing stack sampling to obtain estimates of what functions are consuming the most CPU time. You could maybe reuse the work these tools have already done, as stack sampling can be tricky to make robust.

继续阅读：multithreading ptrace signals

Multithreading, Multiprocessing with STOP and Continue Signals

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？