开发者

A running bash script is hung somewhere. Can I find out what line it is on?

E.g. does the bash debugger support attaching to existing processes and examining the current state?

Or can I easily find out by looking at the bash process entries in /proc? Is there a convenient tool to give line numbers in active files?

I don't want to have to kill and restart the process.

This开发者_高级运维 is on Linux - Ubuntu 10.04.


I recently found myself in a similar position. I had a shell script that was not identifiable through other means (such as arguments, etc.)

There are ways to find out a lot more about a running process than you would expect.

Use lsof -p $pid to see what files are open, which may give you some clues. Note that some files, while "deleted", can still be kept open by the script. As long as the script doesn't close the file, it can still read and write from it - and the file still takes up room on the file system.

Use strace to actively trace the system calls used by the script. The script will read the script file, so you can see some of the commands as they are read prior to execution. Look for read commands with this command:

strace -p $pid -s 1024

This makes the commands print strings up to 1024 characters long (normally, the strace command would truncate strings much shorter than that).

Examine the directory /proc/$pid in order to see details about the script; in particular note, see /proc/$pid/environ which will give you the process environment separated by nulls. To read this "file" properly, use this command:

xargs -0 -i{} < /proc/$pid/environ

You can pipe that into less or save it in a file. There is also /proc/$pid/cmdline but it is possible that that will only give you the shell name (-bash for instance).


No real solution. But in most cases a script is waiting for a child process to terminate:

ps --ppid  $(pidof yourscript)

You could also setup signal handlers in you shell skript do toggle the printing of commands:

#!/bin/bash

trap "set -x" SIGUSR1
trap "set +x" SIGUSR2

while true; do
    sleep 1
done

Then use

kill -USR1 $(pidof yourscript)
kill -USR2 $(pidof yourscript)


Use pstree to show what linux command/executable your script is calling. For example, 21156 is the pid of my hanging script:

ocfs2cts1:~ # pstree -pl 21156
activate_discon(21156)───mpirun(15146)─┬─fillup_contig_b(15149)───sudo(15231)───chmod(15232)
                                       ├─ssh(15148)
                                       └─{mpirun}(15147)

So that, I know it's hanging at chmod command. Then, show the stack trace by:

ocfs2cts1:~ # cat /proc/15232/stack 
[<ffffffffa05377ef>] __ocfs2_cluster_lock.isra.39+0x1bf/0x620 [ocfs2]
[<ffffffffa053856d>] ocfs2_inode_lock_full_nested+0x12d/0x840 [ocfs2]
[<ffffffffa0538dbb>] ocfs2_inode_lock_atime+0xcb/0x170 [ocfs2]
[<ffffffffa0531e61>] ocfs2_readdir+0x41/0x1b0 [ocfs2]
[<ffffffff8120d03c>] iterate_dir+0x9c/0x110
[<ffffffff8120d453>] SyS_getdents+0x83/0xf0
[<ffffffff815e126e>] entry_SYSCALL_64_fastpath+0x12/0x6d
[<ffffffffffffffff>] 0xffffffffffffffff

Oh, boy, it's likely a deadlock bug...


I've combined several answers from this thread.

  • First you need to determine the process id, lets say script_which_hangs.sh script hanged, simply filter ps by the name of the script/process etc
> ps -ef | grep script_which_hangs
sos      1464260 1349476  0 14:08 ?        00:00:00 bash /repo/scripts/script_which_hangs.sh
sos      1464652 1316191  0 14:08 pts/4    00:00:00 grep --color=auto script_which_hangs
  • Then I get its PID 1464260 from above output.
DPID=1464260; pstree -pal ${DPID}; lsof -p ${DPID}; pstree -pl ${DPID}
  • first pstree show tree including all arguments (quite long output),
  • then lsof shows open files,
  • and last pstree prints the tree but without arguments so it is more readable for quick overview of situation

And just see the output, execute it again/again to see if anything changes. Potentially it might be extended for some method of current CPU/MEM but I've not looked for any command like that.

pstree has also -s parameter for including parents but it's rather not useful

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜