开发者

Troubleshooting SIGTERMs with tee on a cluster within SGE jobs

I have some legacy scientific code running on a Rocks cluster, with SGE. I have an application-specific job submission script that generates qsub scripts (i.e. the script which Sun Grid Engine takes and runs).

Within the qsub script, my legacy app is called. This app sends it's output to STDOUT. SGE intercepts STDOUT and spools it into a file in the users home directory, so the user can see results build up in real-time. I want this behavior to be maintained, but at the same time, I want to transparently log all output in the background. I figured tee would be perfect to achieve this.

So开发者_开发技巧 I modified the job submission script to run the app and pipe STDOUT to tee, which saves STDOUT to a file that is copied to a central store once the job completes. The app is run and piped to tee as follows:

\$GMSCOMMAND | tee \$SCRATCHDIR/gamess_output.log

The problem is, ever since I've started piping the code to tee, the app has been dying with SIGTERMs, especially when I request several nodes. I tried using the -i (ignore interrupts) parameter with tee: it makes no difference.

Things work fine if I redirect the app output to a file then cat the file once the app is done, but then I can't allow users to view results buildup in real-time (which is an important requirement).

Any ideas about why this use of tee might be failing? Or alternatively, any ideas about how else I might achieve the desired functionality?


I don't know anything about why your particular case is failing, but one option might be to make $GMSCOMMAND do it's own logging. (Effectively put the tee inside the app). I guess this option depends on cost of changing the legacy app.

Failing that you could wrap the 'legacy app' with your own script/application to do the redirection/duplication.


If pipes are your problem perhaps you can get around this by using a 'while/read' loop with process substitution. Does this work for you?

while read line; do
    echo "$line"
    echo "$line" >> ${SCRATCHDIR}/gamess_output.log
done <(${GMSCOMMAND})
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜