Why do I have to use .wait() with python's subprocess module?

2023-01-24 14:08 问答作者：

I'm running a Perl script through the subprocess module in Python on Linux. The function that runs the script is called several times with variable input.

def script_runner(variable_input):

    out_file = open('out_开发者_运维知识库' + variable_input, 'wt')
    error_file = open('error_' + variable_input, 'wt')

    process = subprocess.Popen(['perl', 'script', 'options'], shell=False,
                           stdout=out_file, stderr=error_file)

However, if I run this function, say, twice, the execution of the first process will stop when the second process starts. I can get my desired behavior by adding

process.wait()

after calling the script, so I'm not really stuck. However, I want find out why I cannot run the script using subprocess as many times as I want, and have the script make these computations in parallel, without having to wait for it to finish between each run.

UPDATE

The culprit was not so exciting: the perl script used a common file that was rewritten for each execution.

However, the lesson I learned from this was that the garbage collector does not delete the process once it starts running, because this had no influence on my script once I got it sorted out.

If you are using Unix, and wish to run many processes in the background, you could use subprocess.Popen this way:

x_fork_many.py:

import subprocess
import os
import sys
import time
import random
import gc  # This is just to test the hypothesis that garbage collection of p=Popen() causing the problem.

# This spawns many (3) children in quick succession
# and then reports as each child finishes.
if __name__=='__main__':
    N=3
    if len(sys.argv)>1:
        x=random.randint(1,10)
        print('{p} sleeping for {x} sec'.format(p=os.getpid(),x=x))
        time.sleep(x)
    else:
        for script in xrange(N): 
            args=['test.py','sleep'] 
            p = subprocess.Popen(args)
        gc.collect()
        for i in range(N):
            pid,retval=os.wait()
            print('{p} finished'.format(p=pid))

The output looks something like this:

% x_fork_many.py 
15562 sleeping for 10 sec
15563 sleeping for 5 sec
15564 sleeping for 6 sec
15563 finished
15564 finished
15562 finished

I'm not sure why you are getting the strange behavior when not calling .wait(). However, the script above suggests (at least on unix) that saving subprocess.Popen(...) processes in a list or set is not necessary. Whatever the problem is, I don't think it has to do with garbage collection.

PS. Maybe your perl scripts are conflicting in some way, which causes one to end with an error when another one is running. Have you tried starting multiple calls to the perl script from the command line?

You have to call wait() in order to ask to "wait" the ending of your popen.

As popen execute in background the perl script, if you do not wait(), it will be stopped at the object "process" 's end of life... that is at the end of script_runner.

As said by ericdupo, the task is killed because you overwrite your process variable with a new Popen object, and since there are no more references to your previous Popen object, it is destroyed by the garbage collector. You can prevent this by keeping a reference to your objects somewhere, like a list:

processes = []
def script_runner(variable_input):

    out_file = open('out_' + variable_input, 'wt')
    error_file = open('error_' + variable_input, 'wt')

    process = subprocess.Popen(['perl', 'script', 'options'], shell=False,
                           stdout=out_file, stderr=error_file)
    processes.append(process)

This should be enough to prevent your previous Popen object from being destroyed

I think you want to do

list_process = []
def script_runner(variable_input):

    out_file = open('out_' + variable_input, 'wt')
    error_file = open('error_' + variable_input, 'wt')

    process = subprocess.Popen(['perl', 'script', 'options'], shell=False,
                           stdout=out_file, stderr=error_file)
    list_process.append(process)
#call several times script_runner
for process in list_process:
    process.wait()

so your process will be run in parallel

继续阅读：python subprocess

Why do I have to use .wait() with python's subprocess module?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？