Better multithreaded use of Python subprocess.Popen & communicate()?

2023-01-27 06:33 问答作者：

I'm running multiple commands which may take some time, in parallel, on a Linux machine running Python 2.6.

So, I used subprocess.Popen class and process.communicate() method to parallelize execution开发者_Go百科 of mulitple command groups and capture the output at once after execution.

def run_commands(commands, print_lock):
    # this part runs in parallel.
    outputs = []
    for command in commands:
        proc = subprocess.Popen(shlex.split(command), stdout=subprocess.PIPE, stderr=subprocess.STDOUT, close_fds=True)
        output, unused_err = proc.communicate()  # buffers the output
        retcode = proc.poll()                    # ensures subprocess termination
        outputs.append(output)
    with print_lock: # print them at once (synchronized)
        for output in outputs:
            for line in output.splitlines():
                print(line)

At somewhere else it's called like this:

processes = []
print_lock = Lock()
for ...:
    commands = ...  # a group of commands is generated, which takes some time.
    processes.append(Thread(target=run_commands, args=(commands, print_lock)))
    processes[-1].start()
for p in processes: p.join()
print('done.')

The expected result is that each output of a group of commands is displayed at once while execution of them is done in parallel.

But from the second output group (of course, the thread that become the second is changed due to scheduling indeterminism), it begins to print without newlines and adding spaces as many as the number of characters printed in each previous line and input echo is turned off -- the terminal state is "garbled" or "crashed". (If I issue reset shell command, it restores normal.)

At first, I tried to find the reason from handling of '\r', but it was not the reason. As you see in my code, I handled it properly using splitlines(), and I confirmed that with repr() function applied to the output.

I think the reason is concurrent use of pipes in Popen and communicate() for stdout/stderr. I tried check_output shortcut method in Python 2.7, but no success. Of course, the problem described above does not occur if I serialize all command executions and prints.

Is there any better way to handle Popen and communicate() in parallel?

A final result inspired by the comment from J.F.Sebastian.

http://bitbucket.org/daybreaker/kaist-cs443/src/247f9ecf3cee/tools/manage.py

It seems to be a Python bug.

I am not sure it is clear what run_commands needs to be actually doing, but it seems to be simply doing a poll on a subprocess, ignoring the return-code and continuing in the loop. When you get to the part where you are printing output, how could you know the sub-processes have completed?

In your example code I noticed your use of:

for line in output.splitlines():

to address partially the issue of " /r " ; use of

for line in output.splitlines(True):

would have been helpful.

继续阅读：multithreading pipe python subprocess

Better multithreaded use of Python subprocess.Popen & communicate()?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？