How can Python continuously fill multiple threads of subprocess?

2023-02-07 06:03 问答作者：

I'm running an app, foo, on Linux. From a Bash script/terminal prompt, my application runs multi-threaded with this command:

$ foo -config x.ini -threads 4 < inputfile

System Monitor and top report foo averages about 380% CPU load (quad-core machine). I've recreated this functionality in Python 2.6x with:

proc = subprocess.Popen("foo -config x.ini -threads 4", \
        shell=True, stdin=subprocess.PIPE, \
        stdout=subprocess.PIPE, stderr=subprocess.PIPE)
mylist = ['this',开发者_StackOverflow'is','my','test','app','.']
for line in mylist:
    txterr = ''
    proc.stdin.write(line.strip()+'\n')
    while not proc.poll() and not txterr.count('Finished'):
        txterr += subproc.stderr.readline()
    print proc.stdout.readline().strip(),

Foo runs slower and top reports a CPU load of 100%. Foo also runs fine with shell=False, but still slow:

proc = subprocess.Popen("foo -config x.ini -threads 4".split(), \
        shell=False, stdin=subprocess.PIPE, \
        stdout=subprocess.PIPE, stderr=subprocess.PIPE)

Is there a way to have Python subprocess continuously fill all the threads?

When you call a command with Popen like this it doesn't matter if it's called from Python or from the shell. It's the "foo" command that starts it's processes, not Python.

So the answer is "Yes, subprocesses can be multi-threaded when called from Python."

First things first, are you guessing it is single-threaded only because it is using 100% of CPU rather than 400%?

It would be better to check how many threads it has started using the top program, hit the H key to show threads. Or, use ps -eLf and make sure the NLWP column shows multiple threads.

Linux can be pretty twitchy with CPU affinity; by default, the scheduler will NOT move a process away from the last processor it used. Which means, if all four threads of your program were started on a single processor, they will ALL share the processor FOR EVER. You must use a tool like taskset(1) to force a CPU affinity on processes that must run on separate processors for a long time. e.g., taskset -p <pid1> -c 0 ; taskset -p <pid2> -c 1 ; taskset -p <pid3> -c 2 ; taskset -p <pid4> -c 3.

You can retrieve the affinity with taskset -p <pid> to find out what the affinity is currently set to.

(One day I wondered why my Folding At Home processes were using much less than CPU time I expected, I found that the bloody scheduler had placed three FaH tasks on ONE HyperThread sibling and the fourth FaH task on the other HT sibling on the same core. The other three processors were idle. (The first core also ran quite hot, and the other three cores were four or five degrees colder. Heh.))

If your python script doesn't feed the foo process fast enough then you could offload reading stdout, stderr to threads:

from Queue import Empty, Queue
from subprocess import PIPE, Popen
from threading import Thread

def start_thread(target, *args):
    t = Thread(target=target, args=args)
    t.daemon = True
    t.start()
    return t

def signal_completion(queue, stderr):
    for line in iter(stderr.readline, ''):
        if 'Finished' in line:
           queue.put(1) # signal completion
    stderr.close()

def print_stdout(q, stdout):
    """Print stdout upon receiving a signal."""
    text = []
    for line in iter(stdout.readline, ''):
        if not q.empty():
           try: q.get_nowait()               
           except Empty:
               text.append(line) # queue is empty
           else: # received completion signal              
               print ''.join(text),
               text = []
               q.task_done()
        else: # buffer stdout until the task is finished
            text.append(line)
    stdout.close()
    if text: print ''.join(text), # print the rest unconditionally

queue = Queue()
proc = Popen("foo -config x.ini -threads 4".split(), bufsize=1,
             stdin=PIPE, stdout=PIPE, stderr=PIPE)
threads =  [start_thread(print_stdout, queue, proc.stdout)]
threads += [start_thread(signal_completion, queue, proc.stderr)]

mylist = ['this','is','my','test','app','.']
for line in mylist:
    proc.stdin.write(line.strip()+'\n')
proc.stdin.close()
proc.wait()
for t in threads: t.join() # wait for stdout

继续阅读：multithreading python subprocess

How can Python continuously fill multiple threads of subprocess?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？