开发者

Running a command line from python and piping arguments from memory

I was wondering if there was a way to run a command line executable in python, but pass it the argument values from memory, without having to write the memory data into a temporary file on disk. From what I have seen, it seems to that the subprocess.Popen(args) is the preferred way to run programs from inside python scripts.

For example, I have a pdf file in memory. I want to convert it to text using the commandline function pdftotext which is present in most linux distros. But I would prefer not to write the in-memory pdf file to a temporary file on disk.

pdfInMemory = myPdfReader.read()
convertedText = subproces开发者_开发百科s.<method>(['pdftotext', ??]) <- what is the value of ??

what is the method I should call and how should I pipe in memory data into its first input and pipe its output back to another variable in memory?

I am guessing there are other pdf modules that can do the conversion in memory and information about those modules would be helpful. But for future reference, I am also interested about how to pipe input and output to the commandline from inside python.

Any help would be much appreciated.


with Popen.communicate:

import subprocess
out, err = subprocess.Popen(["pdftotext", "-", "-"], stdout=subprocess.PIPE).communicate(pdf_data)


os.tmpfile is useful if you need a seekable thing. It uses a file, but it's nearly as simple as a pipe approach, no need for cleanup.

tf=os.tmpfile()
tf.write(...)
tf.seek(0)
subprocess.Popen(  ...    , stdin = tf)

This may not work on Posix-impaired OS 'Windows'.


Popen.communicate from subprocess takes an input parameter that is used to send data to stdin, you can use that to input your data. You also get the output of your program from communicate, so you don't have to write it into a file.

The documentation for communicate explicitly warns that everything is buffered in memory, which seems to be exactly what you want to achieve.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜