开发者

subprocess isn't outputting anything

I'm trying to use Python to run pdftotext, but for some reason, my code isn't working. If I run the below, I expect that the content variable would contain the contents of the PDF, but the result I am getting is just an empty string.

Does anybody know what I'm missing?

def getPDFContent(path):
    path = "/path/to/a valid/pdffile.pdf"

    process = subprocess.Popen(["pdftotext", path], shell=False, 
        stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
   开发者_运维知识库 content, err = process.communicate()[0:2]
    return content, err


By default pdftotext doesn't output anything on stdout, it instead creates a .txt file with the same base name as the pdf. To get the text on stdout, add - as a second parameter in the call to pdftotext:

process = subprocess.Popen(["pdftotext", path, "-"], shell=False, 
    stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜