开发者

Is there any method in ruby or in python to convert pdf to xml

Recently I have got an requirement like i need to convert pdf files to xml.So please tell me how to do it in pyth开发者_C百科on or ruby programming languages. please reply to my question as early as possible your help would be appreciated..


Use pdftohtml.

It can be installed with brew install pdftohtml. This adds pdftohtml to your path.

So, to convert pdf to xml, you can run pdftohtml -xml your_file.pdf your_file.xml

In python, you could use these lines:

import os

cmd = 'pdftohtml -xml your_file.pdf your_file'
os.system(cmd)

!note that pdftohtml automatically adds the 'xml' extension to your second file input

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜