开发者

Are there any Libraries/ Projects that convert any generic document type to HTML

Are there any projects out there trying to build converters for different file types -> HTML or Text. The document formats are the most common ones; they include PDF, DOC(X), XLS(X), PPT(X), PS, etc. I am already aware of some Unix utilities like pdftotext. Also, I know of Apache's Tika and POI projects. Is there anything that has a generic interface ? Something like the MultiMarkdow开发者_运维问答n


Like you said, the philosophy of UNIX-like systems is to use small utilities/filters to do that (latex2html, t2html, txt2html, pdftohtml, etc.). You could create you own interface using shell scripting, perl, python, etc. and use those filters as callbacks.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜