开发者

Blacklining of pdf files

I am trying to find a way to produce blacklined pdf files (ie. differences between two versions of the file are highlighted, similar to the "Compare and merge documents" featur开发者_开发知识库e in MS Word).

I have tried a few different approaches thus far, with sub-optimal results: 1) Using Adobe Acrobat's "Compare Documents" feature. The main problem with this approach was some text was interpreted as an image (why? The text could be copy/pasted..), leading to very coarse-grained diffs.

2) Converting the .pdfs into Word documents and using Word's comparison feature. The issue with this approach is the conversion from .pdf -> .doc is unreliable (some text missing in the .doc file), and some false-positive diffs (formatting characters and such Acrobat used to create the Word document).

3) A piece of software called Workshare (http://www.workshare.com/products/). This.. badly mangled.. the documents to a state of unusability.

We generate the .pdf files programatically using the ReportLab library running in the Django web framework. Hence, producing blacklined pdfs programatically is possible, and will probably produce the best results, but this would be a more time-consuming task.

Any suggestions?

(Really? stackoverflow won't allow me to use a tag called 'blacklining'? Really??!)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜