开发者

How to extract an Apache FOP created PDF in C#?

I have a problem in my c# project. I want to ext开发者_StackOverflowract Apache FOP generated PDF files programatically without any 3rd party application. I tried to use many libary like PDFBox, IKVM, PDF2Text, ITextSharp, PDFSharp to extract PDF files, but failed. When i extract a FOP generated PDF to a text file, i get a lots of square symbols and other entangled characters.

My question is, how can i extract a FOP generated PDF file in C#? Is there any library (written to C#), which can do that?

Thanks.


Fonts using Identity-H encoding use directly the glyph indexes for displaying the text on the page. These fonts require a ToUnicode entry in the font dictionary (in the PDF file) in order to support text extraction, otherwise it is not possible. Check the Apache FOP to see if it has a setting for including a ToUnicode entry in the font dictionary or for making the font extraction friendly.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜