开发者

Looking for a linux PDF library to extract annotations and images from a PDF [closed]

Closed. This question is seeking recommendations for books, tools, software libraries, and more. It does not meet Stack Overflow guidelines. It is not currently accepting answers.
开发者_JAVA百科

We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.

Closed 8 years ago.

Improve this question

I'm looking for a free library (Java/Ruby), that can run on linux, and can extract images and annotations from PDFs; similar to what CGPDFDocument can do on OS X.

Thanks!


I don't know about images, but using the last version of the ruby pdfreader library I was able to succesfully extract the annotations from a big PDF file:

PDF::Reader.open(filename) do |reader|
  reader.pages.each do |page|
    annots_ref = page.attributes[:Annots]
    actual_annots = reader.objects[annots_ref]
    if actual_annots && actual_annots.size > 0
      actual_annots.each do |annot_ref|
        actual_annot = reader.objects[annot_ref]
          unless actual_annot[:Contents].nil?
            puts "Page #{page.number},"+actual_annot[:Contents].inspect
          end
        end
    end
  end       
end

I imagine that something like it could be done to extract images.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜