Looking for a linux PDF library to extract annotations and images from a PDF [closed]
开发者_JAVA百科
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this questionI'm looking for a free library (Java/Ruby), that can run on linux, and can extract images and annotations from PDFs; similar to what CGPDFDocument can do on OS X.
Thanks!
I don't know about images, but using the last version of the ruby pdfreader library I was able to succesfully extract the annotations from a big PDF file:
PDF::Reader.open(filename) do |reader|
reader.pages.each do |page|
annots_ref = page.attributes[:Annots]
actual_annots = reader.objects[annots_ref]
if actual_annots && actual_annots.size > 0
actual_annots.each do |annot_ref|
actual_annot = reader.objects[annot_ref]
unless actual_annot[:Contents].nil?
puts "Page #{page.number},"+actual_annot[:Contents].inspect
end
end
end
end
end
I imagine that something like it could be done to extract images.
精彩评论