How to convert PDF files to images using RMagick and Ruby
I'd like to take a PDF file and convert it to images, each PDF page be开发者_StackOverflowcoming a separate image.
"Convert a .doc or .pdf to an image and display a thumbnail in Ruby?" is a similar post, but it doesn't cover how to make separate images for each page.
Using RMagick itself, you can create images for different pages:
require 'RMagick'
pdf_file_name = "test.pdf"
im = Magick::Image.read(pdf_file_name)
The code above will give you an array arr[]
, which will have one entry for corresponding pages. Do this if you want to generate a JPEG image of the fifth page:
im[4].write(pdf_file_name + ".jpg")
But this will load the entire PDF, so it can be slow.
Alternatively, if you want to create an image of the fifth page and don't want to load the complete PDF file:
require 'RMagick'
pdf_file_name = "test.pdf[5]"
im = Magick::Image.read(pdf_file_name)
im[0].write(pdf_file_name + ".jpg")
ImageMagick can do that with PDFs. Presumably RMagick can do it too, but I'm not familiar with it.
The code from the post you linked to:
require 'RMagick'
pdf = Magick::ImageList.new("doc.pdf")
pdf
is an ImageList
object, which according to the documentation delegates many of its methods to Array
. You should be able to iterate over pdf
and call write
to write the individual images to files.
Since I can't find a way to deal with PDFs on a per-page basis in RMagick, I'd recommend first splitting the PDF into pages with pdftk's burst
command, then dealing with the individual pages in RMagick. This is probably less performant than an all-in-one solution, but unfortunately no all-in-one solution presents itself.
There's also PDF::Toolkit for Ruby that hooks into pdftk but I've never used it.
精彩评论