Is there a programmatic way to transform a sequence of image files into a PDF? [closed]
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this questionI have a sequence of JPG images. Each of t开发者_如何学编程he scans is already cropped to the exact size of one page. They are sequential pages of a valuable and out of print book. The publishing application requires that these pages be submitted as a single PDF file.
I could take each of these images and just past them into a word-processor (e.g. OpenOffice) - unfortunately the problem here is that it's a very big book and I've got quite a few of these books to get through. It would obviously be time-consuming. This is volunteer work!
My second idea was to use LaTeX (actually pdflatex) - I could make a very simple document that consists of nothing more than a series of in-line image includes. I'm sure that this approach could be made to work, it's just a little on the complex side for something which seems like a very simple job.
It occurred to me that there must be a simpler way - so any suggestions?
I'm on Ubuntu 9.10, my primary programming language is Python, but if the solution is super-simple I'd happily adopt any technology that works.
UPDATE, can somebody explain what's going wrong here?
sal@bobnit:/media/NIKON D200/DCIM/100HPAIO/bat$ convert '*.jpg' bat.pdf
convert: unable to open image `*.jpg': No such file or directory @ blob.c/OpenBlob/2439.
convert: missing an image filename `bat.pdf' @ convert.c/ConvertImageCommand/2775.
Is there a way in the convert command syntax to specify that bat.pdf is the output?
Thanks
It occurred to me that there must be a simpler way - so any suggestions?
You're right, there is! Try this:
sudo apt-get install imagemagick
cd ~/rare-book-images
convert "*.jpg" rare-book.pdf
Note: depending on what shell you're using "*.jpg" might not work as expected. Try omitting the quotes and seeing if that gets you the results you expect.
If you're interested in a Python solution, you can use the ReportLab library. For example:
from reportlab.platypus import SimpleDocTemplate, Image
from reportlab.lib.pagesizes import letter
from glob import glob
doc = SimpleDocTemplate('image-collection.pdf', pagesize=letter)
parts = [Image(filename) for filename in glob('*.jpg')]
doc.build(parts)
This will take all the jpg files in your current directory and produce a file called "image-collection.pdf".
I wonder if you could just do it with a for
loop with a \includegraphics
command inside and some suitably nifty standard image file naming and so on inside a LaTeX file. This might have the advantage of allowing title pages etc and page numbering and so on. (I'm not sure either of the other solutions do this and I can't be bothered to check. I'm just pondering out loud here, really)
精彩评论