How to crop PDF margins using pdftk and /MediaBox
I used pdftk
to uncompress a PDF and then opened it as a text file.
/MediaBox [0 0 开发者_StackOverflow中文版612 792]
I would like to reduce the margins, for instance
/MediaBox [100 0 512 792]
Unfortunately it doesn't work. I can change the 0
into a 2
or a 9
but I cannot put 100
for instance.
Any idea why?
The string 100 has two more numbers in it than 0. When you use a text editor and add characters, that makes the file longer. That's why replacing with 9 or 2 or any other single digit works fine. While a text editor can theoretically be used to edit a pdf, it's not simple and you have to respect the internal structure of the file. The xref table is a table near the end of a pdf that tells the reader exactly where each object is located. It has to be changed whenever the length or location of anything is changed.
The reason the manual method above using pdftk
doesn't work is that you are adding two bytes in the center of the file. This breaks the xref
table. If you manually update all the xref
s, this will work, but it is potentially very tedious. Using sed
or any other text editing tool will not solve the problem. podofo
does the xref
calculation for you.
use sed to replace any occurrence
sed 's/MediaBox \[0 0 612 792*/MediaBox \[100 0 512 792]/g'<in.pdf >out.pdf
or podofobox (inside podofo utils)
- http://podofo.sourceforge.net/tools.html
without needing to uncompress pdf streams first (as needed with pdftk)
podofobox in.pdf out.pdf media 10000 0 51200 79200
as you can see, podofobox uses MediaBox values multiplied by 100, since its scale is a sub multiple, so, you need simply to add two zeroes (00) to values you can read in MediaBox field
there are better ways to change the margin of a PDF:
- http://code.google.com/p/sopdf/
- https://pypi.org/project/pypdf/
- http://code.activestate.com/recipes/576837-crop-pdf-file-with-pypdf/
- for ghostscript see this page Cropping a PDF using Ghostscript 9.01
hope you found an answer to that since posting :-)
精彩评论