How can I programmatically remove a page from a PDF document on a Mac?
I have a bunch of PDF documents and all of them contain a title page that 开发者_如何学PythonI want to remove.
Is there a way to programmatically remove them?
Most of the PDF utilities I found can only combine documents but not remove pages. In the print dialog I can choose page 2 to and then print to a file, but I can't find any way to access this function programmatically.
Use pdftk.
To remove page 8:
pdftk in.pdf cat 1-7 9-end output out.pdf
Just for the record: you can also use Ghostscript:
gs \
-o removed-page-1-from-input.pdf \
-sDEVICE=pdfwrite \
-dFirstPage=2 \
/path/to/input.pdf
However, pdftk
is the better tool for that job (and was already recommended to you).
Also, this Ghostscript commandline could change some of the properties in your input.pdf because it essentially re-distills it. This could be a desired change or not. To control individual aspects of this behavior (or to suppress some of them), a more complicated commandline with more parameters is required.
pdftk will re-use the original PDF objects for each page as-is.
Update
Ghostscript has the additional parameter of -dLastPage
too. Together with -dFirstPage
this allows for the extraction of page ranges.
The newest versions sport an new parameter, -sPageList
. This could be used like this:
-sPageList="1, 5-10, 12-"
to extract pages 1, 5-10 and 12-last from the input document. However, I've not (yet) personally tested this new feature and I'm not sure how reliably it works.
For older versions of Ghostscript (as well as the most recent one), it should work to feed the same input PDF multiple times with different parameters to same GS call to extract non-contiguous page selections from a document. You could even combine pages from different documents this way:
gs \
-o selected-pages.pdf \
-sDEVICE=pdfwrite \
-dFirstPage=2 \
-dLastPage=2 \
in1.pdf \
\
-dFirstPage=10 \
-dLastPage=15 \
in1.pdf \
\
-dFirstPage=1 \
-dLastPage=1 \
in1.pdf \
\
-dFirstPage=4 \
-dLastPage=6 \
in2.pdf
Caveats: Combining pages from different documents which use non-embedded fonts or identical font names but different encodings and/or different subsets (with identical fontname-prefixes) may lead to a faulty PDF in the result.
-[PDFDocument removePageAtIndex:]
looks like it should make this possible. By the way, Preview.app can remove a page, but it isn't scriptable, so that's not a programmatic solution.
精彩评论