Merge / convert multiple PDF files into one PDF [closed]
This question does not appear to be about a specific pro开发者_如何学Gogramming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed last year.
The community reviewed whether to reopen this question 7 months ago and left it closed:
Improve this questionOriginal close reason(s) were not resolved
How could I merge / convert multiple PDF files into one large PDF file?
I tried the following, but the content of the target file was not as expected:
convert file1.pdf file2.pdf merged.pdf
I need a very simple/basic command line (CLI) solution. Best would be if I could pipe the output of the merge / convert straight into pdf2ps
( as originally attempted in my previously asked question here: Linux piping ( convert -> pdf2ps -> lp) ).
Considering that pdfunite
is part of poppler it has a higher chance to be installed, usage is also simpler than pdftk
:
pdfunite in-1.pdf in-2.pdf in-n.pdf out.pdf
Just make sure you remember to provide out.pdf
, or else it will overwrite the last file in your command
A safer solution may include a test of non-existence
targeting the output file
export output_file=out.pdf && \
! test -e $output_file && \
pdfunite in-1.pdf in-2.pdf in-n.pdf $output_file
Try the good ghostscript:
gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=merged.pdf mine1.pdf mine2.pdf
or even this way for an improved version for low resolution PDFs (thanks to Adriano for pointing this out):
gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -dPDFSETTINGS=/prepress -sOutputFile=merged.pdf mine1.pdf mine2.pdf
In both cases the ouput resolution is much higher and better than this way using convert:
convert -density 300x300 -quality 100 mine1.pdf mine2.pdf merged.pdf
In this way you wouldn't need to install anything else, just work with what you already have installed in your system (at least both come by default in my box).
UPDATE: first of all thanks for all your nice comments!! just a tip that may work for you guys, after googleing, I found a superb trick to shrink the size of PDFs, I reduced with it one PDF of 300 MB to just 15 MB with an acceptable resolution! and all of this with the good ghostscript, here it is:
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/default -dNOPAUSE -dQUIET -dBATCH -dDetectDuplicateImages -dCompressFonts=true -r150 -sOutputFile=output.pdf input.pdf
I'm sorry, I managed to find the answer myself using google and a bit of luck : )
For those interested;
I installed the pdftk (pdf toolkit) on our debian server, and using the following command I achieved desired output:
pdftk file1.pdf file2.pdf cat output output.pdf
OR
gs -q -sPAPERSIZE=letter -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=output.pdf file1.pdf file2.pdf file3.pdf ...
This in turn can be piped directly into pdf2ps.
This is the easiest solution if you have multiple files and do not want to type in the names one by one:
qpdf --empty --pages *.pdf -- out.pdf
Also pdfjoin a.pdf b.pdf
will create a new b-joined.pdf
with the contents of a.pdf and b.pdf
pdfunite
is fine to merge entire PDFs. If you want, for example, pages 2-7 from file1.pdf and pages 1,3,4 from file2.pdf, you have to use pdfseparate
to split the files into separate PDFs for each page to give to pdfunite
.
At that point you probably want a program with more options. qpdf
is the best utility I've found for manipulating PDFs. pdftk
is bigger and slower and Red Hat/Fedora don't package it because of its dependency on gcj. Other PDF utilities have Mono or Python dependencies. I found qpdf
produced a much smaller output file than using pdfseparate
and pdfunite
to assemble pages into a 30-page output PDF, 970kB vs. 1,6450 kB. Because it offers many more options, qpdf
's command line is not as simple; the original request to merge file1 and file2 can be performed with
qpdf --empty --pages file1.pdf file2.pdf -- merged.pdf
You can use the convert command directly,
e.g.
convert sub1.pdf sub2.pdf sub3.pdf merged.pdf
Use pdftools
from PyPI.
Download the tar.gz file and uncompress it and run the command like below
python pdftools-1.1.0/pdfmerge.py -o output.pdf -d file1.pdf file2.pdf file3
You should install python3 before you run the above command
This tools support the below
- add
- insert
- Remove
- Rotate
- Split
- Merge
- Zip
You can find more details on GitHub and it is open source
Apache PDFBox http://pdfbox.apache.org/
PDFMerger This application will take a list of pdf documents and merge them, saving the result in a new document.
usage: java -jar pdfbox-app-x.y.z.jar PDFMerger "Source PDF files (2 ..n)" "Target PDF file"
You can use sejda-console, free and open source.
Unzip it and run sejda-console merge -f file1.pdf file2.pdf -o merged.pdf
It preserves bookmarks, link annotations, acroforms etc.. it actually has quite a lot of options you can play with, just run sejda-console merge -h
to see them all.
I am biased being one of the developers of PyMuPDF (a Python binding of MuPDF).
You can easily do what you want with it (and much more). Skeleton code works like this:
#-------------------------------------------------
import fitz # the binding PyMuPDF
fout = fitz.open() # new PDF for joined output
flist = ["1.pdf", "2.pdf", ...] # list of filenames to be joined
for f in flist:
fin = fitz.open(f) # open an input file
fout.insertPDF(fin) # append f
fin.close()
fout.save("joined.pdf")
#-------------------------------------------------
That's about it. Several options are available for selecting only pages ranges, maintaining a joint table of contents, reversing page sequence or changing page rotation, etc., etc.
We are on PyPi.
Although it's not a command line solution, it may help macos
users:
- Select your PDF files
- Right-click on your highlighted files
- Select Quick actions > Create PDF
I second the pdfunite
recommendation. I was however getting Argument list too long
errors as I was attempting to merge > 2k PDF files.
I turned to Python for this and two external packages: PyPDF2 (to handle all things PDF related) and natsort (to do a "natural" sort of the directory's file names). In case this can help someone:
from pathlib import Path
from PyPDF2 import PdfMerger
import natsort
DIR = Path("dir-with-pdfs/")
OUTPUT = "output.pdf"
paths = DIR.glob("*.pdf")
paths = natsort.natsorted(paths)
merger = PdfMerger()
for path in paths:
merger.append(path)
merger.write(OUTPUT)
If you want to convert all the downloaded images into one pdf then execute
convert img{0..19}.jpg slides.pdf
I used qpdf from terminal and work for me at Windows (Mobaxterm) and Linux, for example the command for join A.pdf with B.pdf at new file C.pdf is:
qpdf --empty --pages oficios/A.pdf informes/B.pdf -- salida/C.PDF
If need more documentation [https://net2.com/how-to-merge-or-split-pdf-files-on-linux/][1]
You can see use the free and open source pdftools (disclaimer: I am the author of it).
It is basically a Python interface to the Latex pdfpages
package.
To merge pdf files one by one, you can run:
pdftools --input-file file1.pdf --input-file file2.pdf --output output.pdf
To merge together all the pdf files in a directory, you can run:
pdftools --input-dir ./dir_with_pdfs --output output.pdf
Here is a Bash script which checks for merging errors.
I had the problem that a few PDF merges produced some error messages. As it is quite a lot trial and error to find the corrupt PDFs, I wrote a script for it.
The following Bash script merges all available PDFs in a folder one by one and gives a success status after each merge. Just copy it in the folder with the PDFs and execute from there.
#!/bin/bash
PDFOUT=_all_merged.pdf
rm -f "${PDFOUT}"
for f in *.pdf
do
printf "processing %-50s" "$f ..." >&2
if [ -f "$PDFOUT" ]; then
# https://stackoverflow.com/questions/8158584/ghostscript-to-merge-pdfs-compresses-the-result
# -dPDFSETTINGS=/prepress
status=$(gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile="${PDFOUT}.new" "${PDFOUT}" "$f" 2> /dev/null)
if [ "$status" ]
then
echo "gs ERROR: $status" >&2
else
echo "successful" >&2
fi
mv "${PDFOUT}.new" "${PDFOUT}"
else
cp "$f" "${PDFOUT}"
echo "successful" >&2
fi
done
example output:
processing inp1.pdf ... successful
processing inp2.pdf ... successful
Here's a method I use which works and is easy to implement. This will require both the fpdf and fpdi libraries which can be downloaded here:
- FPDF: http://www.fpdf.org/en/download.php
- FPDI: https://www.setasign.com/products/fpdi/downloads
require('fpdf.php');
require('fpdi.php');
$files = ['doc1.pdf', 'doc2.pdf', 'doc3.pdf'];
$pdf = new FPDI();
foreach ($files as $file) {
$pdf->setSourceFile($file);
$tpl = $pdf->importPage(1, '/MediaBox');
$pdf->addPage();
$pdf->useTemplate($tpl);
}
$pdf->Output('F','merged.pdf');
I like the idea of Chasmo, but I preffer to use the advantages of things like
convert $(ls *.pdf) ../merged.pdf
Giving multiple source files to convert
leads to merging them into a common pdf. This command merges all files with .pdf
extension in the actual directory into merged.pdf
in the parent dir.
PdfCpu works great:
pdfcpu merge c.pdf a.pdf b.pdf
https://pdfcpu.io/core/merge
If you want to join all PDF files in a directory with Ghostscript, you can use find to do just that. Here's an example
find . -name '*.pdf' -exec gs -o -sDEVICE=pdfwrite -dPDFSETTINGS=/prepress -sOutputFile=../out.pdf {} +
Will find all pdf in current directory, and create out.pdf in parent directory. Might be useful if they're looking for a quick way for do an entire directory with ghostscript.
pdfconcat -o out.pdf 1.pdf 2.pdf
``pdfconcat is a small and fast command-line utility written in ANSI C that can concatenate (merge) several PDF files into a long PDF document.''
Yet another option, useful is you want to select also the pages inside the documents to be merged:
pdfjoin image.jpg '-' doc_only_first_pages.pdf '1,2' doc_with_all_pages.pdf '-'
It comes with package texlive-extra-utils
精彩评论