How do I reverse the order of the pages in a pdf file using pyPdf?
I have a pdf file "myFile.pdf". I would like to reverse the order of its pages using pyPdf开发者_如何学C. How?
from pyPdf import PdfFileWriter, PdfFileReader
output_pdf = PdfFileWriter()
with open(r'input.pdf', 'rb') as readfile:
input_pdf = PdfFileReader(readfile)
total_pages = input_pdf.getNumPages()
for page in xrange(total_pages - 1, -1, -1):
output_pdf.addPage(input_pdf.getPage(page))
with open(r'output.pdf', "wb") as writefile:
output_pdf.write(writefile)
Thanks for sharing suggestions. I used them and edited a bit to make the interface more graphical when selecting and saving a file. New to all of this and what I added might not be efficient or clean, but it worked for me and thought to share.
from PyPDF2 import PdfFileWriter, PdfFileReader
import tkinter as tk
from tkinter import filedialog
import ntpath
import os
output_pdf = PdfFileWriter()
# grab the location of the file path sent
def path_leaf(path):
head, tail = ntpath.split(path)
return head
# graphical file selection
def grab_file_path():
# use dialog to select file
file_dialog_window = tk.Tk()
file_dialog_window.withdraw() # hides the tk.TK() window
# use dialog to select file
grabbed_file_path = filedialog.askopenfilename()
return grabbed_file_path
# file to be reversed
filePath = grab_file_path()
# open file and read
with open(filePath, 'rb') as readfile:
input_pdf = PdfFileReader(readfile)
# reverse order one page at time
for page in reversed(input_pdf.pages):
output_pdf.addPage(page)
# graphical way to get where to select file starting at input file location
dirOfFileToBeSaved = path_leaf(filePath)
locationOfFileToBeSaved=filedialog.asksaveasfilename(initialdir=dirOfFileToBeSaved, initialfile='name of reversed file.pdf',title="Select or type file name and location", filetypes=[("pdf files", "*.pdf")])
# write the file created
with open(locationOfFileToBeSaved, "wb") as writefile:
output_pdf.write(writefile)
# open the file when done
os.startfile(locationOfFileToBeSaved)
As of (long before) January 2019, pyPdf is no longer updated, and upon testing is not compatible with (at least) Python 3.6, and likely at all with Python 3:
In [1]: import pyPdf
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-1-bba5a42e9137> in <module>
----> 1 import pyPdf
c:\temp\envminecart\lib\site-packages\pyPdf\__init__.py in <module>
----> 1 from pdf import PdfFileReader, PdfFileWriter
2 __all__ = ["pdf"]
ModuleNotFoundError: No module named 'pdf'
(Moving the __all__
assignment above the import
fixes this specific problem, but other SyntaxError
s due to Python 2 syntax then pop up.)
Fortunately, its successor project, PyPDF2, works cleanly on Python 3.6 (at least). It appears the core user-facing API was intentionally maintained to be compatible with pyPdf, so nosklo's answer can be used in modern Python after a pip install PyPDF2
just by changing to PyPDF2
in the import
statement, and switching xrange
to range
:
from PyPDF2 import PdfFileWriter, PdfFileReader
output_pdf = PdfFileWriter()
with open(r'input.pdf', 'rb') as readfile:
input_pdf = PdfFileReader(readfile)
total_pages = input_pdf.getNumPages()
for page in range(total_pages - 1, -1, -1):
output_pdf.addPage(input_pdf.getPage(page))
with open(r'output.pdf', "wb") as writefile:
output_pdf.write(writefile)
I would also recommend the more Pythonic approach of just iterating over the pages directly using reversed
:
from PyPDF2 import PdfFileWriter, PdfFileReader
output_pdf = PdfFileWriter()
with open('input.pdf', 'rb') as readfile:
input_pdf = PdfFileReader(readfile)
for page in reversed(input_pdf.pages):
output_pdf.addPage(page)
with open('output.pdf', "wb") as writefile:
output_pdf.write(writefile)
I don't know if this .pages
collection was available in the original pyPdf
, but arguably it doesn't really matter much at this point.
精彩评论