Java: combine 2000-5000 PDFs into 1 using iText yield OutOfMemorryError
I have eyeballing this code for a long time, trying to reducing the amount of memory the code use and still it generated java.lang.OutOfMemoryError: Java heap space
. As my last resort, I want to ask the community on how can I improve this code to avoid OutOfMemoryError
I have a driver/manifest file (.txt file) that contain information about the PDFs. I have about 2000-5000 pdf inside a zip file that I need to combine together. Before the combining, for each pdf, I need to add 2-3 more pdf pages to it. Manifest
object holds information about a pdf.
try{
blankPdf = new PdfReader(new FileInputStream(config.getBlankPdf()));
mdxBacker = new PdfReader(new FileInputStream(config.getMdxBacker()));
theaBacker = new PdfReader(new FileInputStream(config.getTheaBacker()));
mdxAffidavit = new PdfReader(new FileInputStream(config.getMdxAffidavit()));
theaAffidavit = new PdfReader(new FileInputStream(config.getTheaAffidavit()));
ImmutableList<Manifest> manifestList = //Read manifest file and obtain List<Manifest>
File zipFile = new File(config.getInputDir() + File.separator + zipName);
//Extracting PDF into `process` folder
ZipUtil.extractAll(config.getExtractPdfDir(), zipFile);
outputPdfName = zipName.replace(".zip", ".pdf");
outputZipStream = new FileOutputStream(config.getOutputDir() +
File.separator + outputPdfName);
document = new Document(PageSize.LETTER, 0, 0, 0, 0);
writer = new PdfCopy(document , outputZipStream);
document.open(); //Open the document
//Start combining PDF files together
for(Manifest m : manifestList){
//Obtain full path to the current pdf
String pdfFilePath = config.getExtractPdfDir() + File.separator + m.getPdfName();
//Before combining PDF, add backer and affidavit to individual PDF
PdfReader pdfReader = PdfUtil.addBackerAndAffidavit(config, pdfType, m,
pdfFilePath, blankPdf, mdxBacker, theaBacker, mdxAffidavit,
theaAffidavit);
for(int pageNumber=1; pageNumber<=pdfReader.getNumberOfPages(); pageNumber++){
document.newPage();
PdfImportedPage page = writer.getImportedPage(pdfReader, pageNumber);
writer.addPage(page);
}
}
} catch (DocumentException e) {
} catch (IOException e) {
} finally{
if(document != null) document.close();
try{
if(outputZipStream != null) outputZipStream.close();
if(writer != null) writer.close();
}catch(IOException e){
}
}
Please, rest assure that I have look at this code for a l开发者_StackOverflowong time, and try rewrite it many times to reduce the amount of memory it using. After the OutOfMemoryError, there are still lots of pdf files that have not been added 2-3 extra pages, so I think it is inside addBackerAndAffidavit
, however, I try to close every resources I opened, but it still exception out. Please help.
You need to invoke PdfWriter#freeReader()
by end of every loop to free the involved PdfReader
. The PdfCopy#freeReader()
has this method inherited from PdfWriter
and does the same. See also the javadoc:
freeReader
public void freeReader(PdfReader reader) throws IOException
Description copied from class:
PdfWriter
Use this method to writes the reader to the document and free the memory used by it. The main use is when concatenating multiple documents to keep the memory usage restricted to the current appending document.Overrides:
freeReader
in classPdfWriter
Parameters:
reader
- thePdfReader
to freeThrows:
IOException
- on error
精彩评论