Whole PDF compression
I'm working on a tool that will be writing PDFs and am trying to find a way to compress the objects and streams in the PDF. A number of the PDFs that I'm generating are fairly large, but can be substantially reduced by compressing the objects (or most of the PDF structure) into a flate stream. I swear I've seen this done before, but none of the PDFs that I've looked at seem to do it. I also tried using Acrobat X to compress it with "entire file compression"开发者_Python百科, but it seems to only compress the streams.
I've tried using ObjStm, but it doesn't have a lot of support from other file readers. I need something that has a little more support outside of Adobe.
Any suggestions are appreciated!
In PDF you can have 2 types of compression:
- stream compression - the data is compressed using various methods, but the PDF file structure is not compressed.
- object compression - you also compress the file structure, mainly the objects that do not include streams.
These are the only supported compression scenarios in PDF. Selecting the right compression method depends much on the data you want to compress: for page content streams usually Flate compression is used, 1bpp images use CCITT G4 or better JBIG2, color images are better compressed with JPEG2000, etc.
Object compression is available since Acrobat 6.
You could also look at whether you can reduce the size of the data (ie are the fonts subsetted, are the images at the optimum dpi, does the file contain dead objects).
Check out the PDF Specification, section 7.5.7 (Object Streams) and 7.5.8 (Cross Reference Streams).
I'm positive that iText can read and write these files, but I never use it that way because the results are significantly harder to debug. There might be a sample PDF... but I don't see any.
I was hoping one of the iText in Action 2nd edition samples covered object streams, but didn't find one.
精彩评论