开发者

Linux PDF/Postscript Optimizing

So I have a report system built using Java and iText. PDF templates are created using Scribus. The Java code merge开发者_开发问答s the data into the document using iText. The files are then copied over to a NFS share, and a BASH script prints them.

I use acroread to convert them to PS, then lpr the PS.

The FOSS application pdftops is horribly inefficient.

My main problem is that the PDF's generated using iText/Scribus are very large. And I've recently run into the problem where acroread pukes because it hits 4gb of mem usage on large (300+ pages) documents. (Adobe is painfully slow at updating stuff to 64 bit).

Now I can use Adobe reader on Windows, and use the Reduce file size option or whatever its called, and it greatly(> 10x) reduces the size of the PDF(it removes alot of metadata about form fields and such it appears) and produces a PDF that is basically a Print image.

My question is does anyone know of a good solution/program for doing something similiar on Linux. Ideally, it would optimize the PDF, reduce size, and reduce PS complexity so the printer could print faster as it takes about 15-20 seconds a page to print right now.


To reduce the size of a PDF file, use pdfsizeopt, the software I am developing. pdfsizeopt runs on Linux, Mac OS X, Windows (and possibly on other systems as well).

pdfsizeopt has lots of dependencies, so it might be a bit cumbersome to install (about 10 minutes of your time). I'm working on making installation easier.

If you need something quickly, you can try one of its dependencies: Multivalent tool.pdf.Compress, which is a pure Java tool.

Get Multivalent20060102.jar, install Java and run

java -cp Multivalent20060102.jar tool.pdf.Compress input.pdf

There are limitations on what gs -sDEVICE=pdfwrite can do:

  • it can't generate xref streams (so the PDF will be larger than necessary)
  • it can't generate object streams (so the PDF will be larger than necessary)
  • it doesn't deduplicate images or other objects (i.e., if the same image appears multiple times in the input PDF, gs makes a copy in the output for each occurrence)
  • it emits images suboptimally
  • it re-samples images to low resolution
  • it sometimes omits hyperlinks in the PDF
  • it can't convert some constructs (so the output PDF may be visually different from the input)

Neither pdfsizeopt nor Multivalent's tool.pdf.Compress suffer from these limitations.


gs \
  -dCompatibilityLevel=1.4 \
  -dPDFSETTINGS=/screen \
  -dNOPAUSE \
  -dBATCH \
  -sDEVICE=pdfwrite \
  -sOutputFile=output.pdf \
   input.pdf

Ghostscript seems to work for most for this issue. I'm having a different problem now with ghostscript garbling the embedded fonts, but I'll open a new question for that.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜