开发者

how to convert a HTML web page into a PDF file using Java

i've been searching on the internet on how to convert a HTML page into a PDF file using Java. i found a lot of pointers, and in short, they don't work or are too difficult to implement. i also downloaded a commercial product, pdf4ml; the API is something i'd be happy to work with, except that when i crawled a simple page on wikipedia, i get a out of memory error (setting Xmx to 1024 M). in some approaches, they suggest converting HTML -> XHTML -> FO -> PDF. however, i am getting a lot of exceptions for the XHTML-to-FO XLS file; and reading the documentations, it's not something that i have enough time to understand right now.

here are my questions/concerns. 1. is there another cohes开发者_运维知识库ive API out there that will easily convert HTML to PDF (commercial or not)? 2. is there a way i can simply capture a HTML page and store it as a single file. this approach would be similar to using internet explorer's way of saving a web page as a web archive (single file, MHT format)?

any help is appreciated. (btw, i know this question has been asked repeatedly, but in addition to the original spirit of the question, i'm opened to other ways). thanks.


Try wkhtmltopdf, which is using WebKit. Another option (I'm using that currently) is using OpenOffice (remote controlled via macros).


you may use iText open source Java lib for that, and read this

or use YaHPConverter open source Java lib.

or do this whith help of icepdf popular open source lib

or use pd4ml, but it not free, only trial.

or use this, and this is man for it.


My 2 cents using opensource tools:

You can use either Capture screenshots with Selenium or WebDriver to save html page's screenshot in an image file from your Java code. And once you have image file you can convert it to pdf again from your Java code.

EDIT: It seems you can do all that in 1 step using itext Html to Pdf


I am not sure but you could Try

1) cobra html rendering engine http://lobobrowser.org/cobra.jsp

2) htmleditorkit -- part of jdk

3) JWebPane

Use the rendering kit to parse and render html. The rendered out put is a swing component. Swing component can be used by itext to generate pdf file out put


You can try out Pdfcrowd. It is an easy to use commercial online API with many options and with support for Java.

It can create PDF either from web pages or raw HTML code.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜