Convert HTML to PDF
In reference to an earlier post (PDF Report generation)
I have decided to use a solution similar to http://www.alistapart.com/articles/boom
For those of you who don't want to read either reference - I'm creating a report and need it as a 开发者_开发技巧PDF. I've decided to go the HTML -> PDF route using .NET.
So, let's say I get the HTML file exactly like I want it. What is the best way to convert said page to PDF? In essence, I'd like the user to see a "preview" in HTML and then be able to convert said page to PDF. The library I'm currently experimenting with is ABCPdf.
My first thought was to save the page to the filesystem and reference it's URL in the conversion routine during an eventhandler on the page itself. This has it's problems because I'd have to save the page each time it was rendered in preparation to print it. Actually, it just seems backasswards.
My next thought was to use the page's render method to capture the page as a bytestream and use this (since ABCPdf supports converting a stream of HTML.) If this is my answer, I'm lost at how to actual pull it off. Have a "Print" button that's handler does a Me.Render() to bytestream and send that to the conversion routine? Is that even possible?
Bottom line - Once a page is rendered in nice HTML, how do you initialize a conversion to PDF of that page? Workarounds and other solutions are welcome.
I'm hoping I'm missing something obvious as this has got to be "the easy part"
Ok, got it working - and it was fairly simple. Just passing this along to the next guy who might need the answer. I just used the Url property of the page and sent it to the ABCPdf addImageUrl() method. Also had to use chaining since it was more than one page. Thanks for all of the help.
Dim oPdfDoc As New Doc()
Dim iPageID As Int32
Dim MyUrl = Request.Url
iPageID = oPdfDoc.AddImageUrl(MyUrl.AbsoluteUri)
While True
oPdfDoc.FrameRect()
If Not oPdfDoc.Chainable(iPageID) Then
Exit While
End If
oPdfDoc.Page = oPdfDoc.AddPage()
iPageID = oPdfDoc.AddImageToChain(iPageID)
End While
For i as Int32 = 1 To oPdfDoc.PageCount
oPdfDoc.PageNumber = i
oPdfDoc.Flatten()
Next
oPdfDoc.Save(Server.MapPath("test.pdf"))
oPdfDoc.Clear()
I was in the same situation as you and after evaluating a lot of options including iTextSharp and ABC PDF, I ended up with wkhtmltopdf: http://code.google.com/p/wkhtmltopdf/.
How do you do it from C#? You don't (directly).
Spawning a worker PRINCE.EXE process may be your only option.
PRINCE.EXE will read/write data from HTML "standard input" and sent PDF to "standard output". Use the command line "%dir%\PRINCE -" with no output file name.
You may find you need a separate COM component to spawn PRINCE, as the System.Management class might not work for you. Use Visual Basic or C++ to make your COM component.
Putting HTML in the database is a bad idea generally, but may be okay in your case, as it sounds like it's essentially static.
EDIT
Changed "child PRINCE.EXE" to "worker PRINCE.EXE process". I have a funny feeling that PRINCE.EXE needs to not be a child process.
精彩评论