Converting HTML to PDF
15 Feb
When I’m looking at a web page with Safari, all I have to do is hit Command-P and save as PDF, and I have a nice PDF of the page.
Is there a way to do this automated and server-side on a standard linux system? I need to have a web app generate these PDFs dynamically on the fly. And they need to look good, with different designs, etc., so it’s more than just getting text to show on a page.
I’ve come across HTMLDOC, and it does the job in theory, but only for crummy HTML—supports most of HTML 3.2, some of HTML 4.0, and no CSS. Not great, if I also want to produce reasonable HTML output.
Are there any other options out there that you’re aware of? Is there code in Webkit for this, and would it be possible to get some of that to run on a Debian box? Hm, doesn’t sound likely.

First render the html to postscript. There are a few options for this: One I just found (but haven’t tried) is:
<a href="http://user.it.uu.se/~jan/html2ps.html">http://user.it.uu.se/~jan/html2ps.html</a>
The way I always used to do it was to use netscape in batch print mode and direct the output to a file. I assume you can do the same with firefox. The beauty of this approach is that you know you will get the same rendering as firefox, the downside is that you can’t install firefox on a server without X libraries and it’s a fairly heavy way of doing it.
Once you have the postscript, ps2pdf will make it into a pdf for you – it’s available as a default package in every linux distro I’ve ever used.
Looks like typo just typo’d the tilde in the url. It should read:
http://user.it.uu.se/~jan/html2ps.html
Prince, which I haven’t tried, has a command line interface. Don’t know much about it. <a href="http://www.princexml.com/">http://www.princexml.com/</a>
Mark, thanks for the tip, it’s amazing how helpful you always are, it’s deeply appreciated. Thank you!
Andreas, Prince looks like a perfect fit, except, of course, for the $3800 server license, which is a little steep. But at a first glance, it looks clean.
Heh, that’s what I get for not looking at the price before commenting. :o)
You could install OpenOffice, run a vncserver so OpenOffice has something to connect to in the background, and have a Macro to convert HTML to PDF using OpenOffice (which would allow you to convert anything OO can read to PDF).
Hi Malte
Thanks for the tip.
Thanks to your suggestion, I found "this":http://www.xml.com/pub/a/2006/01/11/from-microsoft-to-openoffice.html, which has some example macros, and spells out in more detail how this could be done.
/Lars