I finally gave PrinceXML a quick look after my inquiries regarding generating PDFs (on the server) with Firefox/Gecko. I received no less than ten recommendations advocating Prince so I figured I'd write up the experience as recompense.

Prince is basically a print / typesetting / PDF generation tool based on popular web standards (HTML, XML, CSS, SVG, etc). For applications that are developed for the web first and then want PDF/print capabilities added on top, a tool like Prince makes a lot of sense. You can take your web content, sprinkle on a bit of print specific CSS, and get very acceptable PDF output. This beats the pants off having to write and maintain a separate LaTeX, DocBook, FO, or raw PDF apparatus to get good print support for your primarily web-based content.

I'm not going to get into a whole lot of Prince's more advanced features here; there's a demo of Prince in action on YouTube that lays everything out nicely and the documentation is available on the web.

First Impressions

The first thing I noticed -- after the restrictive license and the $3,800 price tag -- was that, for a proprietary, closed source software outfit, these people appear to have their shit together. I was expecting something really bloated and over-engineered, designed primarily for Windows, maybe with a crappy port to one of the "enterprise" Linux distributions. I have no idea why I thought that was the case - maybe because I knew it was payware; or, maybe it was due to the product name having "XML" in it when XML isn't really all that interesting to what the product actually does. The reality is that the product is well-designed, very lightweight, and has packages available for Windows, MacOS X, Solaris 10, various Linux distros, and FreeBSD.

EDIT: I should have noted that there is "a free Personal license for interactive use on a single computer" that embeds a small P on the upper right-hand area of the first page. The $3,800 license mentioned above is for running Prince on the server, which is what I'm most interested in, and is their most expensive option. Professional and Academic licenses are also available and are significantly more affordable. My apologies for any confusion this may have caused.

You won't find Prince in your package or ports repository but the simple packages provided are the next best thing. There doesn't seem to be any external/dynamic library dependencies, which is weird considering all of the stuff it has to do: SGML/XML/HTML/CSS parsing, PDF generation, HTTP interaction (including SSL and crypto support), multiple image format decode, text layout, embedded SVG support, etc. The compressed distributables are about 4MB on average. The lone prince binary on MacOS weighs in at a mere 11MB (and that includes both PPC and x86 versions of the program). This tiny binary appears to be entirely self contained. A feat of coding, or maybe build engineering - impressive either way.

I'm assuming its written in C (EDIT: it's actually Mercury but compiles down to C), which is always a good thing in my book so long as I'm not the maintainer. You typically don't get the aforesaid attributes out of any other language environment. And there's an abundance of MIT/BSD licensed library code in C floating around out there that could be used to cobble together most of the baseline functionality mentioned above.

All of this to say that Prince passes my don't waste my time test with flying colors. I'm pleasantly surprised by the whole experience thus far and ready to put it through its paces.

Read The License

Before we do anything, you should really read the license, paying special attention to this bit:

Licensee shall not modify, adapt, translate or create derivative works based upon the Software. Licensee shall not reverse engineer, decompile, disassemble or otherwise attempt to discover the source code of the Software.

We're about to officially become Licensees so you should be cool with that before proceeding.

Installation

The following assumes you're running on MacOS X and have /usr/local/bin on your PATH. The process should be basically the same on Linux or FreeBSD:

$ cd /tmp
$ curl http://www.princexml.com/download/prince-6.0r5-macosx.tar.gz | tar xvzf -
$ cd prince* 
$ sudo ./install.sh < /dev/null
Prince 6.0
Install directory
    This is the directory in which Prince 6.0 will be installed.
    Press Enter to accept the default directory or enter an alternative.
    [/usr/local]:
Installing Prince 6.0...
Creating directories...
Installing files...
Installation complete.
  Thank you for choosing Prince 6.0, we hope you find it useful.
  Please visit http://www.princexml.com for updates and development news.

I'm usually one to insist on package management but this was pretty painless and there's no library/header spew so uninstalling isn't a big deal.

Kicking The Tires

We'll start with something simple - this site's main index:

$ time prince -o tomayko.pdf http://tomayko.com 
real    0m7.819s
user    0m1.670s
sys     0m0.222s
$ open tomayko.pdf

The result, tomayko.pdf, looks great considering I've made no print-specific tweaks. The margins I have set for browser viewing are a bit much, though, and we don't want the navigation, search box, or page numbers. This is trivially fixed with a user stylesheet:

$ cat <<EOF > print.css 
body { margin:0 !important }
#footer, #nav, .pages { display:none }
EOF
$ prince -o tomayko-print.pdf -s print.css http://tomayko.com
$ open tomayko-print.pdf

Much better: tomayko-print.pdf. Note that I had to use !important in my user CSS to override attributes specified in the page CSS. As an alternative to passing the user stylesheet on the command line, I could have used a print specific stylesheet embedded directly in my site's HTML (e.g., <link media='print' ...>).

Big Documents and Performance

Let's try something a bit more complex. The single web page version of The FreeBSD Handbook is almost 5MB of DocBook-generated HTML. It uses a wide variety of HTML's markup capabilities and a custom set of DocBook-based CSS.

$ time prince -o handbook.pdf \
    http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/book.html
[snip "entity already defined" warnings]
real    2m53.200s
user    1m40.743s
sys     0m8.389s
$ open handbook.pdf

1m40s is fast. It barfed a bunch of "entity redefined" errors on stderr (as it should) but the result is quite acceptable. I'm not going to link to the handbook PDF here because it's a little over 11 MB. You can find it with a little digging if you're really curious or just grab Prince for yourself and generate your own.

Images / SVG Support

There's one last test I'd like to show.

One of the huge problems with taking web content to print is image resolution. Images on the web, when taken directly to print at the same eye size, usually end up somewhere between 76 and 100 DPI. An image with anything more complex than vertical and horizontal lines looks absolutely horrible in print at 100 DPI. Even simple charts and graphs will appear grainy and jaggy on paper at such low resolution; logos too. Content that needs to go to both mediums will usually have separate web and print optimized versions - a pain to maintain.

This is one of the reasons I dig SVG so much and Prince supports SVG in a very good way. What I was reluctantly expecting to see here was for the SVG to be rasterized to a high resolution bitmap image and then embedded in the PDF but what Prince does is so much better.

Sam Ruby has been embedding little handcrafted SVG images in his weblog entries for a little while now so we'll use him as our guinea pig:

$ prince -o svg.pdf http://intertwingly.net/blog/2008/02/01/SVG-Tidy
$ open svg.pdf

Check this out: svg.pdf. It might not seem very interesting at first but zoom in 500% or so and look at the little graphic: it's perfect. Prince is converting the SVG directly into PDF drawing instructions, retaining its vector goodness. This makes for perfect image output at any DPI.


Prince is an extremely impressive piece of Non-Free software that is otherwise very well-suited to the philosophy of Unix. I have little doubt that it would be more than capable of handling anything I would throw at it and can heartily recommend it if you're comfortable with the license and the price.