HTML and CSS to PDF

11155 · May 13, 2010, 1:23pm

I would like to take a rails page and convert it to a pdf. I don't want to have to generate the code myself for making the pdf, so it should obey css. What is the best tool for doing this? Does the tool use the standard css, or can I provide it alternative print-css?

Thanks in advance, Jonathan Steel

11155 · May 13, 2010, 1:48pm

Jonathan Steel wrote:

I would like to take a rails page and convert it to a pdf. I don't want to have to generate the code myself for making the pdf, so it should obey css. What is the best tool for doing this?

Prince is supposed to be great, but it's expensive. Free alternatives include wkhtmltopdf, prawn_format, acts_as_flying_saucer...

Does the tool use the standard css, or can I provide it alternative print-css?

Print CSS is standard.

Thanks in advance, Jonathan Steel

Best,

11155 · May 13, 2010, 2:12pm

Marnen Laibow-Koser wrote:

Jonathan Steel wrote:

I would like to take a rails page and convert it to a pdf. I don't want to have to generate the code myself for making the pdf, so it should obey css. What is the best tool for doing this?

Prince is supposed to be great, but it's expensive. Free alternatives include wkhtmltopdf, prawn_format, acts_as_flying_saucer...

A lot of people seem to be raving about Prince, which I find odd coming from a community like rails. It is indeed expensive. I tried it on a complex site, and it came crashing to its knees. I have looked into prawn and am so far impressed. I'm just scarred that it won't be able to do a complex pdf that looks like what you would see on a web site.

As for the others, I had planned on looking into wkhtmltopdf, and have not heard of the other two. Thanks for your input.

Does the tool use the standard css, or can I provide it alternative print-css?

Print CSS is standard.

I know its standard for when you are printing. But are there tools that will pic up the print css and use that?

11155 · May 13, 2010, 3:37pm

Prince is supposed to be great, but it's expensive. Free alternatives include wkhtmltopdf, prawn_format, acts_as_flying_saucer...

I'm trying out acts_as_flying_saucer, but I can't get it working. It just seems to ignore my render_pdf action and generates a normal html page. Do you have a sample project of it working? There is very limited documentation on this project, but I am interested in it.

Colin_Law1 · May 13, 2010, 3:53pm

Did you see this? Google translates it pretty well. http://www.jrubyonrails.de/2010/03/pdf-und-html-mit-acts-as-flying-saucer.html

Colin

Jeff_Burly · May 13, 2010, 4:31pm

Hi Jonathan,

I've used prince on a few projects running in production in the past couple of years and have nothing but good things to say about it. Not sure what problems you experienced when you tried prince out, but I have yet to see any problems with it.

If you can't afford the license for prince or want to stick with foss, then I'd also highly recommend wkhtmltopdf. The main reason, other than cost, to go with prince over wkhtmltopdf is that prince has greater print-related css coverage than wkhtmltopdf (or more importantly, the underlying pieces that make up wkhtmltopdf, specifically webkit) does at this time. The main reason why I chose prince over wkhtmltopdf for those specific projects was that the client required certain must-haves in terms of resulting pdf output from the underlying html/css (specifically related to css regarding preventing pdf page-breaks inside of defined html elements) that could be handled by prince but not by wkhtmltopdf.

Check the latest wkhtmltopdf (or webkit) in terms of css print-related coverage as these differences are likely narrowing. Or better yet, test wkhtmltopdf out to see if it meets the needs of your project regardless.

As for the cost of prince, my clients had no problem paying for the server license, especially given the cost savings they've realized over time compared to if they would have had to pay me or some other developer to dev the custom pdf generation code using one of the low- level pdg-gen'ing libs (like prawn). It's just a lot easier/cheaper to have ui devs make mods to html/css, especially in a web app context where the app provides an on-screen preview of what the pdf will (just- about) look like before the pdf is actually gen'd. I doubt I'll ever go back to using low-level pdf-gen'ing libs again.

Jeff

11155 · May 13, 2010, 4:40pm

Thanks for the great info Jeff.

You raise the same points that I have raised in our team. It would be easier to convert html into a pdf instead of us spending the time to develop code using Prawn for custom views.

Prince worked on a simple page, but when I tried it on a more complicated one I got a ton of errors like the following:

prince: /Users/jonathan/tmp/97.html:552: error: Opening and ending tag mismatch: div line 480 and html

followed by just as many:

prince: /Users/jonathan/tmp/97.html:552: error: Premature end of data in tag div line 372

and ending with:

prince: /Users/jonathan/tmp/97.html: error: could not load input file prince: error: no input documents to process

I tried out wkhtmltopdf and really like it. My only concern at this point is that I can't get page breaking to work, and I found some recent posts that would suggest it can't do page breaking.

I'm looking at acts_as_flying_saucer now, but can't get it to work for complicated examples. I will probably be going with either prawn, wkhtmltopdf, or acts_as_flying_saucer.

Colin_Law1 · May 13, 2010, 4:44pm

Thanks for the great info Jeff.

You raise the same points that I have raised in our team. It would be easier to convert html into a pdf instead of us spending the time to develop code using Prawn for custom views.

Prince worked on a simple page, but when I tried it on a more complicated one I got a ton of errors like the following:

prince: /Users/jonathan/tmp/97.html:552: error: Opening and ending tag mismatch: div line 480 and html

followed by just as many:

prince: /Users/jonathan/tmp/97.html:552: error: Premature end of data in tag div line 372

and ending with:

prince: /Users/jonathan/tmp/97.html: error: could not load input file prince: error: no input documents to process

Did you check first that the html is valid by viewing the source (view, page source or similar in browser) and copying the complete text and pasting into w3c html validator (google will find it)?

Colin

11155 · May 13, 2010, 4:48pm

Colin Law wrote:

Jeff_Burly · May 13, 2010, 5:02pm

Sounds like the underlying html/css in your "complicated" test might not be valid, such that prince is saying that it isn't able to generate the pdf because it can't parse/process that html/css? You might want to run that html/css thru a validator first, like http://validator.w3.org/ , to first fix any invalid html/css and then try it again.

As for flyingsaucer, I looked into using that a while back but just didn't like all of the dependencies required to get it working at the time, especially for ruby/rails project. But, maybe if you already have a jvm installed, or are already running jruby, or ....

Whatever you end up using to gen your pdfs with, another tool you might find useful is pdftk -- PDFtk - The PDF Toolkit -- for any pre-/post-processing of your pdfs, like splitting pdfs into pages, stitching pdf pages together, adding watermarks, etc.

Jeff

Jeff_Burly · May 13, 2010, 5:09pm

Not really a valid test of prince then is it? Kind of like testing a toaster that wasn't plugged in: "hey, this toaster is junk, ... it didn't even heat up the bread". --Jeff

11155 · May 13, 2010, 5:10pm

Jeff Burlysystems wrote:

Sounds like the underlying html/css in your "complicated" test might not be valid, such that prince is saying that it isn't able to generate the pdf because it can't parse/process that html/css? You might want to run that html/css thru a validator first, like http://validator.w3.org/ , to first fix any invalid html/css and then try it again.

I think I will try this just because its a good exercise anyways.

As for flyingsaucer, I looked into using that a while back but just didn't like all of the dependencies required to get it working at the time, especially for ruby/rails project. But, maybe if you already have a jvm installed, or are already running jruby, or ....

I played with it some more and it turns out that as soon as I install the acts_as_flying_saucer plugin, the layout of all my pages gets totally messed up. Uninstall the plugin, pages go back to normal. So this one is definitely out of the picture. I might look at the underlying Java Library and compile a simple binary that we can use to convert saved pages. More work then doing it in rails, but it would be just like prince or wkthtmltopdf.

Whatever you end up using to gen your pdfs with, another tool you might find useful is pdftk -- PDFtk - The PDF Toolkit -- for any pre-/post-processing of your pdfs, like splitting pdfs into pages, stitching pdf pages together, adding watermarks, etc.

Thanks. This tool does look interesting.

Colin_Law1 · May 13, 2010, 8:18pm

You can't expect anything like this to work reliably with invalid html. You could install the html validator plugin for firefox, then it will check the html on the fly for you.

Colin

11155 · May 14, 2010, 10:13am

Jonathan Steel wrote: <a href="http://github.com/amardaxini/acts_as_flying_saucer_demo">Act as flying saucer demo</a> Just make sure your html is proper

Any query regarding acts_as_flying_saucer you can drop me a mail on amardaxini@gmail.com <a href="http://railstech.com">Amar Daxini</a>

11155 · May 14, 2010, 10:16am

Jonathan Steel wrote:

Jeff Burlysystems wrote:

Sounds like the underlying html/css in your "complicated" test might not be valid, such that prince is saying that it isn't able to generate the pdf because it can't parse/process that html/css? You might want to run that html/css thru a validator first, like http://validator.w3.org/ , to first fix any invalid html/css and then try it again.

By default acts_as_flying_saucer make stylesheet media attribute as print thats why it's mashup

I have just update plugin and added some features also

Peter_De_Berdt · May 14, 2010, 5:06pm

As some others have pointed out, you simply need to provide Prince with valid XHTML. It even provides you with a nice syntax error to trace down where YOUR code is wrong. You wouldn’t expect Ruby to let you get away with “ruts ‘Hello World’” instead of “puts ‘Hello World’”. We’ve use Prince in a project that created the most complex documents you can ever imagine, using almost everything Prince has to offer, including plenty of SVG images with dynamic data in them. We’re speaking of hundreds of pages too. It works without a hitch and support from the developer is simply amazing. Yes, it’s expensive, but it’s worth every cent if you plan on generating plenty of PDF documents with complex layouts.

Best regards

Peter De Berdt

11155 · May 14, 2010, 5:23pm

prince: /Users/jonathan/tmp/97.html: error: could not load input

it. Being so expensive, Prince pretty much had to work without a hitch if I was going to spend any time trying to make it work.

So what I really meant by this was that with several other options available, things had to work almost immediately if I was going to spend any time considering them. Although it is definitely our site that is causing the problem with Prince, finding the problem in the site could be just as much work as creating the pdf from scratch using something like prawn. If I did fix the site, then I still had no guarantee that it would work with Prince when I was done, so I might as well just make the PDF from scratch.

Peter_De_Berdt · May 14, 2010, 5:33pm

Invalid pages in the browser are just as evil, aren’t they I sometimes wish browsers were a lot less forgiving on that part, it would avoid a lot of seemingly unrelated issues, especially when you start using Javascript DOM manipulations.

When we tried Prawn, the size of the PDF it produces was about twice of what a similar Prince document spits out. I’m assuming the output it far from optimized. That said, it’s a very valid option in some simple PDF cases and certainly less pricey

Best regards

Peter De Berdt

11155 · May 14, 2010, 5:51pm

Invalid pages in the browser are just as evil, aren't they I sometimes wish browsers were a lot less forgiving on that part, it would avoid a lot of seemingly unrelated issues, especially when you start using Javascript DOM manipulations.

Yah I knew somebody would bring that up as soon as I made my last comment. I think the problem is probably due to our DOM manipulation. Our site renders properly in every browser we have tried, so it took me really by surprise then these html parsers started choking on the pages.

Colin_Law1 · May 14, 2010, 7:37pm

It may render ok at the moment, but can you sleep soundly knowing you have invalid html and that the next release of firefox may interpret the invalid html in a different way?

Colin

Topic		Replies	Views
.html/.html.erb/.rhtml to PDF rubyonrails-talk	11	202	September 1, 2009
Export data to PDF rubyonrails-talk	9	143	February 20, 2012
html to pdf in rails rubyonrails-talk	0	87	September 18, 2007
Recommendations for pdf generators rubyonrails-talk	7	113	March 17, 2010
pdf generation rubyonrails-talk	2	114	February 8, 2008

HTML and CSS to PDF

Related topics

More Resources