Converting HTML to PDF using PHP
If you spend enough time building web applications, it's likely that you will eventually need a solution for generating PDF documents. Consider scenarios like invoice PDFs, printable tickets, gift vouchers in email attachments or printable shipping labels. Let's look at four options to accomplish this in a PHP application (or scroll to the conclusion at the bottom of this article).
The problem
Generating and lay-outing PDFs from code is a challenge because it's not a very structured file format. Internally, PDFs are binary encoded and contain an array of blocks of content that can be styled and positioned anywhere in the document. Unlike HTML, we can't just write some text to a file and add .pdf
to create a PDF. Luckily, there are a couple good (open-source) solutions that allow you to convert HTML to PDF.
Things to keep into account
First of all, the ideal solution for your project depends on your use-case. We'll be comparing Wkhtmltopdf, Dompdf, Chromium and a custom Satori+PdfKit package. For each of these, we'll be looking at developer experience and performance.
For example, all of these solutions require fine-tuning your HTML template because they only support the HTML/CSS specification to various degrees, this impacts developer experience a lot. Additionally, if you need to generate thousands of invoices per hour, you might want to do it as fast as possible without wasting too much CPU time or memory.
Finally, if you don't feel like spending too much time on generating PDFs, document APIs like DocRapter are great to hit the ground running.
Headless Chromium (e.g. in Browsershot)
Developer experience
Chromium emerges as a clear winner for developer experience. Leveraging an actual browser engine to render the PDF, ensures great compatibility with modern CSS, HTML and even JavaScript. If your template works in your browser, it'll also work as a PDF. Our spatie/browsershot package is a great choice to use Chromium to create PDFs (or screenshots) of HTML webpages in PHP.
As a bonus, using TailwindCSS in your PDF template is as simple as including the Tailwind CDN in your HTML.
Despite these benefits, installing something like Browsershot or one of its Chromium based alternatives is often a bit annoying. Having to deal with additional dependencies like the Node runtime, the Puppeteer package and the Chromium engine will likely cause you problems at some point.
Performance
If you're looking for a performant solution, this isn't it either. Chrome's memory eating memes are still relevant in 2024 and it's not like it's fast either. On a higher-end DigitalOcean Droplet generating an empty PDF document took just over 1 second, consuming ±100MB of memory and pushing average CPU usage to 30% on 4 threads*.
So in conclusion: Browsershot or one of its Chromium based alternatives is great if you need extensive/modern HTML markup or JavaScript support, but it will take some time and server resources to render.
Wkhtmltopdf
Developer experience
Utilizing the qt WebKit engine (a QT port of Apple's WebKit engine), the Wkhtmltopdf CLI supports much of the modern HTML and CSS spec, with a couple (minor?) limitations. For example, CSS flexbox is a little weird sometimes and JavaScript, CSS grid or canvas are not supported at all.
Luckily, using Wkhtmltopdf in your PHP app is pretty simple with a package like Snappy. You still need to manually sort out the wkhtmltopdf
runtime, but it's available on most OS package repository (like apt, Brew, Chocolatery, ...).
Performance
Performance-wise, Wkhtmltopdf dramatically exceeds its Chromium based counterpart. Consuming minimal memory and CPU resources (5-10MB of memory and 6% CPU*).
So in conclusion, Wkhtmltopdf is a pretty performant solution for rending HTML to PDF using a real browser engine. However, its outdated HTML/CSS spec will require some time to fine-tune your HTML template for your PDF.
Dompdf
Developer experience
For my use-case of generating super basic gift vouchers a fast as possible, Dompdf is the clear winner. Because it's using a custom HTML renderer under the hood, it has some quirks when rendering most modern HTML/CSS but once you've ironed out the kinks in your HTML template, it performs insanely fast.
Installation couldn't be easier either. The package is written entirely in PHP without any external dependencies or binaries needed. Just install it using composer
and you're good to go.
Performance
As mentioned above, performance-wise Dompdf is a clear winner. It's an order of magnitude faster than the second best solution in this article. A thousand PDFs will generate in just over 10 seconds using not even a couple megabytes of memory. On my (not cheap) testing server*, CPU was the only bottleneck to go faster. So depending on your hardware, you might be able to push this package even further.
Satori + PdfKit
vercel/satori
is what inspired me to begin this comparative experiment. Satori is Vercel's new library to efficiently convert HTML and CSS to SVG. Since SVG is easier to convert to PDF, I thought of using Satori for quick HTML -> SVG -> PDF conversions. Unfortunately, performance wasn't as good as I expected, especially compared to other solutions in this list.
Developer experience
Because Satori is not an HTML render engine, but rather a conversion library, HTML/CSS support is very limited. Prepare to tweak your HTML a lot to get more complex lay-outs to work in your PDF and expect to deal with annoying issues like embedding font files. However, TailwindCSS is supported out of the box, so that's a big plus.
Installation isn't as easy as it could be either, needing a Node runtime and several NPM packages such as PdfKit and an html-to-react library. This could be improved a lot by releasing this solution as a separate all-inclusive package, but at the moment it's not an ideal solution.
Performance
Luckily, performance somewhat makes up for the less than ideal DX, however, the wkhtmltopdf
CLI is still faster and supports more of the HTML & CSS specifications. So a fun experiment but not a viable alternative yet.
In conclusion
Some raw numbers from my benchmarking*:
memory | cpu | empty PDF | 1 page PDF | |
---|---|---|---|---|
Browsershot | 100MB | 30% | 1007ms | 1029ms |
Wkhtmltopdf | 30-40MB | 6% | 260ms | 267ms |
Dompdf | 30MB? | 100% | 11ms | 20ms |
Satori+Pdfkit | 40MB | 30% | 600ms | 710ms |
So depending on your use-case, you might want to consider:
- DomPDF for very fast, but somewhat laborious PDFs,
- or a Chromium-based solution, such as spatie/browsershot, for convenient but comparatively slower PDFs.
* I benchmarked all 4 solutions on a "CPU-optimized, regular intel" DigitalOcean Droplet with 4vCPUs and 8GB of ram. CPU percentages are obviously relative but should be comparable within this article. Benchmarks ran synchronously, generating 40 PDFs and averaging those numbers for each benchmark. Dompdf memory usage is a bit uncertain due to a memory leak (I didn't investigate any further).