The open source vs cloud dilemma

PDF generation is deceptively simple. Render some data into a template, produce a file. In practice, the infrastructure around that simple operation — scaling, monitoring, error handling, font management, security patching, queue management — is where engineering time disappears.

Open source tools give you full control and zero vendor lock-in. Cloud APIs give you zero infrastructure and predictable costs. Neither is universally better. The right choice depends on your volume, your team's DevOps capacity, your compliance requirements, and how central document generation is to your product.

Let's look at the actual options in each camp.

The open source landscape

Gotenberg — Docker-based conversion engine

Gotenberg wraps Chromium and LibreOffice inside a Docker container and exposes a REST API for converting HTML, Markdown, Word, and other formats to PDF. It is the most popular self-hosted PDF tool in the Docker ecosystem.

Stirling-PDF — the Swiss army knife

Stirling-PDF is a self-hosted web application for PDF manipulation: merge, split, compress, convert, OCR, add watermarks, and more. It is not a generation tool per se — it's a processing tool.

Carbone.io — template-driven document engine

Carbone is a template engine that takes DOCX, XLSX, or ODT files as templates, injects JSON data, and produces PDF output (via LibreOffice conversion). The community edition is open source; advanced features require the paid cloud version.

WeasyPrint — Python CSS-to-PDF engine

WeasyPrint is a Python library that converts HTML and CSS to PDF without a browser. It implements its own CSS rendering engine, so there is no Chromium dependency.

Puppeteer — headless Chrome

Puppeteer (and Playwright) launch headless Chrome and use page.pdf() to generate PDFs. This gives you pixel-perfect Chrome rendering with full CSS and JavaScript support.

The cloud API landscape

Typsetter — Typst-powered API

Typsetter is a PDF generation API built on the Typst typesetting language. Templates are written in Typst with Tera (Jinja2-like) templating for data injection. No browser, no LibreOffice — the Typst compiler produces PDFs directly.

PDFMonkey — HTML template cloud API

PDFMonkey lets you design HTML/CSS templates in a web editor and generate PDFs via API. It uses a Chrome-based renderer in the cloud.

CraftMyPDF — drag-and-drop builder

CraftMyPDF provides a visual drag-and-drop template builder with a REST API for generation. Designed for non-developers to create templates.

DocRaptor — Prince XML in the cloud

DocRaptor is a cloud API powered by the Prince XML engine, which is widely regarded as having the best CSS-to-PDF output quality in the industry.

The comparison

Criteria Open Source (self-hosted) Cloud APIs (general) Typsetter
Cost at 1K PDFs/mo $10–$50 (server) $15–$49 (plan) $0 (free tier)
Cost at 10K PDFs/mo $50–$150 (server + ops) $49–$199 (plan) $99 (Pro plan)
Cost at 100K PDFs/mo $200–$500 (cluster) $500–$2,000+ Custom pricing
Setup time Hours to days Minutes Minutes
Ongoing maintenance You own it all Zero Zero
Scaling Manual (containers, LB) Automatic Automatic
Template system Varies (HTML, DOCX) Varies (HTML, drag-drop) Typst + visual editor
Render speed 500ms–4s (depends on tool) 500ms–3s ~340ms avg
Data sovereignty Full control Third-party servers EU servers, encrypted
Batch processing Build it yourself Some support it Native CSV batch
A note on cost

The "Cost at 100K PDFs/mo" row is where open source shines. At extreme volume, the per-document cost of a self-hosted solution drops to fractions of a cent, while cloud APIs charge per document. If you're generating hundreds of thousands of PDFs monthly, the infrastructure investment can pay for itself — assuming you have the team to maintain it.

The hidden costs of self-hosting

The comparison table above tells only half the story. Self-hosted solutions have real costs that don't appear in the server bill:

DevOps and infrastructure

Running Gotenberg or Puppeteer in production means managing Docker containers, configuring health checks, setting up auto-scaling rules, and maintaining a CI/CD pipeline for your PDF service. For Chromium-based tools, you're managing a browser pool — one of the more error-prone pieces of infrastructure you can own.

Monitoring and alerting

When your PDF service throws a 500 at 2am because Chromium ran out of memory and the OOM killer terminated the process, someone on your team gets paged. You need metrics on render times, memory usage, queue depth, and error rates. This is not optional infrastructure — it's the cost of running a production service.

Security updates

Chromium vulnerabilities are disclosed regularly. When a critical CVE drops for the browser engine your PDF service depends on, you need to patch and redeploy. LibreOffice (used by Carbone and Gotenberg for office format conversion) has its own security surface. These updates cannot be deferred indefinitely.

Scaling under load

A batch job that generates 10,000 invoices at month-end will saturate a single Gotenberg container. You need horizontal scaling, a job queue (Redis, RabbitMQ, SQS), retry logic for transient failures, and dead-letter handling for documents that consistently fail. This is a meaningful engineering project.

Font and rendering consistency

Chromium renders fonts differently across operating systems. A PDF rendered on your Mac in development may look different from what your Ubuntu container produces. Font installation, fallback configuration, and rendering consistency testing are ongoing concerns.

Reality check

We estimate the typical engineering cost of maintaining a self-hosted PDF pipeline at 4–10 hours per month for a healthy setup. At a blended rate of $100/hr, that's $400–$1,000/month in developer time alone — before the server bill. This is not a criticism of self-hosting; it is a cost that should be factored into the decision.

When to choose open source

Self-hosting is the right call in several scenarios. Being honest about this matters, because the wrong choice in either direction wastes money and engineering time.

Data sovereignty and compliance

If your documents contain personally identifiable information (PII), financial data, or health records, and your compliance framework (HIPAA, SOC 2 Type II, specific GDPR interpretations) requires that data never leave your infrastructure, self-hosting is the only option. No cloud API can satisfy a requirement that prohibits third-party data processing.

Air-gapped and offline environments

Government, defense, and certain financial systems operate in environments with no internet access. A self-hosted tool running inside your network is the only viable approach. Gotenberg and WeasyPrint both work well in air-gapped Docker environments.

Extreme volume

At 500K+ PDFs per month, the per-document economics of cloud APIs become difficult to justify. A well-tuned Gotenberg cluster on dedicated hardware can produce PDFs at a fraction of a cent each. The infrastructure investment is significant, but at this scale you likely have the DevOps team to support it.

Custom rendering requirements

If your documents require specific rendering behavior — custom font shaping, specialized page layout algorithms, or deep integration with internal systems — owning the rendering pipeline gives you control that no API can match.

Good fit for self-hosting

Strict data residency requirements. Air-gapped environments. 500K+ PDFs/month with a DevOps team to support the infrastructure. Need for custom rendering behavior.

Bad fit for self-hosting

Small team without dedicated DevOps. Variable or unpredictable volume. Need to ship fast without infrastructure work. Limited budget for ongoing maintenance.

When to choose cloud

Cloud APIs earn their keep when infrastructure work is a distraction from your core product.

Speed to market

Integrating Typsetter takes about an hour: create an account, pick a template, call the API. Integrating Gotenberg takes a day minimum: set up Docker, configure the container, write the HTML rendering layer, build the API wrapper, set up health checks. If you're building an MVP or adding PDF generation as a feature (not your core product), the cloud path is faster by an order of magnitude.

No DevOps team

A startup with three developers does not have bandwidth to manage a Chromium-based PDF service alongside their product. The maintenance burden is real and ongoing. A cloud API converts that variable cost into a predictable monthly bill.

Predictable costs

Cloud pricing is simple: you pay per document or per plan tier. Self-hosting costs are unpredictable — a memory leak, a scaling incident, or a failed Chromium update can consume days of engineering time in a single month.

Automatic scaling

Month-end invoice runs, seasonal spikes, marketing campaigns that trigger document generation — cloud APIs handle these transparently. With self-hosting, you either over-provision (wasting money) or under-provision (risking failures during peaks).

Typsetter — cloud API call
// One API call. No Docker. No Chromium. No pool management. const response = await fetch('https://api.typsetter.dev/v1/render', { method: 'POST', headers: { 'Authorization': 'Bearer ts_live_sk_YOUR_KEY', 'Content-Type': 'application/json', }, body: JSON.stringify({ template: 'invoice-professional', format: 'pdf', data: { client_name: 'Acme Corporation', invoice_number: 'INV-2026-042', line_items: [ { description: 'API integration', amount: 3200 }, { description: 'Template design', amount: 1800 }, ], } }) }); const pdfBuffer = await response.arrayBuffer(); // ~340ms. Done.

The hybrid approach: open source for dev, cloud for prod

There is a middle path that some teams adopt successfully: use an open source tool during development and staging, then switch to a cloud API for production.

How it works

In local development and CI, you run Gotenberg or WeasyPrint in a Docker container. Templates are developed and tested against the self-hosted renderer. In production, the same data payload is sent to a cloud API like Typsetter instead. A simple environment variable switches the rendering backend.

Hybrid approach — environment-based routing
async function generatePdf(templateId, data) { if (process.env.PDF_BACKEND === 'local') { // Dev/staging: self-hosted Gotenberg return await renderWithGotenberg(templateId, data); } // Production: cloud API return await renderWithTypsetter(templateId, data); }

Trade-offs of the hybrid approach

Practical tip

The hybrid approach works best when both environments use the same input format. If your cloud API accepts HTML (like PDFMonkey or DocRaptor), pairing it with a local Gotenberg instance is seamless. If your cloud API uses a different template language (like Typsetter's Typst), the hybrid approach is less practical — you'd maintain two template sets.

Decision framework

Use this quick checklist to narrow your choice:

  1. Is data sovereignty a hard requirement? If yes, self-host. No exceptions.
  2. Are you generating 100K+ PDFs/month? If yes, evaluate the TCO of self-hosting vs cloud. At extreme volume, self-hosting often wins on cost.
  3. Do you have a dedicated DevOps team? If no, the maintenance burden of self-hosting will fall on developers who should be building features.
  4. Is PDF generation core to your product? If yes, owning the infrastructure may be strategic. If it's a supporting feature, delegate it.
  5. Do you need to ship this week? If yes, cloud. No contest.

Conclusion

The open source tools in this space are genuinely excellent. Gotenberg is well-engineered. WeasyPrint solves a real problem elegantly. Carbone's template approach is clever. Puppeteer gives you unmatched rendering fidelity. There is no shame in choosing any of them.

But for most teams — especially those without dedicated infrastructure engineers — the total cost of self-hosting exceeds the subscription price of a cloud API. The hidden costs (monitoring, scaling, security patches, debugging rendering inconsistencies at 2am) are real and recurring.

Typsetter exists specifically for teams that want to skip the infrastructure work. Fast renders, a proper template system, batch processing, and scheduled generation — all accessible through a REST API with no Docker containers to manage, no Chromium to babysit, and no scaling surprises at month-end.

Start with the free tier. If self-hosting turns out to be the right call for your use case, you'll know within a week. If it isn't, you'll have a working PDF pipeline in production before lunch.

Try Typsetter for free

100 PDFs/month on the free plan. API key in 30 seconds. No credit card required.