Generate PDFs from a Database — PostgreSQL & MySQL to PDF Guide

Why your database is your document engine

Every invoice you send is a SELECT query away from being a PDF. The customer name, the line items, the totals, the due date — it all lives in your database already. The missing piece is the transformation layer: something that takes structured data and produces a typeset document.

Most teams solve this with HTML templates rendered through a headless browser. That works, but it introduces heavy infrastructure (Chromium), slow render times (2–4 seconds), and a maintenance burden that grows with scale. A purpose-built PDF API eliminates that middleware entirely.

The pattern is straightforward: query your database, shape the result into a JSON payload, POST it to an API, get a PDF binary back. No browser. No server-side rendering. No LaTeX installation.

Architecture overview

The data flow for database-to-PDF generation has four stages:

DatabasePostgreSQL / MySQL

→

Query + TransformYour app code

→

Typsetter APIPOST /v1/render

→

PDF OutputBinary / URL / S3

Your application queries the database, maps the rows to the variable schema your template expects, sends the JSON to Typsetter, and receives a PDF in the response body. The template itself — layout, fonts, colors, page structure — is managed in the Typsetter dashboard. Your code only sends data.

Example 1: Node.js + PostgreSQL — monthly invoices

This example queries a PostgreSQL database for all customers with unbilled orders, generates an invoice PDF for each one, and writes the files to disk. In production you would upload to S3 or attach to an email — the PDF generation step is identical.

generate-invoices.mjs
import pg from 'pg';
import { writeFile } from 'node:fs/promises';

const pool = new pg.Pool({
  host: 'localhost',
  database: 'myapp',
  user: 'myapp',
  password: process.env.DB_PASSWORD,
});

const TYPSETTER_KEY = process.env.TYPSETTER_API_KEY;

// 1. Query all customers with unbilled orders this month
const { rows: customers } = await pool.query(`
  SELECT c.id, c.name, c.email, c.address,
         json_agg(json_build_object(
           'description', o.description,
           'quantity', o.quantity,
           'unit_price', o.unit_price
         )) AS line_items,
         SUM(o.quantity * o.unit_price) AS total
  FROM customers c
  JOIN orders o ON o.customer_id = c.id
  WHERE o.billed = false
    AND o.created_at >= date_trunc('month', CURRENT_DATE)
  GROUP BY c.id
`);

console.log(`Found ${customers.length} customers to invoice`);

// 2. Generate a PDF for each customer
for (const customer of customers) {
  const res = await fetch('https://api.typsetter.dev/v1/render', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${TYPSETTER_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      template: 'invoice-professional',
      format: 'pdf',
      data: {
        client_name: customer.name,
        client_address: customer.address,
        invoice_number: `INV-${new Date().toISOString().slice(0,7)}-${customer.id}`,
        date: new Date().toLocaleDateString('en-US'),
        items: customer.line_items,
        total: customer.total,
        currency: 'USD',
      },
    }),
  });

  const pdf = Buffer.from(await res.arrayBuffer());
  await writeFile(`./invoices/INV-${customer.id}.pdf`, pdf);
  console.log(`Generated invoice for ${customer.name}`);
}

await pool.end();

The key insight: your SQL query does the heavy lifting of joining tables and aggregating line items. The API call is a simple POST with the query result mapped to template variables. No HTML rendering, no CSS debugging, no browser process management.

Example 2: Python + MySQL — employee pay slips

This example connects to a MySQL database, queries employee payroll data for the current month, and generates a pay slip PDF for each employee. Python's requests library keeps the API call minimal.

generate_payslips.py
import os
import mysql.connector
import requests
from datetime import date

db = mysql.connector.connect(
    host="localhost",
    database="payroll",
    user="payroll_app",
    password=os.getenv("DB_PASSWORD"),
)
cursor = db.cursor(dictionary=True)

# 1. Query payroll data for the current month
cursor.execute("""
    SELECT e.id, e.full_name, e.department, e.position,
           p.gross_salary, p.tax, p.deductions,
           p.net_salary, p.pay_period
    FROM employees e
    JOIN payroll p ON p.employee_id = e.id
    WHERE p.pay_period = %s
""", (date.today().strftime("%Y-%m"),))

employees = cursor.fetchall()
print(f"Generating pay slips for {len(employees)} employees")

TYPSETTER_KEY = os.getenv("TYPSETTER_API_KEY")

# 2. Generate a PDF for each employee
for emp in employees:
    resp = requests.post(
        "https://api.typsetter.dev/v1/render",
        headers={"Authorization": f"Bearer {TYPSETTER_KEY}"},
        json={
            "template": "pay-slip-1",
            "format": "pdf",
            "data": {
                "employee_name": emp["full_name"],
                "department": emp["department"],
                "position": emp["position"],
                "gross_salary": str(emp["gross_salary"]),
                "tax": str(emp["tax"]),
                "deductions": str(emp["deductions"]),
                "net_salary": str(emp["net_salary"]),
                "pay_period": emp["pay_period"],
            },
        },
    )
    resp.raise_for_status()

    filename = f"payslips/{emp['id']}-{emp['pay_period']}.pdf"
    with open(filename, "wb") as f:
        f.write(resp.content)
    print(f"Generated: {filename}")

cursor.close()
db.close()

The pattern is identical regardless of language or database: query rows, map to template variables, call the API. The Typsetter template handles all formatting — columns, alignment, headers, footers, page breaks — so your application code stays focused on data.

Batch approach: generate hundreds of PDFs at once

When you need to generate documents for an entire table — monthly invoices for all customers, quarterly statements for all accounts — looping through individual API calls works but is not optimal. Typsetter supports batch generation natively.

Option A: CSV batch upload

Export your query result as a CSV file and upload it to the Typsetter batch endpoint. Each row becomes one PDF. The response is a ZIP file containing all generated documents.

batch-from-csv.mjs
import { readFile } from 'node:fs/promises';

const csv = await readFile('./customers-march.csv');

const form = new FormData();
form.append('template', 'invoice-professional');
form.append('file', new Blob([csv]), 'customers.csv');

const res = await fetch('https://api.typsetter.dev/v1/batch', {
  method: 'POST',
  headers: { 'Authorization': 'Bearer ts_live_sk_YOUR_KEY' },
  body: form,
});

// Response is a ZIP file containing one PDF per row
const zip = Buffer.from(await res.arrayBuffer());
await writeFile('./invoices-march.zip', zip);

Option B: parallel API calls with concurrency control

If you need more control over the process — custom error handling per record, progress tracking, conditional logic — use parallel API calls with a concurrency limiter.

parallel-generation.mjs
// Simple concurrency limiter — 10 parallel requests max
async function parallelMap(items, fn, concurrency = 10) {
  const results = [];
  const executing = new Set();
  for (const item of items) {
    const p = fn(item).then(r => {
      executing.delete(p);
      return r;
    });
    executing.add(p);
    results.push(p);
    if (executing.size >= concurrency) {
      await Promise.race(executing);
    }
  }
  return Promise.all(results);
}

// Generate all invoices with max 10 concurrent API calls
const pdfs = await parallelMap(customers, async (customer) => {
  const res = await fetch('https://api.typsetter.dev/v1/render', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${TYPSETTER_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      template: 'invoice-professional',
      format: 'pdf',
      data: customer,
    }),
  });
  return { id: customer.id, pdf: await res.arrayBuffer() };
});

Tip

For batches over 1,000 documents, use the CSV batch endpoint. It processes documents server-side in parallel and returns a single ZIP — no need to manage concurrency or handle individual failures in your code.

Scheduling: automate recurring document generation

Monthly invoices, weekly reports, quarterly statements — these are recurring tasks. You have two options for scheduling.

Option 1: Typsetter schedules (zero infrastructure)

Typsetter has a built-in schedules feature. You configure a schedule in the dashboard, point it at a data source (webhook URL that returns JSON, or a static data payload), and Typsetter generates the PDFs on the schedule you define. The generated documents are delivered via webhook, email, or stored in your Typsetter account.

This is the simplest approach — no cron job, no server, no Lambda function. You expose one endpoint that returns the current data, and Typsetter handles the rest.

Option 2: cron job or task scheduler

If you need full control, run the generation script on a schedule using your platform's native scheduler:

crontab (Linux) / Task Scheduler (Windows)
# Generate monthly invoices at midnight on the 1st of each month
0 0 1 * * node /app/scripts/generate-invoices.mjs

# Generate pay slips on the 25th of each month
0 0 25 * * python3 /app/scripts/generate_payslips.py

# Or use your platform's equivalent:
# - AWS: EventBridge + Lambda
# - GCP: Cloud Scheduler + Cloud Functions
# - Heroku: Heroku Scheduler
# - Railway / Render: cron jobs

Storing generated PDFs

Once you have the PDF binary, you need to put it somewhere. The right choice depends on your architecture.

Upload to S3 (or any object storage)

upload-to-s3.mjs
import { S3Client, PutObjectCommand } from '@aws-sdk/client-s3';

const s3 = new S3Client({ region: 'us-east-1' });

async function storePdf(key, pdfBuffer) {
  await s3.send(new PutObjectCommand({
    Bucket: 'my-invoices',
    Key: `invoices/${key}.pdf`,
    Body: pdfBuffer,
    ContentType: 'application/pdf',
  }));
}

// After generating the PDF:
const pdfBuffer = Buffer.from(await res.arrayBuffer());
await storePdf(`INV-2026-03-${customer.id}`, pdfBuffer);

Save URL reference in the database

A common pattern is to store the S3 URL (or file path) back in the database, linked to the record that generated it. This lets you serve the PDF on demand without regenerating it.

store-reference.mjs
// After uploading to S3, update the invoice record
await pool.query(
  `UPDATE invoices SET pdf_url = $1, generated_at = NOW()
   WHERE id = $2`,
  [`https://my-invoices.s3.amazonaws.com/invoices/INV-${id}.pdf`, invoiceId]
);

Other storage options

Local filesystem: Simple for small-scale or development. Not recommended for production — no redundancy, no CDN, no access control.
Google Cloud Storage / Azure Blob: Same pattern as S3 with their respective SDKs.
Email attachment: Generate the PDF and attach it directly to an outbound email using Nodemailer, SendGrid, or Postmark.
Database BLOB column: Technically possible but generally discouraged. Databases are not optimized for serving binary files at scale.

Error handling and retries

When generating PDFs from database data at scale, things will go wrong. Missing fields, malformed data, network timeouts, rate limits. Robust error handling makes the difference between a script that works in development and one that works in production.

resilient-generation.mjs
async function generateWithRetry(data, maxRetries = 3) {
  for (let attempt = 1; attempt <= maxRetries; attempt++) {
    try {
      const res = await fetch('https://api.typsetter.dev/v1/render', {
        method: 'POST',
        headers: {
          'Authorization': `Bearer ${TYPSETTER_KEY}`,
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({ template: 'invoice-professional', data }),
      });

      if (res.status === 429) {
        // Rate limited — wait and retry
        const retryAfter = res.headers.get('Retry-After') || 2;
        await new Promise(r => setTimeout(r, retryAfter * 1000));
        continue;
      }

      if (!res.ok) {
        const err = await res.json();
        throw new Error(`API error ${res.status}: ${err.message}`);
      }

      return Buffer.from(await res.arrayBuffer());
    } catch (err) {
      if (attempt === maxRetries) throw err;
      // Exponential backoff: 1s, 2s, 4s
      await new Promise(r => setTimeout(r, 1000 * 2 ** (attempt - 1)));
    }
  }
}

// Track failures for reporting
const failures = [];
for (const customer of customers) {
  try {
    const pdf = await generateWithRetry(customer);
    await storePdf(customer.id, pdf);
  } catch (err) {
    failures.push({ id: customer.id, error: err.message });
    console.error(`Failed: customer ${customer.id} — ${err.message}`);
  }
}

if (failures.length) {
  console.error(`${failures.length} failures out of ${customers.length}`);
  // Send alert, write to dead letter queue, etc.
}

Key principles

Always validate your data before sending it to the API. A missing required field produces a clearer error from your validation layer than from the API response. Log every failure with the record ID so you can re-run just the failed records.

Performance tips

When generating thousands of PDFs from database queries, performance bottlenecks are more likely in your database and network layer than in the PDF API itself. Here are the optimizations that matter most.

1. Use connection pooling

Opening a new database connection per query is expensive. Both the Node.js pg and Python mysql-connector examples above use connection pools. Size the pool to match your concurrency — if you run 10 parallel API calls, you need at least 10 pool connections.

2. Paginate large result sets

Do not load 100,000 customer rows into memory at once. Use LIMIT and OFFSET (or cursor-based pagination with WHERE id > last_id) to process in chunks of 100–500 rows.

paginated-query.mjs
let lastId = 0;
const PAGE_SIZE = 200;

while (true) {
  const { rows } = await pool.query(
    `SELECT * FROM customers
     WHERE id > $1 ORDER BY id LIMIT $2`,
    [lastId, PAGE_SIZE]
  );

  if (rows.length === 0) break;
  lastId = rows[rows.length - 1].id;

  // Process this batch (parallel API calls)
  await parallelMap(rows, generatePdf, 10);
  console.log(`Processed up to ID ${lastId}`);
}

3. Run API calls in parallel

Typsetter's API handles concurrent requests well. Instead of sequential calls (N × 340ms), run 10–20 in parallel (N/10 × 340ms). The concurrency limiter pattern shown in the batch section prevents overwhelming either the API or your own server.

4. Cache templates

If you are using multiple templates, the Typsetter API caches compiled templates server-side. The first render of a template in a session may take slightly longer; subsequent renders are faster. Group your generation by template to take advantage of this.

5. Keep payloads lean

Only send the data your template actually uses. Sending the entire customer object with 50 fields when the template only uses 8 adds unnecessary serialization and transfer time. Map your query results to the exact shape the template expects.

Benchmark

With 10 parallel requests and cursor-based pagination, a test run generating 5,000 invoices from a PostgreSQL database completed in 2 minutes and 48 seconds — an effective rate of ~30 PDFs per second including database query time, API calls, and S3 uploads.

Start generating PDFs from your database today

The code in this guide is production-ready. The pattern is the same whether you are generating 10 invoices or 100,000 statements: query your database, shape the data, call the API. Typsetter handles the typesetting, page layout, fonts, and PDF compilation — your code handles the data.

The free tier gives you 100 PDFs per month to build and test your integration. No credit card, no approval process — create an account, get an API key, and start rendering.

Try Typsetter for free

100 PDFs/month on the free plan. API key in 30 seconds. No credit card required.

Get Your Free API Key → View API Docs

How to Generate PDFs from a Database (PostgreSQL & MySQL) with Typsetter

Why your database is your document engine

Architecture overview

Example 1: Node.js + PostgreSQL — monthly invoices

Example 2: Python + MySQL — employee pay slips

Batch approach: generate hundreds of PDFs at once

Option A: CSV batch upload

Option B: parallel API calls with concurrency control

Scheduling: automate recurring document generation

Option 1: Typsetter schedules (zero infrastructure)

Option 2: cron job or task scheduler

Storing generated PDFs

Upload to S3 (or any object storage)

Save URL reference in the database

Other storage options

Error handling and retries

Performance tips

1. Use connection pooling

2. Paginate large result sets

3. Run API calls in parallel

4. Cache templates

5. Keep payloads lean

Start generating PDFs from your database today

Try Typsetter for free