---
title: "PostgreSQL vertical scaling — 3 approaches in 2026"
excerpt: "PostgreSQL vertical scaling options compared: hardware and config tuning, connection pooling with query optimization, and analytical offload to ClickHouse®. Know when vertical hits its ceiling."
authors: "Tinybird"
categories: "AI Resources"
createdOn: "2026-04-13 00:00:00"
publishedOn: "2026-04-13 00:00:00"
updatedOn: "2026-04-13 00:00:00"
status: "published"
---

These are the main options for **PostgreSQL vertical scaling**:

1. **Hardware and config tuning** — maximize what the current server delivers
2. **Connection pooling + query optimization** — reduce resource contention without hardware changes
3. **Analytical query offload** — CDC to ClickHouse® via Tinybird for queries Postgres cannot serve at scale

**PostgreSQL vertical scaling** means extracting more performance from a single server before distributing load across multiple nodes. Tuning [PostgreSQL](https://www.ibm.com/think/topics/postgresql) configuration to use available hardware efficiently is almost always the right first step before introducing distributed infrastructure.

But vertical scaling has hard limits. Analytical queries that scan billions of rows are structurally expensive regardless of hardware. Knowing when you have hit the ceiling — and what to do next — is the real engineering judgment required.

Before choosing an approach, identify the root cause: **resource underutilization** (conservative defaults not using available hardware), **connection saturation** (too many idle connections), or **query patterns** (analytical scans that no hardware can make fast). Profile with `pg_stat_statements` to confirm.

## **Three ways to implement PostgreSQL vertical scaling**

### **Option 1: Hardware upgrade + configuration tuning**

The most direct approach. PostgreSQL ships with conservative default configuration designed for minimal hardware. On production servers with 32–256 GB RAM and fast NVMe storage, the defaults leave most resources idle.

**How it works:** upgrade server hardware (more RAM, faster NVMe, more CPU cores), then recalibrate `postgresql.conf` to use those resources. Key parameters directly affect query performance and concurrency.

**Critical `postgresql.conf` parameters to tune:**

```ini
# postgresql.conf — tuned for a 64 GB RAM server

# Memory
shared_buffers = 16GB             # 25% of RAM — PostgreSQL's own buffer cache
effective_cache_size = 48GB       # 75% of RAM — hint to query planner for OS cache
work_mem = 64MB                   # per-sort/hash, multiply by active queries
maintenance_work_mem = 2GB        # for VACUUM, CREATE INDEX, pg_restore

# WAL / checkpoint
wal_buffers = 64MB
checkpoint_completion_target = 0.9
max_wal_size = 4GB
min_wal_size = 1GB

# Planner
random_page_cost = 1.1            # for NVMe; default 4.0 is tuned for spinning disk
effective_io_concurrency = 200    # for NVMe; 2 for spinning disk

# Autovacuum — more aggressive on busy tables
autovacuum_max_workers = 6
autovacuum_vacuum_cost_delay = 2ms
autovacuum_vacuum_scale_factor = 0.01
```

**Check current resource utilization before tuning:**

```sql
-- Top queries by total execution time (requires pg_stat_statements)
SELECT query, calls, total_exec_time::bigint AS total_ms,
       mean_exec_time::bigint AS mean_ms,
       rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;

-- Current buffer cache hit rate (target > 99%)
SELECT round(100.0 * sum(heap_blks_hit) /
             nullif(sum(heap_blks_hit) + sum(heap_blks_read), 0), 2) AS hit_rate_pct
FROM pg_statio_user_tables;
```

**When this fits:**

- Your server has remaining capacity (CPU not consistently above 80%, and `pg_stat_statements` shows no obvious index or config gaps yet)
- Buffer cache hit rate is already below its potential (e.g., `shared_buffers` is at the conservative default and would benefit from tuning for your actual RAM)
- Queries are OLTP-style: point lookups, small range scans, indexed access
- You want the simplest path to more capacity without new infrastructure

**Trade-offs:** hardware upgrades have a cost ceiling — a 256 GB RAM server costs much more than a 64 GB server and provides limited incremental benefit beyond a certain point. Analytical queries scanning millions of rows are slow not because of RAM but because of PostgreSQL's row-oriented storage model. No hardware upgrade fixes a structural query mismatch.

**Prerequisites:** server access for hardware changes, PostgreSQL 14+, `pg_stat_statements` extension enabled.

### **Option 2: Connection pooling + query optimization**

The second most common cause of PostgreSQL slowness is **connection saturation** — too many connections holding memory and creating contention — combined with **unoptimized queries** that execute more work than necessary.

**How it works:** add **PgBouncer** in transaction mode to cap the number of actual PostgreSQL server connections regardless of how many clients connect. Then systematically optimize slow queries using `EXPLAIN ANALYZE` and targeted indexing.

**PgBouncer in transaction mode:**

```ini
# pgbouncer.ini
[databases]
mydb = host=127.0.0.1 port=5432 dbname=mydb

[pgbouncer]
pool_mode = transaction
max_client_conn = 2000
default_pool_size = 50        # actual connections to PostgreSQL
server_pool_size = 100
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = scram-sha-256
auth_file = /etc/pgbouncer/userlist.txt
```

With transaction pooling, 2000 client connections share 50 actual PostgreSQL server connections — each connection uses ~5–10 MB of server memory, so this significantly reduces memory pressure.

**Optimize a slow query with EXPLAIN ANALYZE:**

```sql
-- Identify a slow query
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT u.id, u.email, count(o.id) AS order_count, sum(o.amount) AS total_spent
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE o.created_at >= now() - INTERVAL '90 days'
  AND u.status = 'active'
GROUP BY u.id, u.email
ORDER BY total_spent DESC
LIMIT 100;
```

**Add targeted indexes for common patterns:**

```sql
-- Partial index for active users (reduces index size)
CREATE INDEX CONCURRENTLY idx_users_active ON users (id, email)
WHERE status = 'active';

-- Covering index for order lookup by user + date (avoids heap fetch)
CREATE INDEX CONCURRENTLY idx_orders_user_date ON orders (user_id, created_at DESC)
INCLUDE (amount, status);

-- Partial index for recent orders only
CREATE INDEX CONCURRENTLY idx_orders_recent ON orders (created_at DESC, user_id)
WHERE created_at >= '2025-01-01';
```

**When this fits:**

- Connection counts are high (above 200 server connections) and memory is under pressure
- `pg_stat_statements` shows specific queries with high `total_exec_time` that lack appropriate indexes
- You want query-level improvements without hardware changes or new infrastructure

**Trade-offs:** PgBouncer transaction mode is incompatible with prepared statements in session mode and some PostgreSQL features (advisory locks, `SET LOCAL`). Query optimization has diminishing returns — every table gets its indexes, and the next bottleneck is query volume or data size. This approach does not help if queries are slow because they scan large amounts of data, not because they lack indexes.

**Prerequisites:** PgBouncer 1.18+, `pg_stat_statements` enabled, `auto_explain` for identifying slow queries in production logs.

### **Option 3: Analytical query offload — CDC to ClickHouse® via Tinybird**

When vertical tuning reaches its limit, the remaining bottleneck is usually **analytical queries** that are structurally expensive in a row-oriented engine. Aggregations over millions of rows, dashboard queries scanning time ranges, and reporting pipelines compete with OLTP transactions for the same CPU and I/O — and no amount of vertical scaling eliminates that competition.

**How it works:** enable logical replication on PostgreSQL, stream changes into Tinybird via CDC, and route analytical queries to Tinybird's ClickHouse®-backed engine instead of hitting PostgreSQL. The OLTP database keeps serving transactional workloads; analytical queries move to an engine designed for them.

**Enable logical replication and create a publication:**

```sql
-- postgresql.conf: wal_level = logical (requires restart)

-- Verify WAL level
SHOW wal_level;

-- Create publication for analytical tables
CREATE PUBLICATION analytics_pub FOR TABLE orders, events, page_views, sessions;

-- A CDC connector (e.g. Debezium, Estuary, or Sequin) consumes this slot
-- and forwards change events to Tinybird via Kafka or HTTP ingest
SELECT pg_create_logical_replication_slot('tinybird_analytics', 'pgoutput');
```

**ClickHouse® schema in Tinybird for the events table:**

```sql
CREATE TABLE events_raw (
  event_id    UInt64,
  user_id     UInt64,
  session_id  UInt64,
  event_type  LowCardinality(String),
  page_path   String,
  event_time  DateTime,
  updated_at  DateTime
)
ENGINE = ReplacingMergeTree(updated_at)
PARTITION BY toYYYYMM(event_time)
ORDER BY (user_id, event_id)
```

**Analytical materialized view:**

```sql
CREATE MATERIALIZED VIEW events_hourly_mv
ENGINE = AggregatingMergeTree()
PARTITION BY toYYYYMM(hour)
ORDER BY (event_type, hour)
AS SELECT
  toStartOfHour(event_time) AS hour,
  event_type,
  countState() AS total_events_state,
  uniqState(user_id) AS unique_users_state,
  uniqState(session_id) AS unique_sessions_state
FROM events_raw
GROUP BY hour, event_type
```

Query: `SELECT hour, event_type, countMerge(total_events_state) AS events, uniqMerge(unique_users_state) AS users FROM events_hourly_mv WHERE hour >= today() - 7 GROUP BY hour, event_type ORDER BY hour`.

**When this fits:**

- Vertical tuning is done and the remaining slow queries are **analytical scans** — aggregations, time-range reports, dashboard queries
- You need **sub-second response times** on those queries under concurrent load from dashboards or APIs
- You also need the same analytical data accessible via APIs for [user-facing analytics](https://www.tinybird.co/blog/user-facing-analytics)

**Trade-offs:** adds CDC infrastructure and Tinybird ingestion pipeline. Data reflects PostgreSQL changes with a short lag (typically seconds). Requires modeling analytical queries as Tinybird Pipes. Does not help with OLTP write or read pressure directly — it removes analytical overhead.

**Prerequisites:** PostgreSQL 13+ with `wal_level = logical`, a CDC connector (Debezium + Kafka, Estuary Flow, or Sequin), Tinybird account.

### **Summary: picking the right option**

| Criterion | Hardware + config | Connection pooling + indexes | Analytical offload (Tinybird) |
| --- | --- | --- | --- |
| **Best for** | Underutilized hardware | Connection saturation + slow indexed queries | Analytical scans competing with OLTP |
| **Setup complexity** | Low | Low | Medium |
| **Ops overhead** | Low | Low | Low (managed) |
| **Sub-second analytics** | No | No | Yes |
| **Hardware ceiling** | Yes | Yes | No (separate engine) |
| **API reuse** | No | No | Yes |

## **Decision framework: what to choose for PostgreSQL vertical scaling**

Apply these in sequence — they are not mutually exclusive:

- **Start with hardware + config tuning** if you have not tuned `shared_buffers`, `work_mem`, `autovacuum`, or `random_page_cost` for your actual hardware. This is almost always under-done and provides immediate gains.
- **Add PgBouncer + index optimization** if connection counts are high, memory is under pressure, or `pg_stat_statements` shows high-execution-time queries without appropriate indexes.
- **Add analytical offload** when tuning is complete and the remaining slow queries are analytical scans. This is the lever that breaks through the vertical ceiling for analytical workloads — not by adding hardware, but by removing that workload type from PostgreSQL entirely.

**Bottom line:** most PostgreSQL performance problems can be fixed with proper config + indexing. But when analytical query interference remains after tuning, ClickHouse® via Tinybird is the cleanest path — it removes the root cause rather than scaling around it.

## **What is PostgreSQL vertical scaling and when does it matter?**

**PostgreSQL vertical scaling** means improving performance on a single [database](https://www.oracle.com/database/what-is-database/) server — adding hardware and tuning configuration to use those resources effectively. It is the first and most cost-effective scaling approach before introducing distributed architecture.

Most PostgreSQL performance problems are caused by misconfiguration, connection overhead, or unoptimized queries — not true hardware limits. Fixing these first extracts the most leverage with the least complexity.

Vertical scaling has a ceiling, though. [OLTP vs OLAP](https://www.tinybird.co/blog/oltp-vs-olap) workload mixing means no single server can serve high-frequency transactional writes and analytical aggregations over billions of rows without one affecting the other. At that point, workload separation — not bigger hardware — is the right move. The goal after tuning is predictable [low latency](https://www.cisco.com/site/us/en/learn/topics/cloud-networking/what-is-low-latency.html) for both OLTP and analytical query tiers.

## **Schema and pipeline design**

### **Practical schema rules for vertical scaling**

**Rule 1: use partial indexes for filtered queries.** A partial index on `WHERE status = 'active'` is smaller and faster than a full index — if 5% of rows match, the partial index is 20x smaller.

**Rule 2: avoid `SELECT *` in application queries.** Explicit column selection reduces I/O and `shared_buffers` pressure, especially on wide tables.

**Rule 3: tune autovacuum per table.** High-write tables need `autovacuum_vacuum_scale_factor = 0.01` and lower `autovacuum_vacuum_cost_delay` — defaults are designed for average load.

**Rule 4: set `CONNECTION LIMIT` per role.** Prevents a runaway process from exhausting all connections before PgBouncer intervenes.

### **Example: high-write table with per-table autovacuum tuning**

```sql
-- High-write events table with custom autovacuum settings
CREATE TABLE events (
  event_id    BIGSERIAL PRIMARY KEY,
  user_id     BIGINT NOT NULL,
  event_type  TEXT NOT NULL,
  payload     JSONB,
  created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
) WITH (
  autovacuum_vacuum_scale_factor    = 0.01,  -- vacuum after 1% dead tuples (default 20%)
  autovacuum_analyze_scale_factor   = 0.005, -- analyze after 0.5%
  autovacuum_vacuum_cost_delay      = 2,     -- less delay between vacuum pages
  fillfactor                        = 80     -- leave 20% free space for HOT updates
);

-- Partial covering index for recent events (the most queried).
-- Note: static date predicates require periodic recreation as your "recent" window shifts.
-- For long-lived tables, prefer time-based partitioning instead of a fixed partial index.
CREATE INDEX CONCURRENTLY idx_events_recent_user
  ON events (user_id, created_at DESC)
  INCLUDE (event_type)
  WHERE created_at >= '2026-01-01';
```

**Monitor vacuum health on busy tables:**

```sql
SELECT relname, n_dead_tup, n_live_tup,
       round(100.0 * n_dead_tup / nullif(n_live_tup + n_dead_tup, 0), 2) AS dead_pct,
       last_autovacuum, last_autoanalyze
FROM pg_stat_user_tables
WHERE n_dead_tup > 10000
ORDER BY n_dead_tup DESC
LIMIT 20;
```

### **Failure modes**

1. **`work_mem` set too high causing OOM.** `work_mem` applies per sort or hash operation, and queries can have multiple. With 200 concurrent connections at `work_mem = 256MB`, the server runs out of memory quickly. Set conservatively (32–64 MB) and override per session for known heavy queries: `SET LOCAL work_mem = '1GB'`.

2. **PgBouncer transaction mode breaking prepared statements.** Named prepared statements are scoped to a server connection, which is reassigned between transactions in pooling mode. Applications using named prepared statements fail. Use session mode for those applications, or configure the ORM to use protocol-level (unnamed) prepared statements.

3. **Autovacuum unable to keep up.** High update rates accumulate dead tuples faster than defaults clean them. Table bloat grows, queries degrade. Set `autovacuum_vacuum_scale_factor = 0.01` on high-write tables and monitor `pg_stat_user_tables.n_dead_tup`.

4. **`shared_buffers` too large starving OS page cache.** Many teams stay at 25% of RAM and let the OS cache handle the rest. Pushing above ~40% can reduce OS cache and hurt sequential reads. Set `effective_cache_size` correctly (75% of RAM) so the planner accounts for both.

5. **Vertical ceiling reached with analytical scans remaining.** After full tuning, dashboard and reporting queries still consume most CPU and I/O. Adding RAM no longer helps — the queries are row-scanning by nature. Profile with `pg_stat_statements` to confirm the pattern, then implement analytical offload via CDC to ClickHouse®.

## **Why ClickHouse® for PostgreSQL vertical scaling**

ClickHouse® is the right escape hatch when PostgreSQL vertical scaling reaches its ceiling for analytical workloads. PostgreSQL's row-oriented storage reads entire rows to aggregate values. ClickHouse®'s columnar engine reads only the columns referenced in the query — an aggregation over a wide table that touches 3 columns reads a fraction of the data.

This is the [fastest database for analytics](https://www.tinybird.co/blog/fastest-database-for-analytics) model: not faster hardware, but a storage format purpose-built for the analytical query pattern. Freeing PostgreSQL from analytical work — rather than adding more RAM — is the highest-leverage move when vertical tuning is complete.

## **Why Tinybird is a strong fit for PostgreSQL vertical scaling**

When vertical scaling has been maximized and analytical queries remain the bottleneck, Tinybird provides the managed path to ClickHouse® without operational overhead. You get [real-time change data capture](https://www.tinybird.co/blog/real-time-change-data-capture) from PostgreSQL, ClickHouse®-backed query execution, and REST API publishing in one platform.

Define the analytical queries overloading PostgreSQL as Tinybird Pipes, connect CDC, and route dashboard and API traffic to Tinybird endpoints. PostgreSQL handles OLTP; Tinybird handles [real-time data processing](https://www.tinybird.co/blog/real-time-data-processing) for the analytical tier — without operating distributed PostgreSQL or a self-hosted ClickHouse® cluster.

Next step: run `SELECT query, total_exec_time, calls FROM pg_stat_statements ORDER BY total_exec_time DESC LIMIT 10`. Identify which top queries are analytical aggregations vs. transactional lookups. The analytical ones are your candidates for Tinybird offload.

## **Frequently Asked Questions (FAQs)**

### **What are the most impactful PostgreSQL vertical scaling config changes?**

`shared_buffers` (25% of RAM), `work_mem` (32–64 MB), `effective_cache_size` (75% of RAM), `random_page_cost` (1.1 for NVMe), and autovacuum `scale_factor = 0.01` for high-write tables. Then `max_wal_size` and checkpoint settings to reduce I/O spikes.

### **How do I know when PostgreSQL vertical scaling has reached its ceiling?**

When indexes exist for all high-frequency queries, autovacuum is keeping up, and the remaining slow queries are analytical aggregations or time-range scans — those are structurally expensive regardless of hardware. `pg_stat_statements` will show them at the top by `total_exec_time`.

### **What is PgBouncer transaction mode and when should I use it?**

PgBouncer transaction mode releases the server connection after each transaction, letting many clients share few server connections. Use it when client count exceeds ~200 and you see memory pressure or connection errors. Avoid it if your application uses named prepared statements, advisory locks, or session-scoped `SET` variables.

### **How does analytical offload via Tinybird complement PostgreSQL vertical scaling?**

They are sequential steps: tune vertically first, then offload what tuning cannot fix. Vertical scaling improves PostgreSQL's ability to serve all workloads on one server. Analytical offload removes the workload type that vertical scaling cannot address — analytical scans over large datasets. The combination maximizes performance per dollar.

### **Does PostgreSQL vertical scaling help with connection pool exhaustion?**

No. `max_connections` is a configured limit — each connection uses ~5–10 MB of shared memory regardless of RAM. The fix is **PgBouncer**, not more hardware. Set `max_connections` to 100–300 and let PgBouncer multiplex application connections on top.

### **What PostgreSQL metrics should I monitor to guide vertical scaling decisions?**

Monitor: buffer cache hit rate (`pg_statio_user_tables`), dead tuple ratio (`pg_stat_user_tables`), top queries by total execution time (`pg_stat_statements`), checkpoint frequency (`pg_stat_bgwriter`), connection count vs. `max_connections`, and `autovacuum_count` per table. These point to the right lever — hardware, config, indexing, or analytical offload.