Your database is slow. Before you throw hardware at it or start rewriting queries at random, it helps to know why.
Most SQL performance problems have a short list of causes: missing indexes, queries that scan more data than they need to, analytics patterns running against a transactional database, or connection handling that breaks down under load. Each has a fix. Some are a one-liner. A few require rethinking the architecture.
This post covers all of them, in order of how often they show up.
Check what's actually slow first
Don't guess. Run EXPLAIN ANALYZE before touching anything.
EXPLAIN ANALYZE
SELECT u.name, count(o.id) AS order_count
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.created_at > '2026-01-01'
GROUP BY u.name
ORDER BY order_count DESC
LIMIT 20;
In Postgres, the output shows you execution time, rows scanned vs. rows returned, and which nodes are doing the most work. Look for:
- Seq Scan on a large table: usually means a missing index
- rows=1000000 with actual rows=12: the planner estimated wrong, often due to stale statistics
- Hash Join on a huge table: might be better served with a different join strategy or a filtered subquery
- High actual time on a node that should be cheap: points to data volume or a type mismatch slowing the index lookup
Postgres also has pg_stat_statements, which surfaces your slowest queries by total execution time across all calls:
SELECT query, calls, total_exec_time, mean_exec_time, rows
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
Fix the queries at the top of this list. A query taking 800ms that runs 10,000 times a day is more important than one taking 5 seconds that runs twice.
Add the right indexes
The most common cause of slow queries is scanning rows you don't need to. Indexes fix this.
The simplest case: a WHERE clause on an unindexed column.
-- Slow: full table scan on 10M users
SELECT * FROM users WHERE email = 'ada@example.com';
-- Fix
CREATE INDEX CONCURRENTLY idx_users_email ON users (email);
CONCURRENTLY lets the index build without locking the table. Use it in production.
For multi-column filters, column order in the index matters. Put the most selective column first, and match the order of your WHERE clauses:
-- Query filters by status then created_at
SELECT * FROM orders
WHERE status = 'pending' AND created_at > now() - interval '7 days';
-- Index: status first (low cardinality but filters heavily), then created_at
CREATE INDEX idx_orders_status_created ON orders (status, created_at);
If your query includes ORDER BY on the same columns as the WHERE, a covering index avoids a sort step entirely:
-- This query both filters and sorts by created_at
SELECT id, user_id, total FROM orders
WHERE user_id = 42
ORDER BY created_at DESC
LIMIT 10;
-- Covering index: user_id for the filter, created_at for the sort, include the projected columns
CREATE INDEX idx_orders_user_created ON orders (user_id, created_at DESC)
INCLUDE (id, total);
One warning: indexes aren't free. Every write to the table updates every index on it. Don't index every column. Index the columns that appear in WHERE, JOIN ON, and ORDER BY clauses in your slowest queries.
Stop using SELECT *
SELECT * fetches every column from every matched row. If you need 3 columns from a 40-column table, you're reading 37 extra columns from disk on every row.
-- Slow: reads all 40 columns
SELECT * FROM events WHERE user_id = 42;
-- Fast: reads only what you need
SELECT id, event_type, created_at FROM events WHERE user_id = 42;
This matters more than it sounds. Wide tables with JSONB or TEXT columns balloon row sizes. Fetching those columns takes real I/O even when you immediately discard them in application code.
Fix N+1 queries
N+1 is the most common application-level performance problem. It happens when you load a list of records and then query for related data on each one individually.
// N+1: 1 query for users + N queries for their orders
const users = await db.query("SELECT id, name FROM users LIMIT 100");
for (const user of users) {
user.orders = await db.query(
"SELECT * FROM orders WHERE user_id = $1", [user.id]
);
}
This fires 101 queries. Replace it with a JOIN or a single IN query:
// 1 query
const rows = await db.query(`
SELECT u.id, u.name, o.id AS order_id, o.total
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.id = ANY($1)
`, [userIds]);
ORM users: N+1 is almost always a missing .include() / .with() / .preload() call on a relationship. Turn on query logging in development and watch for repeated queries with different ID parameters.
Use connection pooling
Each Postgres connection is an OS process. Opening a new one takes 20-50ms and consumes memory. If your application opens a new connection per request and you're handling 500 req/sec, you're spending 10-25 seconds per second just on connection setup.
Use a connection pool. In Node.js with pg:
import { Pool } from "pg";
const pool = new Pool({
connectionString: process.env.DATABASE_URL,
max: 20, // max pool size
idleTimeoutMillis: 30_000,
connectionTimeoutMillis: 2_000,
});
// Reuses connections from the pool
const result = await pool.query("SELECT ...");
For higher scale, put PgBouncer in front of Postgres in transaction-mode pooling. It multiplexes thousands of application connections onto a small number of actual Postgres connections. A Postgres instance that falls over at 200 direct connections can handle 5,000+ application connections behind PgBouncer with no code changes.
Avoid functions on indexed columns in WHERE clauses
Wrapping a column in a function inside a WHERE clause disables the index on that column. The database can't use the index to find matching values when the values are being transformed at query time.
-- Slow: index on created_at can't be used
WHERE date_trunc('day', created_at) = '2026-05-11'
-- Fast: rewrite to a range query
WHERE created_at >= '2026-05-11' AND created_at < '2026-05-12'
-- Slow: index on email can't be used
WHERE lower(email) = 'ada@example.com'
-- Fast option 1: store emails lowercased on insert
-- Fast option 2: create a functional index
CREATE INDEX idx_users_email_lower ON users (lower(email));
WHERE lower(email) = 'ada@example.com' -- now uses the functional index
Partition large tables
Tables with hundreds of millions of rows need partitioning to stay fast over time. Partitioning splits the physical storage into smaller chunks the query planner can skip entirely based on filter values.
Range partitioning by time is the most common pattern for append-heavy tables:
CREATE TABLE events (
id BIGSERIAL,
user_id BIGINT,
event_type TEXT,
created_at TIMESTAMPTZ NOT NULL
) PARTITION BY RANGE (created_at);
CREATE TABLE events_2026_05 PARTITION OF events
FOR VALUES FROM ('2026-05-01') TO ('2026-06-01');
CREATE TABLE events_2026_06 PARTITION OF events
FOR VALUES FROM ('2026-06-01') TO ('2026-07-01');
A query with WHERE created_at >= '2026-05-01' AND created_at < '2026-06-01' now scans only the May partition, not the entire table. For a 5-year table with monthly partitions, that's skipping 59 of 60 partitions.
Pair partitioning with indexes on each partition. Partition pruning and index scans together are far faster than either alone.
Materialize expensive aggregations
If you're running the same expensive aggregate query repeatedly (dashboard totals, leaderboards, summary stats), compute the result once and store it. Recomputing from raw data on every request doesn't scale.
-- Materialized view: computed once, queried many times
CREATE MATERIALIZED VIEW daily_order_totals AS
SELECT
date_trunc('day', created_at) AS day,
status,
count(*) AS order_count,
sum(total) AS revenue
FROM orders
GROUP BY 1, 2;
CREATE INDEX ON daily_order_totals (day);
-- Refresh on a schedule or after bulk inserts
REFRESH MATERIALIZED VIEW CONCURRENTLY daily_order_totals;
CONCURRENTLY allows reads during the refresh. It requires a unique index on the view.
When the problem isn't the query
All of the above assumes your bottleneck is query execution. Sometimes it isn't.
You're running analytics on a transactional database. Postgres and MySQL are designed for OLTP: many small reads and writes, indexed by primary key, with strong consistency guarantees. Analytical queries — aggregations over millions of rows, GROUP BY across multiple dimensions, time-series rollups — are a different workload entirely. Running them on a transactional database puts load on your primary, competes with application queries for I/O, and doesn't get better with indexes alone.
If you're running queries like this against Postgres:
SELECT
date_trunc('hour', created_at) AS hour,
event_type,
region,
count(*) AS events,
count(DISTINCT user_id) AS unique_users
FROM events
WHERE created_at > now() - interval '30 days'
GROUP BY 1, 2, 3
ORDER BY 1;
...and it's slow even after indexing, that's a workload mismatch. That query needs a columnar OLAP database, not a row-store transactional one.
Tinybird runs on ClickHouse®, which stores data by column rather than by row. Analytical queries that scan billions of rows and aggregate across multiple dimensions return in milliseconds, because the database only reads the columns it needs, and reads them in compressed, vectorized chunks.
The same query in Tinybird, exposed as an API endpoint:
-- analytics_hourly.pipe
NODE hourly
SQL >
%
SELECT
toStartOfHour(created_at) AS hour,
event_type,
region,
count() AS events,
uniq(user_id) AS unique_users
FROM events
WHERE created_at > now() - interval {{ Int32(days, 30) }} day
GROUP BY hour, event_type, region
ORDER BY hour
TYPE ENDPOINT
Parameters, auth, and HTTP serving are handled by Tinybird. Your application calls a URL. The analytics load moves off your Postgres primary entirely.
You've hit hardware limits. After indexes, connection pooling, query rewrites, and partitioning, the remaining lever is hardware: more RAM (so the working set fits in memory), faster storage (NVMe vs. SSD), or read replicas to distribute query load. These help. They're also expensive and don't fix architectural problems — a bad query on faster hardware is still a bad query.
Fix the queries first.
