Analytics queries kill transaction performance. Not because they're buggy — because they're doing exactly what they're supposed to do. A query aggregating 30 days of order data is doing a lot of legitimate work. It reads millions of rows, holds buffer cache pages that your transactional queries also need, and holds connections open long enough to interfere with normal traffic.
The fix isn't to make the analytics query faster. The fix is to stop having the two workloads share the same resources.
This post covers the options in order of complexity: from quick wins that buy time, to architectural changes that separate the workloads properly.
Understand what's actually competing
Postgres has a shared buffer cache. When an analytics query reads millions of rows, it floods the cache with pages that transactional queries need for their indexed lookups. Those indexed lookups then go to disk. Latency spikes.
Connections compound this. Postgres connections are OS processes. A long-running analytics query holds a connection and its memory allocation for its entire duration. If you have 100 connection slots and 20 dashboards each holding one connection for a slow GROUP BY, that's 20% of your capacity gone.
Run this to see what's currently holding connections and for how long:
SELECT
pid,
state,
wait_event_type,
wait_event,
round(extract(epoch from now() - query_start)) AS seconds,
left(query, 80) AS query_preview
FROM pg_stat_activity
WHERE state != 'idle'
ORDER BY seconds DESC;
Anything in state = 'active' with a long seconds value and a GROUP BY or COUNT DISTINCT in the query is your culprit.
Option 1: Read replica for analytics
The cheapest architectural change with the highest immediate impact. Create a read replica and route all analytics queries there.
Primary DB ──writes──> Replica ──reads──> Analytics queries
│
└──reads──> Application (transactional)
Your application sends write traffic and latency-sensitive reads to the primary. Dashboard queries, report generation, and aggregations hit the replica. The primary's buffer cache is no longer being evicted by analytics scans.
In practice:
// Two connection pools — one per workload
const primaryPool = new Pool({ connectionString: process.env.DATABASE_PRIMARY_URL });
const replicaPool = new Pool({ connectionString: process.env.DATABASE_REPLICA_URL });
// Transactional queries
async function createOrder(data: OrderData) {
return primaryPool.query("INSERT INTO orders ...", [...]);
}
// Analytics queries
async function getRevenueByDay(days: number) {
return replicaPool.query(`
SELECT date_trunc('day', created_at) AS day, sum(total) AS revenue
FROM orders
WHERE created_at > now() - interval '${days} days'
GROUP BY 1 ORDER BY 1
`);
}
Replication lag is the tradeoff. Replicas typically lag primary by a few hundred milliseconds to a few seconds. For dashboards that show hourly or daily aggregations, that's fine. For analytics that need to reflect data written in the last second, it's a problem.
Option 2: Materialized views with scheduled refresh
If you're running the same analytics queries repeatedly, compute them once and store the result. Recomputing from 30 million rows on every dashboard load doesn't scale.
CREATE MATERIALIZED VIEW revenue_by_day AS
SELECT
date_trunc('day', created_at) AS day,
status,
count(*) AS order_count,
sum(total) AS revenue
FROM orders
GROUP BY 1, 2;
CREATE UNIQUE INDEX ON revenue_by_day (day, status);
Refresh it on a schedule with pg_cron:
-- Refresh every 15 minutes
SELECT cron.schedule(
'refresh-revenue-mv',
'*/15 * * * *',
'REFRESH MATERIALIZED VIEW CONCURRENTLY revenue_by_day'
);
CONCURRENTLY means reads are not blocked during the refresh. The unique index is required for it to work.
Now your dashboard hits the materialized view:
SELECT day, sum(revenue) AS revenue
FROM revenue_by_day
WHERE day > now() - interval '30 days'
GROUP BY day
ORDER BY day;
This query scans a small pre-aggregated table instead of the full orders history. It's fast, and it puts almost no load on the primary.
The tradeoff: data is only as fresh as the last refresh. For a refresh every 15 minutes, dashboards can be up to 15 minutes stale. Whether that's acceptable depends entirely on the use case.
Option 3: Partial indexes and time-based partitioning
If you can't move to a replica and don't want stale materialized views, you can at least minimize how much of the table your analytics queries touch.
For queries that always filter on recent data, a partial index covers only the rows that match:
-- Index only the last 90 days of orders
CREATE INDEX idx_orders_recent ON orders (created_at, status)
WHERE created_at > now() - interval '90 days';
This index is small relative to the full table. Analytics queries that filter to recent data use it and scan far fewer pages.
For tables that have grown to hundreds of millions of rows, partitioning by time reduces scan scope at the storage level:
CREATE TABLE orders (
id BIGSERIAL,
user_id BIGINT,
total NUMERIC(10,2),
status TEXT,
created_at TIMESTAMPTZ NOT NULL
) PARTITION BY RANGE (created_at);
CREATE TABLE orders_2026_q1 PARTITION OF orders
FOR VALUES FROM ('2026-01-01') TO ('2026-04-01');
CREATE TABLE orders_2026_q2 PARTITION OF orders
FOR VALUES FROM ('2026-04-01') TO ('2026-07-01');
A query scoped to Q2 2026 reads only orders_2026_q2. The planner skips every other partition without reading them. Analytics queries become bounded even without materialized views.
Option 4: statement_timeout per role
Block long-running analytics queries from holding connections indefinitely. Create a dedicated analytics role with a statement timeout:
CREATE ROLE analytics_reader;
GRANT CONNECT ON DATABASE yourdb TO analytics_reader;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO analytics_reader;
ALTER ROLE analytics_reader SET statement_timeout = '30s';
ALTER ROLE analytics_reader SET work_mem = '64MB';
Any query from the analytics_reader role that runs longer than 30 seconds is automatically cancelled. It won't save your buffer cache from eviction — but it prevents runaway queries from holding connections for minutes.
Setting a higher work_mem for the analytics role lets sort and hash operations use more memory instead of spilling to disk, which makes analytics queries faster without changing the primary's configuration.
Option 5: Move analytics off Postgres
Read replicas, materialized views, and partitioning all buy time. They don't change the fundamental constraint: Postgres is a row-store transactional database, and analytical workloads are structurally a bad fit for it.
A query like this:
SELECT
date_trunc('hour', created_at) AS hour,
region,
event_type,
count(*) AS events,
count(DISTINCT user_id) AS unique_users
FROM events
WHERE created_at > now() - interval '7 days'
GROUP BY 1, 2, 3
ORDER BY 1;
...runs against a row-store database by reading entire rows and discarding most columns. On a columnar OLAP database, it reads only the four columns it needs, compressed and vectorized, and skips every row outside the time window at the storage layer.
Tinybird is managed ClickHouse®: a columnar OLAP database with a serverless API layer. You ingest events via HTTP or Kafka, write the query as a SQL Pipe, and it becomes a parameterized REST endpoint. The analytics load moves off Postgres entirely.
The same query as a Tinybird endpoint:
-- event_stats.pipe
NODE stats
SQL >
%
SELECT
toStartOfHour(created_at) AS hour,
region,
event_type,
count() AS events,
uniq(user_id) AS unique_users
FROM events
WHERE created_at > now() - interval {{ Int32(days, 7) }} day
GROUP BY hour, region, event_type
ORDER BY hour
TYPE ENDPOINT
Your dashboard calls https://api.tinybird.co/v0/pipes/event_stats.json?days=7. Postgres handles writes and transactional reads. Tinybird handles the analytics. Both do what they're actually designed for.
Which option to pick
| Situation | Option |
|---|---|
| Analytics and transactions share one instance today | Read replica first |
| Same analytics queries run repeatedly | Materialized views + pg_cron |
| Table has hundreds of millions of rows | Partitioning |
| Runaway queries holding connections | statement_timeout per role |
| Sub-second freshness required at scale | Move to OLAP (Tinybird) |
| Growing analytics team hitting Postgres directly | Move to OLAP (Tinybird) |
The options aren't mutually exclusive. A read replica with materialized views on top handles most teams. When that's not enough — when you need millisecond latency, real-time freshness, or analytics APIs serving end users — the workload belongs on a different database.
