ClickHouse® is fast by default. But "fast by default" means fast when the schema and queries are designed well. The same data can return in 40ms or 40 seconds depending on whether the sort key is used, whether you're scanning raw events instead of a rollup, and whether the BI tool wrapping your queries is generating COUNT(DISTINCT) and SELECT * without you realizing it.
Most SQL query performance improvements in ClickHouse come from schema decisions made at table creation time, not from query hints or index tuning after the fact. Get the sort key and table engine right and many queries are fast without further optimization.
This post covers the five most impactful changes you can make to get faster queries in ClickHouse, plus diagnostics and BI tool patterns.
1. Sort key design
The sort key is the most important performance lever in ClickHouse. It determines the order of rows on disk, which determines how much data ClickHouse reads to satisfy a filter.
ClickHouse stores data in granules (default: 8,192 rows each) and reads the minimum number of granules that could contain rows matching your WHERE clause. The primary key index maps the first row of each granule, so if rows are sorted by (org_id, timestamp), a query filtering on org_id = 'acme' skips all granules where the entire range belongs to other orgs.
Wrong sort key:
CREATE TABLE events (
event_time DateTime,
user_id String,
org_id String,
event_type LowCardinality(String)
)
ENGINE = MergeTree
ORDER BY (user_id, event_time); -- but queries filter on org_id first
A query WHERE org_id = 'acme' AND event_time >= ... scans the full table because org_id is not in the sort key. ClickHouse reads every granule to find matching rows.
Right sort key:
ORDER BY (org_id, event_time, event_type, user_id)
Now WHERE org_id = 'acme' AND event_time >= now() - 7 DAY reads only the granules belonging to that org in that time range. At 100 billion rows with 1,000 orgs, this is the difference between scanning 100 billion rows and scanning 100 million.
Rules:
- Put your highest-cardinality filter first, usually tenant or org ID.
- Time column second. Most queries have a time range.
- Additional filter columns after time.
- Never put a column you never filter on in the sort key.
2. LowCardinality on dimension columns
LowCardinality is one of the highest-impact optimizations available for free in ClickHouse. It applies dictionary encoding to columns with a bounded set of distinct values (typically fewer than 10,000), which reduces memory usage for GROUP BY operations and speeds up filtering significantly.
-- Without LowCardinality: stores raw string bytes for every row
event_type String,
-- With LowCardinality: stores an integer dictionary key per row
event_type LowCardinality(String),
For event tracking schemas, apply LowCardinality to: event name/type, platform, device type, country, plan tier, status values, any string with a bounded set of values. The compression improvement is typically 3-10x on those columns, and GROUP BY on LowCardinality columns uses integer comparison rather than string comparison.
Don't apply LowCardinality to high-cardinality columns like user_id, session_id, or order_id. The overhead of managing a dictionary larger than 65,535 values negates the benefit.
3. Avoid functions on sort key columns
Once you have the right sort key, the most common way to bypass it is wrapping a sort key column in a function in your WHERE clause. ClickHouse can't use the sort index when the filter column is transformed.
-- All of these break sort key usage
WHERE toDate(event_time) = today()
WHERE toString(org_id) = 'acme'
WHERE lower(event_type) = 'purchase'
WHERE toYYYYMM(event_time) = 202605
All of these force a full scan. Rewrite as range conditions on the raw column:
-- Uses sort key correctly
WHERE event_time >= today() AND event_time < today() + 1
WHERE org_id = 'acme'
WHERE event_type = 'purchase'
WHERE event_time >= '2026-05-01' AND event_time < '2026-06-01'
This is the most common performance issue when BI tools generate SQL against ClickHouse. Tools like Grafana, Metabase, and Tableau sometimes wrap date columns in functions. The 5 rules for writing faster SQL queries in ClickHouse all trace back to respecting the sort key and partition boundaries.
4. PREWHERE for selective filters
PREWHERE is a ClickHouse-specific optimization that reads a filter column first, finds matching rows, and then reads the remaining columns only for those rows. It's most effective when the filter is highly selective and the table is wide.
-- Standard WHERE: reads all columns, then filters
SELECT user_id, event_type, properties
FROM events
WHERE event_time >= now() - INTERVAL 7 DAY
AND event_name = 'purchase';
-- PREWHERE: reads event_name first, finds matching rows, then reads other columns
SELECT user_id, event_type, properties
FROM events
PREWHERE event_name = 'purchase'
WHERE event_time >= now() - INTERVAL 7 DAY;
On a wide table where event_name = 'purchase' matches 0.5% of rows, PREWHERE reduces I/O by ~99.5% for the non-PREWHERE columns. ClickHouse applies this optimization automatically in many cases, but explicitly specifying PREWHERE ensures it applies when your most selective filter is on a non-sort-key column.
Don't use PREWHERE with:
- Columns that are part of the sort key (the optimizer already uses them efficiently).
- Non-selective filters (filters matching >10% of rows gain little from PREWHERE).
Nullablecolumns (requires extra null checks that can negate the benefit).
For wide tables with many columns and selective attribute filters, PREWHERE consistently shows 5-20x I/O reduction. The ClickHouse query optimization documentation covers when the optimizer applies it automatically and when to force it manually.
5. Replace COUNT(DISTINCT) with uniq()
COUNT(DISTINCT column) is exact but expensive at scale. ClickHouse uses a hash set to track distinct values, which requires memory proportional to cardinality and grows linearly with the number of rows.
uniq() uses HyperLogLog with ~2% error and is 10-100x faster on high-cardinality columns. For most product analytics use cases, the approximation is imperceptible:
-- Slow: exact distinct count
SELECT COUNT(DISTINCT user_id) FROM events WHERE event_time >= today() - 7;
-- Fast: ~2% error, 10-100x faster
SELECT uniq(user_id) FROM events WHERE event_time >= today() - 7;
For low-cardinality columns (dozens to hundreds of distinct values), the difference is negligible either way. Use uniqExact() when exactness is contractually required (billing, compliance). Use uniq() for dashboards and product metrics.
The same pattern applies to approximate percentiles. quantile(0.95)(duration_ms) is much faster than quantileExact(0.95)(duration_ms) at scale, with error rates low enough for SLA monitoring.
6. Pre-aggregate with materialized views
Scanning billions of raw events for every dashboard query is the most common reason ClickHouse deployments feel slow after initial setup. The data is large, queries scan a lot of it, and dashboards that load in seconds at first take 10+ seconds as the table grows.
The fix is pre-aggregation. A materialized view computes aggregates at insert time and writes them to a rollup table. Dashboard queries hit the rollup, not the raw table. The pattern and options for materialized views in ClickHouse include backfilling historical data when you add a new rollup.
CREATE TABLE events_daily
(
day Date,
org_id String,
event_name LowCardinality(String),
events UInt64,
unique_users AggregateFunction(uniq, String),
revenue AggregateFunction(sum, Float64)
)
ENGINE = AggregatingMergeTree
PARTITION BY toYYYYMM(day)
ORDER BY (org_id, day, event_name);
CREATE MATERIALIZED VIEW events_daily_mv TO events_daily AS
SELECT
toDate(event_time) AS day,
org_id,
event_name,
count() AS events,
uniqState(user_id) AS unique_users,
sumState(revenue) AS revenue
FROM events
GROUP BY day, org_id, event_name;
-- Dashboard query reads rollup, not raw events
SELECT
day,
event_name,
sum(events) AS total_events,
uniqMerge(unique_users) AS unique_users,
sumMerge(revenue) AS revenue
FROM events_daily
WHERE org_id = 'acme'
AND day >= today() - 30
GROUP BY day, event_name
ORDER BY day;
A 30-day dashboard query now reads at most 30 rows per org/event-name combination, regardless of whether the underlying table has 1 billion or 1 trillion rows. Query latency stays flat as data grows.
Diagnosing slow queries
ClickHouse's system.query_log records every query with rows read, bytes read, and duration. Use it to find what's slow:
SELECT
substr(query, 1, 120) AS query_preview,
read_rows,
formatReadableSize(read_bytes) AS bytes_read,
query_duration_ms,
user
FROM system.query_log
WHERE type = 'QueryFinish'
AND event_time >= now() - INTERVAL 1 HOUR
AND query_duration_ms > 500
ORDER BY read_bytes DESC
LIMIT 20;
A query with high read_bytes relative to read_rows is reading wide rows, often SELECT * from a wide table. A query with high read_rows but low read_bytes is reading narrow data but scanning too many rows, usually a sort key miss.
For deeper diagnosis, use EXPLAIN PIPELINE to see the query execution plan:
EXPLAIN PIPELINE
SELECT org_id, count(), uniq(user_id)
FROM events
WHERE event_time >= now() - INTERVAL 7 DAY
GROUP BY org_id;
The pipeline output shows which indexes are used, how many granules are read, and where parallelism happens. A query using no index shows FullScan. A query using the sort key shows PrimaryKeyCondition. If you see FullScan on a query you expect to be indexed, check for function wrapping on filter columns.
Common fixes for BI tool-generated SQL
When a BI tool connects to ClickHouse, the SQL it generates often bypasses the optimizations above:
| BI-generated SQL | Problem | Fix |
|---|---|---|
SELECT * | Reads all columns | Define a view with only the needed columns |
WHERE toDate(ts) = ... | Wraps sort key | Rewrite as range filter in the BI tool |
COUNT(DISTINCT x) | Slow exact count | Use uniq() in a pre-aggregated layer |
| Query hits raw table | No pre-aggregation | Point BI tool at rollup table |
| No partition filter | Reads all partitions | Add event_time filter to every query |
The cleanest fix is to create dedicated views or rollup tables that the BI tool queries, designed so that even a SELECT * on them is fast. The view selects only the columns the BI tool uses, filters to the relevant time range, and reads from the pre-aggregated table.
Tinybird for fast query APIs
For query APIs that need to be fast under concurrent load, Tinybird is managed ClickHouse with analytics at scale and observability built in. Pipes let you define the optimized query once (with proper sort key usage, PREWHERE, and rollup tables) and expose it as a parameterized HTTP endpoint. The built-in query analytics show you latency percentiles, cache hit rates, and read bytes per endpoint, so you can identify regressions before your users do.
