---
title: "How to optimize ClickHouse® for high-throughput streaming"
excerpt: "ClickHouse streaming analytics processes millions of events per second. Here's how to build pipelines that keep up with your data."
authors: "Cameron Archer"
categories: "AI Resources"
createdOn: "2025-10-09 22:02:31"
publishedOn: "2025-10-17 22:02:31"
updatedOn: "2025-10-17 22:02:31"
status: "published"
---

Streaming analytics systems process millions of events per second, but most databases struggle to ingest data fast enough while still returning query results in milliseconds. ClickHouse{% sup %}®{% /sup %} is built differently: its columnar storage, vectorized execution, and merge tree architecture can handle both high-velocity writes and sub-second analytical queries on billions of rows.

This guide covers how to optimize ClickHouse{% sup %}®{% /sup %} for high-throughput streaming workloads, from schema design and ingestion tuning to query optimization and infrastructure decisions. You'll learn the specific settings, patterns, and tradeoffs that determine whether your system handles thousands or millions of events per second. Unlike complex stream processors like Apache Flink, which [most teams don't actually need](https://www.tinybird.co/blog-posts/flink-is-95-problem), ClickHouse{% sup %}®{% /sup %} provides a simpler approach using familiar SQL.

## Why ClickHouse{% sup %}®{% /sup %} excels at high-throughput streaming analytics

ClickHouse{% sup %}®{% /sup %} is a columnar database designed for analytical processing, which makes it particularly well-suited for streaming analytics workloads. The database stores data by column rather than by row, which means queries that aggregate or filter specific fields can skip reading unnecessary data entirely.

This columnar approach delivers compression ratios of [10x or more](https://clickhouse.com/engineering-resources/database-compression) compared to row-oriented databases. When you're ingesting millions of events per second, that compression directly reduces both storage costs and the amount of data that queries need to scan.

ClickHouse{% sup %}®{% /sup %} uses [vectorized query execution](https://clickhouse.com/docs/en/introduction/performance.html), processing data in batches rather than one row at a time. Modern CPUs can apply the same operation to multiple values simultaneously through SIMD instructions, which is why [ClickHouse{% sup %}®{% /sup %} queries](https://www.tinybird.co/blog-posts/clickhouse-query-examples) can return results in under 100 milliseconds even when scanning billions of rows.

The database scales both vertically and horizontally. You can add more CPU cores and memory to a single server, or you can distribute data across multiple shards with replication for fault tolerance.

## Core architecture for real-time ingestion and queries

A typical [streaming analytics setup](https://www.tinybird.co/blog-posts/real-time-streaming-data-architectures-that-scale) with ClickHouse{% sup %}®{% /sup %} involves four main components that work together. Each component handles a specific part of the data flow, from ingestion to serving queries.

### 1. Streaming sources to buffer tables

Data arrives from sources like Apache Kafka, webhooks, or HTTP APIs. ClickHouse{% sup %}®{% /sup %} can consume directly from Kafka using the Kafka table engine, which creates a consumer that reads messages and writes them to a target table.

Buffer tables accumulate incoming data in memory before flushing to disk. This reduces the number of small data parts created during high-velocity ingestion, which otherwise slows down background merges and queries.

You can also batch inserts on the client side, which gives you more control over batch sizes and error handling. For detailed examples of client-side batch inserts from application code in Python, see our [ClickHouse{% sup %}®{% /sup %} Python batch insert guide](https://www.tinybird.co/blog-posts/clickhouse-python-example). The tradeoff is that you're responsible for managing consumer logic and handling failures.

### 2. Materialized views for roll-ups

[Materialized views in ClickHouse{% sup %}®{% /sup %}](https://www.tinybird.co/blog-posts/what-are-materialized-views-and-why-do-they-matter-for-realtime) automatically transform or aggregate data as it's written to a source table. This pre-computation means queries can read from smaller, pre-aggregated tables instead of scanning raw events. Learn how to [create materialized views in ClickHouse{% sup %}®{% /sup %}](https://www.tinybird.co/blog-posts/clickhouse-create-materialized-view-example) with complete syntax examples.

Common patterns include hourly summaries, top-N calculations for most active users, and approximate cardinality using HyperLogLog. Each materialized view writes to its own MergeTree table that you can query independently.

The tradeoff is that backfilling becomes more complex. If you need to recalculate historical aggregates, you'll need to rebuild the materialized view or manually insert corrected data.

### 3. Denormalised analytics tables

ClickHouse{% sup %}®{% /sup %} performs best when related data lives in the same table rather than split across multiple tables with joins. This means duplicating user attributes into each event row instead of storing them separately.

For example, instead of joining an events table with a users table, you might embed user_name, user_tier, and user_region directly into each event. This increases storage slightly but eliminates expensive join operations.

The tradeoff is write amplification. When a user attribute changes, you may need to update many rows. For streaming analytics where reads vastly outnumber writes, this tradeoff usually makes sense.

### 4. Parameterised API endpoints

Exposing ClickHouse{% sup %}®{% /sup %} queries as REST APIs lets applications fetch analytics without writing SQL directly. Parameterized queries accept filters, limits, and other options through URL parameters while preventing SQL injection.

For multi-tenant applications, row-level security filters data based on the authenticated user or API token. This ensures each tenant sees only their own data when querying shared tables.

Caching API responses reduces query load for frequently accessed data. Rate limiting protects your cluster from runaway queries.

## Designing tables for million-row-per-second inserts

Schema design directly affects how much data ClickHouse{% sup %}®{% /sup %} can ingest per second and how fast queries run. The right table engine, sorting key, and partitioning strategy can be the difference between handling [millions of rows per second](https://www.tinybird.co/blog-posts/1b-rows-per-second-clickhouse) versus struggling with thousands.

### 1. Choose the right engine and order-by key

The MergeTree family of engines is designed for high-throughput inserts and fast analytical queries. ReplicatedMergeTree adds replication across servers for fault tolerance. Specialized variants like SummingMergeTree and AggregatingMergeTree pre-aggregate data during background merges.

The `ORDER BY` clause determines how data is sorted on disk. For streaming workloads, common patterns include:

- **Timestamp first:** Orders data chronologically, which works well for time-range queries
- **Device or user ID first:** Groups related events together, useful for session analysis
- **Tuple of hot filters:** Combines frequently filtered columns to maximize index effectiveness

The primary key in ClickHouse{% sup %}®{% /sup %} is a prefix of the `ORDER BY` key used for sparse indexing. Unlike traditional databases, ClickHouse{% sup %}®{% /sup %} doesn't enforce uniqueness. Instead, it creates granules (blocks of rows) and stores min/max values for each granule, letting queries skip entire granules that don't match filters.

### 2. Partition by time and cardinality

Partitioning splits a table into separate directories based on a partition key, typically a time-based column like date or hour. This makes dropping old data fast and can speed up queries that filter by the partition key.

Too many partitions hurt performance though. Each partition creates separate files on disk, and ClickHouse{% sup %}®{% /sup %} tracks metadata for all of them. A good guideline is keeping partitions under a few thousand. For most streaming workloads, daily or hourly partitions work well.

Late-arriving data can still be inserted after a partition is finalized, but it may trigger additional merges. If late arrivals are common, consider using longer partition intervals.

### 3. Apply codecs and compression wisely

ClickHouse{% sup %}®{% /sup %} supports several compression codecs that trade CPU time for storage savings. LZ4 is the default and balances speed with compression. ZSTD provides better compression ratios but uses more CPU during both compression and decompression.

Per-column codecs can optimize specific data types:

- **Delta and DoubleDelta:** Work well for timestamps and monotonically increasing values
- **Gorilla:** Designed for floating-point time series data
- **LZ4 vs ZSTD:** LZ4 for hot data queried frequently, ZSTD for cold data rarely accessed

Higher compression uses more CPU during writes and reads. For data queried often, LZ4 minimizes query latency. For data rarely accessed, ZSTD significantly reduces storage costs.

### 4. Avoid costly mutations and TTL clashes

`ALTER TABLE` operations that modify or delete existing data (mutations) are expensive because they rewrite entire data parts. Frequent mutations slow ingestion by competing for disk I/O and CPU with background merges.

For workloads that need updates or deletes, ReplacingMergeTree with a version column offers an alternative. This engine automatically deduplicates rows with the same `ORDER BY` key during merges, keeping only the row with the highest version number.

TTL (Time To Live) rules automatically delete old data based on a time column. However, TTL operations can trigger large numbers of merges if not configured carefully. Align TTL intervals with partition boundaries to minimize merge overhead.

## Optimizing merge tree settings for streaming workloads

ClickHouse{% sup %}®{% /sup %} continuously merges small data parts into larger ones in the background. Tuning these merge settings significantly impacts both ingestion throughput and query performance.

### Adjusting `max_partitions_to_read`

The `max_partitions_to_read` setting limits how many partitions a single query can scan. This prevents expensive queries from accidentally reading months of data when they only need a few days.

You can set this limit globally in `users.xml` or per-query using the `SETTINGS` clause. For streaming analytics where queries typically focus on recent data, a limit of 100-1000 partitions is often reasonable.

When a query exceeds this limit, ClickHouse{% sup %}®{% /sup %} returns an error rather than executing. This acts as a guardrail against runaway queries.

### Controlling background merges

ClickHouse{% sup %}®{% /sup %} uses [background threads to merge](https://www.tinybird.co/blog-posts/what-i-learned-operating-clickhouse) small data parts into larger ones. The `background_pool_size` setting controls how many merge threads run concurrently. More threads speed up merges but increase CPU and disk I/O usage.

The `max_bytes_to_merge_at_max_space_in_pool` setting limits the maximum size of parts that can be merged together. Larger values reduce the number of parts on disk, improving query performance, but increase time and resources needed for each merge.

For high-throughput ingestion, you may need to increase background threads to keep up with incoming data. However, too many threads can starve queries of resources.

### Fine-tuning `insert_quorum` and replication

For replicated tables, `insert_quorum` controls how many replicas acknowledge a write before it's considered successful. Setting this to 1 (the default) maximizes throughput by acknowledging writes as soon as one replica receives the data.

Higher quorum values increase durability at the cost of write latency. For streaming workloads where some data loss is acceptable, the default of 1 is often sufficient.

Async inserts (`async_insert=1`) buffer multiple small inserts into larger batches before writing to disk. This dramatically improves throughput for high-frequency, small-batch workloads, but adds a small delay (controlled by `async_insert_busy_timeout`) before data becomes visible to queries.

## Tuning ingestion from Kafka and other streams

The Kafka table engine provides native integration with Kafka topics, but it requires careful configuration to achieve high throughput and low latency.

### 1. Kafka table settings

The `kafka_num_consumers` setting controls how many consumer threads ClickHouse{% sup %}®{% /sup %} creates to read from Kafka. More consumers increase throughput but also increase memory usage and concurrent connections to Kafka. With proper configuration, Kafka integration can achieve [2 million rows per second](https://mux.com/blog/latency-and-throughput-tradeoffs-of-clickhouse-kafka-table-engine) on a single server.

The `kafka_max_block_size` setting determines how many rows ClickHouse{% sup %}®{% /sup %} accumulates before flushing to the target MergeTree table. Larger blocks reduce the number of small parts created, improving merge performance, but increase memory usage and delay before data becomes queryable.

ClickHouse{% sup %}®{% /sup %} supports several message formats:

- **JSONEachRow:** Human-readable but slower to parse
- **Protobuf and Avro:** Binary formats that are faster and more compact, but require schema definitions

### 2. Batched synchronous vs async inserts

For direct inserts from application code, batching multiple rows into a single `INSERT` statement dramatically improves throughput. The `max_insert_block_size` setting controls the default batch size, though you can override it per query.

Synchronous inserts wait for data to be written to disk before returning. This provides strong consistency but limits throughput. Async inserts (`async_insert=1`) buffer data in memory and return immediately, then flush to disk in the background.

The tradeoff with async inserts is that data may be lost if the server crashes before flushing. For many streaming analytics use cases where some data loss is acceptable, async inserts provide a significant throughput boost.

### 3. Handling schema evolution without downtime

Adding new columns to a ClickHouse{% sup %}®{% /sup %} table is fast and doesn't require rewriting existing data. Use `ALTER TABLE ADD COLUMN` with a default value, and new rows will include the column while old rows use the default.

For more complex schema changes, create a new materialized view that reads from the source table and writes to a new target table with the updated schema. This allows [gradual migration without stopping ingestion](https://www.tinybird.co/blog-posts/clickhouse-schema-migration-while-streaming).

ReplacingMergeTree can also help with schema migrations by allowing you to insert updated versions of rows with the same key. During merges, ClickHouse{% sup %}®{% /sup %} keeps only the latest version.

## Minimising query latency on hot data

Even with optimized ingestion, queries can be slow if they scan too much data or don't use ClickHouse{% sup %}®{% /sup %}'s indexing features. For visualizing performance metrics, you can [connect Grafana to ClickHouse{% sup %}®{% /sup %}](https://www.tinybird.co/blog-posts/clickhouse-grafana-example) to monitor query latency and throughput in real-time.

### Bloom filters and skip indexes

Skip indexes (also called data skipping indexes) let ClickHouse{% sup %}®{% /sup %} skip entire granules of data that don't match a query's `WHERE` clause. Bloom filters (`tokenbf_v1`) are particularly useful for high-cardinality string columns like user IDs or session IDs.

To add a Bloom filter index, use `ALTER TABLE ADD INDEX` with the `tokenbf_v1` type. The index is built asynchronously during background merges.

Choose indexes based on your most common query patterns. If you frequently filter by `user_id`, add a Bloom filter on that column. If you filter by timestamp ranges, the primary index (based on `ORDER BY`) is usually sufficient.

### Using denormalised aggregates

Pre-computed aggregate tables can dramatically reduce query latency for common dashboards and reports. Instead of calculating sums or averages on the fly, you query a table that already contains the results.

[Materialized views](https://www.tinybird.co/blog-posts/clickhouse-create-materialized-view-example) can maintain these aggregates incrementally as new data arrives. For example, a materialized view might calculate hourly active users and write results to a separate table that's much faster to query than the raw event stream.

The tradeoff is freshness. If you need up-to-the-second accuracy, pre-computed aggregates may not be sufficient. However, for many use cases, a delay of a few seconds or minutes is acceptable in exchange for much faster queries.

### Balancing `max_threads` and concurrency

The `max_threads` setting controls how many CPU cores ClickHouse{% sup %}®{% /sup %} uses for a single query. More threads can speed up large queries but increase CPU usage and can starve other queries of resources.

For streaming analytics workloads with many concurrent queries, you may want to limit `max_threads` to prevent any single query from monopolizing the CPU. A value of 4-8 threads per query is often a good starting point.

The `max_concurrent_queries` setting limits the total number of queries that can run simultaneously. This prevents overload during traffic spikes and ensures each query gets a reasonable share of CPU and memory.

## Next steps to ship streaming analytics features with Tinybird

[Tinybird](https://tinybird.co) is a [managed ClickHouse{% sup %}®{% /sup %} platform](https://www.tinybird.co/product/managed-clickhouse) designed for developers who want to integrate real-time analytics into applications without managing infrastructure. The platform handles ingestion, storage, query optimization, and API creation.

You can get started with Tinybird by creating a free account at [https://cloud.tinybird.co/signup](https://cloud.tinybird.co/signup). The free tier includes generous limits for development and testing, with no credit card required.

Tinybird provides rapid ingestion from Kafka, webhooks, and other streaming sources, along with real-time SQL pipes that transform and aggregate data as it arrives. You can deploy parameterized API endpoints with built-in authentication, rate limiting, and row-level security.

The platform includes a CLI for local development and CI/CD integration, allowing you to define data pipelines as code, test them locally, and deploy to production with a single command.

## FAQs about streaming analytics in ClickHouse{% sup %}®{% /sup %}

### How do I secure per-tenant analytics APIs?

Row-level security (RLS) filters data based on the authenticated user or API token, ensuring each tenant only sees their own data. You can implement RLS using query parameters, user attributes stored in ClickHouse{% sup %}®{% /sup %}, or token scopes that encode tenant IDs.

API token scopes limit which endpoints a token can access and which data it can read. Query parameter validation prevents SQL injection and ensures only safe values are passed to queries.

### Can I colocate ClickHouse{% sup %}®{% /sup %} servers with Kafka brokers?

Colocating ClickHouse{% sup %}®{% /sup %} and Kafka in the same availability zone or rack can minimize network latency, which is important for high-throughput streaming workloads. However, ensure each service has sufficient resources and they don't compete for CPU, memory, or disk I/O.

For cross-zone or cross-region replication, ensure you have sufficient network bandwidth and low latency. Plan for the additional cost of data transfer, which can be significant for high-volume streaming workloads.

### What is the safest way to back up streaming tables?

Incremental backups with `BACKUP TABLE` to external object storage like S3 or Google Cloud Storage provide a good balance of safety and efficiency. These backups capture only data that has changed since the last backup, reducing storage costs and backup time.

Snapshot replicas during low-traffic windows to minimize impact on production queries. Verify restore procedures regularly to ensure backups are valid and you can recover quickly in case of data loss.
