Name: Tinybird
Brand: Tinybird
Rating: 5.0 (10 reviews)

Kafka is capable of producing millions of events per second, but those events only become useful when you can consume and query them. ClickHouse® is a popular database for analyzing Kafka topic streams, and there are several ways to consume Kafka streams into ClickHouse®, directly through its built-in Kafka table engine as well as through managed connectors from Tinybird

This guide walks through complete examples of connecting Kafka to ClickHouse®, from basic table setup to production-ready streaming pipelines with materialized views and API endpoints.

What is the ClickHouse Kafka engine?

The Kafka table engine is a built-in ClickHouse feature that reads streaming data directly from Apache Kafka topics. It acts as a consumer that continuously pulls messages from Kafka and makes them queryable in ClickHouse without needing separate ETL tools, batch loading scripts, or complex stream processors like Flink, which add unnecessary complexity for most teams.

Unlike batch ingestion that loads data at scheduled intervals, the Kafka engine provides continuous data flow that 90% of organizations consider important or very important for their analytics needs. Messages arrive in ClickHouse as soon as they're published to Kafka, so your analytics reflect what's happening right now rather than what happened hours ago. This is a capability that 59% of SMBs are now using for real-time analytics.

Here's how it works: you create a special table type that connects to your Kafka cluster and subscribes to one or more topics. When you query this table, ClickHouse reads the latest messages from Kafka. The data isn't stored in this table permanently though, so you'll typically use a materialized view to move it into a MergeTree table for long-term storage.

Prerequisites for a Kafka-to-ClickHouse pipeline

Before connecting Kafka to ClickHouse, you'll want a few components in place. These requirements make sure your pipeline can establish connections, authenticate properly, and handle data flow between systems.

Kafka broker

You'll want a running Kafka instance with network accessibility from your ClickHouse server. The Kafka broker handles message storage and delivery, and you'll want permissions to create topics and produce or consume messages from them.

ClickHouse server or Tinybird workspace

You can use either a self-hosted ClickHouse installation or a managed ClickHouse service like Tinybird. Self-hosting gives you complete control but requires expertise in distributed systems, storage optimization, and performance tuning, while 57.7% of organizations have already moved to cloud-based streaming analytics solutions. Tinybird provides a managed ClickHouse service that handles infrastructure automatically, letting you focus on building data pipelines rather than managing clusters.

Network and auth requirements

Your ClickHouse server requires network access to your Kafka brokers, which might mean configuring firewall rules or security groups. You'll also want Kafka connection strings, authentication credentials (if your cluster uses SASL or SSL), and any consumer group configurations your organization requires.

Quickstart example: streaming JSON from Kafka to ClickHouse

This example walks through the complete workflow for streaming JSON data from a Kafka topic into a ClickHouse table using the Kafka table engine. You'll create a Kafka topic, define the necessary ClickHouse tables, and verify that data flows correctly through the pipeline.

1. Create a Kafka topic

First, create a Kafka topic to hold your streaming data. This command creates a topic called user_events with a single partition:

kafka-topics.sh --create --topic user_events \
  --bootstrap-server localhost:9092 \
  --partitions 1 \
  --replication-factor 1

2. Create the Kafka engine table

The Kafka engine table acts as a consumer that reads from your topic. This table definition specifies the Kafka broker address, topic name, consumer group, and message format:

CREATE TABLE user_events_kafka (
    user_id String,
    event_type String,
    timestamp DateTime64(3),
    properties String
)
ENGINE = Kafka
SETTINGS
    kafka_broker_list = 'localhost:9092',
    kafka_topic_list = 'user_events',
    kafka_group_name = 'clickhouse_consumer',
    kafka_format = 'JSONEachRow';

The JSONEachRow format expects one JSON object per line, which is how most Kafka producers send data.

3. Create the target MergeTree table

Data from the Kafka engine table requires a permanent home. Create a MergeTree table with the same schema to store your events:

CREATE TABLE user_events (
    user_id String,
    event_type String,
    timestamp DateTime64(3),
    properties String
)
ENGINE = MergeTree()
ORDER BY (event_type, timestamp);

The ORDER BY clause determines how ClickHouse sorts and stores data on disk, which affects query performance. Ordering by event_type and timestamp works well for queries that filter by event type and time range.

4. Insert sample messages

Push test messages to your Kafka topic using the Kafka console producer:

echo '{"user_id":"user_123","event_type":"page_view","timestamp":"2024-01-15 10:30:00.000","properties":"{}"}' | \
kafka-console-producer.sh --topic user_events --bootstrap-server localhost:9092

You can send multiple messages by repeating this command or by reading from a file containing one JSON object per line.

5. Verify the data

Query the Kafka engine table to see the latest messages. This query reads directly from Kafka without storing anything:

SELECT * FROM user_events_kafka LIMIT 5;

The Kafka engine table only shows messages that haven't been consumed yet. Once you set up a materialized view in the next section, those messages will move to permanent storage automatically.

Alternative approach: Stream with HTTP

If you don't need the full complexity of Kafka but still want streaming ingestion, Tinybird's Events API provides a lightweight HTTP-based alternative. Instead of managing Kafka brokers, topics, and consumer groups, you can stream data directly to ClickHouse using standard HTTP POST requests.

[object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object][object Object]

Skip the infra work. Deploy your first ClickHouse^®
project now.

Our Columns:

Skip the infra work. Deploy your first ClickHouse^®
project now.

Skip the infra work. Deploy your first ClickHouse^®
project now.

Kafka to ClickHouse^® examples: How to connect streaming data to ClickHouse^®

What is the ClickHouse Kafka engine?

Prerequisites for a Kafka-to-ClickHouse pipeline

Kafka broker

ClickHouse server or Tinybird workspace

Network and auth requirements

Quickstart example: streaming JSON from Kafka to ClickHouse

1. Create a Kafka topic

2. Create the Kafka engine table

3. Create the target MergeTree table

4. Insert sample messages

5. Verify the data

Alternative approach: Stream with HTTP

Skip the infra work. Deploy your first ClickHouse^®
project now.

Skip the infra work. Deploy your first ClickHouse® project now.

Our Columns:

Skip the infra work. Deploy your first ClickHouse® project now.

Skip the infra work. Deploy your first ClickHouse® project now.

Kafka to ClickHouse® examples: How to connect streaming data to ClickHouse®

What is the ClickHouse Kafka engine?

Prerequisites for a Kafka-to-ClickHouse pipeline

Kafka broker

ClickHouse server or Tinybird workspace

Network and auth requirements

Quickstart example: streaming JSON from Kafka to ClickHouse

1. Create a Kafka topic

2. Create the Kafka engine table

3. Create the target MergeTree table

4. Insert sample messages

5. Verify the data

Alternative approach: Stream with HTTP

Skip the infra work. Deploy your first ClickHouse® project now.

Skip the infra work. Deploy your first ClickHouse^®
project now.

Skip the infra work. Deploy your first ClickHouse^®
project now.

Skip the infra work. Deploy your first ClickHouse^®
project now.

Kafka to ClickHouse^® examples: How to connect streaming data to ClickHouse^®

Skip the infra work. Deploy your first ClickHouse^®
project now.