ClickHouse® + Redis — 3 Ways to Connect in {{ year }}
These are the main options for a ClickHouse® integration Redis pipeline:
- Tinybird
- ClickHouse® Cloud + ClickPipes (Kafka source)
- Self-managed or custom (Redis Streams, Kafka, or batch export)
Redis is an in-memory database used for caching, session management, pub/sub messaging, and real-time counters. Many teams want that data in ClickHouse® for analytical queries, reporting, and real-time dashboards at scale.
A ClickHouse® integration Redis setup uses Redis Streams, Kafka bridging, or batch export to move Redis data into ClickHouse® for real-time analytics without putting analytical load on Redis itself.
Below we compare the three options in depth—architecture, code, trade-offs, and when to use each.
Looking for minimal ops and instant APIs?
Tinybird combines managed ingestion from Redis Streams or Kafka, managed ClickHouse®, and one-click API publishing from SQL—no ClickHouse® Kafka engine to operate yourself.
Three ways to implement ClickHouse® integration Redis
This section is the core: the three options to connect Redis to ClickHouse®, in order.
Option 1: Tinybird — managed ClickHouse® with API layer
Tinybird is a real-time data platform built on ClickHouse®. It combines ingestion, storage, and API publishing in one product.
How it works: you run a Redis Streams consumer (Python, Node.js, or any language) that reads entries and POSTs them to the Tinybird Events API over HTTP. No Kafka required. Alternatively, bridge Redis Streams → Kafka (via Kafka Connect Redis Source connector), then connect Tinybird's Kafka connector to the topic.
Data lands in Tinybird's ClickHouse®-backed data sources. You define Pipes (SQL) and publish them as REST endpoints with a single click.
Redis Streams consumer → Tinybird Events API (Python example):
import redis
import requests
r = redis.Redis(host='localhost', port=6379)
TINYBIRD_EVENTS_URL = "https://api.tinybird.co/v0/events?name=redis_events"
HEADERS = {"Authorization": "Bearer <YOUR_TOKEN>"}
last_id = "0-0"
while True:
entries = r.xread({"events": last_id}, block=5000, count=100)
if entries:
for stream, messages in entries:
for msg_id, data in messages:
payload = {k.decode(): v.decode() for k, v in data.items()}
payload["stream_id"] = msg_id.decode()
requests.post(TINYBIRD_EVENTS_URL, headers=HEADERS, json=payload)
last_id = msg_id
When Tinybird fits:
- You want getting Redis data into ClickHouse® with minimal ops
- You need APIs and dashboards from the same data
- You prefer a Redis to ClickHouse® pipeline with an API layer built in
Prerequisites: Redis 5+ (Streams enabled), and either a Tinybird account for HTTP push or a Kafka cluster for the Kafka connector path.
Option 2: ClickHouse® Cloud + ClickPipes (Kafka source)
ClickHouse® Cloud's ClickPipes supports Kafka as a data source. There is no native Redis connector.
How it works: you configure a Kafka Connect Redis Source connector that reads from Redis Streams and writes to a Kafka topic. Then create a Kafka ClickPipe in the ClickHouse® Cloud console pointing to your broker and topic.
Kafka Connect Redis Source connector config:
name=redis-streams-source
connector.class=com.redis.kafka.connect.RedisCacheSourceConnector
redis.hosts=redis://localhost:6379
redis.stream.name=events
kafka.topic=redis-events
tasks.max=1
ClickPipe destination table (ClickHouse® SQL):
CREATE TABLE redis_events
(
stream_id String,
event_type LowCardinality(String),
user_id String,
value String,
ts DateTime DEFAULT now()
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(ts)
ORDER BY (event_type, ts);
When it fits:
- You want managed ClickHouse® with a Kafka-based ingestion path
- You're already on ClickHouse® Cloud and can configure Kafka Connect
- Your main need is data replication; you'll build your own API or BI layer
Prerequisites: Redis with Streams enabled, Kafka Connect cluster, Redis Source connector (e.g. JedisMQ or Redis Kafka Connect). Data flows: Redis Streams → Kafka Connect → Kafka → ClickPipes → ClickHouse® Cloud.
Option 3: Self-managed or custom (Redis Streams, Kafka, or batch)
With self-managed ClickHouse®, the streaming pattern uses a Redis Streams consumer → Kafka → Kafka table engine → materialized view → MergeTree.
ClickHouse® Kafka table engine + materialized view:
-- Kafka engine (reads from Kafka topic)
CREATE TABLE redis_events_kafka
(
stream_id String,
event_type String,
user_id String,
value String,
ts DateTime
)
ENGINE = Kafka
SETTINGS
kafka_broker_list = 'kafka:9092',
kafka_topic_list = 'redis-events',
kafka_group_name = 'clickhouse_consumer',
kafka_format = 'JSONEachRow';
-- Target MergeTree table
CREATE TABLE redis_events
(
stream_id String,
event_type LowCardinality(String),
user_id String,
value String,
ts DateTime
)
ENGINE = ReplacingMergeTree()
PARTITION BY toYYYYMM(ts)
ORDER BY (event_type, stream_id);
-- Materialized view wires the two tables
CREATE MATERIALIZED VIEW redis_events_mv TO redis_events AS
SELECT * FROM redis_events_kafka;
Batch export is an alternative for non-real-time needs: periodically scan Redis keyspaces (SCAN), read values with HGETALL or LRANGE, serialize to JSON, and INSERT into ClickHouse®.
When it fits:
- You already run ClickHouse® and Kafka and want full control
- You have data-engineering capacity to manage the full stack
- Batch export works when sub-minute freshness is not required
Decision framework: which option fits your situation
The right choice depends on four variables: freshness requirement, team capacity, whether you need an API layer, and cost tolerance.
| Situation | Recommended option |
|---|---|
| Real-time analytics, need REST APIs, minimal ops | Tinybird |
| Already on ClickHouse® Cloud, have Kafka | ClickPipes + Kafka |
| Self-managed ClickHouse®, full control | Self-managed |
| Batch analytics, no real-time requirement | Batch SCAN export |
| Cost is priority, have eng capacity | Self-managed |
Choose Tinybird when you need real-time data ingestion from Redis and REST APIs from the same data—without operating ClickHouse® infrastructure or a Kafka cluster.
Choose ClickPipes when you're already on ClickHouse® Cloud and have Kafka running. You get managed ClickHouse® ingestion but build your own serving layer.
Choose self-managed when you have a data-engineering team comfortable operating Redis, Kafka, and ClickHouse®. You get full schema control and no managed-service dependency.
Summary table
| Option | Ingestion path | API layer | Ops burden | Kafka required |
|---|---|---|---|---|
| Tinybird | HTTP (Events API) or Kafka | Built in (Pipes) | Low | No |
| ClickHouse® Cloud ClickPipes | Kafka | Build your own | Medium | Yes |
| Self-managed | Kafka or batch SCAN | Build your own | High | Depends |
What is Redis and why integrate it with ClickHouse®?
Redis as the data source
Redis is an in-memory database that supports strings, hashes, lists, sets, sorted sets, streams, and HyperLogLog. Teams use it for caching, session storage, rate limiting, leaderboards, pub/sub, and real-time counters.
Redis excels at single-digit millisecond latency for individual key lookups. But Redis is memory-bound: it is not designed for analytical queries over large time ranges, high-cardinality groupings, or long-retention historical data.
A ClickHouse® integration Redis keeps Redis for operational writes while analytics run against a replica in ClickHouse®.
How to get data out of Redis
Redis Streams (Redis 5+) provide a persistent, append-only log with consumer groups. Each entry has a unique ID (<timestamp>-<seq>) and a key-value field map. This is the ideal egress mechanism for streaming into ClickHouse® or Tinybird.
Keyspace notifications let you react to key mutations in real time. Kafka Connect has community Redis Source connectors that read from Streams and publish to Kafka topics.
For batch: use SCAN to iterate keyspaces and read values with type-specific commands (HGETALL, LRANGE, SMEMBERS).
Why route Redis data to ClickHouse®
A Redis to ClickHouse® pipeline enables queries Redis can't handle: historical aggregations, time-window rollups, and user-facing analytics dashboards over billions of events.
Streaming via Redis Streams keeps the replica near real-time (seconds latency). Batch export via SCAN is simpler when freshness requirements are relaxed. Either way, you keep Redis as the operational store and use ClickHouse® for analytical scale.
Schema and pipeline design
Mapping Redis data structures to ClickHouse® tables
Redis data is schemaless and heterogeneous. Moving to ClickHouse® requires an explicit schema.
Redis Streams entries map cleanly:
CREATE TABLE redis_stream_events
(
stream_id String, -- Redis entry ID: "1712345678901-0"
event_type LowCardinality(String),
user_id String,
payload String, -- JSON blob for variable fields
ts DateTime64(3) -- Derived from stream ID timestamp
)
ENGINE = ReplacingMergeTree(ts)
PARTITION BY toYYYYMM(ts)
ORDER BY (event_type, user_id, stream_id);
Hashes (user profiles, sessions) → one row per key, wide table. Sorted sets (leaderboards) → key, score, member columns.
Deduplication with ReplacingMergeTree
Redis Streams consumer groups deliver at-least-once. Use stream_id as the deduplication key and ReplacingMergeTree to collapse replays:
-- At query time, deduplicate with FINAL
SELECT event_type, count() AS total
FROM redis_stream_events FINAL
WHERE ts >= now() - INTERVAL 1 HOUR
GROUP BY event_type
ORDER BY total DESC;
Failure modes to plan for
- Memory eviction: Redis may evict keys under memory pressure. Consumer groups should ACK only after confirmed ingestion into ClickHouse® or Tinybird.
- Stream backlog: Monitor
XLENand consumer lag viaXPENDING. Alert when lag exceeds your SLA. - Schema drift: New fields in Redis Streams entries require updating the destination schema. Plan for nullable columns or a flexible JSON column.
- Telegraf buffer overflow: If using Telegraf as a bridge, set
metric_buffer_limitand configure a fallback output.
Why ClickHouse® for Redis analytics
ClickHouse® is a columnar OLAP database built for analytical queries over large datasets. MergeTree tables and vectorized execution deliver sub-second queries on billions of rows—exactly the scale Redis events generate at production volume.
ClickHouse® natively supports time-series functions (toStartOfMinute, toStartOfHour), window functions, and materialized views for pre-aggregated rollups. A ClickHouse® integration Redis setup fits real-time analytics, historical event analytics, and real-time data processing at the scale Redis alone cannot serve.
Columnar compression also reduces storage costs dramatically—typical event data compresses 5x–20x compared to row-oriented stores.
Why Tinybird is the best Redis to ClickHouse® option
Most teams don't need to operate ClickHouse® infrastructure—they need fast analytics on Redis data exposed as APIs.
Tinybird is purpose-built for this. You configure a Redis Streams consumer or Kafka bridge that sends data to Tinybird's Events API or Kafka connector. Tinybird handles ingestion, storage, and API publishing in one product.
You avoid operating the ClickHouse® Kafka engine and the overhead of maintaining a separate API layer. Define Pipes in SQL, publish as REST endpoints, and serve dashboards or product features with sub-100ms latency and automatic scaling.
No VPC configuration, no cluster management, no API gateway to build. You focus on schema and pipe logic; Tinybird runs the infrastructure.
Frequently Asked Questions (FAQs)
Does ClickHouse® Cloud support Redis natively?
ClickHouse® Cloud does not have a native Redis connector in ClickPipes. You use the Kafka data source: configure a Kafka Connect Redis Source connector that reads from Redis Streams and writes to Kafka, then create a Kafka ClickPipe to load into ClickHouse® Cloud.
You operate the Redis → Kafka path; ClickHouse® Cloud ingests from Kafka.
Can I use Tinybird for Redis to ClickHouse® without Kafka?
Yes. Push Redis Streams data into Tinybird via the Events API (HTTP): a Redis Streams consumer reads entries and POSTs them directly to Tinybird. No Kafka cluster required.
Tinybird stores data in ClickHouse®-backed data sources and lets you publish Pipes as REST APIs. You own the Redis consumer; Tinybird is the destination and API layer.
How do I handle duplicates in a Redis to ClickHouse® pipeline?
Use the Redis Stream entry ID (e.g. 1712345678901-0) as a deduplication key. Configure ReplacingMergeTree with a version derived from the stream ID timestamp.
With at-least-once delivery from Redis Streams consumer groups, the same entry may arrive more than once on failure. ReplacingMergeTree collapses duplicates during background merges. Use FINAL at query time for strong deduplication consistency.
What's the latency of a ClickHouse® integration Redis pipeline?
With Tinybird Events API (HTTP push), end-to-end latency from Redis entry to queryable ClickHouse® data is typically under 5 seconds. With a Kafka bridge, latency depends on your Kafka producer batch settings (typically 1–10 seconds).
Batch SCAN export latency equals your scan schedule interval—minutes to hours.
Is ClickHouse® integration Redis good for leaderboard analytics?
Yes. Redis sorted sets power real-time leaderboards at low latency. But they can't answer historical questions: "What was the rank distribution last week?" or "Which users consistently rank in the top 10%?"
Replicate sorted set data into ClickHouse® via Redis Streams and run time-window analytics, cohort analysis, and long-retention queries in SQL.
What Redis data structures work best for ClickHouse® integration?
Redis Streams are the best fit—they provide a durable, ordered, append-only log with consumer group semantics. Every write is captured and can be replayed.
Hashes and sorted sets work with batch SCAN export but lose intermediate state between scans. Pub/sub messages are ephemeral and must be captured in a consumer before they disappear.
