ClickHouse® + InfluxDB — 3 Ways to Connect in {{ year }}
These are the main options for a ClickHouse® integration InfluxDB pipeline:
- Tinybird
- ClickHouse® Cloud + ClickPipes (Kafka source)
- Self-managed or custom (Telegraf, Kafka, or batch via InfluxDB query API)
InfluxDB is a time-series database built for metrics, IoT sensor data, infrastructure monitoring, and time-stamped events. Many teams want a copy of that data in ClickHouse® for long-retention analytical queries, cross-measurement aggregations, and real-time dashboards at scale.
A ClickHouse® integration InfluxDB setup uses Telegraf outputs, Kafka bridging, or the InfluxDB query API to move InfluxDB metrics into ClickHouse® for real-time analytics without putting analytical load on InfluxDB.
Below we compare all three options in depth—architecture, real configuration examples, trade-offs, and when to use each.
Looking for minimal ops and instant APIs?
Tinybird combines managed ingestion from your Telegraf→Kafka or Telegraf→HTTP path, managed ClickHouse®, and one-click API publishing from SQL—no ClickHouse® Kafka engine to operate yourself.
Three ways to implement ClickHouse® integration InfluxDB
This section is the core: the three options to connect InfluxDB to ClickHouse®, in order.
Option 1: Tinybird — managed ClickHouse® with API layer
Tinybird is a real-time data platform built on ClickHouse®. It combines ingestion, storage, and API publishing in one product.
How it works: configure Telegraf with an HTTP or Kafka output plugin that forwards metrics to Tinybird. Use [[outputs.http]] to POST metrics directly to the Tinybird Events API, or [[outputs.kafka]] to write to Kafka consumed by the Tinybird Kafka connector.
Data lands in Tinybird's ClickHouse®-backed data sources. You define Pipes (SQL) and publish them as REST endpoints.
Telegraf config → Tinybird Events API:
# Read metrics from InfluxDB
[[inputs.influxdb]]
urls = ["http://localhost:8086/debug/vars"]
# Send to Tinybird Events API
[[outputs.http]]
url = "https://api.tinybird.co/v0/events?name=metrics"
method = "POST"
data_format = "json"
[outputs.http.headers]
Authorization = "Bearer ${TINYBIRD_TOKEN}"
Content-Type = "application/json"
# Flush every 10 seconds
flush_interval = "10s"
When Tinybird fits:
- You want getting InfluxDB data into ClickHouse® with minimal ops
- You need APIs and dashboards from the same metrics data
- You prefer an InfluxDB to ClickHouse® pipeline with an API layer built in
Prerequisites: Telegraf configured with [[outputs.http]] or [[outputs.kafka]]. Data flows: InfluxDB → Telegraf → Kafka or HTTP → Tinybird. Latency depends on your Telegraf flush_interval (typically 10 seconds).
Option 2: ClickHouse® Cloud + ClickPipes (Kafka source)
ClickHouse® Cloud's ClickPipes supports Kafka as a data source. There is no native InfluxDB connector.
How it works: configure Telegraf with a Kafka output plugin ([[outputs.kafka]]) that writes metrics to a topic. Then create a Kafka ClickPipe in the ClickHouse® Cloud console pointing to your broker and topic.
Telegraf config → Kafka:
[[inputs.influxdb]]
urls = ["http://localhost:8086/debug/vars"]
[[outputs.kafka]]
brokers = ["kafka:9092"]
topic = "influx-metrics"
data_format = "json"
routing_key = "measurement"
ClickHouse® Cloud destination table:
CREATE TABLE influx_metrics
(
ts DateTime64(3),
measurement LowCardinality(String),
host LowCardinality(String),
region LowCardinality(String),
cpu_usage Float64,
mem_usage Float64
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(ts)
ORDER BY (measurement, host, ts);
When it fits:
- You want managed ClickHouse® with a Kafka-based ingestion path
- You're already on ClickHouse® Cloud and can configure Telegraf→Kafka
- Your main need is metric replication; you'll build your own API or BI layer
Prerequisites: Telegraf with a Kafka output plugin. Data flows: InfluxDB → Telegraf → Kafka → ClickPipes → ClickHouse® Cloud.
Option 3: Self-managed or custom (Telegraf, Kafka, or batch Flux export)
With self-managed ClickHouse®, the streaming pattern uses Telegraf → Kafka → Kafka table engine → materialized view → MergeTree.
ClickHouse® Kafka table engine + materialized view:
-- Kafka engine reads from topic
CREATE TABLE influx_metrics_kafka
(
ts DateTime64(3),
measurement String,
host String,
region String,
cpu_usage Float64,
mem_usage Float64
)
ENGINE = Kafka
SETTINGS
kafka_broker_list = 'kafka:9092',
kafka_topic_list = 'influx-metrics',
kafka_group_name = 'clickhouse_influx_consumer',
kafka_format = 'JSONEachRow';
-- Target MergeTree table
CREATE TABLE influx_metrics
(
ts DateTime64(3),
measurement LowCardinality(String),
host LowCardinality(String),
region LowCardinality(String),
cpu_usage Float64,
mem_usage Float64
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(ts)
ORDER BY (measurement, host, ts);
-- Materialized view
CREATE MATERIALIZED VIEW influx_metrics_mv TO influx_metrics AS
SELECT * FROM influx_metrics_kafka;
Batch sync via Flux (InfluxDB 2.x/3.x) for periodic export:
from(bucket: "metrics")
|> range(start: -1h)
|> filter(fn: (r) => r._measurement == "cpu")
|> pivot(rowKey:["_time"], columnKey: ["_field"], valueColumn: "_value")
|> yield(name: "cpu_export")
Export results to CSV, then INSERT into ClickHouse® via the Python clickhouse-driver or the HTTP interface.
When it fits:
- You already run ClickHouse® and Kafka and want full control
- You have data-engineering capacity to manage the full stack
- Batch Flux export works when sub-minute freshness is not required
Decision framework: which option fits your situation
The right choice depends on four variables: freshness requirement, team capacity, whether you need an API layer, and cost tolerance.
| Situation | Recommended option |
|---|---|
| Real-time metrics APIs, minimal ops | Tinybird |
| Already on ClickHouse® Cloud, have Kafka + Telegraf | ClickPipes + Kafka |
| Self-managed ClickHouse®, full control | Self-managed |
| Long-retention analytics, batch is fine | Batch Flux export |
| IoT analytics with sub-second freshness | Tinybird (Kafka path) |
Choose Tinybird when you need real-time data ingestion from InfluxDB and REST APIs from the same data—without operating ClickHouse® infrastructure.
Choose ClickPipes when you're already on ClickHouse® Cloud and have Telegraf→Kafka running. You get managed ClickHouse® ingestion but build your own serving layer.
Choose self-managed when your team is comfortable with Telegraf, Kafka, and ClickHouse® and needs full control over schema and configuration.
Summary table
| Option | Ingestion path | API layer | Ops burden | Kafka required |
|---|---|---|---|---|
| Tinybird | Telegraf HTTP or Kafka | Built in (Pipes) | Low | No |
| ClickHouse® Cloud ClickPipes | Telegraf → Kafka | Build your own | Medium | Yes |
| Self-managed | Telegraf → Kafka or batch Flux | Build your own | High | Depends |
What is InfluxDB and why integrate it with ClickHouse®?
InfluxDB as the data source
InfluxDB is a purpose-built time-series database: data is organized into measurements (analogous to tables), tags (indexed metadata), and fields (measured values), with a mandatory timestamp.
InfluxDB excels at high-frequency metrics ingestion from Internet of Things (IoT) devices, infrastructure agents, and application instrumentation. It handles sub-second write throughput and provides fast point-in-time queries.
But InfluxDB is expensive for long-retention analytical workloads: aggregations across many measurements, multi-year historical analysis, and user-facing analytics dashboards strain InfluxDB at scale.
How to get data out of InfluxDB
Telegraf is the standard agent. Its output plugins write to Kafka ([[outputs.kafka]]), HTTP endpoints ([[outputs.http]]), or directly via the SQL output plugin ([[outputs.sql]]).
InfluxDB 2.x/3.x Tasks run scheduled Flux queries to export measurements. The InfluxDB v2 API supports querying and exporting data in CSV or annotated CSV format for batch ingestion.
For InfluxDB to ClickHouse® the typical pattern: configure Telegraf with a Kafka or HTTP output and forward metrics to Tinybird or ClickHouse® Cloud.
Why route InfluxDB metrics to ClickHouse®
An InfluxDB to ClickHouse® pipeline enables workloads InfluxDB handles poorly: multi-year historical analysis, cross-measurement joins, and real-time analytics APIs serving hundreds of concurrent users.
ClickHouse® achieves 2–3 million data points per second in multi-threaded ingestion and 10:1 to 30:1 compression ratios for time-series data—significantly more efficient than InfluxDB's TSM (Time-Structured Merge Tree) storage for analytical queries.
Schema and pipeline design
Mapping InfluxDB's tag + field model to ClickHouse® tables
InfluxDB's tag + field + timestamp model maps directly to ClickHouse® columns:
CREATE TABLE cpu_metrics
(
ts DateTime64(9), -- Nanosecond precision
host LowCardinality(String), -- Tag: low-cardinality indexed metadata
region LowCardinality(String), -- Tag
cpu LowCardinality(String), -- Tag: cpu0, cpu1, etc.
usage_user Float64, -- Field
usage_system Float64, -- Field
usage_idle Float64 -- Field
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(ts)
ORDER BY (host, cpu, ts); -- Tags first, then timestamp
Use LowCardinality(String) for tags (low-cardinality string metadata) for better compression and faster GROUP BY. Use Float64 or Int64 for fields (measured values). Use DateTime64(9) for nanosecond timestamp precision.
For measurements with many dynamic field names, consider a JSON column or key-value approach to avoid schema migrations:
CREATE TABLE metrics_flexible
(
ts DateTime64(3),
measurement LowCardinality(String),
tags Map(String, String), -- All tags as a map
fields Map(String, Float64) -- All numeric fields as a map
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(ts)
ORDER BY (measurement, ts);
Pre-aggregation with AggregatingMergeTree
ClickHouse® supports AggregatingMergeTree for pre-aggregated rollups—ideal for metrics at scale:
CREATE TABLE cpu_hourly
(
hour DateTime,
host LowCardinality(String),
avg_cpu AggregateFunction(avg, Float64),
max_cpu AggregateFunction(max, Float64)
)
ENGINE = AggregatingMergeTree()
PARTITION BY toYYYYMM(hour)
ORDER BY (host, hour);
-- Materialized view feeds rollup automatically
CREATE MATERIALIZED VIEW cpu_hourly_mv TO cpu_hourly AS
SELECT
toStartOfHour(ts) AS hour,
host,
avgState(usage_user) AS avg_cpu,
maxState(usage_user) AS max_cpu
FROM cpu_metrics
GROUP BY hour, host;
This makes long-range queries (e.g. "average CPU by host for the last 90 days") instant even over billions of raw data points.
Failure modes to plan for
- Telegraf buffer overflow: if Telegraf's output buffer fills (Kafka unavailable), metrics are dropped. Set
metric_buffer_limitand configure a[[outputs.file]]fallback. - Clock skew: InfluxDB allows backfilling with past timestamps. Ensure your ClickHouse®
ORDER BY (timestamp)handles out-of-order writes correctly. - Schema drift: new InfluxDB fields or tags require updating the destination schema. Plan for nullable columns or the flexible Map approach.
- Telegraf flush interval: Telegraf batches before flushing (default
flush_interval = "10s"). Tune for your freshness SLA.
Why ClickHouse® for InfluxDB analytics
ClickHouse® is a columnar OLAP database built for analytical queries over large volumes. MergeTree tables and vectorized execution deliver sub-second queries on billions of time-series data points.
ClickHouse® natively supports time-series functions (toStartOfMinute, toStartOfHour, tumble, hop) and AggregatingMergeTree for pre-aggregated rollups. This makes it a natural fit for metrics analytics at scale.
Where InfluxDB stores data in TSM (Time-Structured Merge Tree) optimized for recent writes, ClickHouse® handles 10:1 to 30:1 compression on time-series data and processes 2–3 million points per second in multi-threaded ingestion scenarios. A ClickHouse® integration InfluxDB setup fits real-time analytics, infrastructure observability at scale, and real-time data processing over metrics data without InfluxDB's long-retention cost.
Why Tinybird is the best InfluxDB to ClickHouse® option
Most teams don't need to operate ClickHouse® infrastructure—they need fast analytics on metrics data exposed as APIs.
Tinybird is purpose-built for this. You configure a Telegraf pipeline that sends metrics to Tinybird's Events API or Kafka connector. Tinybird handles ingestion, storage, and API publishing in one product.
You avoid operating the ClickHouse® Kafka engine and the overhead of maintaining a separate API layer. Define Pipes in SQL, publish as REST endpoints, and serve dashboards or product features with sub-100ms latency and automatic scaling.
For IoT and infrastructure analytics, Tinybird's Kafka connector handles high-throughput Telegraf output at scale—including metrics from thousands of hosts—without configuration overhead.
Frequently Asked Questions (FAQs)
Does ClickHouse® Cloud support InfluxDB natively?
ClickHouse® Cloud does not have a native InfluxDB connector in ClickPipes. You use the Kafka data source: configure Telegraf with a Kafka output plugin, then create a Kafka ClickPipe to load into ClickHouse® Cloud.
You operate the InfluxDB → Telegraf → Kafka path; ClickHouse® Cloud ingests from Kafka.
Can I use Tinybird for InfluxDB to ClickHouse® without Kafka?
Yes. Use Telegraf's [[outputs.http]] plugin to POST metrics directly to the Tinybird Events API. No Kafka cluster required.
Tinybird stores data in ClickHouse®-backed data sources and lets you publish Pipes as REST APIs. You own the Telegraf pipeline; Tinybird is the destination and API layer.
How do I map InfluxDB's tag + field model to ClickHouse® columns?
Map tags (low-cardinality indexed metadata like host, region, service) to LowCardinality(String) columns. Map fields (measured values) to Float64, Int64, or String.
Include timestamp as DateTime64(9) for nanosecond precision and use it in ORDER BY. For variable field names, use Map(String, Float64) to avoid schema migrations per measurement.
Is a ClickHouse® integration InfluxDB good for long-retention analytics?
Yes—this is the primary motivation. InfluxDB is optimized for recent, high-frequency metric writes. Long-retention storage is expensive, and complex analytical queries across multiple measurements or long time ranges are slow.
A ClickHouse® integration InfluxDB pipeline moves aged metrics to ClickHouse® where columnar compression handles long retention efficiently. Use InfluxDB for recent operational queries and ClickHouse® for historical trend analysis and real-time analytics over full history.
How does ClickHouse® handle high-frequency metric writes from InfluxDB?
ClickHouse® is optimized for high-throughput batch inserts—inserting thousands of rows at once is far more efficient than one-by-one inserts. Telegraf naturally batches metrics before flushing, which aligns with ClickHouse®'s optimal insert pattern.
Configure Telegraf's flush_interval and metric_batch_size to control batch size. Tinybird's Events API accepts batch JSON arrays, and the Kafka connector processes topics at millions of events per second.
Can I join InfluxDB measurements in ClickHouse® with SQL?
Yes—this is one of ClickHouse®'s key advantages over InfluxDB. InfluxDB's Flux query language supports joins but they are complex and slow for large datasets. InfluxQL (1.x) has no native join capability.
In ClickHouse®, you create separate tables per measurement and join them with standard SQL. Pre-aggregate with materialized views and AggregatingMergeTree to make cross-measurement analytical queries instantaneous.
