---
title: "ClickHouse® + Snowflake — 3 Ways to Connect in 2026"
excerpt: "Get Snowflake data into ClickHouse® for real-time analytics. Tinybird, ClickPipes S3, or self-managed—compare the three options and choose the right pipeline architecture."
authors: "Tinybird"
categories: "AI Resources"
createdOn: "2026-03-30 00:00:00"
publishedOn: "2026-03-30 00:00:00"
updatedOn: "2026-03-30 00:00:00"
status: "published"
---

# **ClickHouse® + Snowflake — 3 Ways to Connect in {{ year }}**

These are the main options for a **ClickHouse® integration Snowflake** pipeline:

1. [Tinybird](https://www.tinybird.co/)
2. ClickHouse® Cloud + ClickPipes (S3 or Kafka source)
3. Self-managed or custom (Snowflake Streams + Kafka, or batch export via COPY INTO S3)

**Snowflake** is a **cloud data warehouse** used for **batch analytics**, **BI workloads**, and **structured data storage**. Many teams want a copy of that data in ClickHouse® for **sub-100ms analytical queries**, **real-time APIs**, and [user-facing analytics](https://www.tinybird.co/blog/user-facing-analytics) that Snowflake's job-pool model cannot support at scale.

A **ClickHouse® integration Snowflake** setup uses **scheduled syncs**, **Snowflake Streams + Kafka**, or **S3 staging** to move Snowflake data into ClickHouse® for [real-time analytics](https://www.tinybird.co/blog/real-time-analytics-a-definitive-guide) at a fraction of Snowflake's query cost.

Below we compare all three options in depth—architecture, real code examples, trade-offs, and when to use each.

**Looking for minimal ops and instant APIs?**

Tinybird has a **native Snowflake Connector**—scheduled sync directly from Snowflake tables, **managed ClickHouse®**, and **one-click API publishing** from SQL. Your **Snowflake to ClickHouse® pipeline** can power real-time APIs at roughly **1/10th the cost** of running those queries directly in Snowflake.

## **Three ways to implement ClickHouse® integration Snowflake**

This section is the core: the three options to connect Snowflake to ClickHouse®, in order.

### **Option 1: Tinybird — native Snowflake Connector with API layer**

Tinybird is a **real-time data platform** built on ClickHouse® with a **native Snowflake Connector**. It is the only option that requires zero pipeline infrastructure: no S3 buckets, no Kafka clusters, no custom extraction scripts.

**How it works:** authenticate Tinybird with your Snowflake account using the one-time SQL snippet Tinybird generates. Select a table or view and set a sync schedule (interval or cron expression). Tinybird runs a background job that extracts the delta from Snowflake and ingests it into a ClickHouse®-backed data source.

You then define **Pipes** (SQL) and publish them as REST endpoints with a single click.

**One-time Snowflake setup (Tinybird-generated SQL):**

```sql
-- Create a dedicated role and user for Tinybird
CREATE ROLE tinybird_role;
CREATE USER tinybird_user
  PASSWORD = '<password>'
  DEFAULT_ROLE = tinybird_role;
GRANT ROLE tinybird_role TO USER tinybird_user;

-- Grant access to your database and schema
GRANT USAGE ON DATABASE analytics TO ROLE tinybird_role;
GRANT USAGE ON SCHEMA analytics.public TO ROLE tinybird_role;
GRANT SELECT ON TABLE analytics.public.orders TO ROLE tinybird_role;

-- Create GCS integration (for Tinybird's internal staging)
CREATE STORAGE INTEGRATION tinybird_gcs_integration
  TYPE = EXTERNAL_STAGE
  STORAGE_PROVIDER = GCS
  ENABLED = TRUE
  STORAGE_ALLOWED_LOCATIONS = ('gcs://tinybird-staging/');
```

Tinybird handles the extraction, staging, and ingestion automatically after this one-time setup.

**When Tinybird fits:**

- You want **getting Snowflake data into ClickHouse®** with zero pipeline infrastructure
- You need low-latency APIs and [real-time dashboards](https://www.tinybird.co/blog/real-time-dashboards-are-they-worth-it) on top of Snowflake data
- You want a **Snowflake to ClickHouse® pipeline** that costs significantly less than running queries in Snowflake

**Prerequisites:** a Snowflake account (any cloud region), a Tinybird workspace, and the one-time SQL setup above. Data flows: Snowflake → Tinybird background job → Tinybird data sources → Pipes → REST API.

### **Option 2: ClickHouse® Cloud + ClickPipes (S3 or Kafka source)**

ClickHouse® Cloud's **ClickPipes** supports **S3** and **Kafka** as data sources. There is no native Snowflake connector.

**How it works (S3 path):** configure a Snowflake **Task** that runs `COPY INTO @external_stage` on a schedule, exporting to an S3 bucket in Parquet format. Then create an **S3 ClickPipe** in the ClickHouse® Cloud console.

**Snowflake setup — export to S3 on schedule:**

```sql
-- Create file format and external stage
CREATE FILE FORMAT parquet_format TYPE = PARQUET;
CREATE OR REPLACE STAGE s3_stage
  URL = 's3://my-bucket/snowflake-export/'
  CREDENTIALS = (AWS_KEY_ID = '<key>' AWS_SECRET_KEY = '<secret>')
  FILE_FORMAT = parquet_format;

-- Create a scheduled Task to run every 15 minutes
CREATE OR REPLACE TASK export_orders_task
  WAREHOUSE = COMPUTE_WH
  SCHEDULE = '15 MINUTE'
AS
  COPY INTO @s3_stage/orders/
  FROM (SELECT * FROM orders WHERE updated_at > DATEADD(minute, -15, CURRENT_TIMESTAMP()))
  MAX_FILE_SIZE = 157286400  -- 150 MB per file
  HEADER = TRUE;

ALTER TASK export_orders_task RESUME;
```

**Load from S3 into ClickHouse® Cloud (one-time or via ClickPipes):**

```sql
-- Direct S3 load (one-time)
INSERT INTO orders
SELECT * FROM s3(
  'https://my-bucket.s3.amazonaws.com/snowflake-export/orders/*.parquet',
  '<aws_key>', '<aws_secret>',
  'Parquet'
)
SETTINGS
  input_format_null_as_default = 1,
  input_format_parquet_case_insensitive_column_matching = 1;
```

For **near-real-time CDC**: use the **Snowflake Kafka Connector** to publish table changes to Kafka, then create a **Kafka ClickPipe** in ClickHouse® Cloud.

**When it fits:**

- You want managed ClickHouse® and control the export mechanism yourself
- You already have S3 staging infrastructure or Kafka running
- Your main need is **set up ClickHouse® integration Snowflake** for analytical replication

**Prerequisites:** Snowflake Task configured to export to S3, or Snowflake Kafka Connector publishing to Kafka. Data flows: Snowflake → S3 (or Kafka) → ClickPipes → ClickHouse® Cloud.

### **Option 3: Self-managed or custom (Snowflake Streams + Kafka, or batch S3)**

With **self-managed ClickHouse®**, the common batch pattern is: a Snowflake **Task** exports to S3 via `COPY INTO @external_stage`, then ClickHouse® loads via the **S3 table function**.

For near-real-time CDC: **Snowflake Streams** capture row-level changes (inserts, updates, deletes). A Snowflake Task reads the stream and publishes to Kafka via the **Snowflake Kafka Connector**. ClickHouse® consumes via the **Kafka table engine** + materialized view → MergeTree.

**ClickHouse® CDC via Kafka table engine + Snowflake Streams:**

```sql
-- Kafka engine reads CDC from Snowflake Streams → Kafka
CREATE TABLE orders_kafka
(
    order_id    UInt64,
    customer_id UInt64,
    amount      Float64,
    status      LowCardinality(String),
    created_at  DateTime,
    METADATA$ACTION LowCardinality(String),   -- INSERT / DELETE
    METADATA$ISUPDATE UInt8
)
ENGINE = Kafka
SETTINGS
    kafka_broker_list = 'kafka:9092',
    kafka_topic_list = 'snowflake-orders-cdc',
    kafka_group_name = 'clickhouse_snowflake_consumer',
    kafka_format = 'JSONEachRow';

-- Target table with ReplacingMergeTree for upserts
CREATE TABLE orders
(
    order_id    UInt64,
    customer_id UInt64,
    amount      Float64,
    status      LowCardinality(String),
    created_at  DateTime,
    updated_at  DateTime DEFAULT now()
)
ENGINE = ReplacingMergeTree(updated_at)
PARTITION BY toYYYYMM(created_at)
ORDER BY (customer_id, order_id);

CREATE MATERIALIZED VIEW orders_mv TO orders AS
SELECT
    order_id, customer_id, amount, status, created_at, now() AS updated_at
FROM orders_kafka
WHERE METADATA$ACTION = 'INSERT' OR METADATA$ISUPDATE = 1;
```

**When it fits:**

- You already run ClickHouse® and want full control over the pipeline
- You have data-engineering capacity to operate Snowflake, Kafka, and ClickHouse®
- Batch S3 export fits when near-real-time is not required

### **Decision framework: which option fits your situation**

The right choice depends on **four variables**: freshness requirement, team capacity, whether you need an API layer, and cost sensitivity.

| Situation | Recommended option |
|---|---|
| Real-time APIs, minimal ops, no pipeline infra | Tinybird |
| Already on ClickHouse® Cloud, have S3/Kafka | ClickPipes |
| Self-managed ClickHouse®, CDC required | Self-managed + Snowflake Streams |
| One-time migration or batch analytics | S3 export + S3 table function |
| Cost is driving the move from Snowflake | Tinybird (10× cheaper at high concurrency) |

**Choose Tinybird** when you need [real-time data ingestion](https://www.tinybird.co/blog/real-time-data-ingestion) from Snowflake and REST APIs from the same data—without managing any pipeline infrastructure.

**Choose ClickPipes** when you're already on ClickHouse® Cloud and have S3 or Kafka configured. You get managed ClickHouse® ingestion but build your own serving layer.

**Choose self-managed** when your team has data-engineering capacity to operate Snowflake, Kafka, and ClickHouse®—and needs full control over CDC semantics and schema mapping.

### **Summary table**

| Option | Ingestion path | API layer | Ops burden | Infrastructure required |
|--------|---------------|-----------|------------|------------------------|
| Tinybird | Native Snowflake Connector | Built in (Pipes) | Low | None |
| ClickHouse® Cloud ClickPipes | S3 or Kafka | Build your own | Medium | S3 bucket or Kafka |
| Self-managed | S3 or Snowflake Streams → Kafka | Build your own | High | S3, Kafka, ClickHouse® |

## **What is Snowflake and why integrate it with ClickHouse®?**

### **Snowflake as the data source**

Snowflake is a **cloud-agnostic [data warehouse](https://www.tinybird.co/blog/why-data-warehouses)** available on AWS, GCP, and Azure. It separates compute from storage, scales virtual warehouses independently, and supports diverse analytical workloads through standard SQL.

Teams use it for **BI dashboards**, **ETL pipelines**, **data sharing**, and **structured analytics**. Snowflake handles batch workloads well.

But Snowflake is **not optimized for user-facing real-time APIs**. Its **job pool model** introduces non-deterministic latency—queries queue before execution. **Query concurrency** is limited per virtual warehouse. For [user-facing analytics](https://www.tinybird.co/blog/user-facing-analytics) needing **sub-100ms** responses at high concurrency, Snowflake's architecture creates a bottleneck.

### **How to get data out of Snowflake**

**COPY INTO @external_stage** exports query results to S3, GCS, or Azure Blob in CSV, JSON, or Parquet format. Parquet is preferred—it preserves type information, compresses efficiently, and supports nested structures.

**Snowflake Streams** capture table-level changes (inserts, updates, deletes) for CDC-style replication. The **Snowflake Kafka Connector** reads Snowflake table data and publishes to Kafka topics for near-real-time replication.

**Snowflake Connector for Python** and the **JDBC/ODBC drivers** support direct SQL extraction for custom pipelines.

### **Why route Snowflake data to ClickHouse®**

Snowflake processes queries in a job pool with shared-compute overhead. For 100 queries per minute at 10 MB per query over 1 TB of data, Snowflake requires multiple virtual warehouses running 24/7—costing roughly **$30K/month**.

Tinybird's ClickHouse®-backed platform for the same workload costs roughly **$3K/month** (10× cheaper). You keep Snowflake as the source of truth for batch analytics and use ClickHouse® as the [real-time data processing](https://www.tinybird.co/blog/real-time-data-processing) and serving layer.

## **Schema and pipeline design**

### **Mapping Snowflake types to ClickHouse® columns**

Snowflake uses micro-partitions internally with a flexible `VARIANT` type for semi-structured data. In ClickHouse® you define an explicit schema with typed columns.

Key type mappings:

```sql
-- Snowflake → ClickHouse® type mapping
-- NUMBER(38,0)     → Int64 or UInt64
-- NUMBER(18,6)     → Float64
-- VARCHAR          → String or LowCardinality(String)
-- BOOLEAN          → UInt8
-- TIMESTAMP_NTZ    → DateTime64(3)
-- DATE             → Date
-- VARIANT          → String (JSON), extract at query time

CREATE TABLE orders
(
    order_id    UInt64,
    customer_id UInt64,
    amount      Float64,
    status      LowCardinality(String),     -- Low-cardinality VARCHAR
    metadata    String,                      -- VARIANT → JSON String
    created_at  DateTime64(3),
    updated_at  DateTime64(3)
)
ENGINE = ReplacingMergeTree(updated_at)
PARTITION BY toYYYYMM(created_at)
ORDER BY (customer_id, order_id);
```

For `VARIANT` columns, extract fields at query time using ClickHouse®'s JSON functions:

```sql
SELECT
    order_id,
    JSONExtractString(metadata, 'shipping_carrier') AS carrier,
    JSONExtractFloat(metadata, 'shipping_cost') AS shipping_cost
FROM orders
WHERE created_at >= today() - INTERVAL 7 DAY;
```

### **Incremental sync strategy**

Full table exports from Snowflake are slow and consume credits. After the initial load, use **incremental sync** by delta:

```sql
-- Snowflake: export only rows updated since last sync
COPY INTO @s3_stage/orders/
FROM (
  SELECT * FROM orders
  WHERE updated_at > '{{ last_sync_timestamp }}'
)
MAX_FILE_SIZE = 157286400
HEADER = TRUE;
```

Tinybird's native Snowflake Connector handles incremental sync automatically using a watermark column.

### **Failure modes to plan for**

- **Sync failures:** Tinybird's Snowflake Connector logs each sync job with success/failure status. Configure alerts via `datasource_ops_log`.
- **Schema drift:** new Snowflake columns require updating the ClickHouse® schema. Plan for nullable columns and monitor with `INFORMATION_SCHEMA.COLUMNS`.
- **Stale data windows:** scheduled syncs have a lag equal to your sync interval. For sub-interval freshness, use Snowflake Streams + Kafka for CDC.
- **Large initial loads:** full table exports from Snowflake take time and consume credits. Use incremental sync after the initial load.

## **Why ClickHouse® for Snowflake analytics**

ClickHouse® is a **columnar OLAP [database](https://www.oracle.com/database/what-is-database/)** built for analytical queries over large volumes. **MergeTree** tables and **vectorized execution** deliver **sub-second queries** on billions of rows.

Where Snowflake processes queries in a job pool with shared-compute overhead, ClickHouse® responds in **62ms p90** for [real-time dashboards](https://www.tinybird.co/blog/real-time-dashboards-are-they-worth-it) serving thousands of concurrent users (Resend case study). This is the fundamental difference: Snowflake is built for warehousing, ClickHouse® is built for serving.

A **ClickHouse® integration Snowflake** setup fits [user-facing analytics](https://www.tinybird.co/blog/user-facing-analytics), real-time APIs over warehouse data, and workloads where Snowflake's concurrency and latency limits are a bottleneck.

## **Why Tinybird is the best Snowflake to ClickHouse® option**

Most teams moving from Snowflake to a real-time serving layer don't want to build a pipeline—they want **fast APIs on Snowflake data**.

Tinybird is purpose-built for this. The native Snowflake Connector eliminates S3 staging, Kafka clusters, and custom extraction scripts. Authenticate Tinybird with Snowflake, select a table, set a sync schedule—done.

Tinybird handles **extraction, ingestion, storage, and API publishing** in one product. Define Pipes in SQL, publish as REST endpoints, and serve dashboards or product features at **sub-100ms latency** and automatic scaling.

The cost advantage is real: 100 queries per minute at 10 MB per query costs ~**$30K/month** in Snowflake virtual warehouses. The same workload on Tinybird costs ~**$3K/month**. Exploratory queries in Tinybird are free. You keep Snowflake as the source of truth and use Tinybird for the real-time serving layer.

## **Frequently Asked Questions (FAQs)**

### **Does ClickHouse® Cloud support Snowflake natively?**

ClickHouse® Cloud does not have a **native Snowflake connector** in ClickPipes. You use **S3** as the staging layer: configure a Snowflake Task that exports data to an S3 bucket via `COPY INTO`, then create an S3 ClickPipe to load into ClickHouse® Cloud.

Alternatively, use the Snowflake Kafka Connector to publish table data to Kafka, then a Kafka ClickPipe. You operate the Snowflake → S3 or Snowflake → Kafka path; ClickHouse® Cloud ingests from S3 or Kafka.

### **Can I use Tinybird for Snowflake to ClickHouse® without building a pipeline?**

Yes—this is Tinybird's native Snowflake Connector use case. Authenticate Tinybird with your Snowflake account using the one-time SQL snippet, select a table or view, set a sync schedule (interval or cron), and Tinybird handles extraction, ingestion, and storage in ClickHouse®-backed data sources.

No S3 buckets, no Kafka clusters, no custom extraction scripts required.

### **How much cheaper is ClickHouse® vs Snowflake for real-time serving?**

Significantly. For 100 queries per minute at 10 MB per query over 1 TB of data, Snowflake requires multiple virtual warehouses running 24/7—costing roughly **$30K/month**.

Tinybird's ClickHouse®-backed platform for the same workload costs roughly **$3K/month** (10× cheaper). Exploratory queries in Tinybird are free. You keep Snowflake for batch analytics and use Tinybird for [real-time data ingestion](https://www.tinybird.co/blog/real-time-data-ingestion) and the serving layer.

### **What is the difference between Snowflake COPY INTO and Snowflake Streams?**

**COPY INTO @external_stage** exports a snapshot of your data to S3 or GCS on a schedule. It is simple and works for batch analytics, but lag equals your export schedule (minutes to hours). Deletes and updates are not captured—only the latest state.

**Snowflake Streams** capture row-level changes (inserts, updates, deletes) in near real-time. Combined with the Snowflake Kafka Connector, they enable CDC-style replication into ClickHouse® with seconds of lag. Use Streams when you need accurate upsert/delete propagation and sub-minute freshness.

### **How do I handle Snowflake VARIANT columns in ClickHouse®?**

Map `VARIANT` columns to `String` in ClickHouse® and store them as JSON. Use ClickHouse®'s JSON functions (`JSONExtractString`, `JSONExtractFloat`, `JSONExtractArrayRaw`) to extract fields at query time.

For frequently-accessed fields, extract them at ingest time using a **Tinybird Pipe transformation** or a ClickHouse® **materialized view** that extracts JSON fields into typed columns—improving query performance significantly.

### **Can I do real-time CDC from Snowflake into ClickHouse®?**

Yes, using **Snowflake Streams** + the **Snowflake Kafka Connector**. Snowflake Streams capture inserts, updates, and deletes from a table. A Snowflake Task reads the stream and publishes changes to Kafka. ClickHouse® consumes via the Kafka table engine + materialized view into ReplacingMergeTree.

This provides CDC-style replication with seconds of latency. Deletes require application-level handling since ClickHouse® doesn't support true row deletion in MergeTree—use a soft-delete column and filter at query time.

