Migrate from ClickHouse self-hosted to Tinybird

If you're running ClickHouse self-hosted and looking for a more streamlined, production-ready analytics platform, this guide will walk you through migrating to Tinybird. You'll learn how to translate your existing ClickHouse concepts, take advantage of Tinybird's unique features, and integrate the platform into your application architecture.

Why migrate from ClickHouse self-hosted to Tinybird?

While ClickHouse is an excellent OLAP database, self-hosting comes with significant operational overhead. You're responsible for:

  • Infrastructure management: Cluster operations, sharding, replication, and scaling
  • Security: Authentication, authorization, and network security
  • Monitoring: Performance tuning, query optimization, and observability
  • Maintenance: Updates, backups, disaster recovery, and capacity planning
  • Development workflow: Building APIs, handling authentication, and creating deployment pipelines

Tinybird abstracts away these operational complexities while providing additional capabilities that transform ClickHouse from a database into a complete analytics platform.

Benefits of Tinybird over ClickHouse self-hosted

Tinybird offers several key advantages over managing ClickHouse yourself:

Fully managed infrastructure

Tinybird handles all the operational complexity of running ClickHouse at scale:

  • Automatic scaling: Infrastructure scales transparently based on your workload
  • Zero cluster operations: No need to manage shards, replicas, or cluster topology
  • Built-in monitoring: Comprehensive observability with performance metrics and alerting
  • Managed updates: Automatic updates and patches without downtime

API-first architecture

Transform SQL queries into production-grade REST APIs:

  • Instant API endpoints: Turn any SQL query into a scalable HTTP API with authentication
  • Parameterized queries: Build dynamic APIs with template variables and type validation
  • Built-in authentication: Token-based and JWT authentication with fine-grained permissions
  • Rate limiting: Built-in protection against abuse and cost control

Real-time data ingestion

Simplified data ingestion with multiple options:

  • Events API: High-throughput streaming ingestion for real-time events via HTTP
  • Native connectors: Managed integrations for Kafka, S3, GCS, and more
  • Data Sources API: Batch ingestion for CSV, Parquet, and NDJSON files
  • Schema evolution: Handle schema changes without downtime

Development workflow

Modern development practices built in:

  • Local development: Full local environment with Tinybird Local container
  • Version control: Git-based workflow for all data assets
  • CI/CD integration: Deploy changes through your existing pipelines
  • Testing and validation: Built-in deployment validation and rollback capabilities

BI and tool integration

Connect to your existing analytics tools:

  • ClickHouse HTTP interface: Use existing ClickHouse drivers and tools
  • PostgreSQL compatibility: Connect BI tools like Grafana, Metabase, and Tableau
  • Query API: Direct SQL execution for existing applications

ClickHouse to Tinybird concept translation

Tinybird has an opinionated approach to structuring analytics projects using two primary file types: .datasource files (tables) and .pipe files (queries, views, and materializations).

Database → Workspace

In ClickHouse, you organize tables within databases. In Tinybird:

  • Organization: Top-level container for all your projects
  • Workspace: Equivalent to a database, contains all your data sources and pipes

Learn more about Tinybird Core Concepts.

Table → Data Source

ClickHouse tables become Tinybird data sources, defined in .datasource files. Data sources in Tinybird define table schemas, engines, sorting keys, and partition keys. They may also contain references to connectors from which they receive data.

ClickHouse CREATE TABLE statement:

ClickHouse table creation
CREATE TABLE user_events
(
    user_id UInt32 ,
    event_type String,
    timestamp DateTime,
    properties JSON
)
ENGINE = MergeTree
ORDER BY (user_id, timestamp)
PARTITION BY toYYYYMM(timestamp)

Tinybird datasources/user_events.datasource:

user_events.datasource
DESCRIPTION >
    User events tracking with properties for analytics

SCHEMA >
    `user_id` UInt32 `json:$.user_id`,
    `event_type` String `json:$.event_type`,
    `timestamp` DateTime `json:$.timestamp`,
    `properties` String `json:$.properties`

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(timestamp)"
ENGINE_SORTING_KEY "user_id, timestamp"

# optional connector configuration
KAFKA_CONNECTION_NAME kafka_connection # The name of the .connection file
KAFKA_TOPIC topic_name
KAFKA_GROUP_ID {{ tb_secret("KAFKA_GROUP_ID") }}

Key differences:

  • JSONPath expressions: Map JSON fields to columns automatically
  • Declarative schema: Define structure without SQL DDL
  • Built-in description: Document your data sources inline
  • Source connector definitions: Define where the data comes from

Learn more about data sources.

Views and parametrized queries → Pipes

ClickHouse views become Tinybird pipes, which can be used for to define API endpoints, materializations, internal queries, scheduled copies, sinks, or as building blocks for other pipes.

ClickHouse parametrized view:

ClickHouse parametrized view
CREATE VIEW user_activity_summary
AS SELECT
    user_id,
    event_type,
    count() AS event_count,
    min(timestamp) AS first_event,
    max(timestamp) AS last_event
FROM user_events
WHERE timestamp >= {start_date:DateTime} 
  AND timestamp <= {end_date:DateTime}
  AND user_id = {target_user:UInt32}
GROUP BY user_id, event_type
ORDER BY event_count DESC

Tinybird pipes/user_activity_summary.pipe:

user_activity_summary.pipe
DESCRIPTION >
    User activity summary with date range and user filtering

NODE activity_summary
SQL >
    %
    SELECT
        user_id,
        event_type,
        count() AS event_count,
        min(timestamp) AS first_event,
        max(timestamp) AS last_event
    FROM user_events
    WHERE timestamp >= {{DateTime(start_date)}}
      AND timestamp <= {{DateTime(end_date)}}
      AND user_id = {{UInt32(target_user)}}
    GROUP BY user_id, event_type
    ORDER BY event_count DESC

Key differences:

  • Template variables: Use % token at the beginning of a query and {{Type(variable)}} syntax for parameters.
  • Advanced logic: Support for {% %} advanced templating functions
  • Node structure: Organize complex queries into multiple nodes
  • Type safety: Automatic type validation for parameters

Learn more about pipes.

Materialized Views → Materialized Views

ClickHouse incremental materialized views translate to Tinybird materialized views with target data sources.

ClickHouse materialized view:

ClickHouse materialized view
-- Target table
CREATE TABLE user_metrics_hourly
(
    hour DateTime,
    user_id UInt32,
    event_count AggregateFunction(count),
    unique_events AggregateFunction(uniq, String)
)
ENGINE = AggregatingMergeTree()
ORDER BY (hour, user_id);

-- Materialized view
CREATE MATERIALIZED VIEW user_metrics_hourly_mv
TO user_metrics_hourly
AS SELECT
    toStartOfHour(timestamp) AS hour,
    user_id,
    countState() AS event_count,
    uniqState(event_type) AS unique_events
FROM user_events
GROUP BY hour, user_id;

Tinybird implementation:

Target data source (datasources/user_metrics_hourly.datasource):

user_metrics_hourly.datasource
DESCRIPTION >
    Hourly aggregated user metrics with state functions

SCHEMA >
    `hour` DateTime,
    `user_id` UInt32,
    `event_count` AggregateFunction(count, UInt64),
    `unique_events` AggregateFunction(uniq, String)

ENGINE "AggregatingMergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(hour)"
ENGINE_SORTING_KEY "hour, user_id"

Materialized view (materializations/user_metrics_hourly.pipe):

user_metrics_hourly.pipe
DESCRIPTION >
    Incremental materialization of user metrics

NODE hourly_aggregation
SQL >
    SELECT
        toStartOfHour(timestamp) AS hour,
        user_id,
        countState() AS event_count,
        uniqState(event_type) AS unique_events
    FROM user_events
    GROUP BY hour, user_id

TYPE MATERIALIZED
DATASOURCE user_metrics_hourly

Learn more about materialized views.

Refreshable Materialized Views → Copy Pipes

ClickHouse refreshable materialized views become Tinybird copy pipes with scheduling.

ClickHouse refreshable materialized view:

ClickHouse refreshable materialized view
CREATE MATERIALIZED VIEW daily_user_summary
REFRESH EVERY 1 HOUR
ENGINE = MergeTree()
ORDER BY (date, user_id)
AS SELECT
    toDate(timestamp) AS date,
    user_id,
    count() AS total_events,
    countDistinct(event_type) AS unique_event_types
FROM user_events
GROUP BY date, user_id;

Tinybird implementation:

Target Data Source (datasources/daily_user_summary.datasource):

daily_user_summary.datasource
DESCRIPTION >
    Daily user activity summary

SCHEMA >
    `date` Date,
    `user_id` UInt32,
    `total_events` UInt64,
    `unique_event_types` UInt64

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(date)"
ENGINE_SORTING_KEY "date, user_id"

Copy Pipe (copies/daily_user_summary.pipe):

daily_user_summary.pipe
DESCRIPTION >
    Hourly copy job for daily user summaries

NODE daily_aggregation
SQL >
    SELECT
        toDate(timestamp) AS date,
        user_id,
        count() AS total_events,
        countDistinct(event_type) AS unique_event_types
    FROM user_events
    GROUP BY date, user_id

TYPE COPY
TARGET_DATASOURCE daily_user_summary
COPY_SCHEDULE "0 * * * *"  -- Every hour

Learn more about copy pipes.

Schema evolution

Evolving your schema in Tinybird is handled through deployments. Edit your .datasource or .pipe files and deploy changes:

# Validate changes before deploying
tb deploy --check

# Deploy changes
tb deploy

Tinybird automatically handles backfills and schema migrations based on your changes.

Learn more about Schema Evolution and Deployments.

Integrating Tinybird into your application

Data ingestion services

Tinybird provides multiple ways to ingest data. You do not need to set up ingestion infrastructure or use INSERT statements. Tinybird offers several native ingestion connectors and APIs.

Events API

Best for real-time streaming data, and can be used to send data directly from the client to Tinyburd. Send JSON or NDJSON events via HTTP POST:

Events API ingestion
curl -X POST "https://api.tinybird.co/v0/events?name=user_events" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"user_id": 123, "event_type": "page_view", "timestamp": "2024-01-15 10:30:00", "properties": "{\"page\": \"/dashboard\"}"}'

Learn more about the Events API.

Data Sources API

For batch ingestion of CSV, Parquet, or NDJSON files:

Batch file ingestion
curl -X POST "https://api.tinybird.co/v0/datasources?mode=append&name=user_events" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -F "csv=@user_events.csv"

Learn more about the Data Sources API.

Connectors

Managed integrations for popular data sources:

  • Kafka: Real-time streaming with automatic schema detection
  • S3/GCS: Automated file ingestion from cloud storage

Connectors handle authentication, schema mapping, and error handling automatically.

Learn more about Connectors.

Table functions

Tinybird supports ClickHouse table functions for specialized data ingestion and transformation:

  • url(): Load data directly from HTTP/HTTPS URLs
  • iceberg(): Import data from Apache Iceberg tables
  • postgres(): Import data from PostgreSQL databases
  • mysql(): Import data from MySQL databases

Example using postgres() to import data from PostgreSQL:

pipes/import_from_postgres.pipe
DESCRIPTION >
    Import user data from PostgreSQL database

NODE postgres_import
SQL >
    SELECT
        user_id,
        email,
        created_at,
        last_login
    FROM postgres(
        'postgresql://username:password@hostname:5432/database',
        'users'
    )
    WHERE created_at >= yesterday()

TYPE COPY
TARGET_DATASOURCE user_data
COPY_SCHEDULE "0 */6 * * *"  -- Every 6 hours

This example creates a scheduled copy pipe that imports user data from a PostgreSQL table every 6 hours.

Learn more about table functions.

Query development

Tinybird offers flexible options for querying data, from direct SQL to API endpoints.

Pipes vs Query API

Pipes are the default approach for production applications:

  • Type-safe parameters: Automatic validation and conversion
  • Versioning: Track changes and roll back if needed
  • Performance optimization: Automatic query optimization and caching
  • Security: Fine-grained access control with tokens

Query API for direct SQL execution:

  • Migration path: Easy transition from existing ClickHouse applications
  • Ad-hoc analysis: Quick data exploration and debugging
  • BI tools: Direct integration with SQL-based tools

Example API endpoint from a pipe:

endpoints/user_activity.pipe
TOKEN analytics_api READ

DESCRIPTION >
    User activity API endpoint with authentication

NODE user_metrics
SQL >
    %
    SELECT
        user_id,
        event_type,
        count() AS event_count,
        max(timestamp) AS last_activity
    FROM user_events
    WHERE user_id = {{UInt32(user_id)}}
      AND timestamp >= {{DateTime(start_date, '2024-01-01 00:00:00')}}
    GROUP BY user_id, event_type
    ORDER BY event_count DESC

TYPE ENDPOINT

Access the endpoint:

curl "https://api.tinybird.co/v0/pipes/user_activity.json?user_id=123&start_date=2024-01-01" \
  -H "Authorization: Bearer READ_TOKEN"

Learn more about API Endpoints.

BI integration using ClickHouse interface

Connect your existing BI tools using the ClickHouse HTTP interface:

Grafana configuration:

  • Host:
clickhouse.tinybird.co
  • Port: 443 (HTTPS)
  • Username: Your Workspace name
  • Password: Your Tinybird token

Supported tools:

  • Grafana
  • Metabase
  • Apache Superset
  • Tableau (via ClickHouse ODBC driver)
  • Any tool supporting ClickHouse HTTP protocol

The interface exposes four databases:

  • organization: for the Organization data sources, like organization.workspaces, organization.pipe_stats_rt
  • system: for system.tables, system.columns
  • tinybird: for the workspace service data sources like tinybird.datasources_ops_log, tinybird.pipe_stats_rt
  • your workspace name: for your workspace data sources.

See an example Grafana integration.

Deployment options

Tinybird offers flexible deployment options to meet your infrastructure and compliance requirements.

Tinybird Cloud hosting

Fully-managed service with global regions:

  • No infrastructure management: Tinybird handles all operations
  • Global availability: Multiple regions across AWS and GCP
  • Automatic scaling: Infrastructure scales as needed based on selected plan
  • Enterprise features: SSO, audit logs, advanced security controls

Tinybird self-managed regions

Bring Your Own Cloud deployment for enhanced control:

  • Data sovereignty: Keep data in your cloud environment
  • Network isolation: Deploy within your VPC/VNet
  • Compliance: Meet specific regulatory requirements
  • Custom configurations: Tailored performance optimizations

Benefits of self-managed regions:

  • Complete control over infrastructure and data location
  • Integration with existing cloud commitments and reserved instances
  • Direct connection to internal data sources without network egress
  • Custom security policies and network configurations

Deployment process:

  1. Register a self-managed region with Tinybird Cloud
  2. Deploy Tinybird infrastructure in your cloud environment
  3. Connect and manage through Tinybird CLI with --region flag

Self-managed regions are currently in beta. Contact Tinybird support for deployment assistance and requirements.

Learn more about self-managed Regions.

Open source projects using Tinybird

Many successful open source projects have adopted Tinybird for their analytics needs:

  • Ghost: Publishing platform using Tinybird for member analytics and engagement metrics

  • Papermark: Document sharing platform with real-time analytics on document views and interactions

  • Inbox Zero: Email management tool leveraging Tinybird for email analytics and automation insights

  • LocalStack: Cloud development platform using Tinybird for usage analytics and performance monitoring

  • OpenStatus: Website monitoring platform built on Tinybird for real-time uptime and performance analytics

  • Dub: Link management platform using Tinybird for click analytics and link performance tracking

Each leverages Tinybird's real-time capabilities, API-first architecture, and managed infrastructure to focus on their core product instead of analytics infrastructure, while offering the flexibility between hosted and self-hosted deployments..

Migration strategy

To successfully migrate from ClickHouse self-hosted to Tinybird:

1. Assessment and planning

  • Inventory your current setup: Document tables, views, materialized views, and queries
  • Identify integration points: Map how your application currently interacts with ClickHouse
  • Plan data migration: Choose between full migration or gradual transition

2. Local development setup

# Install Tinybird CLI
curl https://tinybird.co | sh

# Authenticate
tb login

# Start local development environment
tb local start

3. Schema translation

  • Convert ClickHouse tables to .datasource files
  • Transform views and materialized views to .pipe files
  • Set up materialized views and copy pipes for aggregations

4. Data migration

  • Use Data Sources API for bulk historical data
  • Set up real-time connectors for ongoing ingestion
  • Validate data consistency between systems

5. Application integration

  • Replace direct ClickHouse queries with Tinybird API endpoints or requests to the Query API
  • Update authentication to use Tinybird tokens
  • Configure BI tools to use Tinybird's ClickHouse interface

6. Production deployment

  • Deploy to Tinybird Cloud or set up self-managed region
  • Configure monitoring and alerting via Tinybird service data sources
  • Gradually shift traffic from ClickHouse to Tinybird

Next steps

Ready to start your migration? Here are some immediate next steps:

  1. Sign up for a free Tinybird account and explore the platform
  2. Follow the quick start guide to understand core concepts
  3. Install the CLI and set up local development
  4. Join our community for migration support and best practices

For complex migrations or enterprise requirements, contact our team for personalized assistance and migration planning.

Updated