ClickHouse has earned its reputation as one of the fastest analytical databases available, with over 36,000 GitHub stars and adoption at companies processing petabytes of data. Many teams have found great success with ClickHouse in production, though the journey from initial setup to reliable operation at scale often involves navigating some growing pains.
Self-hosting a ClickHouse installation gives you the performance benefits of ClickHouse while wanting to maintain control over your infrastructure. You avoid vendor lock-in and keep costs predictable. That all makes sense. As with any self-hosted solution, the tradeoff comes in balancing ClickHouse's power and utility with the operational complexity that can come with it. Many teams successfully take on this operational complexity as part of their core competency. But for teams who don't want to become ClickHouse experts (but still want that ClickHouse power), exploring different deployment options might align better with their priorities and constraints.
This post explores an alternative - Tinybird self-managed regions - for those who want to install ClickHouse on their own server while also avoiding the more nuanced complexities of Clickhouse. I'll cover some of the things that make ClickHouse tough to operate at scale, how Tinybird's self-managed option offers an alternative deployment path for self-hosters, and provide a walkthrough on how to install ClickHouse on your own server using Tinybird self-managed.
Understanding ClickHouse operational considerations
ClickHouse is a powerful database, and with that power comes operational complexity that's worth understanding before diving in. My colleague Javi Santana wrote a thoughtful 2-part series on operating ClickHouse at scale over the years (Part 1, Part 2). There's a lot of practical wisdom in those posts about what successful ClickHouse operations look like in practice.
While many teams have built excellent ClickHouse operations, it's helpful to understand what you'll be taking on. Here are some of the key operational areas to consider when self-hosting ClickHouse:
Infrastructure
A production ClickHouse deployment requires more than just spinning up a few containers. For high availability, you need:
- Minimum 2 ClickHouse instances with replication
- ZooKeeper or ClickHouse Keeper for coordination (isolated from database nodes)
- Load balancer with replica health awareness
- Storage architecture decisions: Local SSD vs. cloud storage vs. hybrid approaches
The rule of thumb for hardware sizing: a 32-core machine can process around 5GB/s healthily. Beyond that, you're looking at horizontal scaling with all the coordination complexity that entails.
Another practical configuration challenge: storage. Zero-copy replication can dramatically reduce storage costs, but it requires careful setup of shared storage (like S3) with proper IAM policies, network configuration, and monitoring of data consistency across replicas. And as of this writing, zero-copy replication is disabled by default on ClickHouse open-source v22.8 and higher and is not recommended for production use, adding complexity and cost if you need to store a lot of data.
Operations
Upgrades are a monthly reality with ClickHouse's release cycle. Each upgrade brings potential risks:
- Backward-compatible replication protocol changes
- Data storage format incompatibilities
- SQL behavior changes and performance regressions
- Settings modifications requiring careful review
From our experience operating ClickHouse at scale, we've found that upgrades can sometimes take longer than initially planned when unexpected incompatibilities arise. The upgrade process benefits from careful orchestration across replicas to maintain availability.
Query management becomes important as you mix workload types:
- Real-time traffic (sub-second responses)
- Long-running analytical queries (minutes to hours)
- Backfill operations (can take days)
Load balancing these workloads requires careful request routing and replica management. Real-time queries should get dedicated replicas optimized for low latency, while backfills must run on separate nodes to prevent resource contention. Keep in mind what that means for storage costs: A 300TB dataset with 10 replicas is actually costing you 3000TB in storage.
Ingestion
This is an area where scaling considerations become particularly important. ClickHouse's performance depends on balancing four competing processes:
- Part generation from incoming data
- Background merges to optimize storage
- Read queries accessing data
- Mutations for updates/deletes
Common challenges and errors that show up when managing this balance:
- "Too many parts" errors causing tables to go read-only
- Out-of-memory crashes during high ingestion periods
- Materialized views requiring manual detaching and reattaching
- Query performance degradation as unmerged parts accumulate
Each new materialized view slows down ingestion. Memory issues during part generation can bring down entire nodes. Recovery often requires deep knowledge of ClickHouse internals and careful manipulation of system tables.
Areas requiring extra attention
Some operations in ClickHouse that seem straightforward can have surprising implications:
POPULATEs: Using CREATE MATERIALIZED VIEW ... POPULATE
seems convenient but can duplicate data if not executed carefully. Safe backfills require manual coordination and can take days for large datasets.
Memory management: Settings like max_memory
, max_bytes_before_external_group_by
, and max_concurrent_queries
interact in complex ways. Getting them wrong can mean OOM kills, query timeouts, or resource starvation.
System table maintenance: ClickHouse generates extensive system logs that can consume disk space rapidly. Without proper retention policies, system tables can grow larger than your actual data.
Tinybird self-managed: Another way to install ClickHouse on your server
It's worth being clear about what Tinybird self-managed is and isn't: Tinybird self-managed is not the same as ClickHouse open-source. The operational model and abstractions are different from managing a raw ClickHouse cluster. While both provide high-performance analytics capabilities, they approach deployment and management quite differently.
Tinybird and ClickHouse share the foundation of high-performance analytics for complex, real-time use cases. Every Tinybird deployment includes a hosted and scalable ClickHouse database. The key difference is that Tinybird provides a layer of abstractions and tooling that changes the development and deployment experience. If you're curious about how Tinybird's approach compares to a traditional ClickHouse setup, the ClickHouse-to-Tinybird migration guide offers a helpful mapping of core concepts.
With Tinybird self-managed regions, instead of managing a raw ClickHouse database cluster, you'll deploy Tinybird's managed ClickHouse service on your infrastructure. The idea is to give you the control and data locality/sovereignty of self-hosting ClickHouse with similar performance characteristics and reduced operational complexity. If that sounds good to you, keep reading.
Benefit for open source companies
Commercial open source companies often choose open source software for their own technology stack - this ensures that users who want to self-host are able to without external dependencies.
Tinybird self-managedd gives commercial open source companies the ability to offer a hosted SaaS solution (and use Tinybird Cloud as the backend for that hosted service), will still giving open source users a self-managed deployment option.
So, if you're a COSS company interested in ClickHouse, you could consider this hybrid path: using Tinybird Cloud for your hosted service, and offering Tinybird self-managed to your open source users.
What you get vs. what you manage
What Tinybird handles:
- ClickHouse cluster configuration and optimization
- Automated upgrades with testing and rollback capabilities
- Query optimization and automatic routing
- API hosting for application integration
- Ingestion backpressure and batch optimization
- Materialized view lifecycle management
- System monitoring and alerting
- Schema changes and data migrations
What you manage:
- Kubernetes cluster or VM infrastructure
- Network policies and security
- Backup storage configuration
- Resource scaling decisions
The key insight: Tinybird provides the ClickHouse expertise and performance while you maintain infrastructure control.
Technical advantages over raw ClickHouse
Built-in API layer Tinybird includes an API layer: any ClickHouse SQL query becomes a hosted API endpoint running on your infrastructure, which eliminates the need for an ORM-like interface to the database or building a separate API backend.
Ingestion handling eliminates added infrastructure requirements. Tinybird self-managed deployments include HTTP streaming ingestion and native connectors for Kafka, S3, and GCS.
Operational features that would typically require significant development effort:
- Simplified upgrades managed by Tinybird
- Schema changes and data migrations via
tb deploy
- Monitoring and alerting capabilities
The development workflow advantage
Tinybird includes modern development tooling designed to simplify and accelerate feature development and deployment with ClickHouse:
# Install and authenticate
curl https://tinybird.co | sh && tb login
# Local development with Docker
tb local start
# AI-powered project generation
tb create --prompt "Create a user analytics system with session tracking and conversion funnels"
# Hot rebuilds for instant feedback
tb dev
# Setup self-managed region (AWS only, requires prerequisites)
tb infra init
# Connect to self-managed instance
tb login --host https://your-domain.com
# Deploy to self-managed infrastructure
tb --cloud deploy
This workflow bridges the gap between local development and production deployment, something raw ClickHouse makes more complicated.
Implementation guide: How to deploy on Tinybird self-managed
Prerequisites and planning
Before deployment, assess your infrastructure requirements based on the documentation:
Resource requirements:
- Compute: At least 4 vCPUs and 16 GB RAM for development environments
- Storage: At least 100 GB for ClickHouse volume and 10 GB for Redis volume
- Required container paths:
/redis-data
and/var/lib/clickhouse
- Required container paths:
- Network: Sufficient bandwidth for expected query/ingestion rates
- Public HTTPS URL: Your deployment must be publicly accessible (e.g.,
https://tinybird.example.com
)
For automated deployment with tb infra init
(AWS only):
- AWS CLI with credentials configured
- Terraform CLI and kubectl installed
- EKS cluster with AWS Load Balancer Controller and external-dns
- Route 53 hosted zone for your domain
Deployment process
Important note: Tinybird self-managed regions are designed and recommended for development environments and smaller production workloads. Please contact us if you'd like to discuss additional functionality such as high-availability, S3 storage, or anything else about scaling capacity of your Tinybird self-managed region."
Tinybird offers two deployment approaches:
Option 1: Automated setup with tb infra init
(AWS only currently):
# Prerequisites: AWS CLI, kubectl, Terraform CLI configured
# EKS cluster with AWS Load Balancer Controller and external-dns
# Log into Tinybird Cloud
tb login
# Automated setup (prompts for AWS region, DNS zone, etc.)
tb infra init
# Connect to your self-managed instance
tb login --host https://your-domain.com
# Use the instance
tb --cloud workspace ls
Option 2: Manual setup with tb infra add
:
# Add the region to Tinybird Cloud
tb infra add
# Provides required environment variables:
# TB_INFRA_TOKEN, TB_INFRA_WORKSPACE, TB_INFRA_ORGANIZATION, TB_INFRA_USER
# Deploy the tblocal container on your infrastructure
# with the provided environment variables
The deployment uses Tinybird's Local container. The container must be publicly accessible via HTTPS and includes the complete Tinybird platform with ClickHouse backend.
Post-deployment configuration
Data source setup connects to your existing systems. Tinybird supports direct connections to Kafka, databases via CDC, object storage, and HTTP streaming endpoints.
User management integrates with your existing authentication:
- OIDC/SAML for enterprise authentication
- API key management for service accounts
- Workspace isolation for multi-tenancy
- Row-level security policies
Performance tuning adapts to your workload:
- Resource allocation per service
- Query concurrency limits
- Ingestion batch sizes
- Cache configuration
Upgrade management
Tinybird simplifies the complex ClickHouse upgrade process:
Tinybird handles ClickHouse upgrades within new Docker image releases. Instead of upgrading your ClickHouse cluster manually, you can update the Tinybird Local image, which handles any included ClickHouse upgrades safely alongside the other services within the Tinybird container.
Conclusion: choosing your ClickHouse deployment strategy
If you want to install your own ClickHouse, you have options. Tinybird self-managed regions provide a path to get ClickHouse's performance benefits while maintaining infrastructure control and reducing operational overhead.
Here's a framework to help think through the options:
Raw ClickHouse might be a good fit if you have deep ClickHouse expertise in-house, can dedicate engineering resources to operations, and cost optimization is a key priority.
Tinybird self-managed could work well if you're looking to balance operational simplicity with infrastructure control, want to avoid building ClickHouse expertise from scratch, and want the added ingestion/API services baked into Tinybird.
Tinybird Cloud offers another path if operational overhead isn't something you want to take on and you're comfortable with data residing in managed cloud regions.
For many teams considering ClickHouse, Tinybird self-managed offers a great balance of control, performance, and operational simplicity. The platform handles the complex parts of ClickHouse operations, adds some addition features, and gives you the deployment flexibility you need.
If you are ready to evaluate Tinybird self-managed for your use case, you don't have to commit to any deployment model just yet. Start developing locally with Tinybird Local. You can use the local dev environment to understand the workflow and validate your use case. Then, if you're ready to deploy, you can choose to set up a self-managed region on your own infrastructure (or deploy to Tinybird Cloud if you prefer). If you need support contact the Tinybird team or join the Tinybird Slack community to discuss your specific requirements for self-managed deployment.
The goal isn't just to run ClickHouse. It's to run it reliably, efficiently, and without consuming your engineering team's time on operational complexity that doesn't differentiate your product. Tinybird self-managed helps you do just that.
Additional resources
Tinybird Self-Managed Documentation:
- Add a self-managed region manually
- Use the CLI to add a self-managed region (AWS)
- ClickHouse-to-Tinybird migration guide
Related Blog Posts:
- What I learned operating ClickHouse at scale - Part 1
- What I learned operating ClickHouse at scale - Part 2 (Part 2)
- Want a managed ClickHouse? Here are some options
Community and Support:
Frequently asked questions
What is Tinybird self-managed?
Tinybird self-managed is a deployable version of Tinybird that runs in your own cloud environment or on-premises infrastructure. It provides the same ClickHouse-powered analytics platform as Tinybird Cloud but gives you complete control over where your data resides and how your infrastructure is configured.
How is Tinybird self-managed different from open source ClickHouse?
While both use ClickHouse as the underlying database, Tinybird self-managed includes additional services like ingestion APIs, query optimization, monitoring, hosted API endpoints, and automated operational features. Open source ClickHouse generally requires you to build and maintain these components yourself.
Is Tinybird self-managed ready for production?
Tinybird self-managed is currently recommended primarily for development environments and smaller production workloads. For larger production deployments, contact Tinybird to discuss scaling capacity and production readiness.
What are the minimum requirements for Tinybird self-managed?
You need at least 4 vCPUs and 16 GB RAM for development environments, with 100 GB for ClickHouse storage and 10 GB for Redis storage. The deployment also requires a publicly accessible HTTPS URL.
How much does Tinybird self-managed cost?
Tinybird self-managed is free. You'll only pay for the underlying infrastructure costs (compute, storage, network) on your cloud provider. You can also purchase an optional support plan if you'd like assistance running Tinybird on your own infrastructure.
Can I migrate from raw ClickHouse to Tinybird self-managed?
Yes, Tinybird provides migration documentation to help you move from raw ClickHouse deployments. The process typically involves data assessment, parallel deployment, gradual migration, and performance validation. Read the ClickHouse-to-Tinybird migration guide for more information.
What cloud providers does Tinybird self-managed support?
Currently, automated deployment with tb infra init
is available for AWS only. Manual deployment with tb infra add
can work on any cloud provider that supports container deployment.
How do upgrades work with Tinybird self-managed?
Tinybird handles ClickHouse upgrades as part of the managed service. Instead of manually upgrading your ClickHouse cluster, you update the Tinybird Local container image, which safely handles any included ClickHouse upgrades.
Can I use Tinybird self-managed for regulatory compliance?
Yes, Tinybird self-managed is designed for scenarios requiring data residency in specific regions or on-premises infrastructure to meet regulatory requirements like GDPR or HIPAA.
How do I get started with Tinybird self-managed?
Start by installing the Tinybird CLI (curl https://tinybird.co | sh
), then run tb local start
to create a local development environment. From there, you can set up a self-managed region using either tb infra init
(AWS) or tb infra add
(manual setup) before deploying with tb --cloud deploy
.