CedarDB topped ClickBench recently with faster query times than ClickHouse on several analytical workloads. The result surprised people because ClickBench is ClickHouse's own benchmark, and ClickHouse has dominated analytical database performance for years.
This comparison examines architecture differences, real-world performance patterns, and operational trade-offs between the two systems. You'll learn where each database excels, when benchmark results matter for production applications, and how to choose between them based on your specific requirements.
Why compare ClickHouse and CedarDB?
CedarDB appeared at the top of ClickBench recently, which got people's attention. ClickBench is the analytical database benchmarkthat ClickHouse maintains, and seeing a new database beat ClickHouse on its own benchmark raises questions.

CedarDB comes from a research team at Technical University of Munich. The system compiles SQL queries directly to machine code and claims to handle both analytical queries (OLAP) and transactional workloads (OLTP) in one database. ClickHouse, by contrast, focuses specifically on analytical queries and has been in production since 2016.
The real question isn't just "which is faster on benchmarks?" but "does CedarDB's performance translate to advantages for real applications?" This comparison looks at architecture, performance patterns, and operational trade-offs to help you decide which makes sense for your use case.
Architecture at a glance
ClickHouse stores data in columns rather than rows, which makes analytical queries faster. When you ask "what's the average order value last month?" ClickHouse only reads the columns you need, skipping everything else. The system was built at Yandex starting in 2009 and has been refined through thousands of production deployments.
CedarDB takes a different path. Instead of interpreting SQL queries, it compiles them to native machine code before execution, similar to how DuckDB and Hyper work. The compilation step adds overhead but can produce faster execution for complex queries with multiple joins.
Storage layer
Both systems use columnar storage, but they optimize for different workloads. ClickHouse compresses data aggressively, often achieving 15-20x compression ratios on production datasets, with some cases reaching 100x. The MergeTree table engine appends new data in sorted blocks, then merges blocks in the background.
CedarDB's storage layer handles both column scans and row-level updates. This flexibility comes with trade-offs. The system can't compress as aggressively as ClickHouse because it maintains structures for transactional updates, which increases storage requirements.
Query execution path
ClickHouse processes data in batches of 65,536 rows at a time. This vectorized execution keeps CPU caches full and reduces branch mispredictions. The approach works well for scanning large amounts of data quickly.
CedarDB compiles each query to machine code before running it. For a query with five joins and three aggregations, compilation might take 50-100 milliseconds, but the compiled code runs faster than interpreted execution. Simple queries don't benefit as much because compilation overhead exceeds any execution speedup.
Parallelization model
ClickHouse automatically spreads query work across all CPU cores on a machine. For distributed setups, each node processes its local data independently, then sends results to a coordinator node for final aggregation. There's no shared storage between nodes.
CedarDB focuses on single-machine parallelism with careful thread coordination. The system was designed to maximize throughput on modern servers with 64 or 128 cores, but it doesn't yet offer mature distributed query execution across multiple machines.
ClickBench and other benchmark results
ClickBench measures analytical query performance on a flat table with 100 million rows of web analytics data. The benchmark includes 43 queries covering common patterns like aggregations, string operations, and filters.
CedarDB completes the full ClickBench suite in roughly 60-70% of the time ClickHouse takes on single-node runs. The performance advantage shows up most clearly on queries with complex string operations and multiple aggregations. However, individual query differences are often less than one second, which may not matter for typical dashboards.
ClickBench single-node
Looking at specific query patterns helps understand where each system excels. CedarDB shows lower median latency on aggregations that touch many columns. ClickHouse performs better on queries that benefit from its sparse primary indexes and data skipping capabilities.
Data loading tells another story. ClickHouse ingests the benchmark dataset faster because its append-optimized storage doesn't maintain transactional structures. CedarDB uses more memory during query execution for compiled code and intermediate results.
ClickBench multi-node
ClickHouse scales nearly linearly across multiple nodes for most ClickBench queries. A three-node cluster typically completes the benchmark 2.5x to 2.8x faster than a single node. Adding nodes increases total throughput proportionally.
CedarDB doesn't currently support distributed queries, so multi-node comparisons aren't possible yet. This matters if your data volume or query concurrency exceeds what one machine can handle, even a large one.
TPC-H benchmark results
TPC-H tests analytical performance with a more complex schema including multiple tables and foreign keys. The benchmark includes 22 queries with multi-way joins, subqueries, and complex aggregations.
CedarDB performs well on join-heavy TPC-H queries where code generation provides clear benefits. ClickHouse handles TPC-H effectively too, though some join patterns run slower than on systems specifically optimized for join processing. Both systems complete the full TPC-H suite in reasonable time.
Streaming ingest throughput
ClickHouse was designed for high-throughput streaming ingestion. Production deployments commonly handle hundreds of thousands to millions of rows per second per node. The system batches incoming data automatically and merges it in the background without blocking queries.
CedarDB supports real-time ingestion but with different trade-offs. ACID transaction support adds overhead that reduces raw ingestion throughput compared to ClickHouse's append-only approach. The system provides stronger consistency guarantees but at the cost of insert performance.
Real-world performance for dashboards and AI workloads
Benchmarks provide useful data points, but production workloads rarely match synthetic tests exactly. Real applications involve concurrent users, varied query patterns, and data that doesn't match benchmark schemas.
High-concurrency dashboards
ClickHouse handles hundreds to thousands of concurrent queries through asynchronous execution and efficient resource management. Query latency increases predictably as concurrency grows, but the system maintains stable performance under sustained load.
CedarDB's compiled execution can provide lower latency for individual queries. Less is known about its behavior under sustained high concurrency because the system is newer with fewer published multi-user benchmarks. The compilation cache helps when multiple users run similar queries.
Both systems support result caching. ClickHouse offers granular control over cache behavior through query settings and materialized views that pre-compute common aggregations.
Vector similarity search
ClickHouse added vector similarity search with specialized functions for computing distances between high-dimensional vectors. The system supports approximate nearest neighbor search using HNSW indexes, now in beta status, which works for AI applications like semantic search and recommendation systems.
CedarDB is building vector search features but currently offers less mature support. If vector similarity search is central to your application, ClickHouse provides more production-ready capabilities today.
Time-series rollups
Both systems handle time-series aggregations efficiently but through different mechanisms. ClickHouse uses materialized views to pre-compute rollups at different time granularities. You define the aggregation once, and ClickHouse maintains it automatically as new data arrives.
CedarDB computes rollups on demand with good performance due to compiled execution. The system's support for incremental materialization is still developing, so maintaining real-time rollups may require more manual work than ClickHouse's automatic background merges.
Operational complexity and tooling
Running a database in production involves more than query performance. Debugging issues, scaling capacity, and maintaining uptime all require mature tooling and documentation.
Cluster scaling and upgrades
ClickHouse supports online cluster scaling where you add nodes without downtime. The system rebalances data across new nodes automatically, though large resharding operations can take hours or days depending on data volume.
CedarDB currently focuses on single-node deployments, which simplifies operations but limits scaling options. Vertical scaling works well up to a point, but eventually you hit hardware limits. The largest AWS instances offer 192 cores and 24TB of memory, which handles substantial workloads but can't scale indefinitely.
Version upgrades in ClickHouse are well-documented, with rolling upgrades supported across minor versions. The project maintains clear compatibility guidelines between major versions.
Observability and alerting
ClickHouse exposes detailed system tables that provide visibility into query execution, resource usage, and cluster health. You query these tables with SQL to build custom monitoring dashboards and alerts.
CedarDB's observability tooling is still maturing. The system provides basic metrics and logging, but the ecosystem of third-party monitoring integrations is smaller than ClickHouse's. ClickHouse integrates with Prometheus, Grafana, and Datadog out of the box.
Security and RBAC
ClickHouse offers role-based access control with permissions at the database, table, and column level. The system supports multiple authentication methods including SQL users, LDAP, and Kerberos.
CedarDB implements ACID transactions with row-level locking, which provides different consistency guarantees than ClickHouse's eventual consistency model. The RBAC system is less feature-complete than ClickHouse's mature access control. Both systems support encrypted connections and data encryption at rest.
Ecosystem, community and vendor options
The size of a database's ecosystem affects how easily you integrate it into your stack. Larger communities mean more answered questions, more third-party tools, and more developers familiar with the technology.
SQL and driver compatibility
ClickHouse implements a large subset of standard SQL with some proprietary extensions for analytical functions. The system provides native drivers for Python, Go, Java, JavaScript, and Rust.
CedarDB aims for PostgreSQL wire protocol compatibility, which means many existing PostgreSQL tools and libraries work without modification. This compatibility simplifies migration from PostgreSQL, though some advanced ClickHouse features aren't available through the PostgreSQL protocol. Both systems support JDBC and ODBC connections.
Third-party integrations
ClickHouse has a mature ecosystem:
- ETL tools: Airbyte, Fivetran, dbt, Apache Kafka, Apache Flink
- BI platforms: Grafana, Tableau, Metabase, Apache Superset, Looker
- Cloud services: AWS Kinesis, Google Pub/Sub, Azure Event Hubs
CedarDB's integration ecosystem is growing but smaller. PostgreSQL compatibility helps because many tools that support PostgreSQL can connect to CedarDB with minimal changes.
Release cadence and community size
ClickHouse releases new versions monthly with active development from both ClickHouse Inc. and the broader community. The project has over 30,000 GitHub stars and thousands of production deployments worldwide.
CedarDB is a newer commercial product from a research team. The system benefits from academic research but has a smaller user base and less public documentation than ClickHouse.
Cost and licensing considerations
Total cost includes not just licensing fees but also hardware, operational overhead, and developer time. These factors often outweigh software license costs.
Hardware efficiency
ClickHouse achieves high performance through efficient CPU, memory, and storage use. Columnar storage and compression typically mean you store more data on less hardware compared to row-oriented databases.
CedarDB also uses hardware efficiently, particularly CPU cycles through compiled query execution. The system's dual optimization for analytical and transactional workloads may require more memory than ClickHouse's specialized analytical engine. Storage costs matter for large datasets. ClickHouse's aggressive compression often reduces storage requirements by 10x or more, with users reporting 600GB reduced to 35GB, which directly lowers cloud storage bills.
Managed service pricing
ClickHouse Cloud offers managed hosting with usage-based pricing. You pay for compute resources and storage separately, which allows independent scaling of each dimension.
CedarDB is available as a commercial product with pricing that hasn't been publicly disclosed. Managed service options are still developing. Tinybird provides managed ClickHouse with a focus on developer experience and API creation, handling infrastructure management and scaling automatically.
Hidden DevOps costs
Self-hosting ClickHouse requires expertise in database operations, cluster management, and performance tuning. Many organizations underestimate the ongoing time investment for maintaining production database infrastructure.
CedarDB's single-node focus reduces some operational complexity compared to distributed ClickHouse clusters. You still handle backups, monitoring, security patches, and capacity planning. Managed services eliminate most DevOps overhead but add monthly costs.
When to choose one database over the other
The right choice depends on your specific requirements, team capabilities, and long-term plans. Neither database is universally better.
Choose ClickHouse when you need a proven system for analytical queries on large datasets with mature tooling and a large community. ClickHouse works well when your workload is primarily append-only with infrequent updates and you might need to scale horizontally across multiple nodes.
Choose CedarDB when you want to experiment with a newer system that unifies analytical and transactional workloads. CedarDB makes sense when you need frequent row-level updates alongside analytical queries and can work within single-node scaling limits.
| Factor | ClickHouse | CedarDB |
|---|---|---|
| Maturity | Production-proven since 2016 | New, launched 2024 |
| Community | Large, active open source community | Small, commercial product |
| Scaling | Horizontal and vertical | Primarily vertical |
| Updates | Optimized for append-only | Supports frequent updates |
| Ecosystem | Extensive integrations | Growing, PostgreSQL compatible |
For most analytical workloads, ClickHouse offers a safer choice with proven scalability and a larger support ecosystem. CedarDB represents an interesting ClickHouse alternative if its specific features align with your requirements and you're comfortable with newer technology.
How managed ClickHouse on Tinybird removes the ops burden
Tinybird provides managed ClickHouse infrastructure that removes the operational complexity. The platform handles provisioning, scaling, backups, and monitoring automatically so developers focus on building applications rather than managing infrastructure.
The service includes features designed for application developers integrating ClickHouse into their backends. Tinybird provides streaming data ingestion, SQL-based data transformation pipelines, and automatically generated REST APIs from your queries. If you prefer direct database access from your application, see our guides for Java and Python.
Developers define data sources and queries as code using .datasource and .pipe files, which enables version control and CI/CD workflows.
To get started with Tinybird, sign up for a free account and follow the quickstart guide. You can have a working API backed by ClickHouse running in minutes rather than days of infrastructure setup.
Frequently asked questions about ClickHouse vs CedarDB
Does CedarDB support vector indexes for similarity search?
CedarDB is developing vector search capabilities but currently offers less mature support than ClickHouse. ClickHouse provides production-ready vector indexes using HNSW algorithms for approximate nearest neighbor search, which is important for AI applications like semantic search and recommendation engines.
Can I migrate my ClickHouse tables to CedarDB without reloading data?
No direct migration tools exist between ClickHouse and CedarDB. Moving data requires exporting from ClickHouse, typically as CSV or Parquet files, and importing into CedarDB. You'll likely need schema adjustments because the systems use different table engines and optimization strategies.
What is the maturity of CedarDB backup and restore tooling?
CedarDB's backup and restore tools are still developing. ClickHouse offers mature backup solutions including incremental backups, point-in-time recovery, and integration with object storage like S3. Third-party backup tools are also available from vendors like Altinity and ClickHouse Inc.
Is CedarDB open source or source-available?
CedarDB follows a commercial licensing model rather than open source. ClickHouse uses the Apache License 2.0 for its open source version, which allows free use, modification, and distribution. ClickHouse Inc. also offers commercial licenses with additional features.
How do both databases handle schema changes on large tables?
ClickHouse supports online schema changes through ALTER TABLE commands that modify metadata without rewriting data for many operations. CedarDB also supports schema evolution, but its approach differs due to its transactional architecture. The feature set is still expanding compared to ClickHouse's mature ALTER operations.
/
