One of the things I enjoy most about working at Tinybird is that contributing to ClickHouse® is part of the job. We track experimental features, test them against real workloads, and when we find bugs, we fix them upstream.
That's what happened with the Alias table engine. We found two bugs: one in how DDL dependencies are tracked, and another where inserts through an Alias silently failed to trigger materialized views. We shipped a fix for the first one and collaborated on the second.
What the Alias engine does
You create an alias table with:
CREATE TABLE my_alias ENGINE = Alias('target_table');
Any queries and operations to my_alias will be forwarded to target_table without any data being copied.
Why it's interesting for us
It will simplify how Deployments work. A Deployment is how we call the process that applies changes to a workspace in Tinybird Forward. Our users can update their schemas and queries in arbitrary ways and we ensure that their data stays consistent during the migration, with 0 downtime, and with convenient staging and discard mechanisms so they can validate the final state of their data. It's one of our proudest and most differentiating features and, as such, it gets a lot of use. Changing the sorting keys of a bunch of tables connected by materialized views takes our users a single command and frees them to work on what really matters for the products they build on Tinybird.
During Deployments, Tinybird routes ingestion and reads to different tables and views to ensure data integrity without downtime while schemas are being updated and data rewritten. Currently, we're relying on our application layer for this routing, and pushing it down to ClickHouse® will simplify our code and eliminate a bunch of failure modes, like ensuring all the different components in our stack have fresh and consistent targets for writes at a given time.
This feature is still experimental (SET allow_experimental_alias_table_engine = 1), but it solves a real set of problems we care about. So we're doing our best to help test it and push it forward.
The bugs
Dependency tracking
ClickHouse® has a setting called check_referential_table_dependencies. When enabled, it prevents you from dropping a table that other tables or views depend on. If you have a materialized view reading from table T1 and writing to table T2, the setting prevents you from dropping T1 or T2 while the materialized view exists. With an Alias table, it should prevent you from dropping its target.
And that's precisely what wasn't working as expected. When an Alias table is created without a fully-qualified name (Alias(target_table) instead of Alias(target_database.target_table)), the target table is looked up in the alias' database, not in the session database. But the dependency tracking mechanism was using the current session database. thus breaking the safety dependency checks. The following statement illustrates this:
CREATE TABLE db_one.my_alias ENGINE = Alias('target_table');
Here, target_table is resolved to db_one.target_table. But the dependency tracking mechanism was resolving it to current_database.target_table, using whatever database the session happened to be connected to.
So if you ran this from a session connected to db_two, the dependency was recorded as db_two.target_table instead of db_one.target_table. And check_referential_table_dependencies couldn't prevent you from dropping db_one.target_table, because it didn't know the alias depended on it.
The fix was small. The core change is one line:
// Before:
addQualifiedNameFromArgument(table_engine, 0);
// After:
addQualifiedNameFromArgumentUsingTableDatabase(table_engine, 0);
The new method addQualifiedNameFromArgumentUsingTableDatabase resolves unqualified table names using the database of the table being created instead of the session database. This matches the actual behavior of the Alias engine. After this change, you can rely on check_referential_table_dependencies to save you from ending up with an Alias with no target, which would break your ingestion and queries.
Materialized Views from Alias tables
In our testing, we also found a bug in how materialized views were triggered (not triggered, in fact) by inserts to Alias tables.
┌──────┐ Inserts to Alias were
│Alias │ not triggering MV
└──────┘
│ Data was written to Table1
▼ but not to Table2
┌──────┐ ┌──────┐ ┌──────┐
│Table1│───────▶│ MV │───────▶│Table2│
└──────┘ └──────┘ └──────┘
For this one, I proposed a fix to start the conversation with nauu, the core contributor who had been working on the Alias engine. After discussing tradeoffs and our different use cases for the new engine, they ended up implementing a solution that satisfied them all. Thanks nauu for being receptive to our feedback and for implementing the solution.
Contributing upstream
Neither of these bugs is a big deal in isolation. But you only find them by actually running experimental features against real workloads, which is what we do. We've been evaluating the Alias engine for how Tinybird handles routing during deployments, and these bugs surfaced during that work.
We've contributed to ClickHouse® before (JOIN support for parallel replicas, query optimizations) and we'll keep doing it. We have our own fork so we can be more agile rolling out fixes, and for improvements to features in which our strategy diverges from ClickHouse, Inc's. But we still upstream all bug fixes that can benefit other ClickHouse® users.
We take pride in how our product makes ClickHouse accessible to any developer at any scale, without them having to deal with infra or the setup complexities. So the work requires a strong product mindset, empathy towards other engineers that use Tinybird, and a keen eye for good DevEx. But you also get to dive into the internals of ClickHouse, and even contribute to it. If you also enjoy work like this, we're hiring.
