PricingDocs
Bars

Data Platform

Managed ClickHouse
Production-ready with Tinybird's DX
Streaming ingestion
High-throughput streaming ingest
Schema iteration
Safe migrations with zero downtime
Connectors
Plug and play Kafka, S3, and GCS

Developer Experience

Instant SQL APIs
Turn SQL into an endpoint
BI & Tool Connections
Connect your BI tools and ORMs
Tinybird Code
Ingest and query from your terminal

Enterprise

Tinybird AI
AI resources for LLMs and agents
High availability
Fault-tolerance and auto failovers
Security and compliance
Certified SOC 2 Type II for enterprise
Sign inSign up
Product []

Data Platform

Managed ClickHouse
Production-ready with Tinybird's DX
Streaming ingestion
High-throughput streaming ingest
Schema iteration
Safe migrations with zero downtime
Connectors
Plug and play Kafka, S3, and GCS

Developer Experience

Instant SQL APIs
Turn SQL into an endpoint
BI & Tool Connections
Connect your BI tools and ORMs
Tinybird Code
Ingest and query from your terminal

Enterprise

Tinybird AI
AI resources for LLMs and agents
High availability
Fault-tolerance and auto failovers
Security and compliance
Certified SOC 2 Type II for enterprise
PricingDocs
Resources []

Learn

Blog
Musings on transformations, tables and everything in between
Customer Stories
We help software teams ship features with massive data sets
Videos
Learn how to use Tinybird with our videos
ClickHouse for Developers
Understand ClickHouse with our video series

Build

Templates
Explore our collection of templates
Tinybird Builds
We build stuff live with Tinybird and our partners
Changelog
The latest updates to Tinybird

Community

Slack Community
Join our Slack community to get help and share your ideas
Open Source Program
Get help adding Tinybird to your open source project
Schema > Evolution
Join the most read technical biweekly engineering newsletter

Our Columns:

Skip the infra work. Deploy your first ClickHouse
project now

Get started for freeRead the docs
A geometric decoration with a matrix of rectangles.

Product /

ProductWatch the demoPricingSecurityRequest a demo

Company /

About UsPartnersShopCareers

Features /

Managed ClickHouseStreaming IngestionSchema IterationConnectorsInstant SQL APIsBI & Tool ConnectionsTinybird CodeTinybird AIHigh AvailabilitySecurity & Compliance

Support /

DocsSupportTroubleshootingCommunityChangelog

Resources /

ObservabilityBlogCustomer StoriesTemplatesTinybird BuildsTinybird for StartupsRSS FeedNewsletter

Integrations /

Apache KafkaConfluent CloudRedpandaGoogle BigQuerySnowflakePostgres Table FunctionAmazon DynamoDBAmazon S3

Use Cases /

User-facing dashboardsReal-time Change Data Capture (CDC)Gaming analyticsWeb analyticsReal-time personalizationUser-generated content (UGC) analyticsContent recommendation systemsVector search
All systems operational

Copyright © 2025 Tinybird. All rights reserved

|

Terms & conditionsCookiesTrust CenterCompliance Helpline
Tinybird wordmark
PricingDocs
Bars

Data Platform

Managed ClickHouse
Production-ready with Tinybird's DX
Streaming ingestion
High-throughput streaming ingest
Schema iteration
Safe migrations with zero downtime
Connectors
Plug and play Kafka, S3, and GCS

Developer Experience

Instant SQL APIs
Turn SQL into an endpoint
BI & Tool Connections
Connect your BI tools and ORMs
Tinybird Code
Ingest and query from your terminal

Enterprise

Tinybird AI
AI resources for LLMs and agents
High availability
Fault-tolerance and auto failovers
Security and compliance
Certified SOC 2 Type II for enterprise
Sign inSign up
Product []

Data Platform

Managed ClickHouse
Production-ready with Tinybird's DX
Streaming ingestion
High-throughput streaming ingest
Schema iteration
Safe migrations with zero downtime
Connectors
Plug and play Kafka, S3, and GCS

Developer Experience

Instant SQL APIs
Turn SQL into an endpoint
BI & Tool Connections
Connect your BI tools and ORMs
Tinybird Code
Ingest and query from your terminal

Enterprise

Tinybird AI
AI resources for LLMs and agents
High availability
Fault-tolerance and auto failovers
Security and compliance
Certified SOC 2 Type II for enterprise
PricingDocs
Resources []

Learn

Blog
Musings on transformations, tables and everything in between
Customer Stories
We help software teams ship features with massive data sets
Videos
Learn how to use Tinybird with our videos
ClickHouse for Developers
Understand ClickHouse with our video series

Build

Templates
Explore our collection of templates
Tinybird Builds
We build stuff live with Tinybird and our partners
Changelog
The latest updates to Tinybird

Community

Slack Community
Join our Slack community to get help and share your ideas
Open Source Program
Get help adding Tinybird to your open source project
Schema > Evolution
Join the most read technical biweekly engineering newsletter

Skip the infra work. Deploy your first ClickHouse
project now

Get started for freeRead the docs
A geometric decoration with a matrix of rectangles.

Product /

ProductWatch the demoPricingSecurityRequest a demo

Company /

About UsPartnersShopCareers

Features /

Managed ClickHouseStreaming IngestionSchema IterationConnectorsInstant SQL APIsBI & Tool ConnectionsTinybird CodeTinybird AIHigh AvailabilitySecurity & Compliance

Support /

DocsSupportTroubleshootingCommunityChangelog

Resources /

ObservabilityBlogCustomer StoriesTemplatesTinybird BuildsTinybird for StartupsRSS FeedNewsletter

Integrations /

Apache KafkaConfluent CloudRedpandaGoogle BigQuerySnowflakePostgres Table FunctionAmazon DynamoDBAmazon S3

Use Cases /

User-facing dashboardsReal-time Change Data Capture (CDC)Gaming analyticsWeb analyticsReal-time personalizationUser-generated content (UGC) analyticsContent recommendation systemsVector search
All systems operational

Copyright © 2025 Tinybird. All rights reserved

|

Terms & conditionsCookiesTrust CenterCompliance Helpline
Tinybird wordmark
PricingDocs
Bars

Data Platform

Managed ClickHouse
Production-ready with Tinybird's DX
Streaming ingestion
High-throughput streaming ingest
Schema iteration
Safe migrations with zero downtime
Connectors
Plug and play Kafka, S3, and GCS

Developer Experience

Instant SQL APIs
Turn SQL into an endpoint
BI & Tool Connections
Connect your BI tools and ORMs
Tinybird Code
Ingest and query from your terminal

Enterprise

Tinybird AI
AI resources for LLMs and agents
High availability
Fault-tolerance and auto failovers
Security and compliance
Certified SOC 2 Type II for enterprise
Sign inSign up
Product []

Data Platform

Managed ClickHouse
Production-ready with Tinybird's DX
Streaming ingestion
High-throughput streaming ingest
Schema iteration
Safe migrations with zero downtime
Connectors
Plug and play Kafka, S3, and GCS

Developer Experience

Instant SQL APIs
Turn SQL into an endpoint
BI & Tool Connections
Connect your BI tools and ORMs
Tinybird Code
Ingest and query from your terminal

Enterprise

Tinybird AI
AI resources for LLMs and agents
High availability
Fault-tolerance and auto failovers
Security and compliance
Certified SOC 2 Type II for enterprise
PricingDocs
Resources []

Learn

Blog
Musings on transformations, tables and everything in between
Customer Stories
We help software teams ship features with massive data sets
Videos
Learn how to use Tinybird with our videos
ClickHouse for Developers
Understand ClickHouse with our video series

Build

Templates
Explore our collection of templates
Tinybird Builds
We build stuff live with Tinybird and our partners
Changelog
The latest updates to Tinybird

Community

Slack Community
Join our Slack community to get help and share your ideas
Open Source Program
Get help adding Tinybird to your open source project
Schema > Evolution
Join the most read technical biweekly engineering newsletter
Back to Blog
Share this article:
Back

ClickHouse Tips #4: Materializing UNION ALL queries

Learn how to materialize queries on ClickHouse when there are UNION ALL operations involved, and when you can use SimpleAggregateFunction instead of AggregateFunction. Part 4.
The Data Base
Elena Torró
Elena TorróSoftware Developer

The problem

There are times when you have a query that involves UNION ALL statements and you’d like to materialize it. If you try to do it on ClickHouse, it will fail. The solution is to create two materialized views that write to the same table. In this post we’ll explain how to do it on ClickHouse and on Tinybird.

We’ll also use SimpleAggregateFunction(function, type). It is a data type that stores the value of the given aggregate function without storing the full state as AggregateFunction does. According to the documentation, “it can be applied to functions for which the following property holds: the result of applying a function f to a row set S1 UNION ALL S2 can be obtained by applying f to parts of the row set separately, and then again applying f to the results: f(S1 UNION ALL S2) = f(f(S1) UNION ALL f(S2))”. What does this mean?

Imagine that function is sum, one set is [1, 3, 7] and another set is [2, 4, 6]. Doing sum(sum(1, 3, 7), sum(2, 4, 6)) will yield the same result as doing sum(1, 3, 7, 2, 4, 6). So if we want to use sum in a materialize view, we can store the result in a SimpleAggregationFunction(sum, type) data type. Other functions where this happen are any, anyLast, min, max, and more.

Therefore, SimpleAggregateFunctions data types will be useful, for instance, when using a data source to store the results of two materialized views in order to perform the same output as doing an “UNION ALL” of two tables, but faster.

A practical example

Let’s see first an example of what we would like to achieve. In this example, we’ll be using Unsplash open data.

We have will be using two source tables: collections and keywords. The first one contains information about every time a photo has been added to a collection, and the second one about the keywords used to describe each photo.

Our purpose is to create an enpoint to get the last time a photo was added into a collection and some information about this collection along with the complete list of keywords for this photo. In order to explore the data and get an idea of what we want to accomplish, we will be doing something similar to the following aggregation over the result of an UNION ALL:

How can we achieve the same result but using materialized views? We can not materialize the result of an UNION ALL, but we can make use of the AggregatingMergeTree engine.

First of all, let’s create the destination table, photo_last_collected:

What should we take into account here:

  • Every SimpleAggregateFunction using groupArrayArray must have Array type, and can’t be Nullable. The groupArray function can’t contain null values.
  • We could also use groupUniqArrayArray in order to avoid duplicated keywords

Now, let’s create two materialized views, one per table. This view materializes the photo id and the keywords that belong to each photo to the photo_last_collected Data Source:

And the following view materializes the photo id, the last time it was added to a collection, the collection id and the collection title that belong to each photo to the photo_last_collected Data Source:

Key points:

  • As a rule of thumb, column names must have the same name and must be inserted in the same order into the destination Data Source
  • The keywords_list can not simply be an empty array, we must cast it correctly to Array(String), because that’s the data type of the destination column
  • last_time_collected_at, collection_id and collection_title can’t either be Null, we have to cast them to the exact type of the column in the landing Data Source they’re being saved in

If you create the Tinybird Data Sources and Pipe files under a project with this structure (the folder structure is already created when you do tb init with our CLI):

You’d just have to run tb push --push-deps --populate --wait to create the Data Source on Tinybird, the materialized views, and populate the Data Source.

Then, you’d only have to get the data already joined from the photo_last_collected destination table.

As you can see, the result is the same as from the first query, and it’s about 6x faster.

In Tinybird's Data Flow graph you can see how the collections and keywords photos are materialize data into the photo_last_collected Data Source
In Tinybird's Data Flow graph you can see how the collections and keywords photos are materialize data into the photo_last_collected Data Source

A similar result would have been obtained if we had used AggregateFunction when defining the photo_last_collected Data Source. Using SimpleAggregateFunction has several advantages:

  • As SimpleAggregateFunction just saves the value instead of the full state of its aggregate function (like AggregateFunction does), it’s more performant.
  • It lets us write the cleaner and simpler queries, as we don’t have to use the -State suffixes for the Materialized Views and we can just query the landing datasource doing SELECT * FROM photo_last_collected. Otherwise we’d have to use the -Merge suffix to merge the intermediate states of the aggregated columns.
Do you like this post? Spread it!

Skip the infra work. Deploy your first ClickHouse
project now

Get started for freeRead the docs
A geometric decoration with a matrix of rectangles.
Tinybird wordmark

Product /

ProductWatch the demoPricingSecurityRequest a demo

Company /

About UsPartnersShopCareers

Features /

Managed ClickHouseStreaming IngestionSchema IterationConnectorsInstant SQL APIsBI & Tool ConnectionsTinybird CodeTinybird AIHigh AvailabilitySecurity & Compliance

Support /

DocsSupportTroubleshootingCommunityChangelog

Resources /

ObservabilityBlogCustomer StoriesTemplatesTinybird BuildsTinybird for StartupsRSS FeedNewsletter

Integrations /

Apache KafkaConfluent CloudRedpandaGoogle BigQuerySnowflakePostgres Table FunctionAmazon DynamoDBAmazon S3

Use Cases /

User-facing dashboardsReal-time Change Data Capture (CDC)Gaming analyticsWeb analyticsReal-time personalizationUser-generated content (UGC) analyticsContent recommendation systemsVector search
All systems operational

Copyright © 2025 Tinybird. All rights reserved

|

Terms & conditionsCookiesTrust CenterCompliance Helpline

Related posts

The Data Base
Feb 25, 2021
ClickHouse tips #3: the transform function
Javi Santana
Javi SantanaCo-founder
1ClickHouse tips #3: the transform function
The Data Base
Mar 24, 2021
ClickHouse tips #5: Adding and subtracting intervals
Alberto Romeu
Alberto RomeuSoftware Engineer
1ClickHouse tips #5: Adding and subtracting intervals
The Data Base
Apr 23, 2021
ClickHouse tips #6: Filtering data in subqueries to avoid joins
Xoel López
Xoel LópezFounder at TheirStack
1ClickHouse tips #6: Filtering data in subqueries to avoid joins
The Data Base
Feb 10, 2021
ClickHouse tips #2: Debugging ClickHouse on Visual Studio Code
Alberto Romeu
Alberto RomeuSoftware Engineer
1ClickHouse tips #2: Debugging ClickHouse on Visual Studio Code
The Data Base
Jun 01, 2021
ClickHouse tips #8: Generating time-series on ClickHouse
Xoel López
Xoel LópezFounder at TheirStack
1ClickHouse tips #8: Generating time-series on ClickHouse
The Data Base
Jan 26, 2021
Clickhouse Tips #1: Calculating Aggregations After a Given Date
Javi Santana
Javi SantanaCo-founder
1Clickhouse Tips #1: Calculating Aggregations After a Given Date
The Data Base
May 18, 2021
ClickHouse tips #7: Forward and backfilling null values
Alejandra Rodriguez
Alejandra Rodriguez
1ClickHouse tips #7: Forward and backfilling null values
The Data Base
Jul 08, 2021
ClickHouse tips #9: Filling gaps in time-series on ClickHouse
Xoel López
Xoel LópezFounder at TheirStack
1ClickHouse tips #9: Filling gaps in time-series on ClickHouse
The Data Base
Feb 16, 2021
Clickhouse, Open Source and Tinybird
Javi Santana
Javi SantanaCo-founder
1Clickhouse, Open Source and Tinybird
The Data Base
Sep 22, 2021
ClickHouse tips #11: Best way to get query types
Tinybird
TinybirdTeam
1ClickHouse tips #11: Best way to get query types