---
title: "ClickHouse® tips #3: the transform function"
excerpt: "Using the transform function to join two tables when joinGet is not available. Part 3."
authors: "Javi Santana"
categories: "The Data Base"
createdOn: "2023-09-25 00:00:00"
publishedOn: "2021-02-25 00:00:00"
updatedOn: "2022-09-08 00:00:00"
status: "published"
---

<p>Clickhouse® has a powerful feature, <a href="https://clickhouse.tech/docs/en/engines/table-engines/special/join/"><code>JOIN</code></a> engines, that allows us to prepare a table to be joined with better performance than a regular table (MergeTree, Log…). It also allows to use <a href="https://clickhouse.tech/docs/en/sql-reference/functions/other-functions/#joinget"><code>joinGet</code></a> to get table values using a key.</p><p>Somtimes you don’t have a JOIN table but you’d like to use something with the joinGet performance. Unfortunately, you can’t use joinGet with something created on the fly.</p><p>However there is a way to do that, using <a href="https://clickhouse.tech/docs/en/sql-reference/functions/other-functions/#transformx-array-from-array-to-default">transform</a></p><p>The basic structure would be like this</p><figure class="kg-card kg-embed-card"><iframe loading="lazy" width="100%" src="https://snippets.tinybird.co/endpoint?code=WITH%20%28%0A%20%20%20%20SELECT%20%28groupArray%28key%29%2C%20groupArray%28value%29%29%20FROM%20my_table%0A%29%20AS%20key_value%0ASELECT%20transform%28column%2C%20key_value.1%2C%20key_value.2%29%20FROM%20other_table%0A&amp;language=Tinybird&amp;title=transform_example&amp;run=p.eyJ1IjogIjdmOTIwMmMzLWM1ZjctNDU4Ni1hZDUxLTdmYzUzNTRlMTk5YSIsICJpZCI6ICI3N2NiMDljOS1hYjVjLTQ2MzYtYTA4Yi00ZDYwNDQwMzUyZTYifQ.XhZPypxjhaeJbdJzNozOKmKPbFqM9CWkAil8WiXFaAE&amp;token="></iframe></figure><h3 id="a-real-life-example">A real-life example</h3><p>Given a <code>sales</code> table like this with 1M rows (download the csv <a href="https://storage.googleapis.com/tinybird-assets/datasets/tips/tips-3/sales.csv">here</a>):</p><figure class="kg-card kg-embed-card"><iframe loading="lazy" width="100%" src="https://snippets.tinybird.co/endpoint?code=SELECT%20%2A%20FROM%20sales%0A&amp;language=Tinybird&amp;title=sales.csv&amp;run=run&amp;token=p.eyJ1IjogIjdmOTIwMmMzLWM1ZjctNDU4Ni1hZDUxLTdmYzUzNTRlMTk5YSIsICJpZCI6ICI3N2NiMDljOS1hYjVjLTQ2MzYtYTA4Yi00ZDYwNDQwMzUyZTYifQ.XhZPypxjhaeJbdJzNozOKmKPbFqM9CWkAil8WiXFaAE"></iframe></figure><p>and a <code>exchange_rate</code> table like this (CSV <a href="https://storage.googleapis.com/tinybird-assets/datasets/tips/tips-3/exchange_rate.csv">here</a>), with daily data for 3 years:</p><figure class="kg-card kg-embed-card"><iframe loading="lazy" width="100%" src="https://snippets.tinybird.co/endpoint?code=SELECT%20%2A%20FROM%20exchange_rate%0A&amp;language=Tinybird&amp;title=exchange_rate.csv&amp;run=run&amp;token=p.eyJ1IjogIjdmOTIwMmMzLWM1ZjctNDU4Ni1hZDUxLTdmYzUzNTRlMTk5YSIsICJpZCI6ICI3N2NiMDljOS1hYjVjLTQ2MzYtYTA4Yi00ZDYwNDQwMzUyZTYifQ.XhZPypxjhaeJbdJzNozOKmKPbFqM9CWkAil8WiXFaAE"></iframe></figure><p>Imagine you want to get the total amount of sales in dollars per country. You could do it with a join like this if the data is small:</p><figure class="kg-card kg-embed-card"><iframe loading="lazy" width="100%" src="https://snippets.tinybird.co/endpoint?code=SELECT%20country%2C%20SUM%28amount_in_local_currency%20%2A%20rate_to_dollar%29%20dollars%0AFROM%20%28%0A%20%20%20%20SELECT%20%2A%20FROM%20sales%20s%0A%20%20%20%20JOIN%20exchange_rate%20r%0A%20%20%20%20ON%20s.country%20%3D%3D%20r.country%20AND%20toDate%28s.timestamp%29%20%3D%20r.date%0A%29%0AGROUP%20BY%20country%0AORDER%20BY%20country%0A&amp;language=Tinybird&amp;title=query_with_join&amp;run=run&amp;token=p.eyJ1IjogIjdmOTIwMmMzLWM1ZjctNDU4Ni1hZDUxLTdmYzUzNTRlMTk5YSIsICJpZCI6ICI3N2NiMDljOS1hYjVjLTQ2MzYtYTA4Yi00ZDYwNDQwMzUyZTYifQ.XhZPypxjhaeJbdJzNozOKmKPbFqM9CWkAil8WiXFaAE"></iframe></figure><p>But if the data is big, using <code>transform</code> would have a better performance. You’d do something like this:</p><figure class="kg-card kg-embed-card"><iframe loading="lazy" width="100%" src="https://snippets.tinybird.co/endpoint?code=WITH%20%0A%28%0A%20%20SELECT%20%28groupArray%28cityHash64%28date%2Ccountry%29%29%2C%20groupArray%28rate_to_dollar%29%29%20%0A%20%20from%20exchange_rate%0A%29%20AS%20key_value%0ASELECT%20%20%0A%20%20country%2C%0A%20%20sum%28amount_in_local_currency%2Atransform%28cityHash64%28toDate%28timestamp%29%2C%20country%29%2C%20key_value.1%2C%20key_value.2%2C%200.0%29%29%20AS%20dollars%0AFROM%20sales%0AGROUP%20BY%20country%0AORDER%20BY%20country%0A&amp;language=Tinybird&amp;title=query_with_transform&amp;run=run&amp;token=p.eyJ1IjogIjdmOTIwMmMzLWM1ZjctNDU4Ni1hZDUxLTdmYzUzNTRlMTk5YSIsICJpZCI6ICI3N2NiMDljOS1hYjVjLTQ2MzYtYTA4Yi00ZDYwNDQwMzUyZTYifQ.XhZPypxjhaeJbdJzNozOKmKPbFqM9CWkAil8WiXFaAE"></iframe></figure><h3 id="run-it-yourself">Run it yourself</h3><p>We’ve created two CSVs with fake data and those schemas. If you want to replicate the queries above in your account, create the data sources with this command:</p><figure class="kg-card kg-embed-card"><iframe loading="lazy" width="100%" src="https://snippets.tinybird.co/endpoint?code=TOKEN%3Dyour_admin_token%0A%0Acurl%20%5C%20%0A%20%20-H%20%22Authorization%3A%20Bearer%20%24TOKEN%22%20%5C%20%20%20%20%0A%20%20-X%20POST%20%5C%0A%20%20-d%20url%3D%27https%3A%2F%2Fstorage.googleapis.com%2Ftinybird-assets%2Fdatasets%2Ftips%2Ftips-3%2Fsales.csv%27%20%5C%0A%20%20https%3A%2F%2Fapi.tinybird.co%2Fv0%2Fdatasources%0A%0A%0Acurl%20%5C%20%0A%20%20-H%20%22Authorization%3A%20Bearer%20%24TOKEN%22%20%5C%0A%20%20-X%20POST%20%5C%0A%20%20-d%20url%3D%27https%3A%2F%2Fstorage.googleapis.com%2Ftinybird-assets%2Fdatasets%2Ftips%2Ftips-3%2Fexchange_rate.csv%27%20%5C%0A%20%20https%3A%2F%2Fapi.tinybird.co%2Fv0%2Fdatasources%0A%0A&amp;language=shell&amp;title=create_datasources&amp;run=&amp;token="></iframe></figure>
