Choosing the ENGINE_PARTITION_KEY¶
Partitioning is not intended to speed up SELECT queries (the ENGINE_SORTING_KEY key is sufficient to make range queries fast), partitions are intended for data manipulation. If the number of partitions is huge, the SELECT query will run slower if it has to look in many partitions.
Leaving empty the ENGINE_PARTITION_KEY. If you don’t have it clear beforehand or the table is relatively small (a few gigabytes), leave it empty. The data will be placed in a single partition.
Using a date column. Depending on the filter, you can choose more or less granularity based on your needs.
toYear(date_column)should work fine most of the times
Using a column to handle a controlled amount of different values to partition depending on how you consume and insert data.
ENGINE_PARTITION_KEY "event_type % 8"
If you have doubts about choosing a partition key for your use case, contact us at firstname.lastname@example.org.
Using Materialized Views¶
When aggregating data, it’s useful to save the results in a Materialized View, and then use the Materialized View in your queries.
Materialized Views work as insert triggers over the left most table in the Materialized Node SQL query, that means when new data is appended to the left most table the Materialized Node query is executed. As a rule of thumb, use aggregate functions (
avg, …) and
GROUP BY expressions over the left most table in the topmost node or query being materialized.