MergeTree engine

The MergeTree engine is designed for high data ingest rates and huge data volumes. It is the default engine for Data Sources.

Key features:

  • Append-only writes: Data is ingested in immutable parts, which are later compacted in the background.
  • Automatic background merges: Tinybird handles merging of parts automatically, improving query performance and reducing storage overhead over time.
  • Scalable storage layout: Data is stored in a columnar format and split into parts and granules to support efficient reads at scale.
  • Best-effort ordering: Data is stored roughly in the order defined by the sorting key, enabling faster range-based filtering.
  • Supports partitioning: Ingested data can be split into logical partitions, which are used for background operations like TTL expiration and deletes.

Creating a MergeTree Data Source

Here's how to define a MergeTree data source:

SCHEMA >
    `event_date` Date `json:$.date`,
    `event_type` LowCardinality(String) `json:$.event_type`,
    `user_id` UInt64 `json:$.user_id`,
    `payload` JSON `json:$.payload`

[ENGINE "MergeTree"]
[ENGINE_SORTING_KEY "event_type, event_date, user_id"]
[ENGINE_PARTITION_KEY "toYYYYMM(event_date)"]
[ENGINE_PRIMARY_KEY "event_type, event_date"]
[ENGINE_TTL "event_date + INTERVAL 30 DAY"]
[ENGINE_SETTINGS index_granularity=8192]

[INDEXES >
    index_name index_expression TYPE index_type GRANULARITY index_granularity]

Engine Settings

  • ENGINE_SORTING_KEY - (Optional, but highly recommended for query performance). Defaults to tuple().

    • Defines how data is physically ordered within each part of the Data Source.
    • Sorting happens at the granule level (blocks of 8192 rows), not by individual row values.
    • Use this to accelerate filtering and improve scan efficiency.

    Tips:

    • Choose columns that appear early in your most common query filters.
    • Example: If filtering by event_type and event_date, use ENGINE_SORTING_KEY = (event_type, event_date).

  • ENGINE_PARTITION_KEY - (Optional) When left empty, Tinybird will write ingested data to one partition.

    • Controls how ingested data is logically split into partitions, which serve as the minimum unit for background merges, TTL-based expiration, data deletions and populates.
    • Partition pruning ensures partitions are omitted from reading when the query allows it.

    Tips:

    • Unlike the SORTING_KEY, Partitions do not drastically improve query speed.
    • Use a time-based expression like toYYYYMM(event_date) for monthly partitions.
    • Limit inserts to no more than 12 partitions to avoid TOO_MANY_PARTS errors.

  • ENGINE_PRIMARY_KEY - (Optional) Defines the primary key for the table. Defaults to the value of ENGINE_SORTING_KEY.

    • Specifies the primary key used for deduplication and index building.
    • Can be a subset of the sorting key to reduce index size while maintaining sort order benefits.

    Tips:

    • Useful when a long sorting key is needed (e.g. for aggregation or deduplication). A shorter primary key helps keep the index size smaller.

  • ENGINE_TTL - (Optional)
    • Defines rules for data expiration.
    • Must evaluate to a Date or DateTime, for example "event_date + INTERVAL 1 DAY".

  • ENGINE_SETTINGS - (Optional)
    • Custom settings that tune the MergeTree behavior, defined below.
    • Example: index_granularity=8192 sets the size of granules, aligning with sorting key behavior.

  • INDEXES > - (Optional)
    • Define additional secondary indexes to improve filter performance on non-sorting key columns.

Sorting Keys and Indexes in Queries

Take the "event_type, event_date" sorting key as an example. In this case, the sorting and index can be illustrated as follows:

      Whole data:     [-------------------------------------------------------------------------]
      event_type:     [aaaaaaaaaaaaaaaaaabbbbcdeeeeeeeeeeeeefgggggggghhhhhhhhhiiiiiiiiikllllllll]
      event_date:     [1111111222222233331233211111222222333211111112122222223111112223311122333]
      Marks:           |      |      |      |      |      |      |      |      |      |      |
                      a,1    a,2    a,3    b,3    e,2    e,3    g,1    h,2    i,1    i,3    l,3
      Marks numbers:   0      1      2      3      4      5      6      7      8      9      10

If the data query specifies:

  • event_type in ('a', 'h'), the server reads the data in the ranges of marks [0, 3) and [6, 8).
  • event_type in ('a', 'h') AND event_date = 3, the server reads the data in the ranges of marks [1, 3) and [7, 8).
  • event_date = 3, the server reads the data in the range of marks [1, 10].

The examples above show that it's always more effective to use an index than a full scan.

MergeTree Engine Mechanics

In Tinybird, ingested data is written into partitions, which are defined by the ENGINE_PARTITION_KEY. Each insert operation generates one or more data parts, with one part per partition represented in the batch. A Data Source is therefore composed of many such parts, each belonging to a specific partition.

Using very granular partitions (e.g., by exact timestamp or user ID) can lead to an excessive number of small parts being written, which slows down inserts and hinders the engine’s ability to merge efficiently. Partition by a time-based column (year or month) when possible.

Within each part, rows are sorted lexicographically according to the ENGINE_SORTING_KEY. For example, if the sorting key is (event_type, event_date), rows are ordered first by event_type, then by event_date within each event_type. This ordering plays a key role in query efficiency—order matters.

To optimize performance and storage, the MergeTree engine runs a background merge process that consolidates small parts into larger ones within the same partition. Parts from different partitions are never merged. These merges help reduce storage overhead and improve read performance, but:

  • They do not guarantee that all rows with the same sorting or primary key end up in the same part.

  • They are automatic and depend on factors like server load and part size.

  • They cannot be triggered manually or precisely controlled.

Each data part is further divided into granules, which are the smallest unit of data Tinybird reads during queries. Granules:

  • Always contain whole rows (never split across granules).

  • Hold between 1 and index_granularity rows (default is typically 8192).

  • Start with a row whose sorting key is saved as a mark in an index file.

Tinybird generates an index file for each data part, storing these marks per granule for every column, not just those in the sorting or primary key. This indexing structure allows Tinybird to efficiently skip irrelevant granules and read only the necessary data from column files during query execution.

Data Skipping Indexes

SCHEMA >
    `event_date` Date `json:$.date`,
    `event_type` LowCardinality(String) `json:$.event_type`,
    `user_id` UInt64 `json:$.user_id`,
    `payload` JSON `json:$.payload`

ENGINE "MergeTree"
ENGINE_SORTING_KEY "event_type, event_date, user_id"

INDEXES >
    index_name index_expression TYPE index_type GRANULARITY index_granularity

For tables from the MergeTree family, data skipping indices can be specified.

These indices compute and store aggregate information about a specified expression across blocks of data. Each block contains a number of granules, where the number of granules per block is defined by granularity_value. The size of each granule is determined by the index_granularity setting in the table engine.

During SELECT queries, these precomputed aggregates help reduce data read from storage by allowing the engine to skip large blocks of data when it's clear that the WHERE clause conditions cannot be met.

You can omit the GRANULARITY clause; if not specified, granularity_value defaults to 1.

Example

SCHEMA >
    u64 UInt64,
    i32 Int32,
    s String

INDEXES >
    INDEX idx1 u64 TYPE bloom_filter GRANULARITY 3,
    INDEX idx2 u64 * i32 TYPE minmax GRANULARITY 3,
    INDEX idx3 u64 * length(s) TYPE set(1000) GRANULARITY 4

Indices from the example can be used by Tinybird to reduce the amount of data read from storage in the following queries:

SELECT count() FROM ds WHERE u64 == 10
SELECT count() FROM ds WHERE u64 * i32 >= 1234
SELECT count() FROM ds WHERE u64 * length(s) == 1234

Data skipping indexes can also be created on composite columns:

-- on columns of type Map:
INDEX map_key_index mapKeys(map_column) TYPE bloom_filter
INDEX map_value_index mapValues(map_column) TYPE bloom_filter

-- on columns of type Tuple:
INDEX tuple_1_index tuple_column.1 TYPE bloom_filter
INDEX tuple_2_index tuple_column.2 TYPE bloom_filter

-- on columns of type Nested:
INDEX nested_1_index col.nested_col1 TYPE bloom_filter
INDEX nested_2_index col.nested_col2 TYPE bloom_filter

Available Types of Indices

MinMax

Stores extremes of the specified expression (if the expression is tuple, then it stores extremes for each element of tuple), uses stored info for skipping blocks of data like the primary key.

Syntax: minmax

Set

Stores unique values of the specified expression (no more than max_rows rows, max_rows=0 means “no limits”). Uses the values to check if the WHERE expression isn't satisfiable on a block of data.

Syntax: set(max_rows)

Bloom Filter

Stores a Bloom filter for the specified columns. An optional false_positive parameter with possible values between 0 and 1 specifies the probability of receiving a false positive response from the filter. Default value: 0.025. Supported data types: Int*, UInt*, Float*, Enum, Date, DateTime, String, FixedString, Array, LowCardinality, Nullable, UUID and Map. For the Map data type, the client can specify if the index should be created for keys or values using mapKeys or mapValues function.

Syntax: bloom_filter([false_positive])

N-gram Bloom Filter

Stores a Bloom filter that contains all n-grams from a block of data. Only works with datatypes: String, FixedString and Map. Can be used for optimization of EQUALS, LIKE and IN expressions.

Syntax: ngrambf_v1(n, size_of_bloom_filter_in_bytes, number_of_hash_functions, random_seed)

  • n: ngram size,
  • size_of_bloom_filter_in_bytes: Bloom filter size in bytes (you can use large values here, for example, 256 or 512, because it can be compressed well).
  • number_of_hash_functions: The number of hash functions used in the Bloom filter.
  • random_seed: The seed for Bloom filter hash functions.

Token Bloom Filter

The same as ngrambf_v1, but stores tokens instead of ngrams. Tokens are sequences separated by non-alphanumeric characters.

Syntax: tokenbf_v1(size_of_bloom_filter_in_bytes, number_of_hash_functions, random_seed)

Functions Support

Conditions in the WHERE clause contains calls of the functions that operate with columns. If the column is a part of an index, Tinybird tries to use this index when performing the functions. Tinybird supports different subsets of functions for using indexes.

Indexes of type set can be utilized by all functions. The other index types are supported as follows:

Function (operator) / Indexprimary keyminmaxngrambf_v1tokenbf_v1bloom_filter
equals (=, ==)
notEquals(!=, <>)
like
notLike
match
startsWith
endsWith
multiSearchAny
in
notIn
less (<)
greater (>)
lessOrEquals (<=)
greaterOrEquals (>=)
empty
notEmpty
has
hasAny
hasAll
hasToken
hasTokenOrNull
hasTokenCaseInsensitive (*)
hasTokenCaseInsensitiveOrNull (*)

Functions with a constant argument that is less than ngram size can’t be used by ngrambf_v1 for query optimization.

(*) For hasTokenCaseInsensitive and hasTokenCaseInsensitiveOrNull to be effective, the tokenbf_v1 index must be created on lowercased data, for example INDEX idx (lower(str_col)) TYPE tokenbf_v1(512, 3, 0).

Bloom filters can have false positive matches, so the ngrambf_v1, tokenbf_v1, and bloom_filter indexes can't be used for optimizing queries where the result of a function is expected to be false.

For example:

  • Can be optimized:
    • s LIKE '%test%'
    • NOT s NOT LIKE '%test%'
    • s = 1
    • NOT s != 1
    • startsWith(s, 'test')
  • Cannot be optimized:
    • NOT s LIKE '%test%'
    • s NOT LIKE '%test%'
    • NOT s = 1
    • s != 1
    • NOT startsWith(s, 'test')

Settings

For a list of supported settings, see Engine settings.

Updated