---
title: "ClickHouse® ReplacingMergeTree examples and best use cases"
excerpt: "ClickHouse ReplacingMergeTree example that handles deduplication correctly. Most implementations miss critical settings. Don't."
authors: "Cameron Archer"
categories: "AI Resources"
createdOn: "2025-10-14 19:28:27"
publishedOn: "2025-10-16 19:28:27"
updatedOn: "2025-10-16 19:28:27"
status: "published"
---

When you insert updated records into ClickHouse{% sup %}®{% /sup %}, you often want to keep only the latest version of each row rather than accumulating duplicates. The ReplacingMergeTree table engine handles this automatically by removing duplicate rows during background merge operations, keeping either the most recently inserted row or the one with the highest version number.

This article explains how ReplacingMergeTree works, when to use it, and how to configure it correctly with working examples for common patterns like CDC ingestion, user state tracking, and streaming data deduplication.

## What is a ReplacingMergeTree and when to use it

ReplacingMergeTree is a ClickHouse{% sup %}®{% /sup %} table engine that removes duplicate rows with the same sorting key during background merge operations. The engine keeps only one row from each group of duplicates, typically the row with the highest version number or the most recently inserted row.

You'll find this engine useful when maintaining [current state tables](https://www.tinybird.co/blog-posts/when-to-use-columnar-database) where each record represents the latest version of an entity. For example, if you're tracking user profiles and receive updates throughout the day, ReplacingMergeTree automatically keeps only the newest profile data for each user. The same pattern works well for product catalogs, order statuses, or any scenario where you care about current state rather than full history.

The [deduplication happens during merge operations](https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/replacingmergetree/), not when you insert data. This means you might see duplicate rows in query results until ClickHouse{% sup %}®{% /sup %} completes a merge in the background.

## How row versioning and deduplication work

When ClickHouse{% sup %}®{% /sup %} merges data parts in a ReplacingMergeTree table, it looks for rows with identical values in all columns specified by the `ORDER BY` clause. For each group of matching rows, the engine picks one to keep and discards the rest.

The engine decides which row to keep using one of two methods:

* **Version column:** If you specify a version column when creating the table, ClickHouse{% sup %}®{% /sup %} keeps the row with the highest value in that column
* **Insertion order:** Without a version column, the engine keeps the last inserted row based on when data arrived

The timing of merges depends on ClickHouse{% sup %}®{% /sup %}'s internal scheduling and how you insert data. Smaller, more frequent inserts create more parts that need merging, while larger batch inserts create fewer parts.

## Create table syntax with minimal and advanced parameters

The basic syntax for [creating a ReplacingMergeTree table](https://www.tinybird.co/blog-posts/clickhouse-create-table-example) uses `ENGINE = ReplacingMergeTree()` with an `ORDER BY` clause that defines your deduplication key.

### Minimal example

The simplest ReplacingMergeTree table uses only the `ORDER BY` clause to define which columns identify unique rows:

```sql
CREATE TABLE user_activity (
    user_id UInt64,
    activity_type String,
    event_time DateTime,
    session_id String
) ENGINE = ReplacingMergeTree()
ORDER BY (user_id, activity_type);
```

This table deduplicates based on the combination of `user_id` and `activity_type`. When multiple rows have the same values for both columns, ClickHouse{% sup %}®{% /sup %} keeps the last inserted row during merges.

### Adding a version column

For more control over which row to keep, you can specify a version column as a parameter to the engine:

```sql
CREATE TABLE user_activity (
    user_id UInt64,
    activity_type String,
    event_time DateTime,
    session_id String,
    version UInt64
) ENGINE = ReplacingMergeTree(version)
ORDER BY (user_id, activity_type);
```

Now ClickHouse{% sup %}®{% /sup %} keeps the row with the highest `version` value for each unique combination of `user_id` and `activity_type`. This works well when you have explicit version numbers from your source system, like CDC sequence numbers from PostgreSQL.

### Using a sign column for deletes

For scenarios where you want to mark rows as deleted rather than physically removing them, ReplacingMergeTree supports an optional sign column:

```sql
CREATE TABLE user_activity (
    user_id UInt64,
    activity_type String,
    event_time DateTime,
    session_id String,
    version UInt64,
    sign Int8
) ENGINE = ReplacingMergeTree(version, sign)
ORDER BY (user_id, activity_type);
```

The sign column typically uses `1` for active rows and `-1` for deleted rows. During merges, if the row with the highest version has `sign = -1`, ClickHouse{% sup %}®{% /sup %} removes all rows for that key.

## Choosing the right ORDER BY and version columns

The `ORDER BY` clause in a ReplacingMergeTree table serves two purposes: it defines the physical sorting of data on disk and acts as the deduplication key. Choosing the right columns for this clause directly affects both query performance and deduplication behavior.

### Single primary key

For tables where each row represents a unique entity, use a single column in your `ORDER BY` clause:

```sql
CREATE TABLE user_profiles (
    user_id UInt64,
    email String,
    name String,
    updated_at DateTime,
    version UInt64
) ENGINE = ReplacingMergeTree(version)
ORDER BY user_id;
```

This pattern works well for maintaining current state tables where each user, product, or order has exactly one active record at any time.

### Composite key

When your business logic requires multiple columns to identify a unique row, use a composite key:

```sql
CREATE TABLE user_sessions (
    user_id UInt64,
    session_id String,
    start_time DateTime,
    end_time DateTime,
    version UInt64
) ENGINE = ReplacingMergeTree(version)
ORDER BY (user_id, session_id);
```

The order of columns in the composite key matters for query performance. Place the most selective column first and the least selective last, following the same principles as any MergeTree table.

### High-cardinality caution

Avoid using high-cardinality columns like timestamps or UUIDs in your `ORDER BY` clause. A high-cardinality column is one with many unique values, like a timestamp that changes every second or a UUID that's different for every row. When you use columns like this in your `ORDER BY` clause, you create too many unique combinations, which prevents effective deduplication and slows down merge performance.

Instead, use stable identifiers that truly represent your business key, and store high-cardinality values as regular columns.

## Querying the latest row state without surprises

A regular `SELECT` query against a ReplacingMergeTree table returns all rows, including duplicates, until ClickHouse{% sup %}®{% /sup %} completes background merge operations. You'll see duplicate rows in your results because ClickHouse{% sup %}®{% /sup %} stores data in multiple parts on disk, and deduplication only happens when those parts merge together.

### SELECT FINAL

The `FINAL` modifier forces ClickHouse{% sup %}®{% /sup %} to deduplicate rows at query time, guaranteeing you see only the latest version of each row:

```sql
SELECT
    user_id,
    activity_type,
    event_time,
    session_id
FROM user_activity
FINAL
WHERE user_id = 12345;
```

This query returns deduplicated results immediately, regardless of whether background merges have completed. However, `FINAL` adds overhead because ClickHouse{% sup %}®{% /sup %} performs deduplication logic for every query execution. The performance impact grows with table size and the number of data parts, though proper partitioning can [improve performance by 7×](https://altinity.com/blog/clickhouse-replacingmergetree-explained-the-good-the-bad-and-the-ugly).

### Materialized views without FINAL

A better approach for frequently accessed deduplicated data uses [materialized views](https://www.tinybird.co/blog-posts/what-are-materialized-views-and-why-do-they-matter-for-realtime) to pre-compute the latest state:

```sql
CREATE MATERIALIZED VIEW user_activity_latest
ENGINE = AggregatingMergeTree()
ORDER BY (user_id, activity_type)
AS SELECT
    user_id,
    activity_type,
    argMaxState(event_time, version) AS latest_event_time,
    argMaxState(session_id, version) AS latest_session_id
FROM user_activity
GROUP BY user_id, activity_type;
```

This materialized view maintains deduplicated state automatically as new data arrives. Queries against the view run fast because the deduplication work happens during insert time rather than query time.

## When to run select final or optimize final

The decision to use `FINAL` or trigger manual optimization depends on your latency requirements and query patterns.

### Latency tradeoffs

`SELECT FINAL` adds query overhead but guarantees consistency for every read. This approach makes sense for low-frequency queries where you need guaranteed deduplication, like administrative dashboards or compliance reports. For high-throughput query workloads, the overhead of `FINAL` becomes too expensive.

You can accept eventual consistency and let background merges handle deduplication naturally. Most analytics use cases tolerate seeing duplicate rows for a few minutes until merges complete.

### Automated nightly optimize

For batch processing scenarios where you need deduplicated data at specific times, schedule `OPTIMIZE TABLE FINAL` to force merges:

```sql
OPTIMIZE TABLE user_activity FINAL;
```

This command merges all parts immediately and completes all pending deduplication. You might run this nightly before generating daily reports, or before exporting data to downstream systems. Keep in mind that `OPTIMIZE TABLE FINAL` is resource-intensive and can consume significant CPU and I/O.

## Performance tuning and common pitfalls

ReplacingMergeTree performance depends heavily on how you structure inserts and configure merge behavior.

### Too many small parts

Frequent small inserts create many small data parts on disk. ClickHouse{% sup %}®{% /sup %} has to track and merge all of them, which adds overhead to both merge operations and queries. You can monitor part count with this query:

```sql
SELECT
    table,
    count() AS parts_count,
    sum(rows) AS total_rows
FROM system.parts
WHERE active AND table = 'user_activity'
GROUP BY table;
```

If you see hundreds or thousands of active parts, batch your inserts into larger groups. Aim for inserts of at least 10,000 rows, and preferably 100,000 or more for high-volume tables. For real-world production insights on managing parts and merges at scale, see our experience [operating ClickHouse{% sup %}®{% /sup %} clusters](https://www.tinybird.co/blog-posts/what-i-learned-operating-clickhouse).

### Large partitions

The `PARTITION BY` clause affects merge efficiency and `FINAL` query performance. Large partitions with millions of rows take longer to merge and deduplicate at query time. With year-based partitioning, queries can process [2.35× faster](https://clickhouse.com/docs/en/guides/replacing-merge-tree) compared to no partitioning. Monthly partitions work well for most use cases:

```sql
CREATE TABLE user_activity (
    user_id UInt64,
    activity_type String,
    event_time DateTime,
    version UInt64
) ENGINE = ReplacingMergeTree(version)
PARTITION BY toYYYYMM(event_time)
ORDER BY (user_id, activity_type);
```

This creates one partition per month, which keeps partition size manageable while still allowing efficient pruning for time-range queries.

## Streaming and CDC patterns with ReplacingMergeTree

ReplacingMergeTree works particularly well for ingesting data from streaming sources and [change data capture systems](https://www.tinybird.co/blog-posts/real-time-change-data-capture). The engine's deduplication behavior naturally handles updates and overwrites from streaming data.

### Kafka engine ingestion

The [Kafka table engine](https://www.tinybird.co/blog-posts/event-sourcing-with-kafka) feeds data directly into ReplacingMergeTree tables for real-time deduplication:

```sql
CREATE TABLE user_activity_kafka (
    user_id UInt64,
    activity_type String,
    event_time DateTime,
    version UInt64
) ENGINE = Kafka()
SETTINGS
    kafka_broker_list = 'localhost:9092',
    kafka_topic_list = 'user_activity',
    kafka_group_name = 'clickhouse_consumer',
    kafka_format = 'JSONEachRow';

CREATE MATERIALIZED VIEW user_activity_mv TO user_activity AS
SELECT * FROM user_activity_kafka;
```

This pattern continuously ingests from Kafka and writes to your ReplacingMergeTree table. The materialized view handles the data flow automatically. This architecture powers many use cases, from user behavior tracking to [real-time sentiment analysis with Kafka Streams](https://www.tinybird.co/blog-posts/real-time-sentiment-analysis-with-kafka-streams-sb).

### Upserts from PostgreSQL CDC

[Change data capture from PostgreSQL](https://www.tinybird.co/blog-posts/postgres-cdc) or other OLTP databases produces a stream of inserts, updates, and deletes. ReplacingMergeTree handles this stream naturally when you use transaction IDs or LSN (Log Sequence Numbers) as version columns:

```sql
CREATE TABLE products (
    product_id UInt64,
    name String,
    price Decimal(10, 2),
    updated_at DateTime,
    postgres_lsn UInt64,
    sign Int8
) ENGINE = ReplacingMergeTree(postgres_lsn, sign)
ORDER BY product_id;
```

Your CDC tool inserts each change event with the LSN as the version. Updates overwrite previous versions, and deletes insert a row with `sign = -1`.

## Faster time to production with Tinybird

[Tinybird](https://tinybird.co) handles the infrastructure complexity of running ReplacingMergeTree tables in production. The platform manages merge optimization, monitoring, and scaling automatically so you can focus on your application logic rather than cluster operations. If you're evaluating options for running ClickHouse{% sup %}®{% /sup %} in production, check out the [Tinybird vs ClickHouse{% sup %}®{% /sup %} Cloud cost comparison](https://www.tinybird.co/blog/tinybird-vs-clickhouse-cloud-cost-comparison) to understand how managed platforms differ in pricing and features.

When you deploy a ReplacingMergeTree table to Tinybird, the service handles part management, merge scheduling, and resource allocation. The platform provides built-in observability for tracking merge activity, part counts, and deduplication status through the Tinybird console.

[Sign up for a free Tinybird account](https://cloud.tinybird.co/signup) to try ReplacingMergeTree without managing infrastructure. The platform includes managed ingestion from Kafka, PostgreSQL, and other sources, plus hosted API endpoints for querying your deduplicated data.

## FAQs about ReplacingMergeTree

### Can I query ReplacingMergeTree without using FINAL?

Yes, regular `SELECT` queries work but may return duplicate rows until background merges complete. Use `FINAL` only when you need guaranteed deduplication, keeping in mind the performance overhead.

### How do I backfill historical data without breaking merges?

Insert historical data in chronological order with proper version numbers. Consider using `INSERT INTO ... SELECT` with `ORDER BY` to maintain version sequence, and batch large backfills to avoid creating too many small parts./
