TOO_MANY_PARTS ClickHouse error¶
This error occurs when a ClickHouse table has too many parts, which can cause performance issues and system instability. It's common with high-frequency inserts or improper partitioning strategies.
The TOO_MANY_PARTS
error in ClickHouse (and Tinybird) happens when a table accumulates too many parts, which can cause performance degradation, increased memory usage, and system instability. This typically occurs with high-frequency insert operations, improper partitioning strategies, or when background merge operations can't keep up with the insertion rate.
What causes this error¶
You'll typically see it when:
- High-frequency INSERT operations create many small parts
- Insufficient background merge operations
- Improper partitioning strategy
- Too many concurrent insert operations
- Background merge operations are too slow
- MergeTree table engine configuration issues
- Insufficient system resources for merging
- Partitioning by high-cardinality columns
Too many parts can significantly impact query performance. Implement proper partitioning and monitor merge operations.
Example errors¶
Fails: too many parts in table
INSERT INTO events (user_id, event_type, timestamp) VALUES (123, 'click', '2024-01-01 10:00:00') -- Error: TOO_MANY_PARTS
Fails: high-frequency inserts
-- Multiple rapid inserts creating many parts INSERT INTO metrics (timestamp, value) VALUES ('2024-01-01 10:00:00', 100), ('2024-01-01 10:00:01', 101), ('2024-01-01 10:00:02', 102) -- Error: TOO_MANY_PARTS
Fails: improper partitioning
-- Partitioning by high-cardinality column CREATE TABLE events ( id UInt64, user_id UInt64, event_type String, timestamp DateTime ) ENGINE = MergeTree() PARTITION BY user_id -- Too many partitions ORDER BY (timestamp, id)
How to fix it¶
Check current parts¶
Monitor the number of parts in your table:
Check table parts
SELECT database, table, partition, name, active, rows, bytes_on_disk FROM system.parts WHERE database = currentDatabase() AND table = 'your_table_name' ORDER BY partition, name
Monitor merge operations¶
Check if merge operations are running:
Check merge operations
SELECT query_id, user, query, state, start_time, elapsed FROM system.processes WHERE query LIKE '%OPTIMIZE%' OR query LIKE '%MERGE%'
Optimize table¶
Force table optimization to merge parts:
Optimize table
-- Optimize table to merge parts OPTIMIZE TABLE your_table_name FINAL; -- Or optimize specific partition OPTIMIZE TABLE your_table_name PARTITION '2024-01' FINAL;
Check partitioning strategy¶
Review and improve your partitioning strategy:
Improve partitioning
-- Instead of partitioning by user_id (high cardinality) -- Use time-based partitioning CREATE TABLE events ( id UInt64, user_id UInt64, event_type String, timestamp DateTime ) ENGINE = MergeTree() PARTITION BY toYYYYMM(timestamp) -- Monthly partitions ORDER BY (timestamp, user_id)
Common patterns and solutions¶
Time-based partitioning¶
Use time-based partitioning for time-series data:
Time-based partitioning
CREATE TABLE events ( id UInt64, user_id UInt64, event_type String, timestamp DateTime ) ENGINE = MergeTree() PARTITION BY toYYYYMM(timestamp) -- Monthly partitions ORDER BY (timestamp, user_id) SETTINGS merge_with_ttl_timeout = 86400, -- 1 day merge_max_block_size = 8192
Batch inserts¶
Implement batch inserts to reduce part creation:
Batch inserts
-- Instead of single inserts, use batch inserts INSERT INTO events (user_id, event_type, timestamp) VALUES (123, 'click', '2024-01-01 10:00:00'), (124, 'view', '2024-01-01 10:00:01'), (125, 'purchase', '2024-01-01 10:00:02'), (126, 'click', '2024-01-01 10:00:03')
Merge settings optimization¶
Configure merge settings for better performance:
Merge settings
-- Set appropriate merge settings ALTER TABLE your_table_name SETTINGS merge_with_ttl_timeout = 3600, -- 1 hour merge_max_block_size = 8192, -- Block size for merging min_bytes_for_wide_part = 0, -- Allow wide parts min_rows_for_wide_part = 0 -- Allow wide parts
Background merge configuration¶
Configure background merge operations:
Background merge
-- Configure background merge settings SET merge_tree_parts_lifetime = 86400; -- 1 day SET merge_tree_parts_lifetime_randomize = 1; SET merge_tree_max_bytes_to_merge_at_max_space_in_pool = 1000000000; -- 1GB
Tinybird-specific notes¶
In Tinybird, TOO_MANY_PARTS errors often occur when:
- Data Sources receive high-frequency data
- Pipes create many small output parts
- Materialized Views accumulate too many parts
- Background merge operations can't keep up
To debug in Tinybird:
- Check Data Source insertion patterns
- Review Pipe transformation frequency
- Monitor Materialized View part counts
- Consider batching data operations
In Tinybird, use the Data Source monitoring to track insertion patterns and optimize accordingly.
Best practices¶
Partitioning strategy¶
- Use time-based partitioning for time-series data
- Avoid partitioning by high-cardinality columns
- Keep partition sizes reasonable
- Monitor partition growth over time
Insert optimization¶
- Use batch inserts instead of single inserts
- Implement appropriate insert intervals
- Monitor insertion rates
- Use bulk insert operations
Merge optimization¶
- Configure appropriate merge settings
- Monitor merge operation performance
- Ensure sufficient system resources
- Use background merge operations
Configuration options¶
Merge settings¶
Merge configuration
-- Check current merge settings SELECT name, value, description FROM system.settings WHERE name LIKE '%merge%'
Table settings¶
Table settings
-- Configure table-specific settings ALTER TABLE your_table_name SETTINGS merge_with_ttl_timeout = 3600, merge_max_block_size = 8192, min_bytes_for_wide_part = 0, min_rows_for_wide_part = 0
System settings¶
System settings
-- Configure system-wide merge settings SET merge_tree_parts_lifetime = 86400; SET merge_tree_parts_lifetime_randomize = 1; SET merge_tree_max_bytes_to_merge_at_max_space_in_pool = 1000000000;
Alternative solutions¶
Use ReplacingMergeTree¶
Use ReplacingMergeTree for deduplication:
ReplacingMergeTree
CREATE TABLE events ( id UInt64, user_id UInt64, event_type String, timestamp DateTime, version UInt32 ) ENGINE = ReplacingMergeTree(version) PARTITION BY toYYYYMM(timestamp) ORDER BY (timestamp, user_id, id)
Implement TTL¶
Use TTL to automatically remove old parts:
TTL implementation
CREATE TABLE events ( id UInt64, user_id UInt64, event_type String, timestamp DateTime ) ENGINE = MergeTree() PARTITION BY toYYYYMM(timestamp) ORDER BY (timestamp, user_id) TTL timestamp + INTERVAL 1 YEAR -- Remove data older than 1 year
Use AggregatingMergeTree¶
Use AggregatingMergeTree for pre-aggregated data:
AggregatingMergeTree
CREATE TABLE event_summary ( date Date, user_id UInt64, event_count AggregateFunction(count), last_event DateTime ) ENGINE = AggregatingMergeTree() PARTITION BY toYYYYMM(date) ORDER BY (date, user_id)
Monitoring and prevention¶
Parts monitoring¶
Parts tracking
-- Monitor parts count over time SELECT database, table, partition, COUNT(*) as part_count, SUM(rows) as total_rows, SUM(bytes_on_disk) as total_bytes FROM system.parts WHERE database = currentDatabase() AND active = 1 GROUP BY database, table, partition ORDER BY part_count DESC
Merge performance monitoring¶
Merge performance
-- Monitor merge operation performance SELECT query_id, user, query, state, start_time, elapsed FROM system.processes WHERE query LIKE '%OPTIMIZE%' OR query LIKE '%MERGE%' ORDER BY start_time DESC
Table health monitoring¶
Table health
-- Monitor table health metrics SELECT database, table, COUNT(*) as total_parts, COUNT(CASE WHEN active = 1 THEN 1 END) as active_parts, SUM(rows) as total_rows, SUM(bytes_on_disk) as total_bytes FROM system.parts WHERE database = currentDatabase() GROUP BY database, table ORDER BY total_parts DESC