Slow queries troubleshooting

Common issues and solutions for slow query performance in ClickHouse and Tinybird.

Common causes

Missing indexes

Issue: Queries scanning entire tables without proper indexes

Solution: Add appropriate indexes for frequently queried columns

-- Add index for frequently filtered columns
ALTER TABLE events ADD INDEX idx_user_id user_id TYPE minmax GRANULARITY 1

Large result sets

Issue: Queries returning too much data

Solution: Use LIMIT and proper filtering

-- Add limits to prevent large results
SELECT * FROM events 
WHERE date >= today() - 7 
LIMIT 1000

Performance patterns

Inefficient WHERE clauses

Issue: Non-selective filters causing full table scans

Solution: Use selective filters and proper column order

-- Optimize filter order (most selective first)
SELECT * FROM events 
WHERE user_id = 123 AND date >= '2023-01-01' AND event_type = 'click'

Complex aggregations

Issue: Heavy aggregations on large datasets

Solution: Pre-aggregate data in Pipes

-- Pre-aggregate in data source
SELECT 
  user_id,
  count() as event_count,
  sum(value) as total_value
FROM events 
GROUP BY user_id

Query optimization

Using FINAL modifier

Issue: Queries not using FINAL for ReplacingMergeTree

Solution: Add FINAL when needed

-- Use FINAL for latest versions
SELECT * FROM events FINAL WHERE user_id = 123

Avoiding SELECT *

Issue: Selecting unnecessary columns

Solution: Select only needed columns

-- Select specific columns
SELECT user_id, event_type, timestamp 
FROM events 
WHERE date = today()

Memory and resource issues

TOO_MANY_ROWS_OR_BYTES

Issue: Queries exceeding memory limits

Solution: Use streaming queries and proper limits

-- Use streaming for large datasets
SELECT * FROM events 
WHERE date >= today() - 30
SETTINGS max_block_size = 10000

Memory pressure

Issue: Queries consuming too much memory

Solution: Optimize query structure

-- Break down complex queries
WITH filtered_data AS (
  SELECT * FROM events WHERE date >= today() - 7
)
SELECT count() FROM filtered_data

Best practices

  1. Use indexes - Add indexes for frequently filtered columns
  2. Limit results - Always use LIMIT for large datasets
  3. Pre-aggregate - Do heavy aggregations in data sources
  4. Use selective filters - Filter early and often
  5. Monitor query plans - Use EXPLAIN to understand query execution