Too many rows or bytes troubleshooting

Common issues and solutions for TOO_MANY_ROWS_OR_BYTES errors in ClickHouse and Tinybird.

Common causes

Large result sets

Issue: Queries returning more data than allowed

Solution: Use LIMIT and proper filtering

-- Add limits to prevent large results
SELECT * FROM events
WHERE date >= today() - 7
LIMIT 10000

Missing WHERE clauses

Issue: Queries without proper filters, causing full table scans

Solution: Add selective filters using sorting key columns

-- Add specific filters using sorting key columns
SELECT * FROM events
WHERE user_id = 123 AND date >= '2023-01-01'

Ensure your filters use columns from the sorting key (ENGINE_SORTING_KEY) to enable efficient index usage and reduce data scanned.

Memory management

Streaming queries

Issue: Large datasets consuming too much memory

Solution: Filter data and use LIMIT to reduce memory usage

-- Filter data and limit results to reduce memory usage
SELECT * FROM events
WHERE date >= today() - 30
LIMIT 10000

Chunked processing

Issue: Processing entire datasets at once

Solution: Process data in chunks

-- Process by date ranges
SELECT * FROM events
WHERE date BETWEEN '2023-01-01' AND '2023-01-07'

Query optimization

Using aggregations

Issue: Returning raw data instead of summaries

Solution: Use aggregations to reduce result size, or pre-aggregate with Materialized Views

-- Aggregate at query time
SELECT
  user_id,
  count() as event_count,
  sum(value) as total_value
FROM events
WHERE date >= today() - 7
GROUP BY user_id

For better performance, consider pre-aggregating using Materialized Views to shift computation to ingestion time.

Column selection

Issue: Selecting unnecessary columns

Solution: Select only needed columns

-- Select specific columns only
SELECT user_id, event_type, timestamp
FROM events
WHERE date = today()

Configuration solutions

Adjusting limits

Issue: Default limits too restrictive

Solution: Use LIMIT in queries to control result size

-- Use LIMIT to control result size
SELECT * FROM events
WHERE date >= today() - 7
LIMIT 1000000

Using sampling

Issue: Need to analyze large datasets

Solution: Use sampling for analysis

-- Use sampling for large datasets
SELECT * FROM events SAMPLE 0.1
WHERE date >= today() - 30

Best practices

  1. Always use LIMIT - Prevent unexpectedly large results
  2. Add selective filters - Filter data early using sorting key columns
  3. Use aggregations - Summarize data instead of returning raw rows, or use Materialized Views
  4. Select specific columns - Avoid SELECT * on large tables
  5. Use LIMIT - Always use LIMIT to control result size and prevent memory issues
  6. Filter by sorting key - Ensure filters use columns from ENGINE_SORTING_KEY
Updated