ILLEGAL_AGGREGATION ClickHouse error¶
This error usually means you're trying to use an aggregate function in a context where it's not allowed, or you're mixing aggregate and non-aggregate columns incorrectly.
The ILLEGAL_AGGREGATION
error in ClickHouse (and Tinybird) happens when you try to use aggregate functions in an invalid context or mix aggregate and non-aggregate columns incorrectly. This is different from NOT_AN_AGGREGATE
in that it specifically relates to the placement or combination of aggregate functions.
What causes this error¶
You'll typically see it when:
- Using aggregate functions in WHERE clauses
- Mixing aggregate and non-aggregate columns without proper grouping
- Using aggregate functions in subqueries incorrectly
- Having aggregate functions in ORDER BY without GROUP BY
- Using window functions with aggregate functions incorrectly
Example errors¶
Fails: aggregate in WHERE clause
SELECT * FROM events WHERE COUNT(*) > 10
Fails: aggregate in ORDER BY without GROUP BY
SELECT user_id, COUNT(*) FROM events ORDER BY COUNT(*)
Fails: aggregate in HAVING without GROUP BY
SELECT user_id, COUNT(*) FROM events HAVING COUNT(*) > 10
Fails: aggregate in WHERE clause
SELECT COUNT(*) FROM events WHERE COUNT(*) > 10
How to fix it¶
Use HAVING instead of WHERE for aggregates¶
Move aggregate conditions to HAVING clause:
Fix: use HAVING for aggregate conditions
SELECT user_id, COUNT(*) as count FROM events GROUP BY user_id HAVING COUNT(*) > 10
Add proper GROUP BY¶
Include all non-aggregate columns in GROUP BY:
Fix: add GROUP BY
SELECT user_id, COUNT(*) as count FROM events GROUP BY user_id
Use subqueries for complex aggregations¶
Break down complex queries using subqueries:
Fix: use subquery for aggregate filtering
SELECT * FROM events WHERE user_id IN ( SELECT user_id FROM events GROUP BY user_id HAVING COUNT(*) > 10 )
Use window functions instead¶
For row-level calculations with aggregates:
Fix: use window function
SELECT user_id, COUNT(*) OVER (PARTITION BY user_id) as user_count FROM events
Common patterns and solutions¶
Simple aggregation¶
When you want to aggregate the entire table:
Correct: aggregate entire table
SELECT COUNT(*) as total_events FROM events
Grouped aggregation¶
When you want to aggregate by groups:
Correct: group by user_id
SELECT user_id, COUNT(*) as event_count FROM events GROUP BY user_id
Multiple columns in GROUP BY¶
Include all non-aggregate columns:
Correct: multiple columns in GROUP BY
SELECT user_id, event_type, COUNT(*) as count FROM events GROUP BY user_id, event_type
Using HAVING correctly¶
HAVING filters on aggregate results:
Correct: HAVING with GROUP BY
SELECT user_id, COUNT(*) as count FROM events GROUP BY user_id HAVING COUNT(*) > 5
Advanced solutions¶
Conditional aggregation¶
Use CASE statements within aggregate functions:
Conditional aggregation
SELECT user_id, COUNT(*) as total_events, COUNT(CASE WHEN event_type = 'click' THEN 1 END) as clicks FROM events GROUP BY user_id
Nested aggregations¶
Aggregate functions cannot be nested except via sumState()
and sumMerge()
. For nested aggregations, use state functions:
Nested aggregation with state functions
SELECT sumMerge(sumState(value)) as nested_sum FROM events
Multiple aggregates¶
You can use multiple aggregate functions in the same query:
Multiple aggregates
SELECT user_id, COUNT(*) as total_events, SUM(value) as total_value, AVG(value) as avg_value FROM events GROUP BY user_id
Using subqueries for complex logic¶
Break down complex aggregations:
Complex aggregation with subquery
SELECT user_id, event_count FROM ( SELECT user_id, COUNT(*) as event_count FROM events GROUP BY user_id ) t WHERE event_count > 10
Window functions vs aggregates¶
When to use window functions¶
For row-level calculations:
Window function example
SELECT user_id, event_type, COUNT(*) OVER (PARTITION BY user_id) as user_event_count, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY timestamp) as event_rank FROM events
When to use aggregate functions¶
For group-level calculations:
Aggregate function example
SELECT user_id, COUNT(*) as event_count FROM events GROUP BY user_id
Tinybird-specific notes¶
In Tinybird, this error often occurs when:
- Building Pipes that mix aggregate and non-aggregate operations
- Using Materialized Views incorrectly
- Forgetting to group by all non-aggregate columns in Pipe nodes
- Working with nested aggregations in Pipes
To debug in Tinybird:
- Check each Pipe node for proper GROUP BY clauses
- Verify that Materialized Views have correct aggregation logic
- Use the Pipe validation to catch these errors early
- Test aggregation logic with simple queries first
Best practices¶
Query structure¶
- Plan your aggregations before writing complex queries
- Use HAVING for aggregate conditions, WHERE for row conditions
- Include all non-aggregate columns in GROUP BY
- Test aggregation logic with small datasets first
- Use HAVING for aggregate filters instead of WHERE
Performance considerations¶
- Use appropriate indexes for GROUP BY columns
- Consider using Materialized Views for complex aggregations
- Monitor query performance on large datasets
- Use approximate functions for large aggregations when possible
Error prevention¶
- Use query validation tools
- Test aggregation logic step by step
- Document complex aggregation patterns
- Implement proper error handling