INCORRECT_RESULT_OF_SCALAR_SUBQUERY ClickHouse error¶
Common cause: Your scalar subquery is returning multiple rows when it should return only one row.
The INCORRECT_RESULT_OF_SCALAR_SUBQUERY
error occurs when a scalar subquery (a subquery that should return a single value) returns multiple rows instead. Scalar subqueries must return exactly one row and one column to be used in expressions, comparisons, or as function arguments.
What causes this error¶
This error typically happens when:
- Multiple rows returned: Subquery returns more than one row
- No rows returned: Subquery returns no rows (should return NULL)
- Missing aggregation: Using subquery without aggregate function
- Missing LIMIT: Subquery without LIMIT clause
- Wrong WHERE conditions: Subquery conditions not specific enough
- GROUP BY issues: Subquery with GROUP BY returning multiple rows
- JOIN problems: Subquery with JOIN returning multiple matches
Example errors¶
-- Error: Subquery returns multiple rows SELECT user_id, (SELECT event_type FROM events WHERE user_id = 123) AS user_event_type; -- Error: Subquery without aggregation SELECT user_id, (SELECT timestamp FROM events WHERE user_id = users.user_id) AS last_event_time; -- Error: Subquery with GROUP BY SELECT user_id, (SELECT count() FROM events WHERE user_id = users.user_id GROUP BY event_type) AS event_count; -- Error: Subquery with JOIN SELECT user_id, (SELECT event_type FROM events e JOIN users u ON e.user_id = u.user_id) AS event_type; -- Error: Subquery without LIMIT SELECT user_id, (SELECT timestamp FROM events WHERE user_id = users.user_id ORDER BY timestamp DESC) AS latest_time;
How to fix it¶
1. Use aggregate functions¶
Ensure the subquery returns exactly one value using aggregate functions:
-- ❌ Wrong: Subquery returns multiple rows SELECT user_id, (SELECT event_type FROM events WHERE user_id = 123) AS user_event_type; -- ✅ Correct: Use aggregate function SELECT user_id, (SELECT any(event_type) FROM events WHERE user_id = 123) AS user_event_type; -- ❌ Wrong: No aggregation SELECT user_id, (SELECT timestamp FROM events WHERE user_id = users.user_id) AS last_event_time; -- ✅ Correct: Use aggregate function SELECT user_id, (SELECT max(timestamp) FROM events WHERE user_id = users.user_id) AS last_event_time;
2. Use LIMIT clause¶
Limit the subquery to return only one row:
-- ❌ Wrong: Multiple rows possible SELECT user_id, (SELECT timestamp FROM events WHERE user_id = users.user_id ORDER BY timestamp DESC) AS latest_time; -- ✅ Correct: Use LIMIT 1 SELECT user_id, (SELECT timestamp FROM events WHERE user_id = users.user_id ORDER BY timestamp DESC LIMIT 1) AS latest_time; -- ❌ Wrong: No LIMIT SELECT user_id, (SELECT event_type FROM events WHERE user_id = users.user_id) AS event_type; -- ✅ Correct: Use LIMIT 1 SELECT user_id, (SELECT event_type FROM events WHERE user_id = users.user_id LIMIT 1) AS event_type;
3. Use specific WHERE conditions¶
Make the subquery conditions more specific to return only one row:
-- ❌ Wrong: Ambiguous conditions SELECT user_id, (SELECT event_type FROM events WHERE user_id = users.user_id) AS event_type; -- ✅ Correct: Specific conditions SELECT user_id, (SELECT event_type FROM events WHERE user_id = users.user_id ORDER BY timestamp DESC LIMIT 1) AS latest_event_type; -- ❌ Wrong: Multiple matches possible SELECT user_id, (SELECT count() FROM events WHERE user_id = users.user_id) AS event_count; -- ✅ Correct: Use aggregate function SELECT user_id, (SELECT count() FROM events WHERE user_id = users.user_id) AS event_count;
Use any()
for arbitrary single values, LIMIT 1
for specific ordering, or aggregate functions like max()
, min()
, count()
for deterministic results.
4. Handle empty results¶
Use aggregate functions that handle empty results gracefully:
-- ❌ Wrong: No rows returned SELECT user_id, (SELECT event_type FROM events WHERE user_id = users.user_id AND event_type = 'nonexistent') AS event_type; -- ✅ Correct: Handle empty results SELECT user_id, (SELECT any(event_type) FROM events WHERE user_id = users.user_id AND event_type = 'nonexistent') AS event_type; -- ✅ Correct: Use COALESCE for default values SELECT user_id, COALESCE((SELECT any(event_type) FROM events WHERE user_id = users.user_id), 'no_events') AS event_type;
Common patterns and solutions¶
Aggregate functions in subqueries¶
-- ❌ Wrong: No aggregation SELECT user_id, (SELECT timestamp FROM events WHERE user_id = users.user_id) AS last_time; -- ✅ Correct: Use aggregate function SELECT user_id, (SELECT max(timestamp) FROM events WHERE user_id = users.user_id) AS last_time; -- ❌ Wrong: Multiple rows possible SELECT user_id, (SELECT event_type FROM events WHERE user_id = users.user_id) AS event_type; -- ✅ Correct: Use aggregate function SELECT user_id, (SELECT any(event_type) FROM events WHERE user_id = users.user_id) AS event_type;
Subqueries with LIMIT¶
-- ❌ Wrong: No LIMIT clause SELECT user_id, (SELECT timestamp FROM events WHERE user_id = users.user_id ORDER BY timestamp DESC) AS latest_time; -- ✅ Correct: Use LIMIT 1 SELECT user_id, (SELECT timestamp FROM events WHERE user_id = users.user_id ORDER BY timestamp DESC LIMIT 1) AS latest_time; -- ❌ Wrong: Multiple rows possible SELECT user_id, (SELECT event_type FROM events WHERE user_id = users.user_id) AS event_type; -- ✅ Correct: Use LIMIT 1 SELECT user_id, (SELECT event_type FROM events WHERE user_id = users.user_id LIMIT 1) AS event_type;
Subqueries with specific conditions¶
-- ❌ Wrong: Ambiguous conditions SELECT user_id, (SELECT count() FROM events WHERE user_id = users.user_id GROUP BY event_type) AS event_count; -- ✅ Correct: Specific aggregation SELECT user_id, (SELECT count() FROM events WHERE user_id = users.user_id) AS total_events; -- ❌ Wrong: Multiple matches SELECT user_id, (SELECT event_type FROM events WHERE user_id = users.user_id AND event_type = 'click') AS click_event; -- ✅ Correct: Use LIMIT or aggregate SELECT user_id, (SELECT any(event_type) FROM events WHERE user_id = users.user_id AND event_type = 'click') AS click_event;
Advanced solutions¶
Using EXISTS instead of scalar subqueries¶
-- Use EXISTS for boolean checks SELECT user_id, CASE WHEN EXISTS(SELECT 1 FROM events WHERE user_id = users.user_id AND event_type = 'click') THEN 'has_clicks' ELSE 'no_clicks' END AS click_status FROM users;
Using window functions¶
-- Use window functions instead of scalar subqueries SELECT user_id, first_value(event_type) OVER (PARTITION BY user_id ORDER BY timestamp DESC) AS latest_event_type, max(timestamp) OVER (PARTITION BY user_id) AS last_event_time FROM events;
Using JOINs instead of subqueries¶
-- Use JOIN instead of scalar subquery SELECT u.user_id, e.event_type AS latest_event_type, e.timestamp AS last_event_time FROM users u LEFT JOIN ( SELECT user_id, event_type, timestamp FROM events WHERE (user_id, timestamp) IN ( SELECT user_id, max(timestamp) FROM events GROUP BY user_id ) ) e ON u.user_id = e.user_id;
Tinybird-specific notes¶
In Tinybird, this error commonly occurs when:
- Pipe transformations: Scalar subqueries in pipe SQL
- Endpoint queries: Subqueries in endpoint definitions
- Data source queries: Subqueries in data source SQL
- Aggregate queries: Complex subquery logic
Debugging in Tinybird¶
- Check subquery logic: Verify subqueries return single values
- Review pipe SQL: Ensure subqueries use proper aggregation
- Test with tb sql: Use CLI to test subquery behavior
- Validate subquery results: Check if subqueries return expected single values
- Use Query Builder: Test subqueries interactively
Common Tinybird scenarios¶
-- In pipe definitions, use proper scalar subqueries NODE user_stats SQL > SELECT user_id, (SELECT max(timestamp) FROM your_data_source WHERE user_id = e.user_id) AS last_event_time FROM your_data_source e GROUP BY user_id; -- In endpoint queries, ensure single value results SELECT user_id, (SELECT any(event_type) FROM your_pipe WHERE user_id = users.user_id) AS user_event_type FROM users;
Subquery best practices¶
-- Use aggregate functions for scalar subqueries NODE processed_data SQL > SELECT user_id, (SELECT count() FROM your_data_source WHERE user_id = e.user_id) AS event_count, (SELECT max(timestamp) FROM your_data_source WHERE user_id = e.user_id) AS last_event_time FROM your_data_source e GROUP BY user_id;
Best practices¶
- Always use aggregate functions in scalar subqueries
- Use LIMIT 1 when you need a specific row
- Handle empty results with COALESCE or default values
- Test subquery logic with simple examples first
- Use EXISTS for boolean checks instead of scalar subqueries
- Consider JOINs as alternatives to complex subqueries
- Document subquery behavior for your team
- Use Query Builder to test subquery logic interactively
See also¶
- Subqueries - Subquery syntax and usage
- Aggregate Functions - Aggregate function reference
- Window Functions - Window function alternatives
- JOIN Clause - JOIN alternatives to subqueries
- Error Handling - General error handling patterns