INCORRECT_RESULT_OF_SCALAR_SUBQUERY ClickHouse error

Common cause: Your scalar subquery is returning multiple rows when it should return only one row.

The INCORRECT_RESULT_OF_SCALAR_SUBQUERY error occurs when a scalar subquery (a subquery that should return a single value) returns multiple rows instead. Scalar subqueries must return exactly one row and one column to be used in expressions, comparisons, or as function arguments.

What causes this error

This error typically happens when:

  • Multiple rows returned: Subquery returns more than one row
  • No rows returned: Subquery returns no rows (should return NULL)
  • Missing aggregation: Using subquery without aggregate function
  • Missing LIMIT: Subquery without LIMIT clause
  • Wrong WHERE conditions: Subquery conditions not specific enough
  • GROUP BY issues: Subquery with GROUP BY returning multiple rows
  • JOIN problems: Subquery with JOIN returning multiple matches

Example errors

-- Error: Subquery returns multiple rows
SELECT user_id, (SELECT event_type FROM events WHERE user_id = 123) AS user_event_type;

-- Error: Subquery without aggregation
SELECT user_id, (SELECT timestamp FROM events WHERE user_id = users.user_id) AS last_event_time;

-- Error: Subquery with GROUP BY
SELECT user_id, (SELECT count() FROM events WHERE user_id = users.user_id GROUP BY event_type) AS event_count;

-- Error: Subquery with JOIN
SELECT user_id, (SELECT event_type FROM events e JOIN users u ON e.user_id = u.user_id) AS event_type;

-- Error: Subquery without LIMIT
SELECT user_id, (SELECT timestamp FROM events WHERE user_id = users.user_id ORDER BY timestamp DESC) AS latest_time;

How to fix it

1. Use aggregate functions

Ensure the subquery returns exactly one value using aggregate functions:

-- ❌ Wrong: Subquery returns multiple rows
SELECT user_id, (SELECT event_type FROM events WHERE user_id = 123) AS user_event_type;

-- ✅ Correct: Use aggregate function
SELECT user_id, (SELECT any(event_type) FROM events WHERE user_id = 123) AS user_event_type;

-- ❌ Wrong: No aggregation
SELECT user_id, (SELECT timestamp FROM events WHERE user_id = users.user_id) AS last_event_time;

-- ✅ Correct: Use aggregate function
SELECT user_id, (SELECT max(timestamp) FROM events WHERE user_id = users.user_id) AS last_event_time;

2. Use LIMIT clause

Limit the subquery to return only one row:

-- ❌ Wrong: Multiple rows possible
SELECT user_id, (SELECT timestamp FROM events WHERE user_id = users.user_id ORDER BY timestamp DESC) AS latest_time;

-- ✅ Correct: Use LIMIT 1
SELECT user_id, (SELECT timestamp FROM events WHERE user_id = users.user_id ORDER BY timestamp DESC LIMIT 1) AS latest_time;

-- ❌ Wrong: No LIMIT
SELECT user_id, (SELECT event_type FROM events WHERE user_id = users.user_id) AS event_type;

-- ✅ Correct: Use LIMIT 1
SELECT user_id, (SELECT event_type FROM events WHERE user_id = users.user_id LIMIT 1) AS event_type;

3. Use specific WHERE conditions

Make the subquery conditions more specific to return only one row:

-- ❌ Wrong: Ambiguous conditions
SELECT user_id, (SELECT event_type FROM events WHERE user_id = users.user_id) AS event_type;

-- ✅ Correct: Specific conditions
SELECT user_id, (SELECT event_type FROM events WHERE user_id = users.user_id ORDER BY timestamp DESC LIMIT 1) AS latest_event_type;

-- ❌ Wrong: Multiple matches possible
SELECT user_id, (SELECT count() FROM events WHERE user_id = users.user_id) AS event_count;

-- ✅ Correct: Use aggregate function
SELECT user_id, (SELECT count() FROM events WHERE user_id = users.user_id) AS event_count;

Use any() for arbitrary single values, LIMIT 1 for specific ordering, or aggregate functions like max(), min(), count() for deterministic results.

4. Handle empty results

Use aggregate functions that handle empty results gracefully:

-- ❌ Wrong: No rows returned
SELECT user_id, (SELECT event_type FROM events WHERE user_id = users.user_id AND event_type = 'nonexistent') AS event_type;

-- ✅ Correct: Handle empty results
SELECT user_id, (SELECT any(event_type) FROM events WHERE user_id = users.user_id AND event_type = 'nonexistent') AS event_type;

-- ✅ Correct: Use COALESCE for default values
SELECT user_id, COALESCE((SELECT any(event_type) FROM events WHERE user_id = users.user_id), 'no_events') AS event_type;

Common patterns and solutions

Aggregate functions in subqueries

-- ❌ Wrong: No aggregation
SELECT user_id, (SELECT timestamp FROM events WHERE user_id = users.user_id) AS last_time;

-- ✅ Correct: Use aggregate function
SELECT user_id, (SELECT max(timestamp) FROM events WHERE user_id = users.user_id) AS last_time;

-- ❌ Wrong: Multiple rows possible
SELECT user_id, (SELECT event_type FROM events WHERE user_id = users.user_id) AS event_type;

-- ✅ Correct: Use aggregate function
SELECT user_id, (SELECT any(event_type) FROM events WHERE user_id = users.user_id) AS event_type;

Subqueries with LIMIT

-- ❌ Wrong: No LIMIT clause
SELECT user_id, (SELECT timestamp FROM events WHERE user_id = users.user_id ORDER BY timestamp DESC) AS latest_time;

-- ✅ Correct: Use LIMIT 1
SELECT user_id, (SELECT timestamp FROM events WHERE user_id = users.user_id ORDER BY timestamp DESC LIMIT 1) AS latest_time;

-- ❌ Wrong: Multiple rows possible
SELECT user_id, (SELECT event_type FROM events WHERE user_id = users.user_id) AS event_type;

-- ✅ Correct: Use LIMIT 1
SELECT user_id, (SELECT event_type FROM events WHERE user_id = users.user_id LIMIT 1) AS event_type;

Subqueries with specific conditions

-- ❌ Wrong: Ambiguous conditions
SELECT user_id, (SELECT count() FROM events WHERE user_id = users.user_id GROUP BY event_type) AS event_count;

-- ✅ Correct: Specific aggregation
SELECT user_id, (SELECT count() FROM events WHERE user_id = users.user_id) AS total_events;

-- ❌ Wrong: Multiple matches
SELECT user_id, (SELECT event_type FROM events WHERE user_id = users.user_id AND event_type = 'click') AS click_event;

-- ✅ Correct: Use LIMIT or aggregate
SELECT user_id, (SELECT any(event_type) FROM events WHERE user_id = users.user_id AND event_type = 'click') AS click_event;

Advanced solutions

Using EXISTS instead of scalar subqueries

-- Use EXISTS for boolean checks
SELECT user_id, 
    CASE 
        WHEN EXISTS(SELECT 1 FROM events WHERE user_id = users.user_id AND event_type = 'click') 
        THEN 'has_clicks' 
        ELSE 'no_clicks' 
    END AS click_status
FROM users;

Using window functions

-- Use window functions instead of scalar subqueries
SELECT 
    user_id,
    first_value(event_type) OVER (PARTITION BY user_id ORDER BY timestamp DESC) AS latest_event_type,
    max(timestamp) OVER (PARTITION BY user_id) AS last_event_time
FROM events;

Using JOINs instead of subqueries

-- Use JOIN instead of scalar subquery
SELECT 
    u.user_id,
    e.event_type AS latest_event_type,
    e.timestamp AS last_event_time
FROM users u
LEFT JOIN (
    SELECT user_id, event_type, timestamp
    FROM events
    WHERE (user_id, timestamp) IN (
        SELECT user_id, max(timestamp)
        FROM events
        GROUP BY user_id
    )
) e ON u.user_id = e.user_id;

Tinybird-specific notes

In Tinybird, this error commonly occurs when:

  • Pipe transformations: Scalar subqueries in pipe SQL
  • Endpoint queries: Subqueries in endpoint definitions
  • Data source queries: Subqueries in data source SQL
  • Aggregate queries: Complex subquery logic

Debugging in Tinybird

  1. Check subquery logic: Verify subqueries return single values
  2. Review pipe SQL: Ensure subqueries use proper aggregation
  3. Test with tb sql: Use CLI to test subquery behavior
  4. Validate subquery results: Check if subqueries return expected single values
  5. Use Query Builder: Test subqueries interactively

Common Tinybird scenarios

-- In pipe definitions, use proper scalar subqueries
NODE user_stats
SQL >
    SELECT 
        user_id,
        (SELECT max(timestamp) FROM your_data_source WHERE user_id = e.user_id) AS last_event_time
    FROM your_data_source e
    GROUP BY user_id;

-- In endpoint queries, ensure single value results
SELECT 
    user_id,
    (SELECT any(event_type) FROM your_pipe WHERE user_id = users.user_id) AS user_event_type
FROM users;

Subquery best practices

-- Use aggregate functions for scalar subqueries
NODE processed_data
SQL >
    SELECT 
        user_id,
        (SELECT count() FROM your_data_source WHERE user_id = e.user_id) AS event_count,
        (SELECT max(timestamp) FROM your_data_source WHERE user_id = e.user_id) AS last_event_time
    FROM your_data_source e
    GROUP BY user_id;

Best practices

  1. Always use aggregate functions in scalar subqueries
  2. Use LIMIT 1 when you need a specific row
  3. Handle empty results with COALESCE or default values
  4. Test subquery logic with simple examples first
  5. Use EXISTS for boolean checks instead of scalar subqueries
  6. Consider JOINs as alternatives to complex subqueries
  7. Document subquery behavior for your team
  8. Use Query Builder to test subquery logic interactively

See also