QUERY_WAS_CANCELLED ClickHouse error¶
This error occurs when a query is cancelled by the system, user, or external factors. It's common with long-running queries, system maintenance, or resource constraints.
The QUERY_WAS_CANCELLED error in ClickHouse (and Tinybird) happens when a query execution is terminated before completion. This can occur due to various reasons including user cancellation, system timeouts, resource constraints, maintenance operations, or external system interventions.
What causes this error¶
You'll typically see it when:
- Query execution time exceeds timeout limits
- User manually cancels a running query
- System resources become unavailable
- Maintenance operations interrupt query execution
- Network connectivity issues occur
- System overload forces query termination
- External monitoring systems kill long-running queries
- Resource limits are exceeded during execution
Query cancellation is often a safety mechanism to prevent system overload or resource exhaustion.
Example errors¶
SELECT COUNT(DISTINCT user_id) FROM events
WHERE timestamp >= '2020-01-01'
-- Error: Query was cancelled (timeout)
-- User cancels a long-running query
SELECT * FROM large_table ORDER BY timestamp
-- Error: Query was cancelled by user
SELECT
user_id,
COUNT(*) as event_count,
AVG(value) as avg_value
FROM events
GROUP BY user_id
-- Error: Query was cancelled (resource limit)
SELECT * FROM events WHERE timestamp > '2024-01-01'
-- Error: Query was cancelled (maintenance)
How to fix it¶
Check query timeout settings¶
Verify current timeout configurations:
SELECT
name,
value,
description
FROM system.settings
WHERE name LIKE '%timeout%' OR name LIKE '%execution%'
Monitor system resources¶
Check system resource availability:
SELECT
metric,
value
FROM system.metrics
WHERE metric IN ('Memory', 'CPU', 'DiskSpace')
Check for maintenance¶
Look for active maintenance operations:
SELECT
query_id,
user,
query,
state,
start_time
FROM system.processes
WHERE state = 'Maintenance'
Review query complexity¶
Simplify complex queries that might timeout:
-- Instead of complex aggregation
SELECT
user_id,
COUNT(*) as event_count
FROM events
WHERE timestamp >= '2024-01-01'
GROUP BY user_id
HAVING event_count > 100
-- Use simpler approach with LIMIT
SELECT
user_id,
COUNT(*) as event_count
FROM events
WHERE timestamp >= '2024-01-01'
GROUP BY user_id
ORDER BY event_count DESC
LIMIT 1000
Common patterns and solutions¶
Query timeout management¶
Set appropriate timeouts for different query types:
-- Set longer timeout for complex queries
SET max_execution_time = 600; -- 10 minutes
-- Set memory limits
SET max_memory_usage = 10000000000; -- 10GB
-- Execute query with custom settings
SELECT * FROM large_table
WHERE timestamp >= '2024-01-01'
SETTINGS max_execution_time = 600
Resource optimization¶
Optimize queries to use fewer resources:
-- Use streaming approach for large datasets
SELECT
user_id,
COUNT(*) as event_count
FROM events
WHERE timestamp >= '2024-01-01'
GROUP BY user_id
SETTINGS
max_bytes_before_external_group_by = 1000000000,
max_bytes_before_external_sort = 1000000000
Incremental processing¶
Break down large operations into smaller parts:
-- Process data in smaller chunks
SELECT COUNT(*) FROM events
WHERE timestamp >= '2024-01-01'
AND timestamp < '2024-01-02'
SELECT COUNT(*) FROM events
WHERE timestamp >= '2024-01-02'
AND timestamp < '2024-01-03'
Query monitoring¶
Monitor query execution to prevent cancellation:
SELECT
query_id,
user,
query,
state,
start_time,
elapsed
FROM system.processes
WHERE state = 'Running'
ORDER BY elapsed DESC
Tinybird-specific notes¶
In Tinybird, QUERY_WAS_CANCELLED errors often occur when:
- API endpoint timeouts are exceeded
- Materialized View processing is interrupted
- Pipe transformations take too long
- System maintenance affects query execution
- Resource limits are reached
To debug in Tinybird:
- Check API endpoint timeout settings
- Monitor Materialized View processing
- Review Pipe transformation complexity
- Check system status for maintenance
In Tinybird, use Materialized Views to pre-compute complex aggregations and avoid long-running queries.
Best practices¶
Query design¶
- Break down complex queries into simpler parts
- Use appropriate WHERE clauses to limit data
- Implement proper indexing strategies
- Use Materialized Views for complex aggregations
Timeout management¶
- Set appropriate timeouts for different query types
- Monitor query execution times
- Implement retry logic with exponential backoff
- Use query queuing for long-running operations
Resource management¶
- Monitor system resource usage
- Implement proper connection pooling
- Use streaming queries for large datasets
- Set appropriate memory and CPU limits
Configuration options¶
Timeout settings¶
-- Set various timeout parameters
SET max_execution_time = 300; -- 5 minutes
SET max_query_size = 1000000000; -- 1GB
SET max_memory_usage = 8000000000; -- 8GB
SET max_bytes_before_external_group_by = 1000000000; -- 1GB
Resource limits¶
-- Set resource constraints
SET max_memory_usage = 10000000000; -- 10GB
SET max_bytes_before_external_group_by = 2000000000; -- 2GB
SET max_bytes_before_external_sort = 2000000000; -- 2GB
Query settings¶
-- Use settings for specific queries
SELECT * FROM large_table
SETTINGS
max_execution_time = 600,
max_memory_usage = 15000000000,
max_bytes_before_external_group_by = 3000000000
Alternative solutions¶
Use Materialized Views¶
Pre-compute complex aggregations:
-- Create Materialized View for complex aggregation
CREATE MATERIALIZED VIEW user_event_summary
ENGINE = SummingMergeTree()
ORDER BY user_id
AS SELECT
user_id,
COUNT(*) as event_count,
COUNT(DISTINCT event_type) as unique_events
FROM events
GROUP BY user_id
-- Query the Materialized View instead
SELECT * FROM user_event_summary
WHERE event_count > 100
Implement retry logic¶
Add retry mechanisms to your application:
-- In your application, implement retry logic
-- Example pseudo-code:
--
-- max_retries = 3
-- base_delay = 1 second
--
-- for attempt in range(max_retries):
-- try:
-- result = execute_query(query)
-- break
-- except QueryWasCancelled:
-- if attempt < max_retries - 1:
-- delay = base_delay * (2 ** attempt)
-- time.sleep(delay)
-- else:
-- raise
Query queuing¶
Implement query queuing for long-running operations:
-- In your application, implement query queuing
-- Example pseudo-code:
--
-- query_queue = Queue()
--
-- def execute_with_queue(query):
-- query_queue.put(query)
-- while not query_queue.empty():
-- current_query = query_queue.get()
-- try:
-- result = execute_query(current_query)
-- return result
-- except QueryWasCancelled:
-- # Re-queue with exponential backoff
-- time.sleep(backoff_delay)
-- query_queue.put(current_query)
Monitoring and prevention¶
Query performance tracking¶
-- Track query performance patterns
SELECT
user,
COUNT(*) as total_queries,
COUNT(CASE WHEN state = 'Cancelled' THEN 1 END) as cancelled_queries,
AVG(elapsed) as avg_duration
FROM system.processes
WHERE start_time >= now() - INTERVAL 1 HOUR
GROUP BY user
Cancellation analysis¶
-- Analyze cancellation patterns
SELECT
user,
query,
start_time,
elapsed,
state
FROM system.processes
WHERE state = 'Cancelled'
AND start_time >= now() - INTERVAL 24 HOUR
ORDER BY start_time DESC
Resource monitoring¶
-- Monitor system resources
SELECT
metric,
value,
description
FROM system.metrics
WHERE metric IN (
'Memory',
'CPU',
'DiskSpace',
'NetworkReceive',
'NetworkSend'
)