NETWORK_ERROR ClickHouse error¶
This error occurs when there are network connectivity issues between the client and ClickHouse server. It's common with connection timeouts, network interruptions, or server unavailability.
The NETWORK_ERROR
in ClickHouse (and Tinybird) happens when there are network connectivity issues between the client application and the ClickHouse server. This can occur due to various reasons including connection timeouts, network interruptions, server unavailability, firewall issues, or DNS resolution problems.
What causes this error¶
You'll typically see it when:
- Network connection is interrupted during query execution
- Server is unreachable due to network issues
- Connection timeout occurs
- Firewall blocks the connection
- DNS resolution fails
- Server is down or restarting
- Network latency exceeds timeout limits
- Connection pool is exhausted
- SSL/TLS handshake fails
Network errors are often transient. Implement retry logic with exponential backoff for better reliability.
Example errors¶
Fails: connection timeout
SELECT * FROM events WHERE timestamp > '2024-01-01' -- Error: Network error: Connection timeout
Fails: server unreachable
INSERT INTO events (user_id, event_type, timestamp) VALUES (123, 'click', '2024-01-01 10:00:00') -- Error: Network error: Connection refused
Fails: DNS resolution failure
-- When trying to connect to hostname that can't be resolved SELECT COUNT(*) FROM events -- Error: Network error: Name or service not known
Fails: SSL handshake failure
-- When SSL/TLS connection fails SELECT * FROM users LIMIT 10 -- Error: Network error: SSL handshake failed
How to fix it¶
Check network connectivity¶
Verify basic network connectivity:
Check network connectivity
-- Test basic connectivity from your client -- Example for Python: -- import socket -- try: -- sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) -- sock.settimeout(5) -- result = sock.connect_ex(('your-host', 9000)) -- sock.close() -- if result == 0: -- print("Port is open") -- else: -- print("Port is closed") -- except Exception as e: -- print(f"Connection failed: {e}")
Verify server status¶
Check if the ClickHouse server is running:
Check server status
-- From another client or server, test connection -- ping your-clickhouse-host -- telnet your-clickhouse-host 9000
Check firewall settings¶
Verify firewall configurations:
Firewall check
-- Check if port 9000 is open -- Example for Linux: -- sudo ufw status -- sudo iptables -L -- -- Example for Windows: -- netsh advfirewall firewall show rule name=all
Review connection settings¶
Check client connection configurations:
Connection settings
-- In your client application, verify settings -- Example for Python clickhouse-driver: from clickhouse_driver import Client client = Client( host='your-host', port=9000, database='your_database', settings={ 'connect_timeout': 10, -- 10 seconds 'send_receive_timeout': 300, -- 5 minutes 'sync_request_timeout': 300 -- 5 minutes } )
Common patterns and solutions¶
Connection retry logic¶
Implement retry mechanisms for network errors:
Retry logic
-- In your application, implement retry logic -- Example pseudo-code: -- -- import time -- from clickhouse_driver.errors import NetworkError -- -- def execute_with_retry(query, max_retries=3, base_delay=1): -- for attempt in range(max_retries): -- try: -- result = client.execute(query) -- return result -- except NetworkError as e: -- if attempt < max_retries - 1: -- delay = base_delay * (2 ** attempt) -- time.sleep(delay) -- continue -- else: -- raise
Connection pooling¶
Use connection pooling to manage connections:
Connection pooling
-- Implement connection pooling in your application -- Example pseudo-code: -- -- class ConnectionPool: -- def __init__(self, max_connections=10): -- self.max_connections = max_connections -- self.connections = [] -- self.lock = threading.Lock() -- -- def get_connection(self): -- with self.lock: -- if self.connections: -- return self.connections.pop() -- return self._create_connection() -- -- def return_connection(self, conn): -- with self.lock: -- if len(self.connections) < self.max_connections: -- self.connections.append(conn) -- else: -- conn.close()
Health checking¶
Implement connection health monitoring:
Health checking
-- Add health checks to your connection management -- Example pseudo-code: -- -- def check_connection_health(connection): -- try: -- # Simple health check query -- connection.execute("SELECT 1") -- return True -- except Exception: -- return False -- -- def get_healthy_connection(): -- for conn in connection_pool: -- if check_connection_health(conn): -- return conn -- return create_new_connection()
Timeout management¶
Set appropriate timeout values:
Timeout management
-- Configure appropriate timeouts for different scenarios -- Example for Python clickhouse-driver: client = Client( host='your-host', port=9000, settings={ 'connect_timeout': 10, -- Connection timeout 'send_receive_timeout': 300, -- Query timeout 'sync_request_timeout': 300, -- Request timeout 'keep_alive_timeout': 60 -- Keep-alive timeout } )
Tinybird-specific notes¶
In Tinybird, NETWORK_ERROR errors often occur when:
- API endpoints are unreachable
- Network issues between client and Tinybird
- Rate limiting causes connection drops
- Workspace maintenance affects connectivity
- API token authentication fails
To debug in Tinybird:
- Check your network connectivity to Tinybird
- Verify API token validity
- Check Tinybird status page for service issues
- Review rate limiting and API usage
In Tinybird, use the status page to check for known service issues before troubleshooting network problems.
Best practices¶
Network resilience¶
- Implement retry logic with exponential backoff
- Use connection pooling to manage connections
- Set appropriate timeout values
- Monitor network performance metrics
Error handling¶
- Handle network errors gracefully
- Implement circuit breaker patterns
- Log network issues for debugging
- Provide user feedback for connection problems
Monitoring¶
- Monitor connection success rates
- Track network latency and timeouts
- Alert on connection failures
- Monitor server availability
Configuration options¶
Network settings¶
Network configuration
-- Check current network settings SELECT name, value, description FROM system.settings WHERE name LIKE '%timeout%' OR name LIKE '%network%'
Connection settings¶
Connection configuration
-- Configure connection parameters -- Example for Python clickhouse-driver: client = Client( host='your-host', port=9000, database='your_database', user='your_user', password='your_password', settings={ 'connect_timeout': 10, 'send_receive_timeout': 300, 'sync_request_timeout': 300, 'keep_alive_timeout': 60 } )
SSL/TLS configuration¶
SSL configuration
-- Configure SSL/TLS for secure connections -- Example for Python clickhouse-driver: client = Client( host='your-host', port=9440, -- SSL port database='your_database', user='your_user', password='your_password', secure=True, verify=False # Set to True for production )
Alternative solutions¶
Use connection proxies¶
Implement connection proxying:
Connection proxy
-- Use a connection proxy for better reliability -- Example pseudo-code: -- -- class ConnectionProxy: -- def __init__(self, primary_host, backup_hosts): -- self.primary_host = primary_host -- self.backup_hosts = backup_hosts -- self.current_host = primary_host -- -- def get_connection(self): -- try: -- return Client(host=self.current_host) -- except NetworkError: -- self._switch_to_backup() -- return Client(host=self.current_host) -- -- def _switch_to_backup(self): -- if self.current_host == self.primary_host: -- self.current_host = self.backup_hosts[0] -- else: -- # Try next backup host -- current_index = self.backup_hosts.index(self.current_host) -- next_index = (current_index + 1) % len(self.backup_hosts) -- self.current_host = self.backup_hosts[next_index]
Implement circuit breaker¶
Add circuit breaker pattern:
Circuit breaker
-- Implement circuit breaker for network operations -- Example pseudo-code: -- -- class CircuitBreaker: -- def __init__(self, failure_threshold=5, recovery_timeout=60): -- self.failure_threshold = failure_threshold -- self.recovery_timeout = recovery_timeout -- self.failure_count = 0 -- self.last_failure_time = 0 -- self.state = 'CLOSED' # CLOSED, OPEN, HALF_OPEN -- -- def call(self, func, *args, **kwargs): -- if self.state == 'OPEN': -- if time.time() - self.last_failure_time > self.recovery_timeout: -- self.state = 'HALF_OPEN' -- else: -- raise Exception("Circuit breaker is OPEN") -- -- try: -- result = func(*args, **kwargs) -- self._on_success() -- return result -- except NetworkError: -- self._on_failure() -- raise -- -- def _on_success(self): -- self.failure_count = 0 -- self.state = 'CLOSED' -- -- def _on_failure(self): -- self.failure_count += 1 -- self.last_failure_time = time.time() -- -- if self.failure_count >= self.failure_threshold: -- self.state = 'OPEN'
Asynchronous connections¶
Use async patterns for better performance:
Async connections
-- Use async/await patterns for network operations -- Example pseudo-code: -- -- import asyncio -- -- async def execute_query_async(query): -- loop = asyncio.get_event_loop() -- return await loop.run_in_executor(None, execute_query, query) -- -- async def main(): -- tasks = [] -- for query in queries: -- task = asyncio.create_task(execute_query_async(query)) -- tasks.append(task) -- -- results = await asyncio.gather(*tasks, return_exceptions=True) -- return results
Monitoring and prevention¶
Network performance tracking¶
Performance monitoring
-- Track network performance metrics -- Example pseudo-code: -- -- import time -- -- def track_network_performance(operation): -- start_time = time.time() -- try: -- result = execute_query(operation) -- duration = time.time() - start_time -- log_metric('network_success', duration) -- return result -- except NetworkError: -- duration = time.time() - start_time -- log_metric('network_error', duration) -- raise
Connection monitoring¶
Connection monitoring
-- Monitor connection health and performance -- Example pseudo-code: -- -- class ConnectionMonitor: -- def __init__(self): -- self.connection_attempts = 0 -- self.connection_successes = 0 -- self.connection_failures = 0 -- self.avg_connection_time = 0 -- -- def track_connection_attempt(self, success, duration): -- self.connection_attempts += 1 -- if success: -- self.connection_successes += 1 -- else: -- self.connection_failures += 1 -- -- # Update average connection time -- self.avg_connection_time = ( -- (self.avg_connection_time * (self.connection_attempts - 1) + duration) -- / self.connection_attempts -- ) -- -- def get_success_rate(self): -- if self.connection_attempts == 0: -- return 0 -- return self.connection_successes / self.connection_attempts
Alerting¶
Network alerting
-- Set up alerts for network issues -- Example pseudo-code: -- -- def check_network_health(): -- success_rate = connection_monitor.get_success_rate() -- if success_rate < 0.95: # 95% success rate threshold -- send_alert(f"Network health degraded: {success_rate:.2%} success rate") -- -- if connection_monitor.avg_connection_time > 5: # 5 second threshold -- send_alert(f"Network latency high: {connection_monitor.avg_connection_time:.2f}s")