NETWORK_ERROR ClickHouse error

This error occurs when there are network connectivity issues between the client and ClickHouse server. It's common with connection timeouts, network interruptions, or server unavailability.

The NETWORK_ERROR in ClickHouse (and Tinybird) happens when there are network connectivity issues between the client application and the ClickHouse server. This can occur due to various reasons including connection timeouts, network interruptions, server unavailability, firewall issues, or DNS resolution problems.

What causes this error

You'll typically see it when:

  • Network connection is interrupted during query execution
  • Server is unreachable due to network issues
  • Connection timeout occurs
  • Firewall blocks the connection
  • DNS resolution fails
  • Server is down or restarting
  • Network latency exceeds timeout limits
  • Connection pool is exhausted
  • SSL/TLS handshake fails

Network errors are often transient. Implement retry logic with exponential backoff for better reliability.

Example errors

Fails: connection timeout
SELECT * FROM events WHERE timestamp > '2024-01-01'
-- Error: Network error: Connection timeout
Fails: server unreachable
INSERT INTO events (user_id, event_type, timestamp) VALUES
(123, 'click', '2024-01-01 10:00:00')
-- Error: Network error: Connection refused
Fails: DNS resolution failure
-- When trying to connect to hostname that can't be resolved
SELECT COUNT(*) FROM events
-- Error: Network error: Name or service not known
Fails: SSL handshake failure
-- When SSL/TLS connection fails
SELECT * FROM users LIMIT 10
-- Error: Network error: SSL handshake failed

How to fix it

Check network connectivity

Verify basic network connectivity:

Check network connectivity
-- Test basic connectivity from your client
-- Example for Python:
-- import socket
-- try:
--     sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
--     sock.settimeout(5)
--     result = sock.connect_ex(('your-host', 9000))
--     sock.close()
--     if result == 0:
--         print("Port is open")
--     else:
--         print("Port is closed")
-- except Exception as e:
--     print(f"Connection failed: {e}")

Verify server status

Check if the ClickHouse server is running:

Check server status
-- From another client or server, test connection
-- ping your-clickhouse-host
-- telnet your-clickhouse-host 9000

Check firewall settings

Verify firewall configurations:

Firewall check
-- Check if port 9000 is open
-- Example for Linux:
-- sudo ufw status
-- sudo iptables -L
--
-- Example for Windows:
-- netsh advfirewall firewall show rule name=all

Review connection settings

Check client connection configurations:

Connection settings
-- In your client application, verify settings
-- Example for Python clickhouse-driver:
from clickhouse_driver import Client

client = Client(
    host='your-host',
    port=9000,
    database='your_database',
    settings={
        'connect_timeout': 10,  -- 10 seconds
        'send_receive_timeout': 300,  -- 5 minutes
        'sync_request_timeout': 300  -- 5 minutes
    }
)

Common patterns and solutions

Connection retry logic

Implement retry mechanisms for network errors:

Retry logic
-- In your application, implement retry logic
-- Example pseudo-code:
--
-- import time
-- from clickhouse_driver.errors import NetworkError
--
-- def execute_with_retry(query, max_retries=3, base_delay=1):
--     for attempt in range(max_retries):
--         try:
--             result = client.execute(query)
--             return result
--         except NetworkError as e:
--             if attempt < max_retries - 1:
--                 delay = base_delay * (2 ** attempt)
--                 time.sleep(delay)
--                 continue
--             else:
--                 raise

Connection pooling

Use connection pooling to manage connections:

Connection pooling
-- Implement connection pooling in your application
-- Example pseudo-code:
--
-- class ConnectionPool:
--     def __init__(self, max_connections=10):
--         self.max_connections = max_connections
--         self.connections = []
--         self.lock = threading.Lock()
--
--     def get_connection(self):
--         with self.lock:
--             if self.connections:
--                 return self.connections.pop()
--             return self._create_connection()
--
--     def return_connection(self, conn):
--         with self.lock:
--             if len(self.connections) < self.max_connections:
--                 self.connections.append(conn)
--             else:
--                 conn.close()

Health checking

Implement connection health monitoring:

Health checking
-- Add health checks to your connection management
-- Example pseudo-code:
--
-- def check_connection_health(connection):
--     try:
--         # Simple health check query
--         connection.execute("SELECT 1")
--         return True
--     except Exception:
--         return False
--
-- def get_healthy_connection():
--     for conn in connection_pool:
--         if check_connection_health(conn):
--             return conn
--     return create_new_connection()

Timeout management

Set appropriate timeout values:

Timeout management
-- Configure appropriate timeouts for different scenarios
-- Example for Python clickhouse-driver:
client = Client(
    host='your-host',
    port=9000,
    settings={
        'connect_timeout': 10,  -- Connection timeout
        'send_receive_timeout': 300,  -- Query timeout
        'sync_request_timeout': 300,  -- Request timeout
        'keep_alive_timeout': 60  -- Keep-alive timeout
    }
)

Tinybird-specific notes

In Tinybird, NETWORK_ERROR errors often occur when:

  • API endpoints are unreachable
  • Network issues between client and Tinybird
  • Rate limiting causes connection drops
  • Workspace maintenance affects connectivity
  • API token authentication fails

To debug in Tinybird:

  1. Check your network connectivity to Tinybird
  2. Verify API token validity
  3. Check Tinybird status page for service issues
  4. Review rate limiting and API usage

In Tinybird, use the status page to check for known service issues before troubleshooting network problems.

Best practices

Network resilience

  • Implement retry logic with exponential backoff
  • Use connection pooling to manage connections
  • Set appropriate timeout values
  • Monitor network performance metrics

Error handling

  • Handle network errors gracefully
  • Implement circuit breaker patterns
  • Log network issues for debugging
  • Provide user feedback for connection problems

Monitoring

  • Monitor connection success rates
  • Track network latency and timeouts
  • Alert on connection failures
  • Monitor server availability

Configuration options

Network settings

Network configuration
-- Check current network settings
SELECT
    name,
    value,
    description
FROM system.settings
WHERE name LIKE '%timeout%' OR name LIKE '%network%'

Connection settings

Connection configuration
-- Configure connection parameters
-- Example for Python clickhouse-driver:
client = Client(
    host='your-host',
    port=9000,
    database='your_database',
    user='your_user',
    password='your_password',
    settings={
        'connect_timeout': 10,
        'send_receive_timeout': 300,
        'sync_request_timeout': 300,
        'keep_alive_timeout': 60
    }
)

SSL/TLS configuration

SSL configuration
-- Configure SSL/TLS for secure connections
-- Example for Python clickhouse-driver:
client = Client(
    host='your-host',
    port=9440,  -- SSL port
    database='your_database',
    user='your_user',
    password='your_password',
    secure=True,
    verify=False  # Set to True for production
)

Alternative solutions

Use connection proxies

Implement connection proxying:

Connection proxy
-- Use a connection proxy for better reliability
-- Example pseudo-code:
--
-- class ConnectionProxy:
--     def __init__(self, primary_host, backup_hosts):
--         self.primary_host = primary_host
--         self.backup_hosts = backup_hosts
--         self.current_host = primary_host
--
--     def get_connection(self):
--         try:
--             return Client(host=self.current_host)
--         except NetworkError:
--             self._switch_to_backup()
--             return Client(host=self.current_host)
--
--     def _switch_to_backup(self):
--         if self.current_host == self.primary_host:
--             self.current_host = self.backup_hosts[0]
--         else:
--             # Try next backup host
--             current_index = self.backup_hosts.index(self.current_host)
--             next_index = (current_index + 1) % len(self.backup_hosts)
--             self.current_host = self.backup_hosts[next_index]

Implement circuit breaker

Add circuit breaker pattern:

Circuit breaker
-- Implement circuit breaker for network operations
-- Example pseudo-code:
--
-- class CircuitBreaker:
--     def __init__(self, failure_threshold=5, recovery_timeout=60):
--         self.failure_threshold = failure_threshold
--         self.recovery_timeout = recovery_timeout
--         self.failure_count = 0
--         self.last_failure_time = 0
--         self.state = 'CLOSED'  # CLOSED, OPEN, HALF_OPEN
--
--     def call(self, func, *args, **kwargs):
--         if self.state == 'OPEN':
--             if time.time() - self.last_failure_time > self.recovery_timeout:
--                 self.state = 'HALF_OPEN'
--             else:
--                 raise Exception("Circuit breaker is OPEN")
--
--         try:
--             result = func(*args, **kwargs)
--             self._on_success()
--             return result
--         except NetworkError:
--             self._on_failure()
--             raise
--
--     def _on_success(self):
--         self.failure_count = 0
--         self.state = 'CLOSED'
--
--     def _on_failure(self):
--         self.failure_count += 1
--         self.last_failure_time = time.time()
--
--         if self.failure_count >= self.failure_threshold:
--             self.state = 'OPEN'

Asynchronous connections

Use async patterns for better performance:

Async connections
-- Use async/await patterns for network operations
-- Example pseudo-code:
--
-- import asyncio
--
-- async def execute_query_async(query):
--     loop = asyncio.get_event_loop()
--     return await loop.run_in_executor(None, execute_query, query)
--
-- async def main():
--     tasks = []
--     for query in queries:
--         task = asyncio.create_task(execute_query_async(query))
--         tasks.append(task)
--
--     results = await asyncio.gather(*tasks, return_exceptions=True)
--     return results

Monitoring and prevention

Network performance tracking

Performance monitoring
-- Track network performance metrics
-- Example pseudo-code:
--
-- import time
--
-- def track_network_performance(operation):
--     start_time = time.time()
--     try:
--         result = execute_query(operation)
--         duration = time.time() - start_time
--         log_metric('network_success', duration)
--         return result
--     except NetworkError:
--         duration = time.time() - start_time
--         log_metric('network_error', duration)
--         raise

Connection monitoring

Connection monitoring
-- Monitor connection health and performance
-- Example pseudo-code:
--
-- class ConnectionMonitor:
--     def __init__(self):
--         self.connection_attempts = 0
--         self.connection_successes = 0
--         self.connection_failures = 0
--         self.avg_connection_time = 0
--
--     def track_connection_attempt(self, success, duration):
--         self.connection_attempts += 1
--         if success:
--             self.connection_successes += 1
--         else:
--             self.connection_failures += 1
--
--         # Update average connection time
--         self.avg_connection_time = (
--             (self.avg_connection_time * (self.connection_attempts - 1) + duration)
--             / self.connection_attempts
--         )
--
--     def get_success_rate(self):
--         if self.connection_attempts == 0:
--             return 0
--         return self.connection_successes / self.connection_attempts

Alerting

Network alerting
-- Set up alerts for network issues
-- Example pseudo-code:
--
-- def check_network_health():
--     success_rate = connection_monitor.get_success_rate()
--     if success_rate < 0.95:  # 95% success rate threshold
--         send_alert(f"Network health degraded: {success_rate:.2%} success rate")
--
--     if connection_monitor.avg_connection_time > 5:  # 5 second threshold
--         send_alert(f"Network latency high: {connection_monitor.avg_connection_time:.2f}s")

See also

Updated