---
title: "ClickHouse® Python examples: clickhouse-connect vs. clickhouse-driver"
excerpt: "ClickHouse Python example that works in production. Copy this code and start querying analytical data from Python in minutes."
authors: "Cameron Archer"
categories: "AI Resources"
createdOn: "2025-10-14 19:50:09"
publishedOn: "2025-10-17 19:50:09"
updatedOn: "2025-10-17 19:50:09"
status: "published"
---

ClickHouse{% sup %}®{% /sup %} stores and analyzes massive datasets at high speed. Accessing that data from your Python application requires a client library that can handle the connection protocol and query execution. Two main Python clients exist for ClickHouse{% sup %}®{% /sup %}: the official `clickhouse-connect` library and the community-maintained `clickhouse-driver`, each with different protocols, features, and tradeoffs.

## Installing a Python driver for ClickHouse{% sup %}®{% /sup %}

To talk to ClickHouse{% sup %}®{% /sup %} from Python, you need a client library that handles the connection and sends your queries to the database. Two main options exist: `clickhouse-connect`, the official client from ClickHouse{% sup %}®{% /sup %}, Inc., and `clickhouse-driver`, a community-built alternative that uses a different protocol.

Both libraries let you connect to databases, run queries, and insert data. Your choice depends on which protocol you prefer and what features you need. If you're working with Java instead, see our complete [ClickHouse{% sup %}®{% /sup %} Java integration guide](https://www.tinybird.co/blog-posts/clickhouse-java-example) with JDBC examples.

### `pip install clickhouse-connect`

The official ClickHouse{% sup %}®{% /sup %} client installs with this command:

```bash
pip install clickhouse-connect
```

This library talks to ClickHouse{% sup %}®{% /sup %} over HTTP and works with Python 3.8 or newer. ClickHouse{% sup %}®{% /sup %}, Inc. maintains it directly, so updates align with new ClickHouse{% sup %}®{% /sup %} releases.

The community-maintained client installs similarly:

```bash
pip install clickhouse-driver
```

This library uses ClickHouse{% sup %}®{% /sup %}'s native TCP protocol and supports Python 3.7 or newer. The community has maintained it for several years with stable releases, demonstrating the [highest repository issue activity](https://www.pracdata.io/p/state-of-open-source-read-time-olap-2025) among similar OLAP projects.

## Comparing clickhouse-connect and clickhouse-driver

The two clients differ in how they communicate with ClickHouse{% sup %}®{% /sup %} and what features they include. Here's what sets them apart:

| Feature            | clickhouse-connect     | clickhouse-driver       |
|--------------------|------------------------|-------------------------|
| Protocol           | HTTP/HTTPS             | Native TCP              |
| Maintainer         | ClickHouse{% sup %}®{% /sup %}, Inc.       | Community               |
| Pandas support     | Built-in               | Manual conversion       |
| Async support      | Included               | Separate package        |
| Firewall friendly  | Yes (port 8123/8443)   | Requires port 9000/9440 |

### Protocol support and performance differences

`clickhouse-connect` uses HTTP, which means it works over standard web ports. This makes it easier to use when you're behind a firewall or connecting through a proxy. `clickhouse-driver` uses ClickHouse{% sup %}®{% /sup %}'s native TCP protocol, which can be faster for large result sets because it uses a more compact binary format.

Connection pooling works automatically in `clickhouse-connect`, while `clickhouse-driver` requires manual setup if you want pooling.

### Feature gaps and ecosystem maturity

Both clients handle the core operations you'll use daily, but they differ in convenience features. `clickhouse-connect` includes pandas DataFrame support through its `query_df()` method. With `clickhouse-driver`, you convert query results to DataFrames manually using `pd.DataFrame()`.

For async operations, `clickhouse-connect` includes everything in the main package. `clickhouse-driver` requires installing the separate `clickhouse-driver[asyncio]` package.

## Creating a connection with Python

Connecting to ClickHouse{% sup %}®{% /sup %} involves three steps: importing the client library, creating a client with your connection details, and optionally testing that the connection works.

### 1. Import the client library

For `clickhouse-connect`, import the module and call `get_client()`:

```python
import clickhouse_connect

client = clickhouse_connect.get_client(host='localhost', username='default', password='')
```

For `clickhouse-driver`, import the `Client` class:

```python
from clickhouse_driver import Client

client = Client(host='localhost', user='default', password='')
```

### 2. Instantiate the client with host, port, and auth

When connecting to a local ClickHouse{% sup %}®{% /sup %} instance, use `localhost` with the default ports. For cloud deployments, use the hostname provided by your cloud service and enable secure connections.

Here's a local connection with `clickhouse-connect`:

```python
client = clickhouse_connect.get_client(
    host='localhost',
    port=8123,
    username='default',
    password='',
    database='default'
)
```

For [Tinybird](https://www.tinybird.co) with `clickhouse-connect`:

```python
client = clickhouse_connect.get_client(
    host='clickhouse.us-east-1.aws.tinybird.co',
    port=7182,
    username='default',
    password='TINYBIRD_TOKEN',
    secure=True
)
```

With `clickhouse-driver`, a cloud connection looks like this:

```python
client = Client(
    host='clickhouse.us-east-1.aws.tinybird.co',
    port=7182,
    username='default',
    password='TINYBIRD_TOKEN',
    secure=True
)
```

### 3. Validate the connection with a ping

Before running queries, test that your connection works. With `clickhouse-connect`:

```python
if client.ping():
    print("Connection successful")
```

With `clickhouse-driver`, run a simple query:

```python
result = client.execute('SELECT 1')
print(f"Connection successful: {result}")
```

If the connection fails, you'll see an error message showing what went wrong, like authentication problems, network timeouts, or incorrect hostnames.

## Running your first select query

Once connected, you can run SQL queries and get results back. The format of the results depends on which client you're using and which method you call.

### Simple select returning tuples

Both clients return query results as lists of tuples by default. Here's a basic query with `clickhouse-connect`:

```python
result = client.query('SELECT number, number * 2 FROM system.numbers LIMIT 5')
for row in result.result_rows:
    print(row)
```

This prints tuples like `(0, 0)`, `(1, 2)`, `(2, 4)`, and so on.

With `clickhouse-driver`, the syntax changes slightly:

```python
result = client.execute('SELECT number, number * 2 FROM system.numbers LIMIT 5')
for row in result:
    print(row)
```

### Parameterized select returning a pandas DataFrame

Parameterized queries prevent SQL injection by separating your query structure from user input. With `clickhouse-connect`, you can pass parameters and get results as a [pandas DataFrame](https://www.tinybird.co/blog-posts/python-real-time-dashboard):

```python
result = client.query_df(
    'SELECT number, number * {multiplier:Int32} FROM system.numbers WHERE number < {limit:Int32}',
    parameters={'multiplier': 3, 'limit': 5}
)
print(result)
```

The parameter syntax uses curly braces with the parameter name and its ClickHouse{% sup %}®{% /sup %} type. For `clickhouse-driver`, you pass parameters as a dictionary:

```python
import pandas as pd

result = client.execute(
    'SELECT number, number * %(multiplier)s FROM system.numbers WHERE number < %(limit)s',
    {'multiplier': 3, 'limit': 5}
)
df = pd.DataFrame(result, columns=['number', 'result'])
```

## Inserting data in bulk from Python

ClickHouse{% sup %}®{% /sup %} performs best with bulk inserts rather than single-row operations. Both clients support batch insertion, though the syntax differs.

### 1. Prepare sample rows or a DataFrame

First, create your table structure and prepare your data. Here's a simple table:

```python
client.command('''
    CREATE TABLE IF NOT EXISTS events (
        event_id String,
        user_id String,
        timestamp DateTime,
        event_type String
    ) ENGINE = MergeTree()
    ORDER BY (timestamp, event_id)
''')
```

Prepare data as a list of tuples:

```python
data = [
    ('evt_1', 'user_100', '2024-01-15 10:30:00', 'login'),
    ('evt_2', 'user_101', '2024-01-15 10:31:00', 'pageview'),
    ('evt_3', 'user_100', '2024-01-15 10:35:00', 'logout')
]
```

### 2. Single insert for quick tests

For testing or tiny datasets, you can insert one row. With `clickhouse-connect`:

```python
client.insert('events', [['evt_4', 'user_102', '2024-01-15 11:00:00', 'signup']])
```

This approach works but creates overhead in ClickHouse{% sup %}®{% /sup %}'s storage engine, so it's not efficient for production.

### 3. Batch insert with compression for speed

For production use, insert data in batches. With `clickhouse-connect`:

```python
client.insert('events', data, column_names=['event_id', 'user_id', 'timestamp', 'event_type'])
```

The client automatically compresses data before sending it to ClickHouse{% sup %}®{% /sup %}. For larger batches (10,000+ rows), this can cut insert time significantly.

With `clickhouse-driver`, batch inserts look like this:

```python
client.execute(
    'INSERT INTO events (event_id, user_id, timestamp, event_type) VALUES',
    data
)
```

A good batch size for most cases is between 10,000 and 100,000 rows, though ClickHouse{% sup %}®{% /sup %} can handle [much more extreme ingestion rates](https://www.tinybird.co/blog-posts/1b-rows-per-second-clickhouse) with proper optimization.

## Using async and threaded clients for high concurrency

When your application handles multiple concurrent requests, async or threaded database clients prevent blocking operations from slowing the application down.

### Async example with clickhouse-connect

The async version of `clickhouse-connect` allows non-blocking database operations:

```python
import asyncio
import clickhouse_connect

async def fetch_data():
    client = await clickhouse_connect.get_async_client(
        host='localhost',
        username='default',
        password=''
    )

    result = await client.query('SELECT count() FROM events')
    print(result.result_rows[0][0])

    await client.close()

asyncio.run(fetch_data())
```

### Thread pool example with clickhouse-driver

For thread-based concurrency, create a connection pool that manages multiple database connections:

```python
from clickhouse_driver import Client
from concurrent.futures import ThreadPoolExecutor

def execute_query(query):
    client = Client(host='localhost')
    result = client.execute(query)
    client.disconnect()
    return result

with ThreadPoolExecutor(max_workers=5) as executor:
    queries = ['SELECT count() FROM events WHERE event_type = "login"' for _ in range(10)]
    results = executor.map(execute_query, queries)
```

Each thread gets its own connection, preventing concurrent access issues.

## Handling auth, TLS, and network timeouts

Production deployments require proper security configuration and timeout handling to prevent connection failures and protect data.

### Setting credentials with environment variables

Hardcoding credentials in your code creates security risks. Store them in environment variables instead:

```python
import os
import clickhouse_connect

client = clickhouse_connect.get_client(
    host=os.environ.get('CLICKHOUSE_HOST', 'localhost'),
    port=int(os.environ.get('CLICKHOUSE_PORT', 8123)),
    username=os.environ.get('CLICKHOUSE_USER', 'default'),
    password=os.environ.get('CLICKHOUSE_PASSWORD', '')
)
```

### Enabling TLS to cloud endpoints

Cloud ClickHouse{% sup %}®{% /sup %} deployments typically require encrypted connections. With `clickhouse-connect`, enable TLS by setting `secure=True`:

```python
client = clickhouse_connect.get_client(
    host='your-instance.clickhouse.cloud',
    port=8443,
    username='default',
    password='your_password',
    secure=True
)
```

For `clickhouse-driver`:

```python
client = Client(
    host='your-instance.clickhouse.cloud',
    port=9440,
    user='default',
    password='your_password',
    secure=True
)
```

### Tuning query and socket timeouts

[Long-running queries](https://www.tinybird.co/blog-posts/how-to-safely-cancel-a-database-query) or network issues can cause your application to hang. Set timeouts to prevent this:

```python
client = clickhouse_connect.get_client(
    host='localhost',
    connect_timeout=10,
    send_receive_timeout=300
)
```

For analytical queries that might take longer, increase the `send_receive_timeout`. For transactional workloads, keep timeouts shorter to fail fast.

## Common ClickHouse{% sup %}®{% /sup %} Python errors and fixes

Certain errors appear frequently when working with ClickHouse{% sup %}®{% /sup %} from Python. Recognizing these patterns helps you debug faster.

### Authentication code 516

This error appears when credentials are incorrect or the user lacks permissions:

```text
Code: 516. DB::Exception: default: Authentication failed
```

Check that your username and password are correct. If connecting to ClickHouse{% sup %}®{% /sup %} Cloud, verify you're using the cloud credentials, not local defaults.

### TLS handshake failures

When connecting to secure endpoints, you might see SSL certificate errors:

```text
SSLError: [SSL: CERTIFICATE_VERIFY_FAILED]
```

This typically means the client can't verify the server's certificate. Update your SSL certificates:

```bash
pip install --upgrade certifi
```

### Date parsing errors during insert

ClickHouse{% sup %}®{% /sup %} is strict about date and datetime formats. If you see parsing errors:

```text
Code: 41. DB::Exception: Cannot parse DateTime
```

Ensure your datetime strings match ClickHouse{% sup %}®{% /sup %}'s expected format: `YYYY-MM-DD HH:MM:SS`. When inserting from pandas, convert datetime columns explicitly:

```python
df['timestamp'] = pd.to_datetime(df['timestamp']).dt.strftime('%Y-%m-%d %H:%M:%S')
client.insert('events', df.values.tolist(), column_names=df.columns.tolist())
```

## Using Tinybird for an API-first ClickHouse{% sup %}®{% /sup %} Python backend

Setting up and maintaining a ClickHouse{% sup %}®{% /sup %} cluster requires [expertise in distributed systems, query optimization, and infrastructure management](https://www.tinybird.co/blog-posts/what-i-learned-operating-clickhouse). Managed services handle this complexity, letting you focus on building application features.

Tinybird provides a [managed ClickHouse{% sup %}®{% /sup %} platform](https://www.tinybird.co/product/managed-clickhouse) designed for developers integrating ClickHouse{% sup %}®{% /sup %} into application backends. The platform includes built-in data ingestion, query optimization, and API generation.

With Tinybird, you can deploy ClickHouse{% sup %}®{% /sup %}-backed APIs in minutes rather than weeks. The platform handles cluster scaling, monitoring, backup management, and security configuration automatically. Tinybird also includes data pipelines as code, local development environments, and CI/CD integration.

[Sign up for a free Tinybird plan](https://cloud.tinybird.co/signup) to try managed ClickHouse{% sup %}®{% /sup %} without infrastructure work.

## Frequently asked questions about ClickHouse{% sup %}®{% /sup %} Python

### Can I use pandas with ClickHouse{% sup %}®{% /sup %}?

Yes, both major Python clients support pandas DataFrames. With `clickhouse-connect`, use the `query_df()` method to get results directly as a DataFrame, or use `insert()` with DataFrame values. `clickhouse-driver` requires manual conversion between query results and DataFrames using `pd.DataFrame()`. For local analytics without a ClickHouse{% sup %}®{% /sup %} server, [chDB provides an embedded ClickHouse{% sup %}®{% /sup %} engine](https://www.tinybird.co/blog-posts/clickhouse-vs-chdb-embedded-clickhouse) with zero-copy DataFrame integration.

### Does ClickHouse{% sup %}®{% /sup %} support prepared statements in Python?

ClickHouse{% sup %}®{% /sup %} Python clients use parameterized queries instead of traditional prepared statements. This approach provides similar protection against SQL injection by separating query structure from user input. Parameters are sent separately from the query text and properly escaped by the client library.

### How do I stream data continuously from Python to ClickHouse{% sup %}®{% /sup %}?

Rather than streaming individual rows, collect data in batches and insert periodically. ClickHouse{% sup %}®{% /sup %} is optimized for bulk operations, so inserting 10,000 rows every few seconds performs better than inserting rows one at a time. You can use a buffer in your application to collect events and flush them to ClickHouse{% sup %}®{% /sup %} when the buffer reaches a size threshold or time limit.
