---
title: Kafka connector performance optimization
meta:
  description: Optimize Kafka connector performance with schema optimization, Materialized View tuning, and throughput best practices. Learn how to reduce consumer lag and improve ingestion speed.
---

# Performance optimization

This guide covers strategies for optimizing your Kafka connector performance, focusing on schema design, Materialized View optimization, and best practices.

## Schema optimization

### Use explicit schemas

Explicit schemas are faster and more efficient than schemaless:

**Recommended:**
```tb
SCHEMA >
    `user_id` String `json:$.user_id`,
    `event_type` LowCardinality(String) `json:$.event_type`,
    `timestamp` DateTime `json:$.timestamp`
```

**Avoid (slower):**
```tb
SCHEMA >
    `data` String `json:$`  -- Requires parsing at query time
```

### Optimize data types

- Use `LowCardinality(String)` for enum-like fields
- Use smallest integer type needed (`Int32` vs `Int64`)
- Use `DateTime` for timestamps (not `String`)
- Use `Nullable()` only when needed

**Example:**
```tb
SCHEMA >
    `user_id` String `json:$.user_id`,
    `event_type` LowCardinality(String) `json:$.event_type`,
    `timestamp` DateTime `json:$.timestamp`,
    `count` Int32 `json:$.count`,
    `metadata` Nullable(String) `json:$.metadata`  -- Only if needed
```

## Materialized View optimization

Complex Materialized Views can slow down ingestion. Materialized Views that trigger on append operations from Kafka data sources can impact ingestion performance, especially if they perform expensive aggregations or joins.

### Optimization strategies

1. **Simplify aggregations** - Keep aggregations efficient
2. **Add filters** - Reduce data volume processed
3. **Optimize joins** - Use appropriate join strategies
4. **Avoid cascade MVs** - Don't create multiple Materialized Views from the same Kafka data source, as this increases ingestion latency
5. **Limit MVs per data source** - Too many Materialized Views reading from the same Kafka data source can slow down ingestion

## Partition distribution

Ensure even partition distribution to maximize throughput. Monitor partition lag:

```sql
SELECT
    partition,
    max(lag) as max_lag,
    avg(lag) as avg_lag,
    sum(processed_messages) as total_processed
FROM tinybird.kafka_ops_log
WHERE timestamp > now() - INTERVAL 1 hour
  AND partition >= 0
GROUP BY partition
ORDER BY max_lag DESC
```

Uneven distribution may indicate:
- Poor partition key design
- Hot partitions
- Need for more partitions

See the [partitioning strategies guide](partitioning-strategies) for detailed guidance.

## Common performance bottlenecks

### Schema parsing

**Symptoms:**
- High CPU usage
- Slow message processing
- Low throughput

**Solutions:**
1. Use explicit schemas instead of schemaless
2. Optimize JSONPath expressions
3. Reduce schema complexity
4. Use appropriate data types

### Materialized Views

**Symptoms:**
- Slow ingestion
- High memory usage
- Timeouts in Materialized Views

**Solutions:**
1. Simplify Materialized View queries
2. Add filters to reduce data volume
3. Avoid cascade MVs or multiple MVs from the same Kafka data source
4. Optimize aggregations

### Partition imbalance

**Symptoms:**
- Uneven lag across partitions
- Some partitions slow
- Overall throughput limited

**Solutions:**
1. Review partition key strategy
2. Redistribute messages more evenly
3. Increase partitions if needed
4. Monitor partition distribution

## Best practices

1. **Use explicit schemas** - Faster parsing and better performance
2. **Optimize data types** - Use smallest types needed, `LowCardinality` for enums
3. **Simplify Materialized Views** - Keep MVs efficient to avoid slowing ingestion
4. **Ensure even partition distribution** - Monitor and optimize partition keys
5. **Monitor performance** - Track lag, throughput, and error rates regularly

## Related documentation

- [Monitor Kafka connectors](/forward/monitoring/kafka-clickhouse-monitoring) - Comprehensive monitoring queries and metrics
- [Partitioning strategies guide](partitioning-strategies) - Optimize partition distribution
- [Troubleshooting guide](../troubleshooting) - Resolve performance issues
