---
title: Kafka message size limits and handling
meta:
  description: Handle large Kafka messages and troubleshoot 10 MB message size limits. Learn compression strategies, splitting techniques, and how to work with quarantined messages in Tinybird's Kafka connector.
---

# Message size handling

This guide covers handling large Kafka messages in Tinybird, including message size limits and strategies for large messages.

## Message size limits

Tinybird has a default message size limit of **10 MB** per message. Messages exceeding this limit are automatically sent to the Quarantine Data Source.

## Checking message sizes

Check quarantined messages for size-related issues:

```sql
SELECT
    timestamp,
    length(__value) as message_size_bytes,
    length(__value) / 1024 / 1024 as message_size_mb,
    msg
FROM your_datasource_quarantine
WHERE timestamp > now() - INTERVAL 1 hour
ORDER BY message_size_bytes DESC
LIMIT 100
```

## Strategies for handling large messages

### Option 1: Compression

Use Kafka compression to reduce message size:

**Producer configuration:**
```python
producer = KafkaProducer(
    bootstrap_servers=['localhost:9092'],
    compression_type='gzip',  # or 'snappy', 'lz4'
    value_serializer=lambda v: json.dumps(v).encode('utf-8')
)
```

**Compression types:**
- `gzip` - Best compression, higher CPU
- `snappy` - Good balance
- `lz4` - Fast, lower compression

### Option 2: Split large messages

Break large messages into smaller chunks on the producer side, then reassemble in a Materialized View if needed.

### Option 3: External storage

Store large payloads in object storage (S3, GCS) and send only references in Kafka:

```python
# Upload to S3, send reference in Kafka
message = {
    'message_id': message_id,
    's3_key': s3_key,
    'metadata': {...}
}
producer.send('topic', value=message)
```

### Option 4: Schema optimization

Reduce message size by storing only necessary data and using references for large content:

```json
{
  "user_id": "123",
  "profile_summary": "key points only",
  "full_profile_s3_key": "s3://bucket/profiles/123.json"
}
```

## Troubleshooting quarantined messages

### Identify size-related quarantines

```sql
SELECT
    timestamp,
    length(__value) as message_size,
    length(__value) / 1024 / 1024 as size_mb,
    msg
FROM your_datasource_quarantine
WHERE timestamp > now() - INTERVAL 24 hour
  AND length(__value) > 10 * 1024 * 1024  -- Over 10 MB
ORDER BY message_size DESC
```

### Extract useful data from quarantined messages

Even if the full message is too large, you can extract metadata:

```sql
SELECT
    timestamp,
    JSONExtractString(__value, 'message_id') as message_id,
    JSONExtractString(__value, 'user_id') as user_id,
    length(__value) as original_size
FROM your_datasource_quarantine
WHERE timestamp > now() - INTERVAL 24 hour
```

## Monitoring message sizes

### Track message size distribution

```sql
SELECT
    quantile(0.5)(message_size) as median_size,
    quantile(0.95)(message_size) as p95_size,
    quantile(0.99)(message_size) as p99_size,
    max(message_size) as max_size
FROM (
    SELECT length(__value) as message_size
    FROM your_datasource
    WHERE timestamp > now() - INTERVAL 1 hour
)
```

### Alert on large messages

```sql
SELECT
    timestamp,
    length(__value) as message_size,
    length(__value) / 1024 / 1024 as size_mb
FROM your_datasource
WHERE length(__value) > 8 * 1024 * 1024  -- Over 8MB
  AND timestamp > now() - INTERVAL 1 hour
ORDER BY message_size DESC
```

## Best practices

1. **Target size:** Keep messages under 1 MB when possible
2. **Use Kafka compression** for large messages
3. **Store only necessary data** in Kafka messages
4. **Use references** for large binary data (S3, GCS)
5. **Monitor message sizes** regularly to catch issues early

## Common issues and solutions

### Issue: Messages consistently over 10 MB

**Solutions:**
1. Implement Kafka compression
2. Split messages into chunks
3. Move large data to external storage
4. Optimize schema to reduce size

### Issue: Compression not helping

**Solutions:**
1. Check if data is already compressed
2. Try different compression types
3. Verify compression is turned on in producer
4. Consider if data is compressible (text vs binary)

## Related documentation

- [Troubleshooting guide](../troubleshooting#error-message-too-large-or-quarantined-due-to-size) - Message size error troubleshooting
- [Quarantine Data Sources](/forward/get-data-in/quarantine) - Handling quarantined messages
- [Kafka connector documentation](/forward/get-data-in/connectors/kafka) - Main setup and configuration guide
