Local development setup

This guide helps you set up the Kafka connector for local development, including running Kafka locally, connecting to cloud Kafka from your local environment, and managing environment-specific configurations.

Prerequisites

  • Docker and Docker Compose installed
  • Tinybird Local installed (see Install Tinybird Local)
  • A Tinybird project directory

Option 1: Local Kafka with Docker Compose

The easiest way to develop locally is to run Kafka in Docker alongside Tinybird Local.

Docker Compose setup

Create a docker-compose.yml file in your project:

networks:
  kafka_network:
    driver: bridge

volumes:
  kafka-data:

services:
  tinybird-local:
    image: tinybirdco/tinybird-local:latest
    container_name: tinybird-local
    platform: linux/amd64
    ports:
      - "7181:7181"
    networks:
      - kafka_network
    volumes:
      - ./:/workspace
      - tinybird-data:/var/lib/tinybird

  kafka:
    image: apache/kafka:latest
    hostname: broker
    container_name: broker
    ports:
      - 9092:9092
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_PROCESS_ROLES: "broker,controller"
      KAFKA_CONTROLLER_QUORUM_VOTERS: "1@broker:29093"
      KAFKA_CONTROLLER_LISTENER_NAMES: "CONTROLLER"
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT,CONTROLLER:PLAINTEXT
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092
      KAFKA_LISTENERS: PLAINTEXT://0.0.0.0:29092,PLAINTEXT_HOST://0.0.0.0:9092,CONTROLLER://0.0.0.0:29093
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_AUTO_CREATE_TOPICS_ENABLE: "true"
    volumes:
      - kafka-data:/var/lib/kafka/data
    networks:
      - kafka_network

Start the services

docker compose up -d

Create a topic

docker exec -it broker /opt/kafka/bin/kafka-topics.sh --create \
  --topic test-topic \
  --bootstrap-server localhost:9092 \
  --partitions 3 \
  --replication-factor 1

Connection configuration

Create a connection file for local development:

connections/kafka_local.connection
TYPE kafka
KAFKA_BOOTSTRAP_SERVERS {{ tb_secret("KAFKA_BOOTSTRAP_SERVERS", "kafka:29092") }}
KAFKA_SECURITY_PROTOCOL {{ tb_secret("KAFKA_SECURITY_PROTOCOL", "PLAINTEXT") }}

Note: The bootstrap server kafka:29092 uses the Docker service name, which works from within the Docker network.

Data source configuration

datasources/test_topic.datasource
SCHEMA >
    `data` String `json:$`

KAFKA_CONNECTION_NAME kafka_local
KAFKA_TOPIC test-topic
KAFKA_GROUP_ID {{ tb_secret("KAFKA_GROUP_ID", "local-dev-group") }}

Test locally

# Deploy to Tinybird Local
tb deploy

# Send a test message
echo '{"test": "data"}' | docker exec -i broker /opt/kafka/bin/kafka-console-producer.sh \
  --topic test-topic \
  --bootstrap-server localhost:9092

# Query the data
tb sql "SELECT * FROM test_topic LIMIT 10"

Option 2: Connect to cloud Kafka from local

You can also connect Tinybird Local to a cloud Kafka cluster (Confluent Cloud, AWS MSK, etc.) for testing.

Connection configuration

Use the same connection configuration as production, but with local secrets:

connections/kafka_cloud_local.connection
TYPE kafka
KAFKA_BOOTSTRAP_SERVERS {{ tb_secret("KAFKA_BOOTSTRAP_SERVERS", "your-cloud-bootstrap:9092") }}
KAFKA_SECURITY_PROTOCOL SASL_SSL
KAFKA_SASL_MECHANISM PLAIN
KAFKA_KEY {{ tb_secret("KAFKA_KEY", "your-key") }}
KAFKA_SECRET {{ tb_secret("KAFKA_SECRET", "your-secret") }}

Important: The tb_secret() function uses default values when running locally. These defaults are only used in Tinybird Local, not in Cloud.

Set local secrets

# Set secrets for local environment
tb secret set KAFKA_BOOTSTRAP_SERVERS "your-cloud-bootstrap:9092"
tb secret set KAFKA_KEY "your-key"
tb secret set KAFKA_SECRET "your-secret"

Environment-specific configurations

Using default values for local development

The tb_secret() function supports default values that are used in local environments. This allows you to use the same Connection and Data Source files across all environments:

KAFKA_BOOTSTRAP_SERVERS {{ tb_secret("KAFKA_BOOTSTRAP_SERVERS", "kafka:29092") }}
KAFKA_SECURITY_PROTOCOL {{ tb_secret("KAFKA_SECURITY_PROTOCOL", "PLAINTEXT") }}
KAFKA_KEY {{ tb_secret("KAFKA_KEY", "key") }}
KAFKA_SECRET {{ tb_secret("KAFKA_SECRET", "secret") }}
  • Local: Uses the default values (for example, kafka:29092 for local Docker Kafka)
  • Cloud: Uses the secret values set in Tinybird Cloud for each workspace

Testing strategies

Unit testing with sample data

Create sample messages for testing:

# Create a test topic
docker exec -it broker /opt/kafka/bin/kafka-topics.sh --create \
  --topic test-events \
  --bootstrap-server localhost:9092

# Send sample messages
cat <<EOF | docker exec -i broker /opt/kafka/bin/kafka-console-producer.sh \
  --topic test-events \
  --bootstrap-server localhost:9092
{"user_id": "123", "event": "click", "timestamp": "2024-01-01T00:00:00Z"}
{"user_id": "456", "event": "view", "timestamp": "2024-01-01T00:01:00Z"}
{"user_id": "123", "event": "purchase", "timestamp": "2024-01-01T00:02:00Z"}
EOF

Integration testing

Test the full pipeline:

# 1. Deploy to local
tb deploy

# 2. Send test messages
# (use your application or kafka-console-producer)

# 3. Verify ingestion
tb sql "SELECT count() FROM your_datasource"

# 4. Test queries
tb sql "SELECT * FROM your_datasource WHERE timestamp > now() - INTERVAL 1 hour"

Schema testing

Test schema changes locally before deploying:

# Test schema with sample data
SCHEMA >
    `user_id` String `json:$.user_id`,
    `event` LowCardinality(String) `json:$.event`,
    `timestamp` DateTime `json:$.timestamp`

Debugging local connections

Check container logs

# Kafka logs
docker compose logs kafka

# Tinybird Local logs
docker compose logs tinybird-local

Verify network connectivity

# From Tinybird Local container
docker exec -it tinybird-local ping kafka

# Test Kafka connectivity
docker exec -it tinybird-local telnet kafka 29092

Common issues

Issue: Connection timeout

  • Verify containers are on the same network
  • Check bootstrap server address matches KAFKA_ADVERTISED_LISTENERS
  • Ensure Kafka container is running: docker compose ps

Issue: Topic not found

  • Create the topic before deploying
  • Check topic exists: docker exec -it broker /opt/kafka/bin/kafka-topics.sh --list --bootstrap-server localhost:9092

Issue: No messages received

  • Verify messages are being produced
  • Check consumer group: docker exec -it broker /opt/kafka/bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --list
  • Review kafka_ops_log in Tinybird Local

Best practices

  1. Use separate consumer group IDs for local development to avoid conflicts
  2. Test schema changes locally before deploying to production
  3. Use default values in tb_secret() for local development
  4. Keep local and cloud configs separate to avoid accidental deployments
  5. Clean up test topics regularly to avoid clutter
  6. Use Docker Compose for consistent local environments
  7. Document local setup for your team
Updated