AWS MSK setup guide

This guide walks you through setting up Tinybird's Kafka connector with Amazon MSK (Managed Streaming for Apache Kafka), including IAM authentication, network configuration, and security group setup.

Prerequisites

  • An AWS account with an MSK cluster
  • AWS IAM permissions to create roles and policies
  • Access to AWS Console
  • A Tinybird workspace

Step 1: Get your MSK cluster details

Bootstrap servers

  1. In AWS Console, navigate to Amazon MSK
  2. Select your cluster
  3. Go to Properties tab
  4. Copy the Bootstrap broker string (for example, b-1.example-cluster.abc123.c2.kafka.us-east-1.amazonaws.com:9098,b-2.example-cluster.abc123.c2.kafka.us-east-1.amazonaws.com:9098)

Important: Use the port that matches your authentication method:

  • Port 9098 for SASL/IAM (OAUTHBEARER)
  • Port 9096 for SASL/SCRAM
  • Port 9094 for TLS
  • Port 9092 for plaintext (not recommended for production)

Cluster ARN

  1. In the cluster Properties tab
  2. Copy the Cluster ARN (for example, arn:aws:kafka:us-east-1:123456789012:cluster/example-cluster/abc123-def456-789)

You need this for the IAM policy configuration.

Step 2: Create the Kafka connection

You can create the Kafka connection using the CLI wizard or by manually creating a Connection file.

Run the following command to create a connection interactively:

tb connection create kafka

The wizard prompts you for:

  1. Connection name
  2. Bootstrap server
  3. Kafka key (for SASL/SCRAM) or IAM role ARN (for IAM authentication)
  4. Kafka secret (for SASL/SCRAM) or external ID (for IAM authentication)

After the wizard completes, edit the generated Connection file to add AWS MSK-specific settings like KAFKA_SASL_OAUTHBEARER_METHOD AWS and AWS_ROLE_ARN for IAM authentication.

Option 2: Manually create a Connection file

For AWS MSK with IAM authentication, manually create the Connection file with the specific IAM settings. See the manual setup steps in the following section.

Step 3: Set up IAM authentication

MSK supports IAM authentication using OAUTHBEARER. You need to create an IAM role that Tinybird can assume.

Create IAM role

  1. In AWS Console, go to IAM > Roles
  2. Select Create role
  3. Select AWS account as the trusted entity type
  4. For Account ID, use Tinybird's AWS account ID (contact support for the correct ID for your region)
  5. Check Require external ID and enter a unique external ID (save this for later)
  6. Select Next

Create access policy

Create a policy that grants access to your MSK cluster:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "kafka-cluster:Connect",
                "kafka-cluster:AlterCluster",
                "kafka-cluster:DescribeCluster"
            ],
            "Resource": "arn:aws:kafka:<REGION>:<ACCOUNT_ID>:cluster/<CLUSTER_NAME>/*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "kafka-cluster:DescribeTopic",
                "kafka-cluster:CreateTopic",
                "kafka-cluster:WriteData",
                "kafka-cluster:ReadData"
            ],
            "Resource": "arn:aws:kafka:<REGION>:<ACCOUNT_ID>:topic/<CLUSTER_NAME>/*/<TOPIC_NAME>"
        },
        {
            "Effect": "Allow",
            "Action": [
                "kafka-cluster:AlterGroup",
                "kafka-cluster:DescribeGroup"
            ],
            "Resource": "arn:aws:kafka:<REGION>:<ACCOUNT_ID>:group/<CLUSTER_NAME>/*/<GROUP_ID>"
        }
    ]
}

Replace:

  • <REGION>: Your AWS region (for example, us-east-1)
  • <ACCOUNT_ID>: Your AWS account ID
  • <CLUSTER_NAME>: Your MSK cluster name
  • <TOPIC_NAME>: Your Kafka topic name (or * for all topics)
  • <GROUP_ID>: Your consumer group ID (or * for all groups)

Alternative: Use Tinybird's API to generate the policy:

curl "https://api.tinybird.co/v0/integrations/kafka/policies/read-access-policy?msk_cluster_arn=<CLUSTER_ARN>&topics=<TOPIC_NAME>&groups=<GROUP_ID>"

Create trust policy

The trust policy allows Tinybird to assume the role:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "sts:AssumeRole",
            "Principal": {
                "AWS": "arn:aws:iam::<TINYBIRD_ACCOUNT_ID>:root"
            },
            "Condition": {
                "StringEquals": {
                    "sts:ExternalId": "<EXTERNAL_ID>"
                }
            }
        }
    ]
}

Replace:

  • <TINYBIRD_ACCOUNT_ID>: Tinybird's AWS account ID (contact support for your region)
  • <EXTERNAL_ID>: The external ID you set when creating the role

Alternative: Use Tinybird's API:

curl "https://api.tinybird.co/v0/integrations/kafka/policies/trust-policy?external_id_seed=<CONNECTION_NAME>"

Attach policies to role

  1. Attach the access policy to your IAM role
  2. Set the trust policy on the role
  3. Copy the Role ARN (for example, arn:aws:iam::123456789012:role/msk-tinybird-role)

Step 4: Configure security groups

MSK security group

  1. In AWS Console, go to EC2 > Security Groups
  2. Find the security group used by your MSK cluster
  3. Add an inbound rule:
    • Type: Custom TCP
    • Port: 9098 (or the port matching your authentication)
    • Source: Tinybird's IP ranges (contact support for details)

Note: For PrivateLink setups (Enterprise), security group configuration may differ.

Step 5: Create the Kafka connection (manual method)

If you didn't use the CLI wizard in Step 2, create a Connection file manually:

connections/aws_msk.connection
TYPE kafka
KAFKA_BOOTSTRAP_SERVERS <BOOTSTRAP_BROKER_STRING>
KAFKA_SECURITY_PROTOCOL SASL_SSL
KAFKA_SASL_MECHANISM OAUTHBEARER
KAFKA_SASL_OAUTHBEARER_METHOD AWS
KAFKA_SASL_OAUTHBEARER_AWS_REGION us-east-1
KAFKA_SASL_OAUTHBEARER_AWS_ROLE_ARN {{ tb_secret("AWS_ROLE_ARN") }}
KAFKA_SASL_OAUTHBEARER_AWS_EXTERNAL_ID <EXTERNAL_ID>

Set the role ARN secret:

tb [--cloud] secret set AWS_ROLE_ARN <YOUR_ROLE_ARN>

Replace:

  • <BOOTSTRAP_BROKER_STRING>: The bootstrap broker string from Step 1
  • <EXTERNAL_ID>: The external ID you set in the IAM role in Step 3
  • us-east-1: Your AWS region

Step 6: VPC and network configuration

Standard setup

For most MSK clusters, Tinybird connects via the public endpoint. Ensure:

  1. Your MSK cluster has public access turned on
  2. Security groups allow inbound connections from Tinybird
  3. Network ACLs don't block the connection

For PrivateLink connectivity:

  1. Ensure your MSK cluster supports PrivateLink
  2. Contact Tinybird support to set up PrivateLink endpoint
  3. Use the PrivateLink endpoint as your bootstrap server

Contact support@tinybird.co with:

  • Your MSK cluster details
  • VPC and subnet information
  • Your Tinybird organization name

Step 7: Create the Kafka Data Source

Now that your connection is configured, create a Kafka Data Source. See Create a Kafka data source in the main Kafka connector guide for detailed instructions on:

  • Using tb datasource create --kafka for a guided setup
  • Manually creating Data Source files
  • Defining schemas with JSONPath expressions
  • Configuring Kafka-specific settings

Step 8: Test the connection

Test your connection and preview data:

tb connection data aws_msk

This command prompts you to select a topic and consumer group ID, then returns preview data. This verifies that Tinybird can:

  1. Assume the IAM role
  2. Connect to your MSK cluster
  3. Authenticate using OAUTHBEARER
  4. Consume messages from the topic

Common AWS MSK issues

Issue: IAM authentication failed

Symptoms:

  • "Authentication failed" errors
  • "Unable to assume role" errors

Solutions:

  1. Verify the IAM role ARN is correct
  2. Check the trust policy allows Tinybird's AWS account
  3. Verify the external ID matches between connection and trust policy
  4. Ensure the IAM role has the correct access policy attached
  5. Check IAM role permissions in AWS Console

Issue: Security group blocking connection

Symptoms:

  • Connection timeout errors
  • Broker unreachable

Solutions:

  1. Verify security group allows inbound traffic on the correct port
  2. Check source IP ranges (contact Tinybird support for current IPs)
  3. Verify network ACLs don't block the connection
  4. For PrivateLink, ensure VPC endpoint is configured correctly

Issue: Network connectivity

Symptoms:

  • Connection timeout
  • Unable to reach bootstrap servers

Solutions:

  1. Verify bootstrap server address is correct
  2. Check if MSK cluster has public access turned on
  3. Verify DNS resolution for MSK cluster endpoints
  4. For PrivateLink, ensure endpoint is active and accessible

Best practices

  1. Use least privilege IAM policies - Only grant access to specific topics and groups
  2. Use unique external IDs for each connection
  3. Monitor IAM role usage in CloudTrail
  4. Rotate IAM roles periodically for security
  5. Use separate roles for different environments (dev, staging, prod)
  6. Turn on VPC flow logs to monitor network traffic
  7. Set up CloudWatch alarms for MSK cluster health

Troubleshooting IAM permissions

If you encounter permission errors, verify:

  1. Access policy grants the required Kafka cluster actions
  2. Trust policy allows Tinybird to assume the role
  3. External ID matches in both connection and trust policy
  4. Role ARN is correct in the Connection file
  5. Region matches your MSK cluster region

Use AWS CloudTrail to see detailed error messages for IAM authentication failures.

Updated