The Confluent Connector allows you to ingest data from your existing Confluent Cloud cluster and load it into Tinybird.
The Confluent Connector is fully managed and requires no additional tooling. Connect Tinybird to your Confluent Cloud cluster, choose a topic, and Tinybird will automatically begin consuming messages from Confluent Cloud.
Note that you need to grant READ permissions to both the Topic and the Consumer Group to ingest data from Confluent into Tinybird.
Using the UI¶
To connect Tinybird to your Confluent Cloud cluster, click the + icon next to the data project section on the left navigation menu, select Data Source, and click Confluent from the list of available Data Sources.
You will be prompted to enter the following details:
Connection name: A name for the Confluent Cloud connection in Tinybird
Bootstrap Server: The comma-separated list of bootstrap servers (including Port numbers)
Key: The Key component of the Confluent Cloud API Key
Secret: The Secret component of the Confluent Cloud API Key
Decode Avro messages with schema registry: Optionally, you can enable Schema Registry support to decode Avro messages. You will be prompted to enter the Schema Registry URL, username and password.
Once you have entered the details, click Connect. This will create the connection between Tinybird and Confluent Cloud. You will then see a list of your existing topics and can select the topic to consume from. Tinybird will create a Group ID that specifies the name of the consumer group this Kafka consumer belongs to. You can customize the Group ID, but ensure that your Group ID has read permissions to the topic.
Once you have chosen a topic, you can select the starting offset to consume from. You can choose to consume from the latest offset or the earliest offset. If you choose to consume from the earliest offset, Tinybird will consume all messages from the beginning of the topic. If you choose to consume from the latest offset, Tinybird will only consume messages that are produced after the connection is created. Select the offset, and click Next.
Tinybird will then consume a sample of messages from the topic and display the schema. You can adjust the schema and Data Source settings as needed, then click Create Data Source to create the Data Source.
Tinybird will now begin consuming messages from the topic and loading them into the Data Source.
Using .datasource files¶
If you are managing your Tinybird resources in files, there are several settings available to configure the Confluent Connector in
KAFKA_CONNECTION_NAME: The name of the configured Confluent Cloud connection in Tinybird
KAFKA_BOOTSTRAP_SERVERS: The comma-separated list of bootstrap servers (including Port numbers)
KAFKA_KEY: The Key component of the Confluent Cloud API Key
KAFKA_SECRET: The Secret component of the Confluent Cloud API Key
KAFKA_TOPIC: The name of the Kafka topic to consume from
KAFKA_GROUP_ID: The Kafka Consumer Group ID to use when consuming from Confluent Cloud
KAFKA_AUTO_OFFSET_RESET: The offset to use when no previous offset can be found, e.g. when creating a new consumer. Supported values:
KAFKA_STORE_RAW_VALUE: Optionally, you can store the raw message in its entirety as an additional column. Supported values:
For example, to define Data Source with a new Confluent Cloud connection in a
Or, to define Data Source that uses an exsting Confluent Cloud connection:
Using INCLUDE to store connection settings¶
To avoid configuring the same connection settings across many files, or to prevent leaking sensitive information, you can store connection details in an external file and use
INCLUDE to import them into one or more
You can find more information about
INCLUDE in the Advanced Templates documentation.
As an example, you may have two Confluent Cloud
.datasource files, which re-use the same Confluent Cloud connection. You can create an include file which stores the Confluent Cloud connection details.
The Tinybird project may use the following structure:
Where the file
my_connector_name.incl has the following content:
And the Confluent Cloud
.datasource files look like the following:
tb pull to pull a Confluent Cloud Data Source using the CLI, the
KAFKA_SECRET settings will not be included in the file to avoid exposing credentials.
Is the Confluent Cloud Schema Registry supported?¶
Yes, for decoding Avro messages. You can choose to enable Schema Registry support when connecting Tinybird to Confluent Cloud. You will be prompted to add your Schema Registry connection details, e.g.
However, the Confluent Cloud Data Source schema will not be defined using the Schema Registry, the Schema Registry is simply used to decode the messages.
Can Tinybird ingest compressed messages?¶
Yes, Tinybird can consume from Kafka topics where Kafka compression has been enabled, as decompressing the message is a standard function of the Kafka Consumer.
However, if you compressed the message before passing it through the Kafka Producer, then Tinybird cannot do post-Consumer processing to decompress the message.
For example, if you compressed a JSON message through gzip and produced it to a Kafka topic as a
bytes message, it would be ingested by Tinybird as
bytes. If you produced a JSON message to a Kafka topic with the Kafka Producer setting compression.type=gzip, while it would be stored in Kafka as compressed bytes, it would be decoded on ingestion and arrive to Tinybird as JSON.