Kafka Sink¶
Kafka Sinks are currently in private beta. If you have any feedback or suggestions, contact Tinybird at support@tinybird.co or in the Community Slack.
Tinybird's Kafka Sink allows you to push the results of a query to a Kafka topic. Queries can be executed on a defined schedule or on-demand.
Common uses for the Kafka Sink include:
- Push events to Kafka as part of an event-driven architecture.
- Exporting data to other systems that consume data from Kafka.
- Hydrating a data lake or data warehouse with real-time data.
Prerequisites¶
To use the Kafka Sink, you need to have a Kafka cluster that Tinybird can reach via the internet, or via private networking for Enterprise customers.
Configure using the UI¶
1. Create a Pipe and promote it to Sink Pipe¶
In the Tinybird UI, create a Pipe and write the query that produces the result you want to export. In the top right "Create API Endpoint" menu, select "Create Sink". In the modal, choose the destination (Kafka).
2. Choose the scheduling options¶
You can configure your Sink to run using a cron expression, so it runs automatically when needed.
3. Configure destination topic¶
Enter the Kafka topic where events are going to be pushed.
4. Preview and create¶
The final step is to check and confirm that the preview matches what you expect.
🎉 Congratulations! You've created your first Sink.
Configure using the CLI¶
1. Create the Kafka Connection¶
Run the tb connection create kafka
command, and follow the instructions.
2. Create Kafka Sink Pipe¶
To create a Sink Pipe, create a regular .pipe and filter the data you want to export to your bucket in the SQL section as in any other Pipe. Then, specify the Pipe as a sink type and add the needed configuration. Your Pipe should have the following structure:
NODE node_0 SQL > SELECT * FROM events WHERE time >= toStartOfMinute(now()) - interval 30 minute) TYPE sink EXPORT_SERVICE kafka EXPORT_CONNECTION_NAME "test_kafka" EXPORT_KAFKA_TOPIC "test_kafka_topic" EXPORT_SCHEDULE "*/5 * * * *"
Pipe parameters
For this step, you will need to configure the following Pipe parameters:
Key | Type | Description |
---|---|---|
EXPORT_CONNECTION_NAME | string | Required. The connection name to the destination service. This the connection created in Step 1. |
EXPORT_KAFKA_TOPIC | string | Required. The desired topic for the export data. |
EXPORT_SCHEDULE | string | A crontab expression that sets the frequency of the Sink operation or the @on-demand string. |
Once ready, push the datafile to your Workspace using tb push
(or tb deploy
if you are using version control) to create the Sink Pipe.
Scheduling¶
The schedule applied doesn't guarantee that the underlying job executes immediately at the configured time. The job is placed into a job queue when the configured time elapses. It is possible that, if the queue is busy, the job could be delayed and executed after the scheduled time.
To reduce the chances of a busy queue affecting your Sink Pipe execution schedule, we recommend distributing the jobs over a wider period of time rather than grouped close together.
For Enterprise customers, these settings can be customized. Reach out to your Customer Success team or email us at support@tinybird.co.
Query parameters¶
You can add query parameters to your Sink, the same way you do in API Endpoints or Copy Pipes.
For scheduled executions, the default values for the parameters will be used when the Sink runs.
Iterating a Kafka Sink (Coming soon)¶
Iterating features for Kafka Sinks are not yet supported in the beta. They are documented here for future reference.
Sinks can be iterated using version control, similar to other resources in your project. When you create a Branch, resources are cloned from the main Branch.
However, there are two considerations for Kafka Sinks to understand:
1. Schedules
When you create a Branch with an existing Kafka Sink, the resource will be cloned into the new Branch. However, it will not be scheduled. This prevents Branches from running exports unintentially and consuming resources, as it is common that development Branches do not need to export to external systems. If you want these queries to run in a Branch, you must recreate the Kafka Sink in the new Branch.
2. Connections
Connections are not cloned when you create a Branch. You need to create a new Kafka connection in the new Branch for the Kafka Sink.
Observability¶
Kafka Sink operations are logged in the tinybird.sinks_ops_log Service Data Source.
Limits & quotas¶
Check the limits page for limits on ingestion, queries, API Endpoints, and more.
Billing¶
Any Processed Data incurred by a Kafka Sink is charged at the standard rate for your account. The Processed Data is already included in your plan, and counts towards your commitment. If you're on an Enterprise plan, view your plan and commitment on the Organizations tab in the UI.
Next steps¶
- Get familiar with the Service Data Source and see what's going on in your account
- Deep dive on Tinybird's Pipes concept