PricingDocs
Bars

Data Platform

Managed ClickHouse
Production-ready with Tinybird's DX
Streaming ingestion
High-throughput streaming ingest
Schema iteration
Safe migrations with zero downtime
Connectors
Plug and play Kafka, S3, and GCS

Developer Experience

Instant SQL APIs
Turn SQL into an endpoint
BI & Tool Connections
Connect your BI tools and ORMs
Tinybird Code
Ingest and query from your terminal

Enterprise

Tinybird AI
AI resources for LLMs and agents
High availability
Fault-tolerance and auto failovers
Security and compliance
Certified SOC 2 Type II for enterprise
Sign inSign up
Product []

Data Platform

Managed ClickHouse
Production-ready with Tinybird's DX
Streaming ingestion
High-throughput streaming ingest
Schema iteration
Safe migrations with zero downtime
Connectors
Plug and play Kafka, S3, and GCS

Developer Experience

Instant SQL APIs
Turn SQL into an endpoint
BI & Tool Connections
Connect your BI tools and ORMs
Tinybird Code
Ingest and query from your terminal

Enterprise

Tinybird AI
AI resources for LLMs and agents
High availability
Fault-tolerance and auto failovers
Security and compliance
Certified SOC 2 Type II for enterprise
PricingDocs
Resources []

Learn

Blog
Musings on transformations, tables and everything in between
Customer Stories
We help software teams ship features with massive data sets
Videos
Learn how to use Tinybird with our videos
ClickHouse for Developers
Understand ClickHouse with our video series

Build

Templates
Explore our collection of templates
Tinybird Builds
We build stuff live with Tinybird and our partners
Changelog
The latest updates to Tinybird

Community

Slack Community
Join our Slack community to get help and share your ideas
Open Source Program
Get help adding Tinybird to your open source project
Schema > Evolution
Join the most read technical biweekly engineering newsletter

Our Columns:

Skip the infra work. Deploy your first ClickHouse
project now

Get started for freeRead the docs
A geometric decoration with a matrix of rectangles.

Product /

ProductWatch the demoPricingSecurityRequest a demo

Company /

About UsPartnersShopCareers

Features /

Managed ClickHouseStreaming IngestionSchema IterationConnectorsInstant SQL APIsBI & Tool ConnectionsTinybird CodeTinybird AIHigh AvailabilitySecurity & Compliance

Support /

DocsSupportTroubleshootingCommunityChangelog

Resources /

ObservabilityBlogCustomer StoriesTemplatesTinybird BuildsTinybird for StartupsRSS FeedNewsletter

Integrations /

Apache KafkaConfluent CloudRedpandaGoogle BigQuerySnowflakePostgres Table FunctionAmazon DynamoDBAmazon S3

Use Cases /

User-facing dashboardsReal-time Change Data Capture (CDC)Gaming analyticsWeb analyticsReal-time personalizationUser-generated content (UGC) analyticsContent recommendation systemsVector search
All systems operational

Copyright © 2025 Tinybird. All rights reserved

|

Terms & conditionsCookiesTrust CenterCompliance Helpline
Tinybird wordmark
PricingDocs
Bars

Data Platform

Managed ClickHouse
Production-ready with Tinybird's DX
Streaming ingestion
High-throughput streaming ingest
Schema iteration
Safe migrations with zero downtime
Connectors
Plug and play Kafka, S3, and GCS

Developer Experience

Instant SQL APIs
Turn SQL into an endpoint
BI & Tool Connections
Connect your BI tools and ORMs
Tinybird Code
Ingest and query from your terminal

Enterprise

Tinybird AI
AI resources for LLMs and agents
High availability
Fault-tolerance and auto failovers
Security and compliance
Certified SOC 2 Type II for enterprise
Sign inSign up
Product []

Data Platform

Managed ClickHouse
Production-ready with Tinybird's DX
Streaming ingestion
High-throughput streaming ingest
Schema iteration
Safe migrations with zero downtime
Connectors
Plug and play Kafka, S3, and GCS

Developer Experience

Instant SQL APIs
Turn SQL into an endpoint
BI & Tool Connections
Connect your BI tools and ORMs
Tinybird Code
Ingest and query from your terminal

Enterprise

Tinybird AI
AI resources for LLMs and agents
High availability
Fault-tolerance and auto failovers
Security and compliance
Certified SOC 2 Type II for enterprise
PricingDocs
Resources []

Learn

Blog
Musings on transformations, tables and everything in between
Customer Stories
We help software teams ship features with massive data sets
Videos
Learn how to use Tinybird with our videos
ClickHouse for Developers
Understand ClickHouse with our video series

Build

Templates
Explore our collection of templates
Tinybird Builds
We build stuff live with Tinybird and our partners
Changelog
The latest updates to Tinybird

Community

Slack Community
Join our Slack community to get help and share your ideas
Open Source Program
Get help adding Tinybird to your open source project
Schema > Evolution
Join the most read technical biweekly engineering newsletter

Skip the infra work. Deploy your first ClickHouse
project now

Get started for freeRead the docs
A geometric decoration with a matrix of rectangles.

Product /

ProductWatch the demoPricingSecurityRequest a demo

Company /

About UsPartnersShopCareers

Features /

Managed ClickHouseStreaming IngestionSchema IterationConnectorsInstant SQL APIsBI & Tool ConnectionsTinybird CodeTinybird AIHigh AvailabilitySecurity & Compliance

Support /

DocsSupportTroubleshootingCommunityChangelog

Resources /

ObservabilityBlogCustomer StoriesTemplatesTinybird BuildsTinybird for StartupsRSS FeedNewsletter

Integrations /

Apache KafkaConfluent CloudRedpandaGoogle BigQuerySnowflakePostgres Table FunctionAmazon DynamoDBAmazon S3

Use Cases /

User-facing dashboardsReal-time Change Data Capture (CDC)Gaming analyticsWeb analyticsReal-time personalizationUser-generated content (UGC) analyticsContent recommendation systemsVector search
All systems operational

Copyright © 2025 Tinybird. All rights reserved

|

Terms & conditionsCookiesTrust CenterCompliance Helpline
Tinybird wordmark
PricingDocs
Bars

Data Platform

Managed ClickHouse
Production-ready with Tinybird's DX
Streaming ingestion
High-throughput streaming ingest
Schema iteration
Safe migrations with zero downtime
Connectors
Plug and play Kafka, S3, and GCS

Developer Experience

Instant SQL APIs
Turn SQL into an endpoint
BI & Tool Connections
Connect your BI tools and ORMs
Tinybird Code
Ingest and query from your terminal

Enterprise

Tinybird AI
AI resources for LLMs and agents
High availability
Fault-tolerance and auto failovers
Security and compliance
Certified SOC 2 Type II for enterprise
Sign inSign up
Product []

Data Platform

Managed ClickHouse
Production-ready with Tinybird's DX
Streaming ingestion
High-throughput streaming ingest
Schema iteration
Safe migrations with zero downtime
Connectors
Plug and play Kafka, S3, and GCS

Developer Experience

Instant SQL APIs
Turn SQL into an endpoint
BI & Tool Connections
Connect your BI tools and ORMs
Tinybird Code
Ingest and query from your terminal

Enterprise

Tinybird AI
AI resources for LLMs and agents
High availability
Fault-tolerance and auto failovers
Security and compliance
Certified SOC 2 Type II for enterprise
PricingDocs
Resources []

Learn

Blog
Musings on transformations, tables and everything in between
Customer Stories
We help software teams ship features with massive data sets
Videos
Learn how to use Tinybird with our videos
ClickHouse for Developers
Understand ClickHouse with our video series

Build

Templates
Explore our collection of templates
Tinybird Builds
We build stuff live with Tinybird and our partners
Changelog
The latest updates to Tinybird

Community

Slack Community
Join our Slack community to get help and share your ideas
Open Source Program
Get help adding Tinybird to your open source project
Schema > Evolution
Join the most read technical biweekly engineering newsletter
Back to Blog
Share this article:
Back

The perfect data ingestion API design

If you ask me, this is pretty much perfect.
Scalable Analytics Architecture
Javi Santana
Javi SantanaCo-founder

The perfect data ingestion API design... does not exist 🙂.

I used the title to catch your attention, but I do think I’ve designed something close to perfect. Check it out and tell me what you'd change.

Easy to use

You can do that with any programming language in a few lines of code.

A format for the web

It accepts NDJSON and JSON. Maybe I'd add support for Parquet, but I think compressed NDJSON is good enough. 

Being web-compatible allows you to connect almost any kind of webhook. Or send it from a JavaScript snippet.

Schema >>> schemaless

When working with a lot of data, schemaless is a waste of money and resources, both on storage and processing. The API transforms the attributes into columns (that are stored with the right type in a columnar database) leading to 10x-100x improvements in both.

You can always save the raw data to process it later but, in general, it’s a bad idea.

ACK

The API sends you an ack when the data is received and safely stored. You can forget about it, you know it will eventually be written to the database.

Failing gracefully

Things fail, and this is the most interesting part. If you fail inserting data, you want to know with 100% certainty. If your app dies while you are pushing data, should you retry?

The API is idempotent. You can retry within a 5 hour window and if the data was inserted, it’s not inserted again as long as you send the same data batch (it uses a hash of the data to know if it was inserted).

The first layer of the API is so simple, so if something does fail internally, in almost every case at least the data is buffered.

Buffering

Speaking of buffering... the API does buffer data. This is generally good performance hygiene for an ingestion API, but it’s also critical if you have an analytical database (as we do). These databases aren't build to accept streaming inserts; they need to insert data in batches, otherwise it’s too expensive (both on CPU and S3 write operations).

This buffer layer also works as the safety net when things fail. For example, overloading a database is quite easy, this helps you to mitigate that without even noticing.

Scale

You can throw 1000 QPS with one event each or 200 QPS with a 50Mb payload. Even if you have a lot of data, that handles at least 99% of use cases.

Real time

Even with some buffering, it works in real time. It usually takes no more than 4 seconds for the data to be available to query from the database, but even that can be reduced to close to a second. 

And in general, it just works.

Try it

What do you think? Is it the perfect data ingestion API? Try it out and let me know.

Do you like this post? Spread it!

Skip the infra work. Deploy your first ClickHouse
project now

Get started for freeRead the docs
A geometric decoration with a matrix of rectangles.
Tinybird wordmark

Product /

ProductWatch the demoPricingSecurityRequest a demo

Company /

About UsPartnersShopCareers

Features /

Managed ClickHouseStreaming IngestionSchema IterationConnectorsInstant SQL APIsBI & Tool ConnectionsTinybird CodeTinybird AIHigh AvailabilitySecurity & Compliance

Support /

DocsSupportTroubleshootingCommunityChangelog

Resources /

ObservabilityBlogCustomer StoriesTemplatesTinybird BuildsTinybird for StartupsRSS FeedNewsletter

Integrations /

Apache KafkaConfluent CloudRedpandaGoogle BigQuerySnowflakePostgres Table FunctionAmazon DynamoDBAmazon S3

Use Cases /

User-facing dashboardsReal-time Change Data Capture (CDC)Gaming analyticsWeb analyticsReal-time personalizationUser-generated content (UGC) analyticsContent recommendation systemsVector search
All systems operational

Copyright © 2025 Tinybird. All rights reserved

|

Terms & conditionsCookiesTrust CenterCompliance Helpline

Related posts

Scalable Analytics Architecture
Mar 07, 2025
How to run load tests in real-time data systems
Ana Guerrero
Ana GuerreroData Engineer
1How to run load tests in real-time data systems
Scalable Analytics Architecture
Aug 18, 2023
Real-Time Data Ingestion: The Foundation for Real-time Analytics
Cameron Archer
Cameron ArcherTech Writer
1Real-Time Data Ingestion: The Foundation for Real-time Analytics
Scalable Analytics Architecture
Feb 19, 2025
I've helped huge companies scale logs analysis. Here’s how.
Paco González
Paco GonzálezData Engineer
1I've helped huge companies scale logs analysis. Here’s how.
Scalable Analytics Architecture
Apr 11, 2025
The simplest way to count 100B unique IDs: Part 2
Ariel Pérez
Ariel PérezHead of Product & Technology
1The simplest way to count 100B unique IDs: Part 2
Scalable Analytics Architecture
Jul 29, 2022
The definition of real-time data
Alasdair Brown
Alasdair BrownDeveloper Advocate
1The definition of real-time data
Scalable Analytics Architecture
Apr 08, 2025
Best practices for downsampling billions of rows of data
Paco González
Paco GonzálezData Engineer
1Best practices for downsampling billions of rows of data
Scalable Analytics Architecture
Jun 15, 2022
The hard parts of building data systems with high concurrency
Javi Santana
Javi SantanaCo-founder
1The hard parts of building data systems with high concurrency
Scalable Analytics Architecture
Oct 06, 2022
The Data Journey: Unlocking data for the right now
Alejandro Martín
Alejandro MartínProduct Manager
1The Data Journey: Unlocking data for the right now
Scalable Analytics Architecture
May 08, 2023
The 5 rules for writing faster SQL queries
Aitana Azcona
Aitana AzconaHead of Base Team
1The 5 rules for writing faster SQL queries
Scalable Analytics Architecture
May 15, 2023
The 8 considerations for designing public data APIs
Jim Moffitt
Jim MoffittDeveloper Advocate
1The 8 considerations for designing public data APIs