Tinybird Customer Story

LocalStack uses Tinybird to analyze 100,000+ instances in realtime

“We had some audacious goals for our product which we thought were unattainable with our capacity at the time. Tinybird made it possible.”

Thomas Rausch  - Co-founder of LocalStack

3.9B

Processed per month

~6.6 TB

Requests per day

1

About LocalStack

Tens of thousands of developers use LocalStack as part of their local development workflow or in automated CI pipelines. Using LocalStack, developers can spin up a light but fully featured cloud stack on their local machine that provides the same functionality and APIs as if they were developing in a real AWS cloud environment. If you rely on real AWS accounts just for testing, you should definitely give LocalStack a try!

2

Big goals... and an infrastructure problem

LocalStack wanted to let their users see the real-time history of API calls made within their instance so that they can make sense of how their cloud applications interact with the AWS API. Likewise, LocalStack wanted to anonymize and aggregate that data so they could better understand how their own product was being used. To do so, they built a simple data pipeline on top of their client usage data using a self-hosted MySQL database.

In the early days, this worked fine. But as the company and their user base grew to where it is now (10,000+ devs and 100,000+ instances), the database quickly became a serious bottleneck. The usage dashboard that customers relied on was taking several seconds to load, and the queries they used to aggregate data for their internal goals started to max out their infrastructure. 

On top of that, LocalStack had a huge goal to open up anonymized telemetry recording for their open source product, which meant adding data from hundreds of thousands more machines to their data pipeline. The volume of data they’d add from their open source product was orders of magnitude higher than what they were already dealing with, and it was clear that their existing MySQL infrastructure wasn't going to keep up. There was simply no room to scale.

3

All the products in one realtime view

Thomas Rausch, Co-Founder at LocalStack, rallied his team to find performant ways to build the next iteration of their data analytics infrastructure. They wanted a solution that would scale out of the box, serve aggregation queries at low latency, and, importantly, allow for an easy and painless migration off of their existing pipeline. This is when they found Tinybird. They were able to migrate from their old database to Tinybird simply by exporting CSV files. The data was in Tinybird in seconds. Literally. 

After the initial migration, Thomas and the team still had to reconcile data coming from older instances of LocalStack which were still sending events in a format that was tailored to their old MySQL database. Using Tinybird, they built a managed connector that separated sessions and events, so that they could combine cold data and hot data, all in real-time. Tinybird’s “ingest > transform > expose” data flow made it a breeze to develop these migrations, which worked both for batch and realtime data. Now, all events are formatted uniformly in a single view, so the team can see how instances across all their products perform.

4

Tinybird is a perfect tool for small, fast teams

The best part for Thomas? He was able to build scalable data pipelines using Tinybird on his own before they had even hired any dedicated data engineers. With Tinybird in place he’s able to focus purely on improving the product. He’s seen Tinybird helps startups with small teams scale their data products, without the typical complexity. Since implementing Tinybird, they’ve combined product usage metrics from their internal operational data with GitHub issues raised by customers, for a better understanding of where to direct engineering resources. They’ve even taken it one step further, and use that same analytics infrastructure to identify product errors in realtime - before the customer even needs to report them.

Since implementation, the team has been exploring using Tinybird to provide more fine-grained in-product analytics to their customers. In the coming months, they expect to serve way more analytics data to customers through their dashboards, which enables tracing and observability of cloud applications that run on LocalStack, and help optimize applications before they are deployed to the cloud.

For developers

1

How is data ingested?

Using a combination of Tinybird’s Data Source & Events APIs, and managed AWS services. Data is streamed into AWS Kinesis and then batched using AWS Lambda into Tinybird. Some of the data is materialized on ingest, some of it is queried raw.

2

How is data exposed?

LocalStack team builds API endpoints in Tinybird using just SQL. They then consume these endpoints from both their user-facing application as well as internal systems for operational intelligence.They also use Redash and the Tinybird plugin they implemented for observability.

3

How does LocalStack take advantage of real-time analytics?

They use Tinybird to collect telemetry data from their clients, operational data from their infrastructure, as well as business analytics data from their backends. Data can be processed and queried in real-time allowing them to react to critical changes in a matter of seconds, or run long-running aggregations for monthly reports.