--- title: "Writing tests sucks. Use LLMs so it sucks less." excerpt: "tb test validates your Tinybird pipelines before deployment. Catch errors locally instead of discovering them in production." authors: "Javi Santana" categories: "Product updates" createdOn: "2025-03-19 00:00:00" publishedOn: "2025-03-20 00:00:00" updatedOn: "2025-03-20 00:00:00" status: "published" ---

There are a few different places to find problems in software products:

While developing: Using unit, integration, and end-to-end tests
In CI: Using the same tests above, but automatically
In code reviews
While operating: Monitoring, alerting, observability, etc.

Everybody does this in software development. Nobody would deploy to production without testing. It’s a fundamental software development practice.

Sadly, best practices haven't been widely adopted in the data and analytics world. I'll try to expand on why below, but let me summarize:

It’s boring, and
You usually need production data to test.

(In case you're interested, I asked Grok why, and it made some interesting points. Anyway…)

The testing gap in data engineering

Data presents some unique challenges when it comes to testing:

Data variability: Real-world data is messy, inconsistent, and constantly changing. This is especially true when you work with billions of records. If you have that much data, there's a high probability of unexpected data.
Complex transformations: It's hard to thoroughly test SQL queries and complex data pipelines. And SQL never had a proper testing framework.
Environment dependencies: Testing often requires production-like data volumes and structures. SQL is probably the language with the simplest runtime setup, but people still don't test it.
Lack of tooling: Again, the data ecosystem hasn't had the same robust testing frameworks as software development.

Because of these things, many data teams rely on manual validation, spot checks, or simply hope that their transformations work as intended. This leads to brittle pipelines, unexpected results, and a lack of confidence when making changes.

So, we tried to solve this by:

Making it easy to generate test data, and
Providing a basic but powerful test framework integrated with the product
Allowing LLMs to handle all the mundane bits of generating tests and test data

Tinybird Forward is here.

Last week, we launched Forward, a new Tinybird UX that makes shipping software with big data requirements faster and more intuitive.

Start using it by installing the new CLI: curl https://tinybird.co | sh

Generating realistic test data with `tb mock`

The tb mock command uses an LLM to analyze the SQL queries and data schemas in your data project and automatically generate realistic test data that covers your logic paths:

This command:

Analyzes the table schema of the supplied data source to understand the columns and types
Analyze the SQL structure of the pipes that select from this data source (including the template logic)
Generates synthetic data that exercises different code paths (for example different values for query params or if/else logic in the templating language)
Creates a mock data file (fixture) that you can use for testing

That solves most of the boring part of generating fixtures, which is handling all the type matching and the logic.

But it doesn't solve everything. As I mentioned, production data is messy, and it turns out that generating production-like data is pretty hard. You have to account for all kinds of details like seasonality (your website has less traffic on weekends and during the holidays), state (a shopper on your e-commerce store shouldn't produce a checkout event if they haven't first produced an add_to_cart event), and reasonable values (temperatures can't drop below 0K).

If you want to test an edge case, it’s unlikely the LLM running under the hood can spot it right away, so we added a way to tell the LLMs what you want:

You can get as complex as you want with the prompt. Tinybird saves the prompt in the fixtures/ folder so you can reuse it and modify it if you want.

The fixtures generated are just .ndjson files, easy to modify by hand (or with an LLM) if needed.

Last but not least, we use a nice trick to generate the data: instead of generating just the ndjson, we generate the SQL that runs on the Tinybird Local database to generate the data. This way, you can change the SQL and the data will be regenerated based on your query. And you can generate as much data as you want (without killing your cloud bill on LLM tokens).

Testing SQL logic with `tb test`

Once you have mock data, the tb test command provides a framework for validating your data transformations. But first, you need to generate the tests.

This analyzes the SQL of the supplied endpoint pipe and generates a set of tests that cover all the possible paths based on the parameters (well, maybe not all of them, but at least the ones the underlying LLMs can detect).

It then generates some YAML files in the tests/ directory that define these tests.

Then, you can run the tests:

One of the pain points when working with data tests is that when the logic or data changes, you need to update the expected result. That's covered, too:

This reruns the tests and updates the expected results to match the current implementation. Of course, you need to be careful with this, but it's super handy when you actually do expect something different than what you're currently testing for.

Did we solve the testing problem?

To be honest, I don’t know. We've provided a set of tools to make it less boring to run tests on your data, but you still need a testing culture, and that's something data engineering doesn't have yet. Our goal when we designed this was to give you at least some basic tests without investing a lot of time. So, you can go to the project and run...

… and some tests will run in your CI (because we also generate CI workflow config files for GitHub/GitLab that automatically call tb test run).

So really, there's no excuse not to run tests, because we've made it super easy. We'll see if people actually do it.

Get started with `tb test`

If you already have an existing data project, just create tests for your endpoints:

Then just push the generated project to your git provider (don't forget to add secrets for host and token to authorize your CI to run against your project).

You can find more info in the docs on creating mock data and tests, and learn more about testing and deploying in CI/CD.

If you don't have a data project to test, install the CLI:

And just follow the prompts to create an API in a few minutes.

The testing gap in data engineering

Generating realistic test data with tb mock

Testing SQL logic with tb test

Did we solve the testing problem?

Get started with tb test

Generating realistic test data with `tb mock`

Testing SQL logic with `tb test`

Get started with `tb test`