Versioning your pipes

Intermediate

With Tinybird you can analyze and transform your data through Pipes in three ways:

  • Using the UI. This is the preferred option when you want to do some data exploration and solution discovery, measuring performance and getting results in an interactive way.

  • Using the CLI. This is the preferred option when you are collaborating on a Data Project, working in production, or with Environments connected to a source control repository. CLI commands are used for automations like CI/CD.

  • Using the API. Both the UI and the CLI use the Pipes API behind the scenes, although you will rarely use it directly unless you have a use case where you need to automate the publication of API Endpoints.

In this section we’ll focus on how to version your Pipes using the CLI through a CI/CD pipeline.

Versioning

All the Data Sources and Pipes in your account have the following naming convention:

{datasource|pipe}__v{version}

Use VERSION numbers when:

  • You change the schema of a Data Source

  • You change the output of an API Endpoint

  • You change the parameters of an API Endpoint

  • In general, any change on your Data Sources or Pipes that is not backwards compatible.

When working on a Data Project, you can version data files by including a VERSION <number> tag at the top of the file, where <number> is an incremental integer.

Guide preparation

You can follow along using the ecommerce_data_project.

Download the project by running:

Git clone the project
git clone https://github.com/tinybirdco/ecommerce_data_project
cd ecommerce_data_project

Then, create a new Workspace and authenticate using your user admin token (admin user@domain.com). If you don’t know how to authenticate or use the CLI, check out the CLI Quick Start.

Authenticating to EU
tb auth -i

** List of available regions:
   [1] us-east (https://ui.us-east.tinybird.co)
   [2] eu (https://ui.tinybird.co)
   [0] Cancel

Use region [1]: 2

Copy the admin token from https://ui.tinybird.co/tokens and paste it here :

Finally, push the Data Project to Tinybird:

Recreating the project
tb push --push-deps --fixtures

** Processing ./datasources/events.datasource
** Processing ./datasources/top_products_view.datasource
** Processing ./datasources/products.datasource
** Processing ./datasources/current_events.datasource
** Processing ./pipes/events_current_date_pipe.pipe
** Processing ./pipes/top_product_per_day.pipe
** Processing ./endpoints/top_products.pipe
** Processing ./endpoints/sales.pipe
** Processing ./endpoints/top_products_params.pipe
** Processing ./endpoints/top_products_agg.pipe
** Building dependencies
** Running products_join_by_id
** 'products_join_by_id' created
** Running current_events
** 'current_events' created
** Running events
** 'events' created
** Running products
** 'products' created
** Running top_products_view
** 'top_products_view' created
** Running products_join_by_id_pipe
** Materialized pipe 'products_join_by_id_pipe' using the Data Source 'products_join_by_id'
** 'products_join_by_id_pipe' created
** Running top_product_per_day
** Materialized pipe 'top_product_per_day' using the Data Source 'top_products_view'
** 'top_product_per_day' created
** Running events_current_date_pipe
** Materialized pipe 'events_current_date_pipe' using the Data Source 'current_events'
** 'events_current_date_pipe' created
** Running sales
** => Test endpoint at https://api.tinybird.co/v0/pipes/sales.json
** 'sales' created
** Running top_products_agg
** => Test endpoint at https://api.tinybird.co/v0/pipes/top_products_agg.json
** 'top_products_agg' created
** Running top_products_params
** => Test endpoint at https://api.tinybird.co/v0/pipes/top_products_params.json
** 'top_products_params' created
** Running top_products
** => Test endpoint at https://api.tinybird.co/v0/pipes/top_products.json
** 'top_products' created
** Pushing fixtures
** Warning: datasources/fixtures/products_join_by_id.ndjson file not found
** Warning: datasources/fixtures/current_events.ndjson file not found
** Checking ./datasources/events.datasource (appending 544.0 b)
**  OK
** Checking ./datasources/products.datasource (appending 134.0 b)
**  OK
** Warning: datasources/fixtures/top_products_view.ndjson file not found

Once you have the Data Project deployed to a Workspace make sure you connect it to Git and push the CI/CD pipelines to the repository.

How to do versioning

Now let’s imagine you are asked to change the output of the top_products API Endpoint, so instead of returning an array with the top 10 products grouped by day, the team that is integrating the API decides it’s better to have the arrays unnested so you’ll have a row for each date and product.

In this case, you don’t want to force push the API Endpoint because it will break the application, the dashboard, or any other tool that integrates it. Instead, you want to increase the version number as follows:

Once the API Endpoint is validated, use this command to see what will be pushed to your account: .. code-block:

tb push endpoints/top_products.pipe --debug --dry-run
** Processing endpoints/top_products.pipe
{
   'top_products': {
      'deps': ['top_product_per_day'],
      'description': '',
      'filename': 'top_products.pipe',
      'name': 'top_products__v1',
      'nodes': [{
         'params': {
            'description': 'returns top 10 '
            'products for the '
            'last month',
            'name': 'endpoint',
            'type': 'standard'
         },
         'sql': 'SELECT\n'
         '    date,\n'
         '    arrayJoin(topKMerge(10)(top_10)) '
         'AS product\n'
         'FROM top_product_per_day\n'
         'WHERE date > (today() - '
         'toIntervalDay(30))\n'
         'GROUP BY date'
      }],
      'resource': 'pipes',
      'resource_name': 'top_products',
      'tokens': [],
      'version': 1
   }
}
** Building dependencies
** [DRY RUN] Creating top_products

You can see it will use the last available version of the Data Source top_product_per_day (in this case with no version) and will increase the version suffix of the top_products endpoint to __v1.

To deploy the changes to the main Environment, create a new Git branch and a Pull Request.

The Continuous Integration Pipeline will create a new test Environment and run tb deploy which will examine the changes in the Pull Request and deploy them to the Environment. When you deploy a new version of a Pipe you should cover the new API Endpoint with tests. Since you are deploying a new version of the API Endpoint regression testing is not needed and won’t work. Instead you should create some fixture tests as described in the implementing test strategies guide

Once Continuous Integration is ✅ just merge the Pull Request to run the tb deploy command in the main Environment.