Data Projects

What is a Data Project?

A Data Project is a set of plain-text files (Datafiles) that describes your Tinybird resources. In other words, Datafiles are the source code of your Data Project.

The Data Project makes it possible to apply the same techniques and principles that software engineering teams use to collaborate on and deploy software products to real time data products.

What should I use Data Projects for?

You use Data Projects to:

  • Push Datafiles to a Git repository for version control and to collaborate more effectively with your team.

  • Incorporate development best practices like, testing, code reviews, and CI/CD processes.

  • Standardize your data teams workflows.

  • Consolidate all your Datafiles to a single source of truth. The Datafiles in main branch of your Data Project’s Git repository will match the main Environment in your production Workspace.

Data Projects and Workspaces help you organize your work depending on your team and project structures. A Data Project can be deployed in different Workspaces using different data (production, staging, etc.)

Creating Data Projects

A Data Project is comprised of a set of Datafiles and a folder structure. You interact with the Data Project with your preferred text editor and the CLI.

Creating a Data Project from an existing Workspace

When you start using Tinybird for the first time, you generally start from the UI to get familiar with how Tinybird works. From there, you create your first Workspace and start defining your data pipelines: how the data should flow, from the start (the Data Sources and schemas), to the end (the APIs).

Once the Workspace is ready for deployment, and/or you want to add more people to collaborate on the Workspace, you should sync your Workspace to Git and start working with the Data Project by editing the Datafiles as code.

To create a new Data Project from an existing Workspace, you can use the CLI. Just tb auth using the Workspace admin token and host region and run:

tb pull --auto --force

Creating a Data Project from scratch

Once you are familiar with Tinybird, you may prefer to start new Data Projects from scratch and work directly from Git using CLI or an IDE.

To create a new Data Project and integrate it with Git, you can use the following command:

tb init --git

The command above will guide you through the process of syncing your Data Project with a Git repository.

Read the working with Git guide for more info about how to implement a Git workflow with a Data Project.

Managing the Data Project

In order to effectively work with a Data Project you need to:

  • Learn the Datafile reference to define Data Sources, Pipes, Connectors, Tokens and any other Tinybird resource.

  • Learn some CLI commands to perform state operations over the Workspace, like Data Operations, adding members to Workspaces or deleting resources.

Read the CLI documentation to learn how to define Data Sources, Pipes, Endpoints and the Datafile and CLI commands references.