Intro to the CLI and Docker

Easy

After following the three previous guides, you now know how to import, explore data and create dynamic endpoints with Tinybird, using the UI and the REST API. In this guide, you’ll learn how we work to develop real-life data projects with Tinybird, having the Data Sources, Pipes and Endpoints defined and in a code repository, and using a very important tool for this: our CLI.

Creating the project directory and a virtual environment in it

First, create the directory where our data project will be

Now, we create a virtual environment so that the packages we install don’t interfere with our system Python installation and they’re isolated from it. You could use virtualenv, venv, Pipenv or other packaging tools. We’ll use venv here

Note that . is an alias to source, which in this case reads and executes the content of .e/bin/activate into the current bash process.

Create a git repository, a .gitignore file and add .e to it so that third-party code doesn’t get tracked

Installing the CLI and using Docker

If you follow the CLI docs, you’ll see that there are two options for installing the CLI. If doing pip install tinybird-cli works for you, you can omit this section. This is how to run it using Docker

You need to have Docker installed and running. Download it from here and then run it. You should see something like this

Docker Desktop must be running

Docker Desktop must be running

Once it’s running, navigate to your project folder (or do nothing if you’re in it already), and run

If you’re new to Docker, this does two things:

  • Downloads the latest version of the Docker image named tinybird-cli-docker from the tinybirdco user on Docker Hub

  • Mounts a volume, setting the current directory (with $(pwd)) as the source and /mnt/data as the target. In Docker’s words, volumes are the “”preferred mechanism for persisting data generated by and used by Docker containers””. This will keep data in sync between the local directory (ecommerce_guides_project, in our case) and everything under /mnt/data in the container

Lastly, within the container, run cd /mnt/data to navigate to the target folder, where we’ll have a copy of our local project files.

Mounting two volumes:

In the case that you want to have your datasets and the Tinybird project files in different folders, you can mount both of them as volumes so that docker can access them. You’d do it with a command like docker run -v $(pwd)/tb_project:/mnt/data -v $(pwd)/datasets:/mnt/datasets -it tinybirdco/tinybird-cli-docker.

Using the CLI

We’ll only go over the basics here, as you’ll see all the CLI functionalities in detail in the next guides. To see all the available commands you can use, you can run tb --help

All the commands have also their own help command, so if you run, for example, tb datasource --help, you’ll see the options available when running it

Authenticating with your Tinybird account

To be able to use the CLI, run tb auth first

If you've followed the previous steps, a .tinyb file containing your admin token and the host will be created and will appear in your local directory as well

If you’ve followed the previous steps, a .tinyb file containing your admin token and the host will be created and will appear in your local directory as well

If you’re on a Pro or Enterprise plan and your Tinybird account runs dedicated machines, your host will be different. You can provide it with the --host flag, or change the .tinyb file directly.

Initializing the folder layout

Running tb init will create these folders to keep your Data Sources and Pipes organized.

initializing tinybird

The idea is that:

  • datasources contains all your Data Sources definitions

  • pipes contains Pipes where you define tranformations for materialized views, etc.

  • endpoints contains Pipes where the last node is exposed as an API endpoint

Downloading existing Data Sources and Pipes

This can be done with the tb pull command. These are its available options

Let’s download the Data Source definition (the schema) of the events Data source with the CLI, and save it in the datasources folder. This can be done running tb pull --match events.datasource --folder datasources

After running it, you should see a new file on the specified folder

After running it, you should see a new file on the specified folder

You can also download the events.datasource file by clicking on the "Download schema" button through the UI

You can also download the events.datasource file by clicking on the “Download schema” button through the UI

And the same could be done for the ecommerce_example pipe and endpoint we created in the previous two guides