Tinybird CLI¶
The Tinybird CLI allows you to use all the Tinybird functionality directly from the command line. Additionally, it includes several functions to create and manage Data Projects easily. It’s used behind the scenes on Git and CI/CD workflows
How to install¶
Option 1: Install it locally¶
You need to have installed Python 3 and pip:
Supported Python versions: 3.8, 3.9, 3.10, 3.11
python3 -m venv .venv
source .venv/bin/activate
If you are not used to python virtual Environments you can read this guide about venv.
pip install tinybird-cli
Option 2: Use a prebuilt docker image¶
Let’s say your project is in projects/data
path:
docker run -v ~/projects/data:/mnt/data -it tinybirdco/tinybird-cli-docker
cd mnt/data
Authenticate¶
The first step is to check everything works correctly and that you’re able to authenticate:
tb auth -i
** List of available regions:
[1] us-east (https://ui.us-east.tinybird.co)
[2] eu (https://ui.tinybird.co)
[0] Cancel
Use region [1]:
Copy the admin token from https://ui.us-east.tinybird.co/tokens and paste it here: <pasted token>
** Auth successful!
** Configuration written to .tinyb file, consider adding it to .gitignore
First choose the Tinybird region, note dedicated regions do not appear in the list.
It’ll ask for the admin token associated to your account (user@domain.com), you have to copy it from the workspace tokens page and paste it.

Note you can also pass the token directly with the --token
flag:
tb auth --token <your token>
** Auth successful!
** Configuration written to .tinyb file, consider adding it to .gitignore
It saves your credentials in the .tinyb
file in your current directory. Please, add it to .gitignore (or the ignore list in the SCM you use) because it contains Tinybird credentials.
The CLI tries it’s best to auth you in the proper region for your account, but you may want to override this behaviour. In that case, you need to provide the --host
flag with the corresponding URL for your region.
We now currently have:
https://api.us-east.tinybird.co: For US accounts
https://api.tinybird.co: Rest of the world accounts
You can always view the up to date list of available regions within the tool using the command tb auth ls
.
Quick intro¶
The CLI works indistinctly with CSV, NDJSON, and Parquet files.
Create a new project
tb init
Generate a Data Source file (we will explain this later) based on a sample CSV file and add a few lines
$ tb datasource generate /tmp/sample.csv
** Generated datasources/sample.datasource
** => Run `tb push datasources/sample.datasource` to create it on the server
** => Add data with `tb datasource append sample /tmp/sample.csv`
** => Generated fixture datasources/fixtures/sample.csv
You can also generate a Data Source file from an NDJSON or Parquet file. It uses the Analyze API. behind the scenes, so it will guess and apply the jsonpath
for each column in the schema.
Push it to Tinybird
$ tb push datasources/sample.datasource
** Processing datasources/sample.datasource
** Building dependencies
** Creating sample
** not pushing fixtures
Append some data
$ tb datasource append sample datasources/fixtures/sample.csv
🥚 starting import process
🐥 done
Query the data
$ tb sql "select count() from sample"
Query took 0.000475 seconds, read 1 rows // 4.1 KB
-----------
| count() |
-----------
| 384 |
-----------
Check the Data Source is in the Data Sources list
$ tb datasource ls
name row_count size created at updated at
------------------------- ----------- ----------- -------------------------- --------------------------
sample 384 20k 2020-06-24 15:09:00.409266 2020-06-24 15:09:00.409266
madrid_traffic 87123456 1.5Gb 2019-07-02 10:40:03.840151 2019-07-02 10:40:03.840152
...
Go to your Tinybird dashboard to check the Data Source is present there
Data projects¶
A Data Project is a set of files that describes how your data should be stored, processed, and exposed through APIs.
The same way we maintain source code files in a repository, use a CI, make deployments, run tests, etc, Tinybird provides a set of tools to work following a similar pattern but with data pipelines. In other words: the source code in your project would be the data files in Tinybird.
Following this approach, any Data Project can be managed with a list of text-based files that allow you to:
Define how the data should flow, from the start (the schemas) to the end (the API)
Manage your data files under version control
Use branches in your data files
Run tests
Deploy a Data Project like you’d deploy any other software application
Let’s see an example. Imagine an e-commerce site where we have events from users and a list of products with their attributes. Our purpose is to expose several API endpoints to return sales per day and top product per day.
The Data Project would look like this:
ecommerce_data_project/
datasources/
events.datasource
products.datasource
fixtures/
events.csv
products.csv
pipes/
top_product_per_day.pipe
endpoints/
sales.pipe
top_products.pipe
Every file in this folder maps to a Data Source or a Pipe in Tinybird. You can create a project from scratch with tb init
, but in this case let’s assume it’s already created and stored in a GitHub repository
Uploading the project¶
git clone https://github.com/tinybirdco/ecommerce_data_project.git
cd ecommerce_data_project
Refer to the how to install section to connect the ecommerce_data_project
with your Tinybird account.
You can push the whole project to your Tinybird account to check everything is fine. The tb push
command uploads the data to Tinybird, but previously it checks the project dependencies and the SQL syntax, between others. In this case, we use the --push-deps
flag to push everything
$ tb push --push-deps
** Processing ./datasources/events.datasource
** Processing ./datasources/products.datasource
** Processing ./pipes/top_product_per_day.pipe
** Processing ./endpoints/top_products_params.pipe
** Processing ./endpoints/sales.pipe
** Processing ./endpoints/top_products.pipe
** Building dependencies
** Creating products
** Creating events
** Creating products_join_by_id
** Creating top_product_per_day
** Creating sales
** => Test endpoint at https://api.tinybird.co/v0/pipes/sales.json
** Creating products_join_by_id_pipe
** Creating top_products_params
** => Test endpoint at https://api.tinybird.co/v0/pipes/top_products_params.json
** Creating top_products
** => Test endpoint at https://api.tinybird.co/v0/pipes/top_products.json
** not pushing fixtures
Once it finishes, the endpoints defined in our project (sales
and top_products
) will be available and we can start pushing data to the different Data Sources. The project is ready.
You can read the Versions guides to learn how to work with your Data Project as any other software project by connecting it to Git and automating CI/CD pipelines.
Now, let’s go through the different files in the project in order to understand how to deal with them individually.
Define Data Sources¶
Data sources define how your data is going to be stored. You can add data to these Data Sources using Data Sources API.
Each Data Source is defined by a schema and other properties we will explain later (more on this in the Datafile reference)
Let’s see event.datasource
:
DESCRIPTION >
# Events from users
This contains all the events produced by Kafka, there are 4 fixed columns.
plus a `json` column which contains the rest of the data for that event.
See [documentation](url_for_docs) for the different events.
SCHEMA >
timestamp DateTime,
product String,
user_id String,
action String
json String
ENGINE MergeTree
ENGINE_SORTING_KEY timestamp
As we can see, there are three main sections:
A general description (using markdown in this case),
The schema
How the data is sorted. In this case, the access pattern is most of the time by the
timestamp
column. If noSORTING_KEY
is set, Tinybird picks one by default, date or datetime columns most of the time.
Now, let’s push the Data Source:
$ tb push datasources/events.datasource
** Processing datasources/events.datasource
** Building dependencies
** Creating events
** not pushing fixtures
You cannot override Data Sources, if you try to push a Data Source that already exists in your account you’ll get an output like this: events already exists, skipping
. If you actually need to override the Data Source you can first remove it or just upload a new version.
Define data pipes¶
You usually don’t use the data as it comes in. For example, in this project, we are dealing with Kafka events so we could be using the events
Data Source but generating a live materialized view of that table is better.
For this purpose, we have pipes. Let’s see how to create a data pipe that transforms the data as it’s inserted. This is the content of pipes/top_product_per_day.pipe
NODE only_buy_events
DESCRIPTION >
filters all the buy events
SQL >
SELECT
toDate(timestamp) date,
product,
JSONExtractFloat(json, 'price') AS price
FROM events
WHERE action = 'buy'
NODE top_per_day
SQL >
SELECT date,
topKState(10)(product) top_10,
sumState(price) total_sales
FROM only_buy_events
GROUP BY date
TYPE materialized
DATASOURCE top_per_day_mv
ENGINE AggregatingMergeTree
ENGINE_SORTING_KEY date
Each pipe can have one or more nodes. In this pipe, as we can see, we’re defining two nodes: only_buy_events
and top_per_day
.
The first one filters “buy” events and extracts some data from the
json
column.The second one runs the aggregation.
The pattern to define a pipeline is simple: use NODE
to start a new node and then use SQL >
to define the SQL for that node. Notice you can use other nodes inside the SQL. In this case, the second node uses the first one only_buy_events
.
Pushing a pipe is the same as pushing a Data Source:
$ tb push pipes/top_product_per_day.pipe --populate
** Processing pipes/top_product_per_day.pipe
** Building dependencies
** Creating top_product_per_day
** Populate job url https://api.tinybird.co/v0/jobs/c7819921-aca0-4424-98c5-9223ca2475c3
** not pushing fixtures
In this case, it’s a materialized node. If you want to populate with the existing data in events
table you can use --populate
flag
When using the --populate
flag you get a job URL. Data population is done in background, so you can check the status of the job by checking the URL provided.
Define endpoints¶
Endpoints are the way you expose the data to be consumed. They look pretty similar to pipes and, well, they are actually pipes that transform the data but add an extra step that exposes the data.
Let’s look into endpoints/top_products.pipe
NODE endpoint
DESCRIPTION >
returns top 10 products for the last week
SQL >
SELECT
date,
topKMerge(10)(top_10) AS top_10
FROM top_per_day
WHERE date > today() - interval 7 day
GROUP BY date
The syntax is exactly the same we’re using in the data transformation pipes, but now, the results can be accessed through the endpoint https://api.tinybird.co/v0/top_products.json?token=TOKEN
When you push an endpoint a TOKEN with PIPE:READ permissions is automatically created. You can see it from the tokens UI or directly from the CLI with the command tb pipe token_read <endpoint_name>
.
Alternatively you can use the TOKEN token_name READ
command to automatically create a token with name token_name
with READ permissions over the endpoint or add READ permissions to the existing token_name
over the endpoint. This is a very convenient way of handling tokens on your Data Project.
TOKEN public_read_token READ
NODE endpoint
DESCRIPTION >
returns top 10 products for the last week
SQL >
SELECT
date,
topKMerge(10)(top_10) AS top_10
FROM top_per_day
WHERE date > today() - interval 7 day
GROUP BY date
Let’s push it now:
$ tb push endpoints/top_products.pipe
** Processing endpoints/top_products.pipe
** Token public_read_token not found, creating one
** Building dependencies
** Creating top_products
** => Test endpoint at https://api.tinybird.co/v0/pipes/top_products.json?token=******
** not pushing fixtures
Note the token public_read_token
was created automatically and it’s provided in the test URL
It’s possible to add parameters to any endpoint. For example, let’s parametrize the dates to be able to filter the data between two dates:
NODE endpoint
DESCRIPTION >
returns top 10 products for the last week
SQL >
%
SELECT
date,
topKMerge(10)(top_10) AS top_10
FROM top_per_day
WHERE date between {{Date(start)}} AND {{Date(end)}}
GRUP BY date
Now, the endpoint can receive start
and end
parameters: https://api.tinybird.co/v0/top_products.json?start=2018-09-07&end=2018-09-17&token=TOKEN
You can print the results from the CLI using the pipe data
command. For instance, for the example above:
$ tb pipe data top_products --start '2018-09-07' --end '2018-09-17' --format CSV
"date","top_10"
"2021-04-28","['sku_0001','sku_0004','sku_0003','sku_0002']"
Check tb pipe data --help
for more options.
The supported types for the parameters are: Boolean
, DateTime64
, DateTime
, Date
, Float32
, Float64
, Int
, Integer
, Int8
, Int16
, UInt8
, UInt16
, UInt32
, Int32
, Int64
, UInt64
, Int128
, UInt128
, Int256
, UInt256
, Symbol
, String
Note that for the parameters templating to work you need to start your NODE SQL definition by the character %
Overriding an endpoint or a data pipe¶
When working on a project, you usually need to push several versions of the same file. You can override a pipe that has already been pushed using the --force
flag.
$ tb push endpoints/top_products_params.pipe --force
** Processing endpoints/top_products_params.pipe
** building dependencies
** Creating op_products_params
current https://api.tinybird.co/v0/pipes/top_products_params.json?start=2020-01-01&end=2010-01-01
new https://api.tinybird.co/v0/pipes/top_products_params__checker.json?start=2020-01-01&end=2010-01-01 ... ok
current https://api.tinybird.co/v0/pipes/top_products_params.json?start=2010-01-01&end=2021-01-01
new https://api.tinybird.co/v0/pipes/top_products_params__checker.json?start=2010-01-01&end=2021-01-01 ... ok
** => Test endpoint at https://api.tinybird.co/v0/pipes/op_products_params.json
It will override the endpoint. If the endpoint has been called before, it runs regression tests with the most frequent requests. If the new version doesn’t return the same data, then it’s not pushed. You can see in the example how to run all the requests tested (up to 10).
However, it’s possible to force the push without running the checks using the --no-check
flag:
$ tb push endpoints/top_products_params.pipe --force --no-check
** Processing endpoints/top_products_params.pipe
** Building dependencies
** Creating top_products_params
** => Test endpoint at https://api.tinybird.co/v0/pipes/top_products_params.json
This is a security check to avoid breaking production Environments. It’s better to add an extra parameter than to be sorry
Downloading data files from Tinybird¶
Sometimes you use the user interface to create pipes, and then you want to store them in your Data Project. It’s possible to download data files using the pull
command:
$ tb pull --match endpoint_im_working_on
It will download the endpoint_im_working_on.pipe
directly to the current folder.
Working with versions¶
Data sources, endpoints, and pipes change over time. Versions are a good way to organize these changes.
The version system is simple:
Each resource might have a version. It’s specified with a
VERSION <number>
in the project file.When a resource is pushed, it uses the version of the dependencies found in local files. For example, if a pipe uses a Data Source and both files have
VERSION 1
locally when you push the pipe, it will use version 1 of the Data Source even if the server has other versions.
You can check which version is set for each resource with tb datasource ls
or tb pipe ls
commands.
An example of a Data Source with a defined version:
# this Data Source is in version 3
VERSION 3
DESCRIPTION generated from /Users/username/tmp/sample.csv
SCHEMA >
`d` DateTime,
`total` Int32,
`from_novoa` Int16
Versions are optional, there could be resources without any version, but we encourage you to use them for all the resources even if you just need to version of some of them
Versions start to payoff when you put your Data Sources and endpoints in a production environment (that is, they are integrated into an application or other workflow), then you want to keep working on your endpoints without disrupt the applications that use them, so you create a new version until it’s ready to be published.
Naming conventions¶
The VERSION system use this convention to rename your pipes and datasources:
{datasource|pipe}__{version}
This is important to note because for certain operations such as running a SQL or removing a datasource you need to provide the full name, including the version.
For instance, if you created the VERSION 0 of a datasource like this:
$ tb push datasources/event.datasource
When you want to remove it you do it like this:
$ tb datasource rm event__v0
Members management¶
You can manage workspace members using the Web UI or the CLI. For the latter, just use the workspace members
commands.
You can add members:
$ tb workspace members add "user1@example.com,user2@example.com,user3@example.com"
Remove members:
$ tb workspace members rm user3@example.com
And list them:
$ tb workspace members ls
---------------------
| email |
---------------------
| user1@example.com |
| user2@example.com |
---------------------
You can also manage roles:
$ tb workspace members set-role admin user@example.com
$ tb workspace members set-role guest user@example.com
Integrated help¶
Once you’ve installed the CLI you can access the integrated help:
$ tb --help
Usage: tb [OPTIONS] COMMAND [ARGS]...
Options:
--debug / --no-debug Prints internal representation, can be
combined with any command to get more
information.
--token TEXT Use auth token, defaults to TB_TOKEN envvar,
then to the .tinyb file
--host TEXT Use custom host, defaults to TB_HOST envvar,
then to https://api.tinybird.co
--version-warning / --no-version-warning
Don't print version warning message if
there's a new available version. You can use
TB_VERSION_WARNING envar
--hide-tokens Disable the output of tokens
--version Show the version and exit.
-h, --help Show this message and exit.
Commands:
auth Configure auth
check Check file syntax
connection Connection commands
datasource Data sources commands
dependencies Print all data sources dependencies
diff Diffs a local datafiles to the corresponding remote files in
the workspace.
fmt Formats a .datasource, .pipe or .incl file
init Initialize folder layout
job Jobs commands
materialize Given a local Pipe datafile (.pipe) and a node name it
generates the target Data Source and materialized Pipe ready
to be pushed and guides you through the process to create the
materialized view
pipe Pipes commands
prompt Learn how to include info about the CLI in your shell PROMPT
pull Retrieve latest versions for project files from Tinybird
push Push files to Tinybird
sql Run SQL query over data sources and pipes
test Test commands
workspace Workspace commands
And you can do the same for every available command, so you don’t need to know every detail for every command:
$ tb datasource --help
Usage: tb datasource [OPTIONS] COMMAND [ARGS]...
Data sources commands
Options:
--help Show this message and exit.
Commands:
analyze Analyze a URL or a file before creating a new data source
append Create a data source from a URL, local file or a connector
connect Create a new datasource from an existing connection
delete Delete rows from a datasource
generate Generates a data source file based on a sample CSV, NDJSON or
Parquet file from local disk or url
ls List data sources
replace Replaces the data in a data source from a URL, local file or...
rm Delete a data source
share Share a datasource
truncate Truncate a data source
Full Command list¶
auth¶
Configure auth
check¶
Check file syntax
datasource¶
Data sources commands
datasource analyze¶
Analyze a URL before creating a new Data Source
datasource append¶
Create a data source from a URL, local file or a connector
datasource generate¶
Generates a Data Source file based on a sample CSV, NDJSON, or Parquet file from local disk or url
datasource ls¶
List Data Sources
datasource replace¶
Replaces the data in a Data Source from a URL, local file or a connector
datasource rm¶
Delete a Data Source
datasource truncate¶
Truncates a Data Source
dependencies¶
Print all Data Sources dependencies
init¶
Initializes folder layout
materialize¶
Analyzes the node_name
SQL query to generate the .datasource
and .pipe
files needed to push a new materialize view.
pipe¶
Pipes commands
pipe append¶
Append a node to a pipe
pipe data¶
Print data returned by a pipe
pipe generate¶
Generates a pipe file based on a sql query
pipe ls¶
List pipes
pipe rm¶
Delete a pipe
pipe set_endpoint¶
Change the published node of a pipe
pipe token_read¶
Retrieve a token to read a pipe
workspace ls¶
List all the workspaces you have access to in the account you’re currently authenticated to
workspace use¶
Switch to another workspace
workspace current¶
Show the workspace you’re currently authenticated to
workspace clear¶
Drop all the resources inside a project. This command is dangerous because it removes everything, use with care
workspace create¶
Create a new workspace.
Allows the creation of a workspace from a starter kit. Currently only our Web Analytics starter kit (web-analytics
) is supported.
workspace delete¶
Delete a workspace where you are admin
workspace members¶
Workspace members management commands
workspace members ls¶
List all members in the current workspace
workspace members add¶
Adds members to the current workspace
workspace members rm¶
Removes members from the current workspace
pull¶
Retrieve latest versions for project files from Tinybird
push¶
Push files to Tinybird
sql¶
Run SQL query over Data Sources and pipes
diff¶
Diffs local datafiles to the corresponding remote files in the workspace.
It works as a regular diff
command, useful to know if the remote resources have been changed. Some caveats:
Resources in the workspace might mismatch due to having slightly different SQL syntax, for instance: Some parenthesis mismatch,
INTERVAL
expressions or changes in the schema definitions.If you didn’t specify an
ENGINE_PARTITION_KEY
andENGINE_SORTING_KEY
, resources in the workspace might have some default ones.
The recommendation in these cases is use tb pull
to keep your local files in sync.
Remote files are downloaded and stored locally in a .diff_tmp
directory, if working with git you can add it to .gitignore
.
fmt¶
Formats a .datasource, .pipe or .incl file
Implementation is based in the ClickHouse dialect of shandy-sqlfmt adapted to Tinybird datafiles.
You can add tb fmt
to your git pre-commit
hook to have your files properly formatted.
If the SQL formatting results are not the ones expected to you, you can disable it just for the blocks needed. Read how to disable fmt
Supported plaforms¶
It supports Linux and macOS > 10.14
Configure the shell PROMPT¶
When working with the Tinybird CLI from the command line it’s useful to have the current workspace in the command line PROMPT, in the same way you have your active git branch for instance.
The Tinybird CLI stores the credentials in a local file called .tinyb
, so it’s relatively easy extract from there the information needed for the PROMPT and customize it to your needs.
You can copy this function to your shell config file (~/.zshrc, ~/.bashrc, etc.) and include it in your PROMPT:
prompt_tb() {
if [ -e ".tinyb" ]; then
TB_CHAR=$'\U1F423'
branch_name=`grep '"name":' .tinyb | cut -d : -f 2 | cut -d '"' -f 2`
region=`grep '"host":' .tinyb | cut -d / -f 3 | cut -d . -f 2 | cut -d : -f 1`
if [ "$region" = "tinybird" ]; then
region=`grep '"host":' .tinyb | cut -d / -f 3 | cut -d . -f 1`
fi
TB_BRANCH="${TB_CHAR}tb:${region}=>${branch_name}"
else
TB_BRANCH=''
fi
echo $TB_BRANCH
}
Once the function is available, make the output visible on the PROMPT depends on your shell installation, for instance, for the case of zsh
this should work in most cases:
echo 'export PROMPT="' $PS1 ' $(prompt_tb)"' >> ~/.zshrc
Once properly configured, and you are in the root directory of a Data Project (the one with the .tinyb file), you’ll see the Tinybird region and workspace in your PROMPT:

CLI telemetry¶
Since version 1.0.0b272, the Tinybird CLI includes telemetry. The feature collects the use of the CLI commands and information about exceptions and crashes anonymously and sends it only to Tinybird. Telemetry data helps Tinybird understand how the commands are used so we can improve our command line experience. Information on undesired outputs helps the team resolve potential issues and fix bugs.
On each tb
execution, we collect information about your system, your Python environment, the CLI version installed and the command you ran.
How to opt out¶
CLI telemetry feature is enabled by default. To opt out of the telemetry feature, set the TB_CLI_TELEMETRY_OPTOUT
environment variable to 1 or true.