Intro to ingesting data with Tinybird

Easy
Quickly learn about the different ways to ingest data into Tinybird. From large amounts of data to ultra-fast queries over your Data Sources in seconds.

Tinybird Analytics is a powerful tool with which you can ingest, transform and expose large amounts of CSV or NDJSON data in real time. You can query the data using SQL—so you don't have to learn a new query language—and also create secure API endpoints in a matter of seconds, instead of days or weeks, to consume your data.

In these Getting Started guides, we'll explore a common use-case: doing realtime analytics of events of an ecommerce store. After reading them, you'll have a much better understanding on how to leverage Tinybird to analyze big amounts of realtime data in other areas, such as SaaS, marketplaces, media, etc

The data

In Tinybird, the bigger your data, the better. Although all the guides explain common concepts that can be applied to most datasets out there, we've generated two sample ecommerce datasets that we'll use in the guides:

  • One with data for 2.4M products, split in two parts
  • and another one with 100M rows containing website events, also divided into two files.

Ingesting products data via the User Interface

This is the simplest way to get your data into Tinybird. For each dataset, you need to create a new Data Source, using your existing CSV or NDJSON files. After creating a Data Source you can always continue to append or replace your data.

Creating a Data Source

Go to your dashboard and click on the "Add Data Source" button. Download the sample file to your computer and select it, or just paste the file URL in the input.

By default, Tinybird guesses the type of every column present in your file. After clicking on "Add", you'll be taken to a screen where you'll see a preview of your data, and from there you'll be able to change the schema if desired.

Change your column names and types in the Data Source preview modal window

You can only make changes in the schema of the data source at this point (for now). After making sure everything looks OK, click on "Continue" and, as soon as your data starts getting ingested (we ingest data by streaming even if it sits in big files), you'll see it in the Data Source modal window.

The Data Source modal window shows everything related to your Data Source

As we will see later, from this modal window you will be able to change your Data Source name, append new data to it, truncate and delete it. In following guides, we will cover how to do all these operations programmatically.

{% tip-box title="VIEWING YOUR DATA SOURCE INFORMATION" %}You'll always be able to bring back this view by clicking on the name data source, in the lateral panel.{% tip-box-end %}

Append data to an existing Data Source

Once you've created a Data Source, you can append more data to it easily through the User Interface, the Rest API or the Command Line Interface (CLI). Doing it through the UI is as easy as clicking on "Options" and then on "Append data".

A similar modal window like the one you saw when creating the Data Source will appear. Try using this other file this time.

As you can see, when appending data you can't change your Data Source schema. Once your data has been correctly appended you will see a new entry in your Operations Log.

{% tip-box title="WHAT IF IT FAILS?" %}In the event that there are rows failing, you will see them in the quarantine view. This is especially useful for fixing your data at ingestion time.{% tip-box-end %}

Ingesting events data programmatically

Most developers will want to upload data programmatically to Tinybird. In that case, our REST API is the way to go. It also provides access to some features that aren't available via UI, and that will let you fine-tune Tinybird to be faster.

To install and use the CLI check-out the guide.

Creating Data Sources via the Rest API or the CLI

Let's add the events data now directly from its URL {% code-line %}https://storage.googleapis.com/tinybird-assets/datasets/guides/events_50M_1.csv{% code-line-end %}, but first, be sure that you have a token with access to the "Data Sources management" scope. A quick way to create it is from your 'Manage Auth tokens' section which is accessible from the sidebar on your Tinybird dashboard. You can always use your admin token but do so with care.

Then, the following request will read the events CSV, create a new Data Source with the guessed schema and ingest the data to it, either through the Rest API or the CLI.

{% tip-box title="Use a token with the right scope" %}Replace {% code-line %}<your_token>{% code-line-end %} by a token whose scope is {% code-line %}DATASOURCES:CREATE{% code-line-end %} or {% code-line %}ADMIN{% code-line-end %}{% tip-box-end %}

You can also create a Data Source programatically from a local CSV file, as follows:

Appending data via the Rest API or the CLI

To append new data to an existing Data Source, you just need to specify the {% code-line %}mode{% code-line-end %} parameter and set it to {% code-line %}append{% code-line-end %} ({% code-line %}mode{% code-line-end %} is {% code-line %}create{% code-line-end %} by default).

{% tip-box title="Use a token with the right scope" %}Replace {% code-line %}<your_token>{% code-line-end %} by a token whose scope is {% code-line %}DATASOURCES:CREATE{% code-line-end %}, {% code-line %}ADMIN{% code-line-end %} or {% code-line %}DATASOURCES:APPEND:events{% code-line-end %}.{% tip-box-end %}

{% tip-box title="APPEND OR REPLACE DATA USING THE API" %}Read more about the different modes in our API reference or check out our guide on replacing or deleting data selectively.{% tip-box-end %}

ON THIS GUIDE