Changelog

New updates and improvements in Tinybird

Parquet ingestion via URL

Parquet file ingestion via URL is now 10 times faster for large Parquet files (hundreds of megabytes). Performance gains may vary based on compression levels and the number of row groups. This optimization significantly enhances the process of backfilling historical data.

Support for Alter API to make a column Nullable and to drop columns

We've improved the Data Sources API > Alter endpoint, adding the ability to make a column Nullable and also be able to drop columns. This feature is particularly useful when you need to change the schema of a Data Source.

  • To make a column nullable using the CLI, change the type of the column adding the Nullable type to the old one in the datafile, and push it using the --force option.
  • To drop a column, simply remove it from the schema definition. You can't remove columns that are part of the primary or partition key.

Support DEFAULT creating a new Data Source from schema

When a new Data Source is generated by defining the schema, you can now add DEFAULT values to the fields:

SCHEMA >
    `name` String `json:$.name`,
    `city` String `json:$.city` DEFAULT 'New York',
    `number` Int32 `json:$.number` DEFAULT 8

Once created, you can view the DEFAULT value from the Schema tab in the Data Source details view.

Populate a Data Source from multiple materializations in the UI

There are some cases where you need to populate a Data Source from multiple materializations. Now, you can do this directly from the UI! When you select Populate from the Data Source options, you can select more than one materialization to populate the Data Source.

Populate from multiple materializations

No more gibberish aggregated state!

We no longer show the intermediate state of aggregated columns in materialized Data Sources. This data is stored in a format unsuitable for human reading. While we haven't checked with extraterrestrial beings from outer worlds, we suspect they wouldn't be able to decipher it either. This was creating confusion among our users, especially those of you seeing these sets of random characters for the first time. Sorry about that.

Now, we show a label that better conveys that the aggregated state needs to be compacted later at query time. Hover over the label, and you'll find a hint about the function you should use at query time, and a link to our guide explaining how state works in Materialized Views.

Showing aggregated state label

Export CSV using custom parameter values

You can now export CSV using custom values for your Pipe parameters. To do this, enter the "Test new values" mode, apply the value you want to explore, and select "Export CSV". ✨

Export CSV using custom parameter values

🐞 Small improvements or bug fixes

We've improved the observability on Copy Pipes and API Endpoint query errors. Processed bytes from failed operations with code 4xx, such as timeouts or memory usage limits, are now also included in datasource_ops_log and pipe_stats Service Data Sources.

Grant organization admin permissions

Now you can grant and revoke admin permissions for your whole organization using the app. Before this change, you had to contact our support.

Grant organization admin permissions

Also, we've added 3 new organization service Data Sources: organization.sinks_ops_log, organization.bi_stats_rt, and organization.bi_stats. They're equivalent to their tinybird.* counterparts, and include data about all workspaces in an organization. You can read about organization service data sources on the Organizations page

Amazon MSK available

You can ingest data now from MSK (Amazon Managed Streaming for Apache Kafka) using the UI. If your cluster doesn't have public access, please reach out to us to help with the underlying connectivity setup.

🐞 Small improvements or bug fixes

  • Now you can sort your branches by name in the CLI using tb branch ls --sort. This will help you find the branch you are looking for faster.
  • We have added FixedString(N) to our list of supported data types. Previously it was only working for CSV Data Sources.
  • You can choose the replace mode for any Copy Pipe from the UI. Last week it was available from the API and CLI.
  • New Decimal and Bool options are available in the Column Type dropdown when creating a new Data Source in the UI. Decimal and Bool are now supported in the UI

Tinybird React library for in-product analytics

BETA

We are excited to announce a major upgrade to our Tinybird React library @tinybirdco/charts, designed to empower frontend developers with robust tools for building in-product analytics. This update significantly enhances the library, offering everything you need to create rich charts and better user experiences with ease.

Key features:

  • Direct data querying: A new hook allows you to directly query data from Tinybird, streamlining the process of fetching and displaying data.
  • Third-party integration: Integration with any third-party chart library or custom component, offering great flexibility.
  • Ready-to-use components: 6 ready-to-use Tinybird Charts and a table component, enabling quick and easy data visualization.
  • Total customization control: Full control over Tinybird Charts customization to fit your unique needs.
  • Polling: Refresh your data with periodic updates for real-time data needs like trading charts.
  • ChartProvider: Share styling and query configurations across multiple charts for a consistent look and feel.
  • State control: Total control over the state of your charts (loaded, loading, error...).
  • Token management: An exposed fetcher simplifies token management.

Check out the Charts documentation to get started.

Create a Pipe from a Playground

Playgrounds are a great way to explore your data. You can do one-time queries and not mess up your data project. They're also used to develop Pipes in a sandbox way. Tinybird users often need them to become Pipes so they can grow into being an API Endpoint, a Materialized View... and fulfill their real calling.

Now, you can duplicate any Playground into a Pipe with just a couple clicks:

Duplicate playground as a Pipe

Specify rate limit for JWT tokens

You can now specify a rate limit of maximum requests per second when defining a JWT token. This is a particularly useful safety net to stop any published Endpoints from accidentally blowing up your Workspace usage! After this limit is reached, any new requests receive a 429 response code.

Read more about it in the "Rate limits for JWTs" docs.

Copy Pipe mode update

Previously, when performing a Copy Pipe, it was always created in "append" mode, adding new results to the existing Data Source. Now, during the creation process, you can choose between "Append only new data" or "Replace all data".

Additionally, you can modify the mode from the copy Pipe options, even for those created previously.

Bool and Decimal types are now supported

We've added support for the following types in our Data Sources:

  • Bool: a field with true/false as possible values.
  • Decimal(P,S)/Decimal32(S)/Decimal64(S)/Decimal128(S)/Decimal256(S): these fields can precisely store signed decimal values with up to 76 digits, including fractional ones. The P parameter defines the total number of digits, while S sets the number of digits for the fractional part.
Schema example using Bool and Decimal types
SCHEMA >
    `bool_value` Bool `json:$.bool_value`,
    `decimal_value` Decimal(20,9) `json:$.decimal_value`

See the "Supported data types" docs for more information and limitations to be aware of.

Support for default values in NDJSON Data Sources

To improve the ingestion of NDJSON Data Sources, we've added support for default values, in the same way we support CSV Data Sources.

To define a default value for a column in an NDJSON Data Source, use the DEFAULT keyword in the schema definition after the JSONPath. Here's an example:

SCHEMA >
    `timestamp` DateTime `json:$.timestamp` DEFAULT now(),
    `string_value` String `json:$.string_value` DEFAULT '-',
    `int_value` Int32 `json:$.int_value` DEFAULT 1

If a column has a default value defined, the row won’t be sent to quarantine if that field is missing in the JSON object or it has a null value.

It's now also possible to alter the schema of an existing NDJSON Data Source to add default values, adding it to the datafile and using the command tb push --force.

Confirm before saving in Time Series

To avoid accidental changes in your Time Series, we've added a confirmation banner when saving changes in the configuration. This way, you can review the changes before applying them or duplicate the Time Series to create a new one.

Support for append and replace modes in Copy Pipes

It's now possible to specify the insertion mode for Copy Pipes. This is done by setting the new attribute COPY_MODE in the datafile, with the possible values:

  • append: Every execution will add the rows extracted by the Copy Pipe to the destination Data Source, incrementally.
  • replace: Every run completely replaces the destination Data Source content with the rows generated in the Copy Pipe.

Here's an example:

Example Copy Pipe with insertion mode REPLACE
NODE all_orders
SQL >
    select * from orders

TYPE copy
TARGET_DATASOURCE orders
COPY_MODE replace
COPY_SCHEDULE @on-demand

Tinybird Charts: New Table component

We're excited to introduce a powerful new addition to Tinybird Charts: The Table component!

This new component enhances your data visualization capabilities alongside the existing Chart components.

Tinybird Charts

Snapshots deprecation

The Snapshots feature will be deprecated next week, hiding the option to create new ones and removing the access to public ones. Remember to use the Playground instead.

Improvements

  • You can now select the compression type for a Sink export from the UI.
  • Although a Node query will fail due to a 10 second timeout, it is now possible to create a resulting Copy Pipe.
  • Fixed a problem when querying the organization Service Data Sources in a Branch.

Copy your Node results

We've improved the UI with a feature that makes it really easy to copy the output of any Node. Individual cell, row, column, the whole table: It's all ready to be copied immediately!

Just use your right click (or equivalent) in any Node results table and copy whatever you need:

Tinybird Copy

New icons in the Tinybird header

Also, we've included two new icons in the top-level header bar in the Tinybird app:

  • We've moved your user info and options to a more discoverable place: The new user icon at the top right of the Tinybird UI.
  • Next to the user icon, the latest changelog entries now appear as a bell icon. If the bell has an additional blue dot, it means there's been a new update to the product and corresponding entry. Check it out, we'll have something exciting to tell you!

Unified banner styles

We have unified how banners are shown across the whole platform.

With all notifications, no matter how many, they'll all be visible and ordered by priority. The priority is:

  1. Error (Red)
  2. Quarantine (Orange)
  3. Warning (Yellow)
  4. Warning/Info (Blue)

As long as a banner is not dismissed, it will continue to appear in the dropdown options. Once dismissed, it automatically moves to the next banner in order of priority.

From each banner, you can explore further details like context and links.

TimeSeries bug fixed

We've fixed a bug in the Time Series feature, so now it displays null values better.

Adding new columns to S3 Connector Data Sources

You can now add new columns to your S3 Connector Data Sources. You can do so with the CLI, the same way you do with other types of Data Sources.

insertion_date deprecation

We're deprecating the automatically-generated insertion_date column. Here's why:

When creating a new Data Source in Tinybird, the Sorting Key is a required setting. Users can specify it with the optional ENGINE_SORTING_KEY setting. If it's not specified, Tinybird chooses a safe default from the defined columns. Only non-nullable columns can be used for the Sorting Key, so if all columns in the schema are nullable, Tinybird created a new non-nullable column called insertion_date and used that as the Sorting Key.

This automatically-created column, which isn't defined by the user, was causing issues when iterating a Data Source. It's not explicitly listed in the .datasource file but still exists, creating an inconsistency. To address this, we've deprecated this behavior. Now, ENGINE_SORTING_KEY will default to tuple() if there are no suitable columns in the schema definition.

This new behavior ensures that the .datasources files in your data project are 100% consistent with your Workspace resources.

How to prepare for the deprecation

Update your Tinybird CLI to > 5.0.0 and run tb pull --force. This will make insertion_date explicit in your datafiles.

If you use Branches and version control, we have added an example in our Use Case repository that demonstrates how to integrate these changes.

You can use a CLI version prior to 5.0.0 if you need to continue using the insertion_date column implicitly.

Tinybird Charts: Real-Time Embeddable Charts

Tinybird Charts

We've just released Tinybird Charts to turn your API Endpoints into real-time visualizations with just a few clicks.

Choose between a set of built-in visualization components, like lines, bars, stacked, areas, and more, that enable you to turn your real-time data into fast charts. Tinybird Charts is available to all Tinybird customers across all Tinybird pricing plans.

Build your user-facing analytics using Tinybird Charts

Tinybird Charts are highly customizable, and you can save style presets to apply them in a single click across any new Chart you create. Once your Chart is ready, copy the React or HTML code and embed it in your application. Remember that you can pass parameters to the Endpoint if you want to add filters like date or customer_id.

Tinybird Charts

You can learn more about Tinybird Charts in our docs or in our blog.

Enhancing Tinybird observability

Having a complete set of observability features in Tinybird is crucial for early issue detection, fast resolution, performance optimizations, and maintaining a reliable service. The better and more self-explanatory these features are, the easier any user can figure things out on their own.

That's why we've released the following enhancements to our observability features - so you can find and fix problems, fast.

Jobs Log

We’re excited to announce the release of our new feature: Jobs Log ✨

This new Service Data Source allows you to track run jobs across Workspaces, overcoming the limitations of the current Jobs API. With Jobs Log, you can access logs of jobs run over the past 12 months, without being restricted to 100 records or the past 48h. It's available in all Tinybird Workspaces as tinybird.jobs_log.

This new Service Data Source includes records for all Copy, Sinks, Populates, Import, and Deletes jobs.

More detailed error messages when using Connectors

Error messages include now more detailed information when using the Snowflake, BigQuery, or S3 Connector. You can check these records in the datasources_ops_log Service Data Source, and under the Logs tab in the UI. These improvements make it easier for you to understand your errors and fix them with autonomy.

You can now configure an Endpoint to monitor Data Sources that have not been successfully updated in the last hour; see Detect out-of-sync Data Sources.

Connectors bugfixes 🐞

The following issues around the S3 Connector have been addressed:

  • The file path expression is now fixed in the Preview UI. Files are detected correctly, and performance has improved significantly.
  • Fixed Parquet support.
  • Fixed compressed files support.
  • Fixed the issue causing the same files to get appended multiple times.

Deprecation of Kafka non-binary headers

To improve how we process Kafka headers, we are deprecating the option of storing them as a JSON string field. This change will take effect from September 1st 2024.

We recently changed how we process Kafka headers: since April 23rd 2024 we treat them as a binary map, instead of assuming it can be stored as a JSON (which was occasionally error-prone behavior). This new change provides you with a more stable Tinybird platform.

How to prepare for the deprecation

You don't have to do anything. We will remind you in advance, and upgrade your Data Source from the Tinybird side.

If you are importing Kafka headers in any of your Data Sources from before April 23rd 2024, you are already getting them as a JSON string in the __headers field. Don't worry - we will move any existing headers to the __old_headers field. Then Tinybird will start importing the new headers in __headers as a Map(String,String). As a reminder, String is the ClickHouse type for storing binary data.

Snapshots deprecation

Tinybird's snapshots feature is being deprecated July 1st, 2024. The Playground feature should be used instead, which lets you share Pipe results with any member of your Workspace.

This deprecation means:

  • The tinybird.snapshot_views Service Data Source won't be accessible any more.
  • Any published snapshots will stop working, and you won't be able to access them.
  • Any mention of snapshots in the documentation will be removed.

Announcing app.tinybird.co

We've moved! 📦

🎉 It's official! From today, ui.tinybird.co has become app.tinybird.co.

This change is an improvement to the Tinybird platform. It provides a single, simple URL for all our users, no matter where they are in the world or what region they're using under the hood. This means Tinybird will redirect to the proper region where you already have a Workspace, you can easily see and navigate between Workspaces using the breadcrumbs, and you can also now have Workspaces in different regions open at the same time in different tabs!

Remember to change any bookmarks now that we have one application to rule them all (...the regions, that is).

Releases and GitHub UI deprecation

To improve Tinybird Versions, we are deprecating a handful of features to simplify the experience of iterating data projects:

  • Releases: Understanding which of the different deployment strategies to use based on the change you were trying to deploy increased the learning curve too much. Most changes were just deployed as post-releases. We are removing them until we can introduce them better.
  • GitHub integration from the UI: The gap between working from the UI and waiting for an external CI/CD to run was confusing and reduced speed. We are keeping the more flexible connection to Git from the CLI.
  • CI/CD default actions: Providing out-of-the-box complete CI/CD actions was a problem once something unexpected happened. We are improving our CI/CD guides to help you build the workflow and actions that better fit your needs with the same tools.

We have updated the docs to reflect this change and the best way to work with CI/CD and version control in Tinybird.

The deprecated features will continue working until July 1st, 2024. The latest CLI release 4.0.0 hides and makes optional the --semver option from tb deploy and the latest CI template skips Releases. On July 1st we will deactivate the feature flag that allows using Releases, so deploys using a semversion will fail and Releases will be removed from the Workspaces.

How to prepare for the deprecation

Ensure you don’t have any Preview Releases with work you want to keep. Once Releases are deprecated, your Workspace will only keep the active Live Release.

If you are using the GitHub integration from the UI, you will have to disconnect your Workspace and connect again using ‘tb init —git’. You can read more about it in the docs. Make sure you remove the default deploy actions or upgraded to the 4.0.1.

Tinybird hits the West Coast 🌴🌞🌊

AWS US-WEST-2 now GA

Tinybird is now available in the US-WEST-2 region of AWS.

This is Tinybird's first West Coast region in the USA and it's available to all Build, Pro and Enterprise users.

Export Nodes to CSV

You can now export the result of an individual Node to a CSV file.

This is particularly useful when you want to see the full results of a Node, but the Node itself is not published as an API Endpoint.

Other notable changes

  • Kafka headers will now be stored using binary data using the Map(String,String) type.

All the DX!

Ask AI

Ask AI for help from the docs or community Slack.

Visit the docs or community Slack and click on the Ask AI button in the bottom right corner. The AI has been trained on public Tinybird and ClickHouse content to provide relevant answers, and will provide sources so you can learn more. This is experimental, so please let us know if you have any feedback!

(To keep with the AI theme, today's changelog image has been drawn by a talented robot.)

Create blank Data Sources in the UI

Create blank Data Sources in the UI without sending any data or using the CLI.

Add a new Data Source using the normal UI flow and select the 'Write schema' option. You can write the schema manually in the text editor modal that appears.

Update your Kafka Connector settings in the UI

Update the settings for an existing Kafka connection without leaving the UI.

Open the list of Kafka connection, click on the 3-dot menu for the connection you want to update, and select 'Edit settings'. You can update the connection name, Kafka brokers, auth creds, SASL mech and schema registry settings.

Error logging for BigQuery & Snowflake Connectors

Use the datasources_ops_log to see find errors for any BiqQuery or Snowflake jobs that go wrong.

You can query the datasources_ops_log log like any other Service Data Source to find information about your Tinybird usage. The datasources_ops_log will now include detailed error messages for any BigQuery or Snowflake jobs that fail.

See when your MVs have errors

If a Materialized View has an error, you'll see a red dot next to it in the sidebar.

If you see a red dot next to a Materialized View in the sidebar, click on the MV to see more details about the error. To dismiss the notice, you'll need to open the MV and click the 'Dismiss' button.

Multiple Auth Tokens in .datasource files

Use multiple TOKEN lines at the top of a .datasource file to create multiple Auth Tokens.

Previously, you could only use one TOKEN line at the top of a .datasource file to set the Auth Token. You can now add as many TOKEN lines as you need to create multiple Auth Tokens.

Other notable changes

  • Better Kafka Connector error messages
  • Fixed various formatting errors when displaying data in the UI
  • Show external ID in the S3 Connector CLI output

Testing Pipe parameters

You can now easily test your Pipes with custom parameters from inside the UI.

Tinybird Pipes allow you to extend your SQL with templating, making it possible to use conditional logic, or add dynamic parameters. When you add dynamic parameters, you often want to test the execution of the query with different parameter values.

Previously, this would usually require publishing the query as an API Endpoint, and then modifying the URL query parameters.

With this new feature, you can now supply custom values and test query execution without leaving the UI.

Happy new year!

Before we kick into the first changelog of 2024, let us wish you a Happy New Year!

Looking back, 2023 has been an explosive year for Tinybird and our users. Together, we served over 45 billion API requests to power some amazing applications. In our creator community, our friends at Dub.co tracked over 18 million link clicks, while OpenStatus hit 1 million requests per day!

We can’t wait to see what people will build with Tinybird in 2024 ❤️

Compression details

You can now view the detailed breakdown of compression statistics per-column for every Data Source.

Tinybird automatically compresses ingested data, achieving significant reductions in data storage. The exact compression ratio depends on the data and column type, but until now, it hasn’t been as easy as it should be to identify opportunities to optimize storage. With this new feature, you can now see the compression ratio for each column, and exactly how much storage the column consumes, for every Data Source. You can access this in the UI on the Schema tab of the Data Source details page.

Viewer role for read-only access

A new ‘Viewer’ role has been added for read-only permissions on a Workspace.

The new ‘Viewer’ role allows for users to join a Workspace in a read-only capacity, being able to view queries and their results, but unable to edit resources. We had several requests from teams that wanted to give users access to Tinybird for visibility, but without a need for them to make any changes.

AWS EU Central is now public!

Tinybird is now available in the AWS eu-central-1 (Frankfurt, Germany) region for all Free and Pro users!

In December last year, we launched our first public AWS region in the US East, and promised that we’d follow up with an EU region shortly. Well, here it is! This complements our availability in GCP, giving Tinybird an EU and US region in both AWS and GCP.

Other notable changes

  • v2.1.0 of the CLI fixes several issues, improves performance of various operations and adds some new safety checks for dangerous commands
  • Fix tb auth unexpected behavior when switching between Workspaces in different cloud providers

CLI 2.0.0

The CLI 2.0.0 release brings some important breaking changes: consistent error codes, and the removal of prefixes and the tb pipe create command.

You can find the release in pypi.

if(failure) return 1;

All CLI commands now return status code 1 on failure.

Several CLI commands didn't return 1 on failure, which meant that they could silently fail while reporting success. This added some friction to building great CI processes, so we're ensuring that, going forward, all commands are consistent in their return values.

Removed prefixes

All prefix commands have been removed from the CLI.

We first announced the deprecation of prefixes in October. Since then, we've been in contact with all active prefix users to help them migrate off.

Prefixes were the original way to manage different projects in Tinybird, before it was possible to create multiple Workspaces. Using Workspaces has been the preferred workflow since their release, but prefixes continued to be supported for legacy users. From this release, we are saying goodbye to prefixes for good.

Removed tb pipe create

The tb pipe create has been removed from the CLI.

This command was deprecated & hidden from the CLI some time ago, as it added no benefit over simply creating a .pipe file yourself. Also, no one was using it 🤷‍♀️

AWS US East is now public!

We are excited to announce that Tinybird is now available on Amazon Web Services (AWS) for self-service Free and Pro plans. Customers can now choose to build real-time data products with Tinybird in AWS US East (us-east-1), with EU regions coming soon.

Read more.

EXPLAIN goes GA

Every query Node within a Pipe now has an Explain link, giving you access to the full EXPLAIN plan of your query.

Tinybird already exposes a lot of information to help engineers tweak and optimize their queries, like showing you how much data was processed, automatically detecting full scan queries and providing the full observability log. However, when you want to tease out the last few percentage points of optimization, nothing beats the EXPLAIN plan. The EXPLAIN plan shows exactly how the underlying database is building and executing the query.

This feature is now GA available within the Tinybird UI, and should be expanded to the API and CLI soon.

Other notable changes

  • Added Map column type for NDJSON ingest
  • Improved error messages when calling the wrong API base URL for your Auth Token
  • Improved documentation for region-specific API base URLs

Updated summary page for Materialized Views

Published Materialized Views now have a summary page showing various details and metrics, such as their processed data, duration and errors, as well as a visual display of the Materialized Views data flow lineage.

When you publish an API Endpoint in Tinybird, you can access a summary page via the UI to see the API Endpoint’s requests, latency, errors and access its logs. It’s a great way for an engineer to get quick insights into an API Endpoint. However, when publishing a Materialized View, these insights were only available by analyzing the data in the Services Data Source, which takes a little more effort. The new Materialized View summary page makes the most important stats available immediately for quick monitoring.

Other notable changes

  • The Tokens page has been updated to show more details and now supports bulk actions and filters
  • -SimpleState has been removed from the automatic query optimizer for Materialized Views. You can still use this modifier when manually defining Materialized View queries.
  • When using Versions, Guest users are now more restricted in read-only Environments
  • Improved some UI elements where long resource names could break formatting
  • Improved handling of optional API parameters in auto-generated OpenAPI specs
  • On the Data Source details page, renamed the Graph tab to Data flow to match the side nav
  • Pipe queries that includes full scans over a Data Source continue to be a common anti-pattern that affect performance of API Endpoints, we’ve improved the design & added more information to the tooltip when we detect a query with a full scan
  • Lots of improved error messages, modals and tooltips

Full screen query editor

You can now expand the Node query editor into a full screen view, making it easier to work with longer queries.

Tinybird Pipes allow you to break your queries down into small, chainable chunks that are much easier to build, share and maintain. However, there is a balance: no one wants 100 single line queries chained together. There’s no rules on how long your queries should be, or how many Nodes you must use, but we’ve found many users still have queries that are somewhere between 10-50 lines in a single Node. The way Nodes are displayed today means that these longer queries are cut off and require scrolling, which can disrupt the development flow. The new full screen editor gives you much more space to work in, can be split horizontally or vertically, and should make longer queries much easier to work with.

New sidebar navigation

The sidebar has been redesigned to provide better organization of your Tinybird resources.

As data projects grow, the amount of Pipes and Data Sources can become overwhelming. In the previous design, it could become hard to find the resources you wanted as there was too much noise present in the sidebar. Resources are now collapsed under their grouping (e.g., Pipes, Data Sources) and can be expanded into a table view in the main UI area. This new view also includes search and stats, making it easier to filter for the resources you care about right now.

Other notable changes

  • Better handling for query templating syntax errors, with error messages that now point to the Pipe and Node that contains the error
  • Fixed a bug when sending Int types over the Events API where the field name and JSONPath don’t match
  • The rows_before_limit_at_least field in an API Endpoint’s JSON response is now marked as optional in the OpenAPI spec

CLI 1.0.0 Stable

The Tinybird CLI is an essential tool for our customers. It has been the preferred way to interact with Tinybird APIs for many of our customers. With our recent Versions release, it has become the indispensable “command center” for version control and CI/CD operations. The importance of the CLI in our users’ daily operations cannot be overstated, and we recognize that bugs and instability in the tool can limit their productivity in our platform.

We are excited to announce our release of a stable version of the Tinybird CLI. The stable version is live on PyPi, and you can upgrade with pip:

pip install tinybird-cli==1.0.0

Other notable changes

  • Query parameters are now fully supported when using the Query API
  • Lists can now be used as default values in the templating Array function
  • You can no longer use an Endpoint Pipe for a Copy operation, use Copy Pipes instead
  • Auto-generated OpenAPI specs now also include POST endpoints
  • The TOO_MANY_SIMULTANEOUS_QUERIES error has been replaced with a more useful & actionable error message

Versions

The way you work with data is getting an upgrade. Iterating real-time data pipelines is very, very hard. And we’ve learned through trial, error, and long hours of customer support the patterns and practices that make it easier. Now, we’re enforcing those patterns, because the same techniques that software engineering teams use to collaborate on and deploy software products must also apply to real-time data products.

Read more.

Playground

The Playground provides a scratchpad for exploring and prototyping new queries. When an Environment is marked as read-only, the Playground can be used to safely write ad-hoc queries without requiring a new development Environment or impacting production.

EXPLAINs

EXPLAIN queries are now supported, allowing you to view the detailed query execution plan for your Pipe queries.

Command palette with CMD+K

A new command palette is accessible via the cmd+k hotkey. Use this to access a brand new search engine for your Tinybird resources, and navigate through the Tinybird UI using only the keyboard.

Region selector

The region selector has been extended. You can now select a cloud provider, see more region options, and request new regions without contacting support.

Deprecations 🚨

Python 3.7

Python 3.7 has been officially deprecated and is no longer supported by the Tinybird CLI.

Prefixes

Prefixes have been discouraged since the introduction of Workspaces. They are now officially deprecated and will no longer be supported. Most users moved away from prefixes a long time ago, so the impact should be minimal, but if you are still using prefixes and need support, please reach out in the Slack community!

Other notable changes

  • Ingesting CSVs now supports GZIP compression. Files must have the .gz extension.
  • The date and timestamp reserved words are now allowed as column names when using the Snowflake connector
  • Fixed a bug that caused Node names not to re-render after the Node name was changed
  • Improved the error message when pushing a Copy Pipe from the CLI that contained invalid SQL
  • Introduced a banner to remind users to save changes when modifying Auth Tokens
  • Added the ability to duplicate an Auth Token

S3 connector

To bridge the gap between Amazon S3 and real-time applications, we’ve launched the Amazon S3 Connector. This first-party connector simplifies the process of ingesting data from files in Amazon S3 into Tinybird. Configured with either the UI and CLI, Tinybird can connect directly to your S3 bucket, discover files and ingest them into a Data Source, so you can publish APIs from your files.

Other notable changes

  • Accessing observability data from internal Service Data Sources is now free of charge
  • Brand new Workspace settings modal, now has much more information and better organized for clarity
  • Bug Fix - sometimes the UI would show the wrong chart for the BigQuery connector
  • Bug Fix - BigQuery previews would sometimes fail
  • Bug Fix - changing the name of Pipe to use the same name as another Pipe stopped displaying an error in the UI
  • Bug Fix - some edge cases where Copy Pipes would not detect compatible schemas with a Data Source
  • Quality of Life - the Token API /delete endpoint can now delete tokens by name or ID, not just by the token content
  • Quality of Life - the Tinybird region (EU/US) has been added to the Auth Token payload, so you can now use tb auth from the CLI and it will auto-detect which region to authenticate against

Copy Pipes and lots of QOL

After two back to back launch weeks, we took some time off from publishing a changelog to catch our breath - but we’re back! And we have some amazing things to share with you.

In June, we shipped some major new features, some QOL improvements, and over 100 minor bug fixes, tech debt resolutions, and security upgrades.

Copy Pipes

Copy Pipes let you sink the results of a Pipe into a Data Source either on demand or on a schedule. You can use them to create point-in-time snapshots, consolidate the results of CDC, finalize deduplication and more. Read the announcement here.

Other notable changes

  • Although you might not notice it directly, we rebuilt a core piece of our architecture that handles the high-availability & failover capabilities for our Events API and the API Endpoints you publish. The old architecture was still letting us achieve over 99.9% availability, but this change will see us into the future. You can always monitor our availability at status.tinybird.co
  • The Tinybird VS Code extension has had a huge upgrade.
  • Our ClickHouse team has been working hard to improve support for Parallel Replicas which will make Tinybird scale effortlessly into a trillion-row future.
  • When using the Kafka connector, you can now choose to include any Kafka headers along with the event message.
  • Any API Endpoint can now be explored in the Time Series UI
  • We added support for Map types, making it easier to ingest data with complex dictionary-style data.
  • The email notifications about Workspace consumption have been improved, to make it easier to be alerted about changing in your Workspace without needing to log in.
  • We’ve improved a wide range of error messages in the platform to be more consistent & helpful, particularly around Kafka & Parquet files.
  • For our Enterprise customers with dedicated infrastructure, we’ve merged support for isolated Kafka consumers to allow to for more secure & predictable access to disparate Kafka clusters.
  • Tinybird’s color palette has been updated! Our green & blue hues in the UI have been updated with richer colors that improve contrast and make the UI more accessible.
  • You can now drag & drop your Tinybird resource files (.pipe, .datasource, etc.) onto the UI to create them without the CLI.
  • From the CLI, you can also delete Tinybird resources using the resource files, e.g. tb pipe rm my_pipe.pipe.

Massive update after a busy launch week #2

Here is the 3 things you can't miss

Other notable changes

  • Made it easier to connect your Confluent Kafka Streams using the Confluent connector.
  • Data Copy operations are now atomic which ensures data is copied when it should be copied.
  • Improved error feedback on the CLI when failing creating a new Workspace and when pushing a Pipe with an empty SQL Node.
  • Ensured that the OpenAPI schema we generate when creating an API Endpoint is now valid as an OpenAPI 3.0 schema.
  • We added the option to select advanced settings like the engine or the sorting key when creating a Data Source from the UI.
  • Destructive actions in the UI now require an extra confirmation step. No more unintentional breaks.
  • We did some usability improvements in the Auth Tokens page.
  • We fixed some inconsistencies in the number of Workspaces a Data Source is being shared with.
  • We fixed the date filter in the Time Series public view.

Au revoir Recently Used

Other notable changes

  • Added a new API method to add/modify schedule operations to Copy Pipes.
  • Improved UX when refreshing a token to avoid undesired changes.
  • Fixed a bug that made datasource_ops_logs inaccessible from the UI under some circumstances.
  • It is now possible to unshare a Data Source from the CLI.
  • Fixed cURL issues in the Organizations Monitoring and API Endpoint sample usage.
  • Fixed a bug that made the Data Sources modal total storage calculation wrong when having Shared Data Sources.

Tinyfixes all around

Notable fixes and improvements

  • Added support for UUID and Datetime Avro logical types in the Kafka Connector.
  • Fixed a bug that made it impossible to delete a published API Endpoint Node with the CLI.
  • Fixed the Node fullscreen view within the Pipe editor.
  • Added a link to the BigQuery Connector guide from the UI.
  • Added Table metadata to the BigQuery Connector UI so you can how big a table is before connecting it.
  • Fixed a bug that prevented the API Endpoint view from being opened in split screen.
  • Fixed the styles of the invite actions dropdown within the Settings modal.
  • Renaming a Node in the UI erroneously received a warning as it was already existing in the Pipe. We’ve fixed this too.
  • Fixed the links on the pro tips (how did we miss this one?)
  • Reversed the order of the jobs on the BigQuery Data Source UI.
  • Fixed the styles of the Data Source type change actions dropdowns.
  • Scheduling metadata is now being sent via the Pipe API.
  • The platform now runs on Python 3.11. You should expect better performance. The CLI is already compatible with it too.

Launch week’s hangover!

Our new BigQuery connector is in GA

Google BigQuery users now have an easy and reliable method to bring their data to Tinybird where they can query and shape it using SQL, and publish it as low-latency APIs to be consumed by their applications.

Tinybird Organizations is available for all enterprise customers

Our newest feature for enterprise customers lets you monitor usage on multiple Workspaces and manage their projects across Tinybird in a single dashboard.

Scheduled Copy API is already available in private beta

Tinybird has always made it possible to ingest relentless streams of data, query and shape them using SQL, and publish them as high-concurrency, low-latency APIs for use in your applications. With Scheduled Copy, you can sink the results of your queries into another Tinybird Data Source.

The Tinybird x Vercel integration is live

Use it to sync your Tinybird Workspaces with your Vercel projects to build the real-time, serverless analytics you've always wanted.

The Tinybird Grafana plugin is here

Just build fast charts, faster.

A new way to copy data within Tinybird is in beta preview

The Scheduled Data Copy API is more powerful than you’d think

A few customers are already using the beta version of the Scheduled Data Copy API to generate data snapshots, do deduplication at scale, and perform other operational duties, such as copying the last month of data to a development Data Source, or moving data out of quarantine. If this rings a bell, let us know and we will give you access to the beta preview.

Other notable changes

  • The guided tour can now be resumed if cancelled.
  • It is possible for Organizations to authenticate with more than one email domain.
  • When working on a Pipe, you can now see the output of other Nodes in split screen mode.
  • Workspace guests can no longer share Data Sources with other Workspaces. Only Workspace admins can do this now.
  • API Endpoint statistics can be analyzed within the Time Series UI with a single click.

Welcome home Organizations

Get a birds eye view of your global usage

Track your usage commitments and manage your Workspaces and Users in a single place. Also get global observability metrics for all your Workspaces at once. If you use the Enterprise edition of Tinybird, let us know and we will activate it for you :)

Extra ball! Supercharge your CLI workflows with tb diff

The diff command allows you to compare files in your local env with the remote files in your Tinybird Workspace. Find out more in the CLI docs, and remember to update to the last CLI version before using it.

Other notable changes

  • We’re including a confirmation modal when changing the TTL. We want to avoid unnecessary data loss.
  • User tokens can be refreshed (finally!).
  • You can now explore specific API Endpoint metrics in the Time Series directly from the API Endpoint page (just clicking on the metric’s name).
  • We’ve fixed a bug that allowed you to edit a query in a Node while the query was running.
  • Improved the experience when the creation of a Materialized View from the UI was giving a timeout.

New BigQuery connector in Private Beta

Start accelerating your BigQuery data now

Some of our users are already using the new BigQuery connector for their production applications. Ask for private access and get access to it:

  • It’s configurable via the CLI and the UI.
  • It allows you to maintain your data in-sync.
  • It gives you a detailed log of every sync operation.
  • Your data remains secure.
  • It is fast.
  • It comes at no extra cost.

We'd love to hear from you what connectors you will like to have available in Tinybird. Just let us know!

Enterprise clients are getting a new home (soon)

Companies with many Workspaces and many users are going to have a better way to manage their permissions and track their usage commitments with our new Organization UI. Interested in learning more? Let us know!

Other notable changes

  • You can now specify the ENGINE settings in Kafka Data Sources allowing you to set your own Sorting Keys or other advanced parameters.
  • We’ve added new functions to calculate differences between dates. Now you can calculate differences in seconds, minutes, hours and days.
  • All users within a Workspace should be able to see the Workspace consumption stats.
  • We’ve done some optimizations that should make the Data Sources List view load faster.

Introducing the new Data Ingestion Experience

Ingestion UI enhancements

  • We've added easier access to all supported ingestion formats.
  • We’ve revamped the different ingestion options UI to make it easier to follow.
  • We’ve added a shortcut for ingesting data streams from Confluent cloud.
  • Users can now suggest new connectors through the UI.
  • The new BigQuery connector is available in private beta. Let us know if you want to give it a try.

Other notable changes

  • Changes in the Data Source schema when previewing a newly created Data Source were being ignored under some circumstances. It shouldn’t happen again.
  • Fixed a problem with datasources_ops_log not being updated after re-pushing a Materialized View.
  • Fixed an issue in the Auth tokens modal that prevented the Read / Write options to be correctly seen.
  • Fixed a couple of bugs in the Data Flow view related to wrong rendering of Data Sources metadata.
  • Fixed a bug that apparently allowed the creation of tokens with no scopes but raised a silent error.
  • When using tb fmt to format your .datasource file we had a bug that removed the Data Source description. Now, the description is kept.

Notable fixes and improvements

CLI

  • New guided process to create a Workspace using the CLI. Also improved inline help for Workspace creation. Go try ‘tb workspace create –help’
  • We’ve fixed an issue with the ‘tb fmt’ CLI command to format queries. If your Pipe had versioning, it was removing the VERSION 0 line on formatting.

Data Sources and Data Sources UI

  • It is now easier to inspect the columns of the Service Data Sources when working from the Pipe editor or Time Series editor.
  • We are now displaying the pulse graph also for materializations coming from Kafka Data Sources.
  • Users won’t be offered the option to delete Shared data sources under the "Clean up" section.
  • Now you can modify existing JSONPaths in data source columns, or add new columns with JSONPaths using the data source alter API.
  • Improved error messages when problems arise adding a column to the Data Source via the UI.
  • The UUID type is now available to choose when defining a new Schema via the UI.

Other notable changes

  • We’ve introduced 30 days option in the Time Series granularity selector.
  • Added support for arbitrary (non UTF-8) bytes in Kafka message keys when using Schema Registry.

We are SOC 2 Type II compliant

Tinybird has successfully completed its SOC 2 Type II audit made by Ernst & Young Global Limited, affirming the effectiveness of our security processes and controls.

This is one of the many measures we are taking to ensure the continuous security, integrity and availability of the Tinybird platform. We will be audited again in early Q2 2023 to ensure our security procedures continue to evolve as we evolve our platform.

Learn about security and compliance at Tinybird.

Other notable changes

  • Tinybird Docs continue to improve! Revamped our Introduction to Tinybird along with a Main Concepts section, quickstarts for Tinybird’s CLI and UI, a revamped structure for Guides as well as this new Changelog that you are reading :-)

  • We’ve updated the editor keyboard shortcuts and fixed some problems when using the shortcuts to query data from a Node.

Tinybird Shortcuts

  • We have completely revamped how Data Replace jobs work internally, which bring improved stability and reliability to the process.

  • Added a new check that prevents the creation of a Materialized View that produces more than 10 partitions. This significantly reduces the risk of getting a TOO_MANY_PARTS exception at ingestion time when creating a Materialized View with a high cardinality partition key.

  • Solved an issue that prevented deletion or renaming of Data Sources/Pipes with GUID format.

Observability for the BI connector

Tinybird’s BI Connector is a PostgreSQL-compatible interface to data in Tinybird.

All Data Sources and published Pipes created in Tinybird are available as standard Postgres tables when you connect a BI tool to Tinybird, such as Tableau, Apache SuperSet or Grafana.

Until recently, BI usage was not available to track. We have now made two new Service Data Sources available that show stats about the BI Connector consumption: bi_stats_rt (real time) and bi_stats (aggregated per day and per query).

This information is also visible in a new information module in the dashboard for the BI Connector (only visible when active) and the Query API (visible to everyone).

Other notable changes

Increased observability capabilities in Service Data Sources

For API Endpoint requests as well as for calls to the Query API you can access the amount of read data and rows, duration of the API call and error information (if relevant) in the pipe_stats_rt Service Data Source, as well as what parameters were used on each request. Now, you can do the same thing to monitor ingest and materialization usage.

Read and written bytes and rows are now also available in the datasources_ops_log Service Data Source for every ingestion operation, whether that’s new data ingested, materializations or replaces.

And lastly, it’s now possible to track your total storage in the new datasources_storage Service Data Source.

Remember that these service Data Sources can be queried from your Pipes so you can easily create API Endpoints or Time Series visualizations to analyze your own usage.

Other notable changes

  • The Kafka connector now supports ingestion of messages encoded with non UTF-8 message keys. Until now, we were assuming that everyone would use UTF-8 keys in their messages but… apparently not!
  • Improved performance for ingestion of large JSON and Parquet files, specifically for files with a large number of columns.
  • Also fixed schema detection issues when importing large Parquet files from a URL. Now, we parse the whole file and use the actual schema defined in the footer, instead of trying to infer the schema as we do with other schemaless formats. As a result of this change, the whole import process will be more accurate and predictable.
  • Added support for "bytes" type in Parquet files.
  • Certain situations were causing Materialized Views populate jobs to appear stuck. Various improvements to progress tracking.
  • If the underlying ClickHouse process was restarted while a Populate job was waiting for a query to finish, it would cause the job to get stuck and never finish. Now solved.
  • populateview entries in the Service Data Source tinybird.datasources_ops_log were not being properly registered when pushing a Pipe with force=true. They are now.
  • When creating a Materialized View, the columns in the GROUP BY need to match those in the Sorting Key of the destination data source. Tinybird validates that this is the case when you create a Materialized View, but the validation was being triggered incorrectly in some instances.
  • Fixed an issue that was exposing some internal error information when some operations on the Events API failed. Now we return an error code and a generic "Internal Tinybird error" message without leaking any unnecessary internal information.
  • Now, the CLI shows warnings with tips when uploading a materialization Pipe. These warnings link to documentation where you can learn how to improve your queries. Right now, there are only a few tips, but we will be adding more.
  • We've released a new homepage for Tinybird’s Documentation.

Notable fixes and improvements

  • The “pulse” graph, the chart that displays real-time ingestion stats for every data source, is now also available for Materialized Views. This helps get a sense of how many rows are being materialized at any given time.
  • We’ve introduced brushing for date range selection in the Time Series UI.
  • Previously, when adding new columns to existing data sources in the UI, rows not containing them used to go into quarantine. Now, when new columns are added, the suggested type by default is ‘Nullable(String)’.
  • We’ve fixed an issue where a "Data Source not found" state would prevent you from closing the right panel for the Data Source preview view.
  • Also fixed an issue that was causing the Data Source preview to close when hitting escape within the code editor.
  • Made all buttons highlight states consistent across the platform.

Buttons highlight states

Events API

While Kafka remains a popular way to ingest events data, sometimes our users just want a super simple way to send an events stream to Tinybird. And there’s nothing simpler than sending events through HTTPS via our events API. Just copy the snippet, run the script, and see your data instantly show up in the Workspace for you to start querying.

CLI updates

One of our core beliefs at Tinybird is that you should be able to build with data like you build with code. And that means being able to run tests. We’re excited to announce our first few feature launches to enable testing in Tinybird data projects. In particular, the CLI has new commands and plus some added functionality, including:

  • tb test to add tests to files
  • New metrics about API Endpoint response times (max, min, mean, median and p90) on the Pipe command tb pipe regression-test
  • tb workspace clear now deletes all files in the Workspace. Given that this command drops all the resources inside a project, please use it with care!
  • tb pipe publish to change which Node of the Pipe is published as an API Endpoint.
  • tb check to verify query syntax.You can find all the latest changes in the command-line updates.

ClickHouse

Our work to allow decimal values in seconds (for example, max_execution_time=0.5) was merged.

Data Source descriptions and beta testing of Parquet ingestion

Data Source descriptions

In a large Workspace that contains many Data Sources, you may want more information than just the name of the Data Source. Documentation matters. Now you can add a description to a Data Source like you already do with Pipes, Nodes and API Endpoints.

This feature is available through the UI and the CLI. Any new descriptions will propagate to shared sources.

Early beta support for Parquet files

At Tinybird we aim to capture and transform large amounts of data whatever the origin of the data or format. In addition to CSV and NDJSON, we're working on accepting Parquet format files. Parquet is an open-source, column-oriented data file format designed for efficient data storage and retrieval. It is also commonly used as an interchange format between data tools.

Our team is now testing ingesting data to Tinybird from Parquet. After further testing, this new format for ingestion will be included in our docs. If you'd like you join the beta reach out to us on Slack

CLI updates!

tb push --subset - We've added the tb push --subset option to be used with --populate so you can populate using only a subset of the data of between 0% and 10%. Now you can quickly validate a Materialized View with just a subset of your total dataset. You can check everything is working with a single month’s worth of data even if you have several years’ worth of data.

Data Source description - We've added the option of adding a description for a Data Source, as you already could for Pipes, thereby improving the documentation of your data project.

Endpoints from materialized Data Sources - We've fixed code so that Nodes whose type is materialized can no longer be published as API Endpoint. Logically the API Endpoint should depend on the target Data Source of the materialized Node, not the Node itself.

Check out the latest command-line updates in the changelog.

ClickHouse improvements

groupSortedArray - This new aggregation function was added to ClickHouse. groupSortedArray(n)(param1) returns an array with the n first values from a field param1 sorted by itself. groupSortedArray(n)(param1, param2) returns an array with the n first values from a field param1 sorted by param2 (field or expression). This aggregation function is useful if, for example, you have a Materialized View with the two most recent values in a field.

ASOF join - The ASOF join performance improvement was included in Tinybird. This join was improved by the ClickHouse community to be twice as fast.