---
title: Evolve data sources
meta:
  description: Evolve your data sources in Tinybird.
---

# Evolve data sources

After you've deployed your project, you can evolve your data sources. For example, you might need to add a new column, change the data type of a column, or change the sorting key. Tinybird handles the data migration.

## Types of changes

You can evolve your data sources by editing one or more of the following:

- Landing Data Source schema.
- Landing Data Source engine settings.
- Materialized Data Source schema.
- Materialized Data Source engine settings.
- Materialized View query logic. See [Altering Materialized
Views](/forward/work-with-data/optimize/materialized-views#altering-materialized-views).

## How Tinybird evolves data sources

Tinybird uses one of two methods to evolve a data source during deployment:

  - Rewrite and backfill. Tinybird creates a new version of the Data Source with the updated schema or settings and backfills data from the live deployment using one of the following methods:                   
    - **Materialized View query**: Re-runs the Materialized View query that writes to the Data Source to repopulate it from the upstream Data Source.                                                               
    - **Forward query**: Uses a [`FORWARD_QUERY`](#forward-query) instruction you provide to transform and 
  repopulate the data.
- `ALTER`. Tinybird applies the change in place without rebuilding and repopulating the data source.

Tinybird automatically chooses the method based on the changes in your `.datasource` files.

### Automatic ALTER operations

The currently supported automatic `ALTER` operations are:

- Adding a new column to a data source.
- Removing a column from a data source.
- Making a column nullable.
- Adding a data source TTL.
- Modifying a data source TTL.
- Removing a data source TTL.
- Adding an index to a data source.
- Dropping an index from a data source.

Data source changes evolved with `ALTER` are applied only when the deployment is promoted to live. They aren't applied to the staging deployment, so you won't see those changes while testing in staging.

## Landing data source schema

When you make changes to the schema of a landing data source, such as adding or editing columns or changing a data type, you can follow these steps:

1. Check that Tinybird Local is running and your Workspace has the project deployed. If not, `tb local start && tb deploy`.
2. Edit the .datasource file to add the changes. See [SCHEMA instructions](/forward/dev-reference/datafiles/datasource-files#schema).
3. For [automatic `ALTER` operations](#automatic-alter-operations), Tinybird applies the change automatically. For changes that require a rewrite and backfill, add a [forward query](#forward-query) instruction to tell Tinybird how to migrate your data.
4. Run `tb deploy --check` to validate the deployment before creating it. This is a good way of catching potential breaking changes.
5. Deploy and promote your changes in Tinybird Cloud using `tb --cloud deploy --check`.

If the change requires a rewrite, Tinybird Cloud automatically populates the new table following the updated schema.

If a deployment fails, Tinybird automatically discards the staging deployment and maintains the live version.

If the change can be applied with `ALTER`, Tinybird uses that method instead of rewriting and backfilling.

- You don't need to add a `FORWARD_QUERY`.
- Tinybird doesn't rebuild and repopulate the data source.
- The changes become available only after promotion, not in staging.

If you add a `FORWARD_QUERY` to a data source that could otherwise be evolved with `ALTER`, Tinybird performs a full rewrite and backfill instead. This is useful when you want to validate the schema change in staging before promoting, since `ALTER` changes aren't applied in staging.

{% callout type="warning" %}
If a `FORWARD_QUERY` in your `.datasource` file can overwrite existing column values during a backfill, for example by using default values for columns that already contain data, Tinybird shows a warning. This can happen when a `FORWARD_QUERY` from a previous deployment is still present in the file. Remove the `FORWARD_QUERY` after promotion if it's no longer needed to avoid unexpected data overwrites in future deployments.
{% /callout %}

### Forward query

If you make changes to a .datasource file that are incompatible with the live version, Tinybird needs a forward query to transform data from the live schema to the new one.                                                                              
For most changes, Tinybird auto-generates the forward query. Run `tb build` to generate the query and copy it into your `.datasource` file. Auto-generated forward queries handle the following changes:
  - Add or remove columns.
  - Change column type, nullable, or codec.
  - Change column comment, default expression, or column-level TTL.
  - Change sorting key, partition key, sampling key, or primary key.
  - Change engine type, engine settings, or engine arguments.
  - Add, remove, or modify indexes.

For changes where the auto-generated mapping can't express the transformation, you must provide an explicit `FORWARD_QUERY`. For example:
  - **Incompatible type conversions**: Changing a `String` containing `"hello"` to `Int64`. Use a custom
  query to handle the transformation with functions like `coalesce`, `accurateCastOrDefault`, or a default
  value.
  - **Column renames**: Tinybird sees a removed column and a new column. Without a `FORWARD_QUERY`, the old
  column's data is lost and the new column gets a default value.
  - **Split or merge columns**: For example, splitting `full_name` into `first_name` and `last_name`.
  - **Computed columns from existing data**: For example, a new `total` column that should be `price *
  quantity` instead of a default value.

The `FORWARD_QUERY` instruction is a `SELECT` query executed on the live data source. The query must include the column selection part of the query, for example `SELECT a, b, c` or `SELECT * except 'guid', toUUID(guid) AS guid`. The `FROM` and `WHERE` clauses aren't supported.

The following is an example of a forward query that changes the `session_id` column from a `String` to a `UUID` type:

```tb {% title="tinybird/datasources/forward-query.datasource - data source with a FORWARD_QUERY declaration" %}
DESCRIPTION >
    Analytics events landing data source

SCHEMA >
    `timestamp` DateTime `json:$.timestamp`,
    `session_id` UUID `json:$.session_id`,
    `action` String `json:$.action`,
    `version` String `json:$.version`,
    `payload` String `json:$.payload`

ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(timestamp)"
ENGINE_SORTING_KEY "timestamp"
ENGINE_TTL "timestamp + toIntervalDay(60)"

FORWARD_QUERY >
    SELECT timestamp, CAST(session_id, 'UUID') as session_id, action, version, payload
```

Tinybird runs a backfill to migrate the data to the new schema. These backfills are logged in `datasources_ops_log` with the `event_type` set to `deployment_backfill`.

For large-scale populates during data migrations (exceeding 50GB or 100 million rows), Tinybird automatically uses on-demand compute to run the populate operations on dedicated instances, avoiding resource contention with your production workloads. See [on-demand compute for deployment populates](/forward/test-and-deploy/deployments#on-demand-compute-for-deployment-populates).

If the existing data is incompatible with the schema change, the staging deployment fails and is discarded. For example, if you change a data type from `String` to `UUID`, but the existing data contains invalid values like `'abc'`, the deployment fails with this error:

```bash
» tb --cloud deploy
...

✓ Deployment submitted successfully
Deployment failed
* Error on datasource '<datasource_name>': Error migrating data: Populate job <job_id> failed, status: error
Rolling back deployment
Previous deployment is already live
Removing current deployment
Discard process successfully started
Discard process successfully completed
```

If you're willing to accept data loss or default values for incompatible records, you can make the deployment succeed by using the [accurateCastOrDefault](/sql-reference/functions/type-conversion-functions#accuratecastordefaultx-t-default-value) function in your forward query:

```tb
FORWARD_QUERY >
    SELECT timestamp, accurateCastOrDefault(session_id, 'UUID') as session_id, action, version, payload
```

After changes have been deployed and promoted, if you want to deploy other changes that don't affect that data source, you can safely remove the forward query.

## Landing data source engine settings

When you make changes to the engine settings of a landing data source, such as changing the sorting or partition key, you can follow these steps:

1. In Tinybird Local, be sure you have your project ready. If not, `tb deploy`.

2. Edit the .datasource file to add the changes. No forward query is required. See [engine settings](/forward/dev-reference/datafiles/datasource-files#engine-settings).

3. Run `tb deploy --check` to validate the deployment before creating it. This is a good way of catching potential breaking changes.

4. Deploy and promote your changes in Tinybird Cloud using `tb --cloud deploy`.

If the change requires a rewrite, Tinybird Cloud automatically populates the new table following the new settings.

If the only engine change is adding, modifying, or removing `ENGINE_TTL`, Tinybird applies it with `ALTER`:

- You don't need to add a `FORWARD_QUERY`.
- Tinybird doesn't rebuild and repopulate the data source.
- The TTL change is applied only after promotion, not in staging.

Other engine changes, such as changing the sorting key or partition key, still use the regular evolution flow and create a rewritten copy of the data source.

## Materialized data sources

When editing materialized data sources, you need to consider the settings of the landing data sources that feed into them, especially the TTL (Time To Live) settings.

[Forward queries](#forward-query) are essential when evolving materialized data sources, both schema and engine settings, to retain historical data.

{% callout type="warning" %}
You can't use `ALTER` to change the source of a Materialized View. To do so, you must drop the Materialized View and recreate it with the new source. To prevent data loss during this process, add a `SELECT *` `FORWARD_QUERY` to the destination `.datasource` file.
{% /callout %}

{% callout type="caution" %}
If your landing data source has a shorter TTL than your materialized data source, you will get a warning when you deploy your changes.
You will need to add a forward query to prevent data loss or, if you accept loss of historical data, add the `--allow-destructive-operations` flag to your deployment command.
{% /callout %}

For example, consider this scenario:

- Landing data source has a 7-day TTL.
- Materialized data source has no TTL (keeps data indefinitely).
- You want to change the data type of a column in the materialized data source.

Without a forward query, recalculating the materialized data source would only process the last 7 days of data due to the landing source's TTL, causing you to lose historical data beyond that period. To retain all historical data, use a forward query to transform the data from the live schema to the new one.

Here's an example materialized data source that uses a forward query to transform the data type of the `visits` column from `AggregateFunction(count, UInt16)` to `AggregateFunction(count, UInt64)`:

```tb
DESCRIPTION >
    Materialized data source for daily page visits aggregation

SCHEMA >
    `date` Date,
    `page_url` String,
    `visits` AggregateFunction(count, UInt64)

ENGINE "AggregatingMergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(date)"
ENGINE_SORTING_KEY "date, page_url"

FORWARD_QUERY >
    SELECT date, page_url, CAST(visits, 'AggregateFunction(count, UInt64)') AS visits

```

Omitting the forward query instruction fully recalculates the materialized data source.

You can omit the forward query when:

- Landing data source has a longer TTL than the materialized data source, or no TTL.
- Making non-backward compatible changes, like adding a new group by column.
- Accepting loss of historical data.

### Materialized data source engine settings

You can safely change engine settings on a Materialized Data Source, including the engine type, without risk of data loss. You don't need to create a new Data Source.

When you change engine settings on a Materialized Data Source, Tinybird backfills it by re-running the Materialized View query by default. You can see the backfill type when you run `tb --cloud deploy --check`.

If you don't want to reprocess all historical data through the Materialized View query, add a `SELECT *` `FORWARD_QUERY` to the `.datasource` file. Tinybird copies data from the live deployment instead of re-running the Materialized View. See [Unchanged downstream Data Sources](#unchanged-downstream-data-sources).

## Backfill strategies

When you deploy changes, Tinybird automatically decides whether affected Data Sources can be evolved with `ALTER` or need to be backfilled. The backfill strategy depends on the type of Data Source and whether it has a forward query.

### Preview the backfill strategy

Run `tb deploy --check` to see the deployment strategy before creating a deployment. The output shows which Data Sources will be evolved with `ALTER`, which ones will be rewritten, and whether rewritten Data Sources will be backfilled using forward queries or Materialized View queries.

```shell
tb deploy --check
```

You can then adjust the backfill strategy by adding or removing forward queries before deploying.

### Landing Data Sources

If a landing Data Source has no Materialized Views writing to it, Tinybird backfills it using a `FORWARD_QUERY`:

- If the schema is fully compatible, for example you're only changing the sorting key, Tinybird automatically applies a `SELECT *` `FORWARD_QUERY`. 
- If the columns aren't identical, you must provide a `FORWARD_QUERY` instruction to tell Tinybird how to transform the data.

### Data Sources with Materialized Views

If a Data Source is the destination of one or more Materialized Views, Tinybird determines the backfill strategy as follows:

- If the `.datasource` file has a `FORWARD_QUERY` instruction, Tinybird uses it for the backfill. You can force this behavior by adding a forward query.
- If there's no `FORWARD_QUERY`, Tinybird backfills the Data Source by re-running the Materialized View queries that write to it.
- If the `.datasource` file has a `BACKFILL skip` instruction, Tinybird skips the backfill entirely.

Use `BACKFILL skip` to create a new Data Source without backfilling it with historical data from the Materialized View query. Add it at the end of the `.datasource` file:

```tb {% title="tinybird/datasources/my_materialized.datasource" %}
DESCRIPTION >
    Materialized data source

SCHEMA >
    `date` Date,
    `page_url` String,
    `visits` UInt64

ENGINE "SummingMergeTree"
ENGINE_SORTING_KEY "date, page_url"

BACKFILL skip
```

`BACKFILL skip` only applies when the Data Source is first created and has no effect if it already exists.

### Unchanged downstream Data Sources

Data Sources that haven't changed but are downstream of other changes in your project are also backfilled:

- By default, they're backfilled using Materialized View queries.
- If they have an explicit `FORWARD_QUERY`, Tinybird uses it instead. Since these Data Sources haven't changed, the forward query must be `SELECT *`. This will select all data from the current live deployment and repopulate it in the datasource. If you're adding a new Materialized View upstream of an existing Data Source, add an explicit `SELECT *` `FORWARD_QUERY` to the downstream Data Source to avoid re-running the Materialized View query for the backfill.

### Materialized View backfill warnings

When Tinybird detects potential issues with Materialized View query backfills, it outputs a warning. For example:

- The Materialized View reads from a [Null engine](/sql-reference/engines/null) Data Source, which doesn't store data.
- The source Data Source has a shorter TTL than the Materialized View target, which can result in data loss.

Tinybird can't detect all potential issues. For example, filters in the Materialized View query might cause data loss depending on your use case. Use `tb deploy --check` to review the backfill strategy and add forward queries where needed.

## Next steps

- Learn more about [deployments](/forward/test-and-deploy/deployments).
- Learn about datafiles, like .datasource and .pipe files. See [Datafiles](/forward/dev-reference/datafiles).
