Migrate from Shared to Dedicated Infrastructure

This guide covers migrating data from shared to dedicated infrastructure without disrupting live services.

Pre-Migration Checklist

Before starting, verify the following:

  • Workspace setup: Your dedicated Workspace is set up and accessible. You can create it yourself if:
    • The creating user's email matches your company domain (e.g. @example.com)
    • The Workspace is created in the region of your dedicated cluster
  • Access rights: Your team can access both the old and new Workspaces
  • Downtime planning: The process minimizes disruption but can fail — plan accordingly
  • Customer Success: Coordinate with your Customer Success Data Engineer so they can assist if needed

Migration Overview

  1. Replicate the project in the new Workspace
  2. Recreate access Tokens
  3. Export and load dimensional tables; duplicate any periodic update processes
  4. Duplicate real-time ingestion, then backfill historical data for fact/event tables
  5. Point your application to the new Endpoints and Tokens
  6. Stop ingestion to the old Workspace

Step 1: Replicate the Project

  • If your project resources (.datasource, .pipe, .connection files) are already in a Git repository, clone the repo locally and skip to the deploy step below
  • If your resources are not in a Git repo, pull them from the existing Workspace first:
tb pull
  • If you use Connectors (Kafka, S3, DynamoDB, Snowflake, etc.), ensure your .connection files use the same connection names as your current Workspace
  • This is a good opportunity to clean up old versions and unnecessary resources before deploying
  • Log in to the new Workspace, validate, then deploy:
tb login
tb deploy --check
tb deploy

Step 2: Recreate Access Tokens

Tokens cannot be migrated — recreate them in the new Workspace with the same permissions. You'll need these Tokens for configuring ingestion and consumption later.

See Tokens and the Tokens API reference for details.

If you use a Connector and can reingest all data from the original source, skip Steps 3 and 4.

Step 3: Export Dimensional Tables

Dimensional tables are smaller and updated periodically, making them simpler to migrate.

If you have a periodic update process

  • Identify the Token used for updates in the source Workspace
  • Create a matching Token in the destination Workspace
  • Duplicate the update process, pointing to the new Workspace URL and Token
  • Same region, same cloud provider: only the Token changes
  • Different region or cloud provider: update both Token and URL
  • Keep both processes running until migration is fully validated

If you don't have a periodic update process

Export via Sinks, then ingest into the new Workspace. Repeat for every dimensional Data Source.

Export the data:

  • Create a Pipe selecting all columns explicitly (avoid SELECT *)
  • Create an on-demand Sink to write results to S3
  • Split output files to stay within ingestion limits

Ingest the data — choose one approach:

Option A: Direct URL ingestion (few files)

curl \
  -H "Authorization: Bearer <DATASOURCES:CREATE token>" \
  -X POST "https://api.tinybird.co/v0/datasources" \
  -d "name=stocks" \
  -d url='https://.../data.csv'

Option B: S3 connector (many files)

  • Configure an auxiliary Data Source to ingest from S3 on-demand with the file name pattern from your export
  • Configure bucket access policies
  • Monitor ingestion:
SELECT * FROM tinybird.sinks_ops_log ORDER BY timestamp DESC
  • Copy data to the destination Data Source using a copy Pipe (select columns explicitly, match types)

Step 4: Migrate Fact/Event Tables

Fact tables require maintaining ingestion continuity — no gaps or duplicates.

4a. Duplicate ingestion

Duplicate your ingestion process to also write to the new Workspace — keep the original ingestion to the shared Workspace running. At this point you'll have two parallel ingests active: one continuing to write to the shared Workspace, and one writing to the dedicated Workspace.

To configure the duplicate ingestion:

  • Same region, same cloud provider: change only the Token
  • Different region or cloud provider: change Token and URL. See API base URLs.

4b. Validate ingestion

Check via service Data Sources or the UI. Verify no data is in quarantine:

SELECT * FROM tinybird.datasources_ops_log
WHERE event_type = 'append'
ORDER BY timestamp DESC

4c. Identify the ingestion start point

Query the Data Source for the first ingested event (typically using a timestamp field). Use this timestamp as the cutoff for backfill.

4d. Export historical data

  • Create a Pipe querying the fact table up to the ingestion start point
  • Create an on-demand Sink to export to S3
  • Use Parquet format for large tables
  • Partition into files < 1 GB
  • Monitor the export:
SELECT * FROM tinybird.sinks_ops_log ORDER BY timestamp DESC

4e. Ingest backfill data

  • Use the S3 connector to ingest into an auxiliary Data Source
  • Copy to the destination Data Source using a copy Pipe (select columns explicitly, match types including LowCardinality, nullability, etc.)
  • Monitor and validate as in Step 3

4f. Validate Endpoints

Confirm all Endpoints return accurate data. Materialized Views will have been executed after the backfill copy, so data should be ready for consumption.

Step 5: Switch Over

  • Point your applications to the new Workspace Endpoints
    • Same region, same cloud provider: change only the access Token
    • Different region or cloud provider: update both Token and URL
  • Use the Tokens created in Step 2 with matching permissions
  • Stop ingestion to the old Workspace once everything is confirmed working

FAQ

I'm hitting memory limit or timeout errors during migration

Contact us through your support or community Slack channel.