URL table function¶
The Tinybird url() table function allows you to read data from an existing URL into Tinybird, then schedule a regular copy pipe to orchestrate synchronization. You can load full tables, and every run performs a full replace on the data source.
To use it, define a node using standard SQL and the url function keyword, then publish the node as a copy pipe that does a sync on every run. See Table functions for general information and tips.
Syntax¶
Create a new pipe node. Call the url table function and pass the URL. Optionally, pass the format and the structure:
Example query logic
SELECT
JSONExtractString(data, 'article') AS article,
JSONExtractInt(data, 'views') AS views,
JSONExtractInt(data, 'rank') AS rank
FROM
(
SELECT toJSONString(arrayJoin(items.articles)) AS data
FROM
url(
'https://wikimedia.org/api/rest_v1/metrics/pageviews/top/en.wikipedia.org/all-access/2024/03/all-days',
'JSONColumns',
'items Tuple(access Nullable(String), articles Array(Tuple(article Nullable(String), rank Nullable(Int64), views Nullable(Int64))), day Nullable(String), month Nullable(String), project Nullable(String), year Nullable(String))'
)
)
Publish this node as a copy pipe. You can choose to append only new data or replace all data.
Example: sync page view rankings from a URL¶
The following example reads the Wikimedia pageviews API, extracts the article rankings, and replaces the target Data Source on demand.
First, define the target Data Source. Then define the copy pipe.
datasources/wiki_pageviews.datasource
SCHEMA >
`article` String,
`views` UInt64,
`rank` UInt32,
`day` Date
ENGINE "MergeTree"
ENGINE_SORTING_KEY "day, rank"
pipes/wiki_pageviews_sync.pipe
NODE pageviews
SQL >
%
SELECT
JSONExtractString(data, 'article') AS article,
toUInt64(JSONExtractInt(data, 'views')) AS views,
toUInt32(JSONExtractInt(data, 'rank')) AS rank,
toDate('2024-03-01') AS day
FROM (
SELECT toJSONString(arrayJoin(items.articles)) AS data
FROM url(
'https://wikimedia.org/api/rest_v1/metrics/pageviews/top/en.wikipedia.org/all-access/2024/03/01',
'JSONColumns',
'items Tuple(access Nullable(String), articles Array(Tuple(article Nullable(String), rank Nullable(Int64), views Nullable(Int64))), day Nullable(String), month Nullable(String), project Nullable(String), year Nullable(String))'
)
)
TYPE copy
TARGET_DATASOURCE wiki_pageviews
COPY_MODE replace
COPY_SCHEDULE @on-demand
Run the copy pipe on demand when you want to refresh the snapshot.