Data Source health¶
Once you have fixed all the possible errors in your source files, matched the Data Source schema to your needs and done the on-the-fly transformations (if needed), at some point, you’ll start ingesting data periodically. Knowing the status of your ingestion processes will be key.
Data Sources Log¶
From the ‘Data Sources log’ in your Dashboard, you can check whether there are new rows in quarantine, if jobs are failing or if there is any other problem.
In addition to the tools we provide in our User Interface (UI), there are powerful tools that you can use for advanced monitoring.
By clicking on an individual Data Source in the left-hand panel you can see the size of the Data Source, the number of rows, the number of rows in the quarantine Data Source (if any) and when it was last updated. The Operations log contains details of the events for the Data Source, which are displayed as the results of the query.
Service Data Sources for continuous monitoring¶
Service Data Sources can help you with ingestion health checks. They can be used like any other Data Source in your Workspace, which means you can create API endpoints to monitor your ingestion processes.
Querying the ‘tinybird.datasources_ops_log’ directly, you can, for example, list your ingest processes during the last week:
This query calculates the percentage of quarantined rows for a given period of time:
This query monitors the average duration of your periodic ingestion processes for a given Data Source:
If you want to configure or build an external service that monitors these metrics, you just need to create an API endpoint and raise an alert when passing a threshold. When you receive an alert, you can check the Quarantine Data Source or the Operations log to see what’s going on and fix your source files or ingestion processes.
Monitoring API Endpoints¶
You can use the Service Data Sources ‘pipe_stats’ and ‘pipe_stats_rt’ to monitor the performance of your endpoints.
Every request to a pipe is logged to ‘tinybird.pipe_stats_rt’ and kept in this Data Source for the last week. This example API endpoint aggregates the statistics for each hour for the selected pipe.
‘pipe_stats’ contains statistics about your Pipe endpoints API calls aggregated per day using intermediate states.
API endpoints such as these can be used to raise alerts for further investigation whenever statistics pass certain thresholds.
To see how Pipes and Data Sources health can be monitored in a dashboard have a look at the blog Operational Analytics in Real Time with Tinybird and Retool.