processed data per month
rows ingested per day
of datasets
Keyrock is one of the key market makers and liquidity providers in the global cryptocurrency space. They work with more than 85 of the biggest marketplaces in the world to improve liquidity and enable both institutions and traders to execute trades and investments via both centralized and decentralized trading platforms. With a focus on scalable infrastructure, they’re able to provide liquidity to cryptocurrency markets at an exceptionally low marginal cost.
Being a cryptocurrency market maker means a lot of data. For Juan Villegas, one of the cofounders at Keyrock, the challenge was that not all trades were saved into their existing data lake. Some trades were missing, and that meant that Keyrock didn’t have a complete picture of the cryptocurrency market movements. But, missing trades were just the beginning. Trading data was siloed, and that meant that there was no single source of truth. As a result, different teams at Keyrock had begun developing their own data mechanisms for handling trading data. They needed all trades to be saved in a single source that Keyrock’s analysts could access and use to make better decisions, and in turn outperform the markets.
On top of the challenge of siloed and missing data, the volume of data meant that it was taking hours to prototype new models. In a time-critical industry, the inability to rapidly create, test and refine prototypes to validate an assumption means you’re losing out to the competitor. The data analyst team also wanted the freedom to experiment with that trading data in a safe environment, and propose changes which could easily be rolled back.
To address these challenges, the data engineering team had custom built their own data lake which fed into Grafana. A data engineer had been working full-time for six months to make the trading data realtime and accessible, but it still wasn’t enough. Juan had come to the realization that in order to build new data pipes and iterate at the speed they needed, he'd need to replatform. He made it a core company OKR to have a single centralized source of data across the entire company by the end of the quarter.
The search for a new solution began with Databricks and Snowflake. The team ran several POC trials. Snowflake wasn’t quite fit for purpose, and the challenge with Databricks was that there was a steep learning curve, a lack of flexibility, and the solution was difficult to integrate with Grafana. Grafana is the best observability solution for the data analyst team, and so the search continued.
When the team came across Tinybird, they fell in love with it instantly. Within a single quarter they’ve centralized their trading data, and made it all accessible in real-time. Juan has seen a massive gain in efficiency, and it’s much easier to experiment now with their trading data. Tinybird gave analysts the freedom to build their own pipelines to validate an assumption, yet the data engineering team can still maintain governance and oversight in terms of what goes into production. It’s also fast. All trades are now materialized at ingestion and instantly available for the data analyst team to experiment with.
The data analysts are happy as Tinybird also works with Grafana and their existing tools. Even though Tinybird was a more compact solution than Snowflake or Databricks, it seamlessly connected with Keyrock’s existing tools - all the benefits of a relational database without the complexity of other solutions and without having to worry about maintaining and scaling infrastructure. Beyond that, they saw Tinybird as a future proof solution. Juan isn’t sure what other use cases they’ll want to pursue with their data in the future, but it’s reassuring to know they can add other tools without impacting their processes, as well as tackle new use cases simply with SQL.
1
Initially Keyrock was uploading the data in CSV format through Tinybird’s /datasources endpoint using a Lambda function, but now they moved to High Frequency Ingestion via the /events endpoint for all their data sources.
2
Keyrock uses API endpoints and the BI Connector to power their Grafana dashboards.
3
Keyrock ingests data in their production workspace. Then, the data sources are shared to development workspaces for their teams to develop new features. Once a new use case is validated, each developer is responsible for testing the quality and performance of their own data sources, pipes, and endpoints and then pushing the changes to the production workspace.