Back

How Fever Made Real-Time Partner Reporting Reliable Under Load with Tinybird

Learn how one of the world's largest live entertainment platforms replaced a fragile real-time ingestion pipeline with Tinybird, eliminating the inverse scaling problem where better business meant worse reporting.

About the company

Fever is a leading global live-entertainment discovery platform that helps millions of people find unique experiences such as cultural events, concerts, exhibitions, and local activities in hundreds of cities worldwide.

300TB
processed per month
1.5M
requests per month
<50ms
p99 latency

Before, adding a new column to a materialized view or changing one in production was terrifying, it could break everything… now Tinybird does it all automatically. You add the column, move the data with a forward query, everything works, nothing stops, it simply flows.

José Virtudes

Data Platform Engineer at Fever

Problem

Fever's data platform team had been chasing real-time reporting for years. They tried shrinking Snowflake batch windows, then moved to ClickHouse® Cloud with CDC via Kafka — but under load during popular event sales, latency ballooned from 3-4 seconds to 2+ minutes.

The system was also fragile: a single malformed record could halt ingestion entirely, and schema changes were "terrifying." The platform team became a bottleneck. As Carlos Sánchez Páez put it: "The better the business is doing, the worse the sales reporting gets."

Why Tinybird

Two capabilities drove Fever's decision:

Kafka ingestion that actually scales. Tinybird handles CDC consumption directly with managed connectors. No more wrestling with MSK configurations or inverse performance under load. Built-in API layer. Beyond dashboards, Fever wanted to expose real-time data through APIs so partners could integrate directly into their own systems.

Additional factors: Tinybird's quarantine system isolates bad records automatically instead of killing the pipeline, and the git-based workflow enables PR-based schema changes with CI validation eliminating manual DDL work.

Results

Even without hard metrics, the state change is clear:

  • Latency stability under load. No longer degrades during peak on sales. Partners see fresh data exactly when it matters most.
  • Failure mode changed. Bad rows get quarantined instead of killing the pipeline. No more 3am scrambles to restart ingestion.
  • Cost scaling behavior changed. No per-topic connector tax. Infrastructure complexity and cost doesn't grow linearly with the data model.
  • Developer workflow changed. PR-based schema changes with CI validation. Less manual DDL and materialized view babysitting.
  • Roadmap unlocked. Partner-facing APIs, automation webhooks, and federation across internal teams all become feasible.

Tinybird x Fever

Partners refresh dashboards during on sales and immediately distrust stale numbers

Fever is the world’s leading tech platform for discovering culture and live entertainment, inspiring over 300 million people last year to discover the best experiences in over 40 countries.

A core part of Fever's business is partner reporting. Event organizers, venues, and promoters need visibility into ticket sales, attendance, and event performance. During an on sale for a popular event, partners are refreshing dashboards constantly. If the numbers look stale or lag behind what they're hearing from other channels, trust erodes immediately. Real-time isn't a technical preference; it's table stakes for partner confidence.

Fever's data platform team, led by Carlos Sánchez Páez, had been chasing real-time reporting for years. They started with Snowflake batch processing, shrinking batch windows from hourly to 30 minutes to 15 minutes to 5 minutes. But batch is batch. You're never going to hit real-time by making batches smaller.

The approach also had two fatal flaws. First, when events sold well, the batch jobs took longer to process, limiting the frequency and increasing the processing cost. Second, any schema change to the transactional database required careful coordination: someone had to update the extraction logic, modify the Snowflake tables, adjust the transformations, and hope nothing broke. Fear of breaking the pipeline meant changes moved slowly, and the platform team became a bottleneck.

Fever website

Before Tinybird, the better the business was doing, the worse the sales reporting got.

Carlos Sánchez Páez

Data Platform Engineer at Fever

The team moved to Clickhouse® Cloud with Change Data Capture (CDC) via Kafka specifically, Amazon MSK. Initial results looked promising. Latency dropped to 3-4 seconds between the transactional database and the analytics layer.

Then a popular event started selling.

The bottleneck wasn't Clickhouse's query engine; it was the ingestion path

Under load, the 3-4 second latency ballooned to 2 minutes or more. The problem wasn't Clickhouse itself. It was the MSK connector sitting between Kafka and Clickhouse, and how it behaved under Fever's throughput patterns.

Carlos describes the core issue: "The better the business is doing, the worse the sales reporting gets.” During peak on sales, exactly when partners most needed fresh data, the connector tasks fell behind. Lag accumulated. Adding capacity didn't reduce lag fast enough to matter. The auto-scaling that was supposed to handle spikes didn't keep up with peak throughput.

The system was also fragile. If the connector’s error handling and DLQ were misconfigured, a single null value in a non-nullable field could crash it and halt ingestion for every table feeding through it. Carlos puts it simply: "You have all of ingestion down. At what time are you going to fix that?" The on-call scramble to diagnose which record caused the failure, fix the data, and restart ingestion was exactly the kind of 3am incident that erodes team morale.

To address both the throughput bottleneck and the fault isolation problem, they found themselves needing to scale the architecture by creating isolated connectors for their most critical tables. That limited the blast radius when something went wrong, but it also meant that costs and operational complexity grew with their data model. Every new table that needed real-time data meant another connector to configure, monitor, and pay for.

The MSK connector also wasn't Fever's area of expertise. They spent time tuning configurations, but the fundamental scaling behavior remained problematic.

Schema changes carried the same fear. Adding a column meant manually creating tables, setting up materialized views, triggering initial loads, and hoping the ingestion didn't break during the transition. In Carlos's words, it was "terrifying." The platform team became a chokepoint for any data model evolution.

Fever ran this architecture for roughly two years. It worked for one or two use cases with careful attention, but expanding real-time reporting across the organization wasn't going to happen with infrastructure that failed at the worst possible moments.

Two things brought Fever to Tinybird: Kafka ingestion and APIs

Carlos is direct about what drove the decision: "The main reasons we contacted you were, 1) being able to consume from Kafka properly with auto-scaling that actually works, and 2), we need APIs."

The Kafka piece addressed the core operational pain. Tinybird handles the CDC consumption directly, with connectors managed and tuned by Tinybird's team. No more wrestling with MSK connector configurations. No more linear cost scaling as the data model grows. No more inverse performance under load.

The API layer unlocked Fever's product roadmap. Beyond partner dashboards, they want to expose the same real-time data through APIs so partners can integrate directly into their own systems. A partner triggering automations based on ticket sales, or pulling live event data into their own analytics, becomes possible when your data platform can serve low-latency APIs at scale.

Fever diagram

Before: null kills ingestion. After: quarantine isolates bad rows

One of the biggest operational improvements is how Tinybird handles data quality issues.

With the MSK Connector, a single malformed record could halt the entire ingestion pipeline. The system would stop until someone diagnosed the problem, found the offending record, fixed or removed it, and restarted ingestion. During that window, partner dashboards showed stale data, and the on-call engineer was under pressure to restore service.

Tinybird's quarantine system changes that equation. Bad records get isolated automatically. The system keeps running. The team gets visibility into what failed and why. They can fix the issue on their own timeline, decide whether to discard or repair the records, and move on. No production outages from data quality edge cases.

Carlos and the team built observability around their Tinybird deployment using Datadog. They monitor lag per Kafka topic, track how latency changes over days and weeks, watch requests per token, and run queries comparing their transactional database timestamps against Tinybird table timestamps to measure end-to-end delay. Combined with traffic-based alarms and query volume monitoring, they can quickly distinguish between "it's quiet" and "something is broken."

Fever UI

Before: schema change blocks teams. After: PR-based deploys

With Clickhouse Cloud, every schema change required manual intervention from the platform team. Someone had to run CREATE TABLE, set up materialized views, trigger initial loads, and validate nothing broke. The platform team reviewed every change, ran manual validation, and managed the deployment. Teams waited. Changes moved slowly.

With Tinybird's git-based workflow, schema changes flow through pull requests. The CI pipeline validates changes automatically and provides information about what each PR will affect. Developers can see exactly what will change before it hits production.

A new team member can create a Jira ticket, submit a PR with a new data source, get it reviewed, and deploy without the platform team doing manual work. The platform team approves changes based on CI output rather than running their own validation.

Before, adding a new column to a materialized view or changing one in production was terrifying, it could break everything… now Tinybird does it all automatically. You add the column, move the data with a forward query, everything works, nothing stops, it simply flows.

José Virtudes

Data Platform Engineer at Fever

What changed

Even without hard metrics, the state change is clear:

  • Latency stability under load. No longer degrades during peak on sales. Partners see fresh data exactly when it matters most.
  • Failure mode changed. Bad rows get quarantined instead of killing the pipeline. No more 3am scrambles to restart ingestion.
  • Cost scaling behavior changed. No per-topic connector tax. Infrastructure complexity and cost doesn't grow linearly with the data model.
  • Developer workflow changed. PR-based schema changes with CI validation. Less manual DDL and materialized view babysitting.
  • Roadmap unlocked. Partner-facing APIs, automation webhooks, and federation across internal teams all become feasible.

Building toward federation

Fever's 2026 roadmap centers on what they call "federation": opening up the real-time data platform so teams across the organization can build their own use cases without the data platform team being a bottleneck.

The old model was a funnel. You want real-time data? Submit a ticket. Wait for the platform team to review. Wait for them to build it. Wait for deployment. That model doesn't scale when you want dozens of teams experimenting with real-time capabilities.

The new model is self-service. Developers get access to the platform. They can explore data, build queries, submit PRs, and deploy. The platform team shifts from gatekeeper to enabler.

The main blocker right now is workload isolation. With everyone working against the same production system, there's the risk that someone runs an expensive query and impacts the real-time APIs that partners depend on. Once workload scheduling capabilities allow development workloads to run on isolated resources, the gates open. Any team at Fever could spin up a real-time use case without risking production performance.

The vision extends to partners as well. Today, partners view dashboards. Tomorrow, they could consume APIs directly, receiving webhooks when tickets sell, pulling live event data into their own systems, and automating their operations based on real-time Fever data.

From inverse scaling to infrastructure that keeps up

Fever's journey is a common one: start with batch, shrink the windows, hit the wall, move to streaming, struggle with infrastructure complexity, and finally land on a managed platform that handles the hard parts.

The difference with Tinybird wasn't just performance. It was the combination of managed Kafka ingestion that scales under load, data quality handling that isolates failures instead of amplifying them, a CI/CD workflow that developers want to use, and the API layer that unlocks new product possibilities. Together, these let Fever stop fighting their infrastructure and start building the real-time partner experience they envisioned.

For a global entertainment platform where ticket sales happen 24/7 across dozens of markets, real-time reporting isn't a nice-to-have. When business gets better, reporting should get better too.

Share this story!

Ship faster with Tinybird

The data infrastructure and tooling to ship enterprise grade analytical features faster and at a fraction of the cost

Try it for free