We're excited to announce our partnership with Ghost, an independent and open-source publishing platform used by both independent bloggers and some of the world's biggest publishers. Combined, publishers have generated over $100M in revenue via Ghost's open-source platform.
With the release of Ghost 6.0, publishers now get access to real-time, multi-channel web analytics powered by Ghost's Tinybird integration. We've partnered with Ghost to offer this for both Ghost(Pro) and self-host users, for free.
How Ghost built real-time analytics for publishers with Tinybird
Ghost 6.0 is packed with many new features for publishers. From our perspective, the most notable of these is Ghost Analytics - detailed, first-party analytics directly in the product.
While Ghost users will appreciate the fully-integrated analytics experience - no setup required - we wanted to spend some time to dig into the technical bits of how Ghost used Tinybird to deliver a powerful, real-time analytics experience in their large-scale, distributed open source app (even when you choose to self-host).
Scalable and portable analytics for open source projects
Building analytics for an open source platform like Ghost presents several technical constraints that standard analytics providers don't address. Ghost's requirements included:
- Scale: Ghost processes hundreds of millions of pageviews every month across thousands of publications
- Deployment flexibility: Support both Ghost(Pro)'s managed hosting and self-hosted installations
- Performance: Real-time analytics that don't impact core publishing functionality
Most analytics solutions require choosing between hosted SaaS (simple integration, vendor lock-in) or self-hosted infrastructure (data control, operational complexity). Ghost needed to support both deployment models with the same feature set.
Tinybird's architecture addresses this by providing identical APIs and functionality whether running in the cloud or self-hosted via Docker containers - and Tinybird gave Ghost the power of a fast analytics database with a great developer experience, both for their own developers and future self-hosters.
Ghost's hybrid architecture approach
Ghost stores site, member, and post metadata in MySQL. Tinybird doesn't replace Ghost's MySQL infrastructure; rather, it fills a need that MySQL can't address at scale as a part of a hybrid architecture:
- MySQL: Handles content, members, payments, and business logic with ACID compliance
- Tinybird: Processes high-volume page view events and provides real-time analytics
This approach allows Ghost to maintain their existing data models while adding analytics capabilities. The systems remain loosely coupled, with data correlation handled through UUIDs.
// Ghost correlates data across systems using UUIDs
{
// Tinybird: Real-time page views
post_uuid: "post-456-uuid",
member_uuid: "member-123-uuid",
pageviews: 1247,
// MySQL: Member attribution and business data
attribution_id: "post_456", // posts.id in MySQL
member_status: "paid",
signup_date: "2024-01-01"
}
This design means analytics can be added incrementally without requiring changes to existing data models or application logic.
Privacy-first data collection without streaming infrastructure
Ghost's analytics implementation starts with a client-side tracking script that captures page views and user interactions. These events are then processed server-side before being forwarded to the Tinybird /events
API endpoint, keeping data collection entirely first-party.
// Core tracking event structure
{
"timestamp": "2024-01-01T12:00:00Z",
"action": "page_hit",
"payload": {
"site_uuid": "ghost-site-uuid",
"member_uuid": "member-uuid-if-logged-in",
"member_status": "free|paid|comped",
"post_uuid": "post-uuid-if-content-page",
"location": "US", // Country from timezone, not IP
"parsedReferrer": {
"source": "twitter.com"
}
}
}
Data ingestion uses a direct HTTP POST to Tinybird's Events endpoint. This avoids the need for message queues or complex ETL pipelines:
// Direct ingestion to Tinybird
fetch(`${tinybird_host}?name=analytics_events&token=${token}`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(eventData)
});
Tinybird handles data validation, partitioning, and indexing automatically. Events become available for querying within milliseconds of ingestion.
This approach reduces infrastructure complexity compared to typical analytics architectures that require multiple components for data collection, processing, and storage. Since event capture is initiated by the client, it's easily portable to self-hosted instances without the need for additional infrastructure.
Real-time enrichment with materialized views
Once data reaches Tinybird, it's immediately available for querying. However, raw event data requires processing for meaningful analytics: sessions, traffic performance, sources, and geographic insights.
Ghost defines these transformations as materialized views that update automatically as new data arrives:
-- Transform raw events into structured analytics
SELECT
timestamp,
site_uuid,
session_id,
JSONExtractString(payload, 'member_uuid') as member_uuid,
JSONExtractString(payload, 'post_uuid') as post_uuid,
-- Device classification
CASE
WHEN user_agent LIKE '%Mobile%' THEN 'Mobile'
WHEN user_agent LIKE '%Tablet%' THEN 'Tablet'
ELSE 'Desktop'
END as device,
-- Referrer normalization
CASE
WHEN referrer IN ('x.com', 'twitter.com') THEN 'Twitter'
WHEN referrer LIKE '%facebook%' THEN 'Facebook'
ELSE domainWithoutWWW(referrer)
END as source
FROM analytics_events
WHERE action = 'page_hit'
This materialized view processes each incoming event, adding device classifications, normalized referrer sources, and structured data optimized for analytics queries within Ghost's specific domain.
Ghost can query session metrics, device breakdowns, and traffic source analysis directly from this processed view without managing stream processing infrastructure. The materialized view approach replaces what would typically require multiple processing components.
API integration and authentication in multi-tenant environment
Ghost needed to expose analytics data through their existing API while maintaining security and performance. Rather than exposing Tinybird tokens directly, Ghost generates scoped JWT tokens that limit access to specific analytics endpoints for specific sites:
// Ghost generates scoped tokens for each publication
const payload = {
workspace_id: workspaceId,
exp: Math.floor(Date.now() / 1000) + (180 * 60), // 3 hours
scopes: TINYBIRD_PIPES.map(pipe => ({
type: 'PIPES:READ',
resource: pipe,
fixed_params: {
site_uuid: siteUuid // Locked to this publication only
}
}))
};
This token scoping ensures each Ghost publication can only access its own analytics data. The JWT approach provides fine-grained access control in a multi-tenant system while allowing Tinybird to handle query execution and performance optimization.
Ghost built specialized API endpoints for different analytics use cases:
- Site-wide KPIs: Visits, pageviews, bounce rate, session duration
- Real-time data: Current active visitors
- Breakdowns: Top browsers, devices, locations, pages, referrer sources
- Post analytics: Performance metrics for individual articles
Each endpoint maps to a Tinybird pipe (SQL query) that Ghost exposes through their API. This eliminates the need for complex aggregation logic in the application layer. Endpoints are dynamic - defined in SQL with query parameters - allowing flexible query patterns with minimal configuration:
TOKEN "stats_page" READ
NODE _top_pages_0
SQL >
%
select
case when post_uuid = 'undefined' then '' else post_uuid end as post_uuid,
pathname,
uniqExact(session_id) as visits
from _mv_hits h
inner join filtered_sessions fs
on fs.session_id = h.session_id
where
site_uuid = {{String(site_uuid, 'mock_site_uuid', description="Tenant ID", required=True)}}
{\% if defined(date_from) %}
and toDate(toTimezone(timestamp, {{String(timezone, 'Etc/UTC', description="Site timezone", required=True)}}))
>=
{{ Date(date_from, description="Starting day for filtering a date range", required=False) }}
{\% else %}
and toDate(toTimezone(timestamp, {{String(timezone, 'Etc/UTC', description="Site timezone", required=True)}}))
>=
timestampAdd(today(), interval -7 day)
{\% end %}
{\% if defined(date_to) %}
and toDate(toTimezone(timestamp, {{String(timezone, 'Etc/UTC', description="Site timezone", required=True)}}))
<=
{{ Date(date_to, description="Finishing day for filtering a date range", required=False) }}
{\% else %}
and toDate(toTimezone(timestamp, {{String(timezone, 'Etc/UTC', description="Site timezone", required=True)}}))
<=
today()
{\% end %}
{\% if defined(member_status) %}
and member_status IN (
select arrayJoin(
{{ Array(member_status, "'undefined', 'free', 'paid'", description="Member status to filter on", required=False) }}
|| if('paid' IN {{ Array(member_status) }}, ['comped'], [])
)
)
{\% end %}
{\% if defined(device) %} and device = {{ String(device, description="Device to filter on", required=False) }} {\% end %}
{\% if defined(browser) %} and browser = {{ String(browser, description="Browser to filter on", required=False) }} {\% end %}
{\% if defined(os) %} and os = {{ String(os, description="Operating system to filter on", required=False) }} {\% end %}
{\% if defined(location) %} and location = {{ String(location, description="Location to filter on", required=False) }} {\% end %}
{\% if defined(pathname) %} and pathname = {{ String(pathname, description="Pathname to filter on", required=False) }} {\% end %}
{\% if defined(post_uuid) %} and post_uuid = {{ String(post_uuid, description="Post UUID to filter on", required=False) }} {\% end %}
{\% if defined(post_type) %}
{\% if post_type == 'post' %}
and post_type = 'post'
{\% else %}
and (post_type != 'post' or post_type is null)
{\% end %}
{\% end %}
group by post_uuid, pathname
order by visits desc
limit {{ Int32(skip, 0) }},{{ Int32(limit, 50) }}
TYPE ENDPOINT
Supporting both hosted and self-hosted deployments
Ghost needed analytics that worked identically whether users chose Ghost(Pro)'s managed hosting or decided to self-host on their own infrastructure.
Tinybird provides both deployment models with the same API, features, and performance characteristics.
For Ghost(Pro) users: Analytics work out of the box with Tinybird's managed cloud, integrated into the Ghost admin interface.
For self-hosters: Ghost includes everything needed to run analytics locally:
# Clone the Ghost repo
git clone https://github.com/TryGhost/ghost-docker.git /opt/ghost
cd /opt/ghost
cp .env.example .env
# Set up Tinybird: login, sync resources, deploy, and get environment tokens
docker compose run --rm tinybird-login
docker compose run --rm tinybird-sync
docker compose run --rm tinybird-deploy
docker compose run --rm tinybird-login get-tokens # -> add tokens to .env
# Install Ghost with Tinybird
docker compose pull
docker compose up -d
You can learn more about the enabling web analytics for Ghost 6.0 via Docker in Ghost's documentation.
The self-hosted setup includes the same materialized views, API endpoints, and real-time processing as the cloud version. This provides feature parity between deployment models. Ghost self-hosters can choose to use Tinybird's managed cloud infrastructure or use Tinybird self-managed regions for complete independence.
This approach allows open source projects to offer analytics to both hosted and self-hosted users using the same underlying technology stack.
Configuration flexibility for development and production
Ghost's analytics configuration demonstrates how to build systems that work across different environments without code changes. The configuration structure supports both cloud and local Tinybird instances:
// ghost/core/config.local.json
{
"tinybird": {
"workspaceId": "tb_workspace_123",
"adminToken": "tb_admin_token_xyz",
"tracker": {
"endpoint": "http://localhost:3000/tb/web_analytics"
},
"stats": {
"id": "custom-site-uuid-override", // Optional for testing
"endpoint": "https://api.tinybird.co",
"local": {
"enabled": true,
"token": "local-stats-token",
"endpoint": "http://localhost:8123"
}
}
}
}
This configuration structure provides several capabilities:
- Environment switching: Toggle between cloud and local Tinybird instances
- Development overrides: Use different site UUIDs for testing with isolated datasets
- Token hierarchy: JWT tokens for production, fallback to stats tokens for development
- Endpoint flexibility: Easy switching between production and development environments
Ghost's service layer automatically detects the configuration and routes requests appropriately. This means developers can work with local Tinybird instances during development, then deploy to production using cloud Tinybird without any code changes.
The configuration also supports mixed deployments where some Ghost instances use cloud Tinybird while others use self-hosted instances, all managed through the same codebase.
Results and performance at scale
Ghost now processes hundreds of millions of pageviews every month through this Tinybird-powered analytics system, delivering real-time insights to thousands of publications.
Key results include:
- Real-time analytics for sites processing millions of monthly pageviews
- Sub-second query response times for analytics breakdowns
- Minimal infrastructure overhead for self-hosted installations
- Data portability with the ability to migrate or export data
- Consistent scaling from small blogs to major publications
The system handles Ghost's scale while remaining operable on modest hardware for self-hosted installations. Tinybird's ClickHouse-based architecture provides this scaling flexibility.
Implementation patterns for developers
Ghost's implementation with Tinybird demonstrates several useful patterns for applications that need analytics:
- Start simple: HTTP POST for ingestion, SQL for transformations, APIs for consumption
- Scale gradually: Begin with hosted services, migrate to self-hosted when needed
- Maintain compatibility: Same APIs work whether hosted or self-hosted
- Focus on features: Minimize infrastructure complexity for analytics
This architecture allows applications to provide analytics capabilities without choosing between ease of use and data ownership. Both hosted and self-hosted deployments can use the same feature set.
The analytics implementation is open source and available in the Ghost repository on GitHub. Developers interested in similar implementations can examine Ghost's analytics integration or explore Tinybird's documentation for their own projects.
If you want to get started self-hosting Ghost with Tinybird, check out the Ghost Docker install guide.