In this guide, we'll show you how to build a production-ready analytics agent using the Agno framework and Tinybird's real-time analytics platform. By the end, you'll have an agent that can answer complex data questions, investigate performance issues, and deliver its findings through multiple channels.
Why Agno and Tinybird?
Agno is a Python framework designed for building AI agents with specific performance characteristics:
- Performance: Agents instantiate in approximately 3 microseconds and use 50x less memory than alternatives like LangGraph
- Model flexibility: Works with 23+ LLM providers including Claude, Gemini, and OpenAI
- Multi-modal support: Native support for text, images, audio, and video inputs/outputs
- Built-in reasoning: First-class support for reasoning models and chain-of-thought approaches
- Production features: Includes memory, storage, structured outputs, and monitoring capabilities
Tinybird provides the data infrastructure for agents with:
- Query performance: Sub-second query responses at scale
- MCP Server: Remote, hosted Model Context Protocol server that securely exposes workspace as AI-accessible tools
- SQL compatibility: Full SQL support with advanced analytics functions
- API generation: Every query becomes a documented REST API endpoint automatically. Natural language descriptions help agents discover these APIs as tools and use them to fetch data.
Together, they create a robust foundation for building analytics agents that can explore data, detect anomalies, and provide intelligent insights.
When building AI agents, you have several options for connecting to external services and data sources. For analytics use cases, it's important to differentiate between traditional text-to-sql MCP tools and API endpoint tools. Each provides tradedoffs for agent-to-data connectivity. Read more about MCPs vs APIs for agentic analytics.
Step 1: Set up a Tinybird workspace
First, let's set up your data infrastructure. Tinybird will serve as the analytics backend that your agent queries.
If you already have data in Tinybird that you want to work with, you can skip this section. If you have data in other places that you want to get into Tinybird, check out the ingest guides for more details on how to get data into Tinybird.
If you don't have data in Tinybird and want to work with an example, follow along.
Create your workspace
First, create a Tinybird workspace.
Install the Tinybird CLI:
curl https://tinybird.co | sh
Authenticate with your workspace (or create a new one).
tb login # A browser window will open for you to login and select/create a workspace
Start the local Tinybird development server (note you'll need a Docker runtime).
tb local start
Initialize an empty workspace.
tb create
You're now ready to start building a data project.
Create a data source
Tinybird data sources define the schema and source connections (if applicable) for your data.
For this example, we'll create a data source to store web analytics events. Create a file called events.datasource
in the datasources directory
:
cd datasources && touch events.datasource
Paste the contents below into the file and save:
SCHEMA >
`timestamp` DateTime,
`session_id` String,
`user_id` String,
`event_type` String,
`page_url` String,
`referrer` String,
`duration_ms` Int32,
`country` String
ENGINE "MergeTree"
ENGINE_PARTITION_KEY "toYYYYMM(timestamp)"
ENGINE_SORTING_KEY "timestamp, session_id"
Deploy the data source to your local environment:
tb deploy
Generate some data
Use the Tinybird CLI's mock data generation feature to populate your data sources with realistic sample data:
tb mock events --rows 100000 --prompt "Generate timestamps within the last 30 days. All pages should have a tinybird.co domain. Use a mix of referrers including Google, X/Twitter, LinkedIn, HackerNews, ChatGPT, and Perplexity. Use pageview, click, or form_submit as event types."
The tb mock
command automatically generates sample data that matches your data source schema. This is much faster than writing custom data generation scripts and ensures the data types and formats are correct. Use the --prompt
flag to further refine the mock data, for example:
- Realistic timestamps distributed over recent time periods
- Common page URLs and referrer patterns
- Geographic distribution of users
- Varied session durations and page view counts
The command will generate two files in the fixtures/
folder:
- A
.sql
file which defines the data generation' - A
.ndjson
file containing the produced mock data
Append the generated mock data to your data source:
tb datasource append events fixtures/events.ndjson
Create API endpoints
You now have data in Tinybird. The next step is to create a few SQL-based API endpoints that your agent can use.
Now, in theory, you could proceed without creating API endpoints and simply allow the agent to generate its own SQL queries. However, predefined API endpoints can help reduce latency and increase deterministic responses to common prompts, so let's create a couple analytical endpoints that the agent can query.
Create top_pages.pipe
:
DESCRIPTION >
'Returns the top N pages by page_view count within a specific date range'
NODE top_pages_node
SQL >
SELECT
page_url,
COUNT() as visits
FROM events
WHERE timestamp >= {{DateTime(date_from, '2024-01-01')}}
AND timestamp < {{DateTime(date_to, '2024-01-02')}}
AND event_type = 'pageview'
GROUP BY page_url
ORDER BY visits DESC
LIMIT {{Int32(N, 10)}}
Note the plain text description; use this to help agents discover the APIs as tools. The more descriptive, the better.
Feel free to create additional API endpoints based on your use case. You can create them manually or use Tinybird's CLI:
tb create --prompt "Create an API endpoint to return the top pages by clicks. Include date_from and date_to parameters, as well as an N parameter which limits the number of results."
tb create --prompt "Create a materialized view to calculate session metrics based on session_id. The resulting data source should store session_id, referrer, utm parameters for the session, session duration, total pageviews, and arrays containing all pages viewed and all actions taken"
Deploy your resources
Deploy your data sources and endpoints to Tinybird local:
tb deploy
You now have an active Tinybird workspace with data and API endpoints running on your localhost.
For production workloads with larger datasets, consider reviewing the Tinybird optimization guide to ensure optimal query performance.
Step 2: Instantiate an agent with Agno
Now let's set up the Agno framework to create our analytics agent.
Install dependencies
Create a new Python project and install the required packages:
python -m venv venv
source venv/bin/activate
pip install -U agno python-dotenv
Set up a basic agent
Create your first analytics agent in analytics_agent.py
:
from agno.agent import Agent
from agno.models.anthropic import Claude
from agno.models.google import Gemini
from agno.tools.mcp import MCPTools
from dotenv import load_dotenv
import os
import asyncio
load_dotenv()
async def create_analytics_agent():
# Choose your model (Gemini for larger context, Claude for reasoning)
model = Gemini(
id="gemini-2.5-flash",
vertexai=True,
project_id=os.getenv("GOOGLE_CLOUD_PROJECT"),
location=os.getenv("GOOGLE_CLOUD_LOCATION", "us-central1")
)
# Alternative: Use Claude for strong reasoning
# model = Claude(id="claude-4-sonnet-20250514")
agent = Agent(
model=model,
role="Analytics Data Analyst",
description="""You are an expert data analyst with access to web analytics data.
You can explore data, identify trends, and provide actionable insights.""",
instructions=[
"Always provide specific metrics and timeframes in your analysis",
"When exploring data, explain what the numbers mean for the business",
"Include relevant context about data patterns and anomalies"
],
markdown=True
)
return agent
async def main():
agent = await create_analytics_agent()
async:
# Test the agent
await agent.aprint_response(
"Hi, I'm Cameron. How are you?",
stream=True
)
if __name__ == "__main__":
asyncio.run(main())
Make sure to create a .env
file with your API key for whatever AI model providers you're using.
This is a very basic agent without any tool access. We need to give it access to our Tinybird MCP Server so it has the tools it needs to explore the data, generate SQL queries, execute those queries, and call our published API endpoints.
Step 3: Connect to the Tinybird MCP server
The Tinybird MCP Server automatically exposes your workspace resources as tools that your agent can use.
The MCP server provides several key capabilities:
- Role-based access control: Access to the MCP Server is secured with a Tinybird token, so agents can only access data within the token scopes. You can use JWTs to offer fine-grained, role-based access to the Tinybird MCP Server.
- Automatic tool discovery: Your Tinybird endpoints become callable functions that agents can use to fetch pre-calculated metrics.
- Data exploration: The built-in
explore_data
tool allows the agent to spawn a server-side subagent with understanding data and developing queries. - Text-to-SQL: Generate SQL queries with schema context.
- SQL execution: Directly query the data sources.
- Observability: List and query service data sources (those that contain your workspace logs) for debugging and performance monitoring.
You can learn more about available tools in the Tinybird MCP Server documentation.
Configure the Tinybird MCP Server
Let's give our agent access to the tools made available via the Tinybird MCP Server.
from agno.agent import Agent
from agno.models.anthropic import Claude
from agno.tools.mcp import MCPTools
from agno.tools.slack import SlackTools
from agno.tools.resend import ResendTools
import os
import json
from datetime import datetime
async def create_advanced_analytics_agent():
# Tinybird MCP configuration
tinybird_api_key = os.getenv("TINYBIRD_TOKEN")
server_url = f"https://mcp.tinybird.co?token={tinybird_api_key}"
mcp_tools = MCPTools(
transport="streamable-http",
url=server_url,
timeout_seconds=300
)
# Choose your model (Gemini for larger context, Claude for reasoning)
model = Gemini(
id="gemini-2.5-flash",
vertexai=True,
project_id=os.getenv("GOOGLE_CLOUD_PROJECT"),
location=os.getenv("GOOGLE_CLOUD_LOCATION", "us-central1")
)
# Alternative: Use Claude for strong reasoning
# model = Claude(id="claude-4-sonnet-20250514")
agent = Agent(
model=model,
tools=[mcp_tools],
role="Senior Analytics Agent",
name="DataExplorer",
description="""You are a senior data analyst with deep expertise in web analytics,
user behavior analysis, and business intelligence. You have access to real-time
analytics data through Tinybird and can explore data to answer complex questions.""",
instructions=[
"Use the explore_data tool for complex analytical questions",
"Always include timeframes and specific metrics in your analysis. Default to the last 24 hours unless a time frame is provided.",
"Provide business context and actionable recommendations.",
"When detecting anomalies, investigate potential root causes.",
"Format responses clearly with key findings highlighted."
],
markdown=True,
show_tool_calls=True,
debug_mode=True
)
return agent, mcp_tools
# Usage example
async def main():
agent, mcp_tools = await create_advanced_analytics_agent()
async with mcp_tools:
await agent.aprint_response(
"""Investigate any unusual traffic patterns in the last 24 hours.
Look for spikes in page views, unusual referrer patterns, or
geographic anomalies. Provide a summary with key metrics.""",
stream=True
)
Understand MCP tool capabilities
The Tinybird MCP Server provides several tools automatically:
- explore_data: Natural language data exploration
- list_endpoints: See available API endpoints
- execute_query: Run direct SQL queries
- [endpoint_name]: Direct access to each of your Tinybird endpoints
Example of using specific endpoint tools:
# The agent can call your endpoints directly
await agent.aprint_response(
"Use the top_pages endpoint to show me the most popular pages today"
)
# Or use natural language exploration
await agent.aprint_response(
"Explore user session data to find patterns in user engagement by country"
)
Step 4: Design system prompts and instructions
Effective prompt design is crucial for analytics agents (or any agent, for that matter). Here's how we recommend structuring prompts for different use cases.
Different language models have varying capabilities for SQL generation and analytical reasoning. If you're curious about which LLMs perform best for analytical SQL queries, this comparison can help inform your model selection.
Structure system prompts
Create a generalized system prompt to help the agent understand its core role:
SYSTEM_PROMPT = """
You are an expert data analyst for Tinybird metrics. You have MCP tools to get schemas, endpoints and data. The explore_data tool is an agent capable of exploring data autonomously.
"""
You can then provide further instructional prompts, or "missions", depending on the contents of your Tinybird workspace and how you want the agent to analyze your data, for example:
MISSION_PROMPT = """
You are an expert data analyst for web analytics. You have access to real-time data
through Tinybird MCP tools including page views, user sessions, and behavioral data.
<role_definition>
Your role is to:
- Analyze web analytics data to provide actionable insights
- Investigate performance anomalies and traffic patterns
- Monitor key business metrics and identify trends
- Provide clear, data-driven recommendations
</role_definition>
<data_access_rules>
- Use explore_data for complex analytical questions requiring data exploration
- Use specific endpoint tools when you know the exact data needed
- Always include timeframes in your analysis (default to last 24 hours if not specified)
- Provide both raw metrics and business context
</data_access_rules>
<response_formatting>
- Structure responses with clear sections: Summary, Key Findings, Metrics, Recommendations
- Use tables for comparative data
- Highlight significant changes or anomalies
- Include data confidence levels when relevant
</response_formatting>
Current date and time: {current_date}
"""
Here's how you might define an agent with a static system prompt and flexible missions:
from agno.agent import Agent
from agno.models.anthropic import Claude
from agno.tools.mcp import MCPTools
from agno.tools.slack import SlackTools
from agno.tools.resend import ResendTools
import os
import json
from datetime import datetime
from prompts import SYSTEM_PROMPT
async def create_advanced_analytics_agent():
# Tinybird MCP configuration
tinybird_api_key = os.getenv("TINYBIRD_TOKEN")
server_url = f"https://mcp.tinybird.co?token={tinybird_api_key}"
mcp_tools = MCPTools(
transport="streamable-http",
url=server_url,
timeout_seconds=300
)
# Choose your model (Gemini for larger context, Claude for reasoning)
model = Gemini(
id="gemini-2.5-flash",
vertexai=True,
project_id=os.getenv("GOOGLE_CLOUD_PROJECT"),
location=os.getenv("GOOGLE_CLOUD_LOCATION", "us-central1")
)
mission="""
Your job is to analyze data in the Tinybird workspace to identify performance issues and make recommendations for how to fix them. Your goal is to effectively answer the user request.
<exploration_instructions>
- If list_service_datasources returns organization data sources, you must append "use organization service data sources" in the explore_data tool call, otherwise answer with an error message
- You MUST include a time filter in every call to the explore_data tool if not provided by the user in the prompt
- You MUST do one call to the explore_data tool per data source requested by the user
- Do not ask follow up questions, do a best effort to answer the user request, if you make any assumptions, report them in the response.
</exploration_instructions>
<output_instructions>
- Format output using Markdown
- Use clearly structured headers, text formatting, and lists
</output_instructions>
"""
agent = Agent(
model=model,
tools=[mcp_tools],
role="Senior Analytics Agent",
name="PerformanceInvestigator",
description=dedent(SYSTEM_PROMPT),
instructions=mission
markdown=True,
show_tool_calls=True,
debug_mode=True
)
return agent, mcp_tools
# Usage example
async def investigate_traffic_spike():
agent, mcp_tools = await create_advanced_analytics_agent()
async with mcp_tools:
await agent.aprint_response(
"""Investigate any unusual traffic patterns in the last 24 hours.
Look for spikes in page views, unusual referrer patterns, or
geographic anomalies. Provide a summary with key metrics.""",
stream=True
)
Finally, user prompts and conversations direct the actions of the analytics agent:
Show me the top 5 pages by pageviews in the last 30 days
Which of those pages has the highest form submission conversion rate?
Which referrers get me the most traffic to the top converting page?
And so on. The Agno framework allows the agent to retain memory and context from a user session. It responds to each user prompt within the context of its system prompt, mission prompt, and the tools available to it.
Step 5: Configure agent output options
Your analytics agent can deliver insights through multiple channels and even trigger new agents to process its findings. Here are a few options for output with implementation examples.
CLI output
The simplest option for development and testing:
import asyncio
from agno.agent import Agent
async def cli_analytics_agent():
agent, mcp_tools = await create_analytics_agent()
print("🤖 Analytics Agent CLI")
print("Ask questions about your data or type 'exit' to quit")
print("-" * 50)
async with mcp_tools:
while True:
try:
user_input = input("\n💬 Your question: ").strip()
if user_input.lower() == "exit":
print("👋 Goodbye!")
break
await agent.aprint_response(
user_input,
stream=True,
user_id="analyst"
)
except KeyboardInterrupt:
print("\n👋 Goodbye!")
break
except Exception as e:
print(f"❌ Error: {e}")
# Run with: python analytics_agent.py
if __name__ == "__main__":
asyncio.run(cli_analytics_agent())
Email reports
You can use an email provider like Resent to generate and send automated email reports:
from agno.tools.resend import ResendTools
async def email_analytics_report():
agent = Agent(
model=Claude(id="claude-4-sonnet-20250514"),
tools=[
MCPTools(transport="streamable-http", url=server_url),
ResendTools(from_email="analytics@company.com")
],
instructions=[
"Generate comprehensive HTML email reports",
"Include executive summary and key metrics",
"Use tables and visual formatting"
]
)
async with agent.tools[0]: # MCP tools
response = await agent.arun(
"""Generate a weekly analytics report with:
1. Traffic overview and trends
2. Top performing content
3. User engagement metrics
4. Key insights and recommendations
Send this as an HTML email to team@company.com with subject
'Weekly Analytics Report - [Current Date]'""",
user_id="system"
)
return response
# Schedule with cron or GitHub Actions
# 0 9 * * 1 cd /path/to/agent && python -c "import asyncio; asyncio.run(email_analytics_report())"
Slack integration
You can even set up real-time notifications and interactive queries within Slack. For a more comprehensive implementation, check out how to chat with your data using a Birdwatcher Slack App:
from agno.tools.slack import SlackTools
async def slack_analytics_bot():
agent = Agent(
model=Claude(id="claude-4-sonnet-20250514"),
tools=[
MCPTools(transport="streamable-http", url=server_url),
SlackTools(token=os.getenv("SLACK_BOT_TOKEN"))
],
instructions=[
"Respond in Slack-friendly format with clear sections",
"Use code blocks for data tables",
"Include relevant metrics in thread responses",
"Tag relevant team members for critical alerts"
]
)
# Alert example
await agent.arun(
"""Check for any traffic anomalies in the last hour.
If significant issues are found, send an alert to #analytics-alerts
channel with details and suggested actions."""
)
# Interactive query handling
await agent.arun(
"""A user in #general asked: 'How is our mobile traffic trending this week?'
Analyze mobile traffic patterns and respond in the thread with insights."""
)
# Webhook handler for Slack events
async def handle_slack_message(event):
if "analytics" in event.get("text", "").lower():
await slack_analytics_bot()
Multi-agent workflows
For complex analytics scenarios or when you want other agents to take action on the insights found by your analytics agent, orchestrate multiple specialized agents:
from agno.agent import Agent
async def multi_agent_analysis():
# Data Explorer Agent
explorer = Agent(
model=model,
tools=[mcp_tools],
role="Data Explorer",
instructions=["Focus on data discovery and pattern identification"]
)
# Insights Analyst Agent
analyst = Agent(
model=model,
tools=[mcp_tools],
role="Insights Analyst",
instructions=["Focus on business interpretation and recommendations"]
)
# Report Writer Agent
writer = Agent(
model=model,
tools=[slack_tools, email_tools],
role="Report Writer",
instructions=["Create clear, actionable reports for stakeholders"]
)
async with mcp_tools:
# Step 1: Explore data
exploration = await explorer.arun(
"Identify interesting patterns in user behavior this week"
)
# Step 2: Analyze insights
analysis = await analyst.arun(
f"Based on this data exploration: {exploration.content}, "
"what are the key business insights and recommendations?"
)
# Step 3: Create and distribute report
await writer.arun(
f"Create a summary report based on this analysis: {analysis.content}. "
"Send to #analytics channel and email to stakeholders@company.com"
)
# Schedule for comprehensive weekly analysis
Step 6: Add advanced features and production considerations
Manage memory and context
Implement persistent memory for better contextual understanding:
from agno.memory.v2.db.postgres import PostgresMemoryDb
from agno.memory.v2.memory import Memory
from agno.storage.postgres import PostgresStorage
async def create_persistent_agent():
# Configure memory
memory = Memory(
model=Claude(id="claude-4-sonnet-20250514"),
db=PostgresMemoryDb(
table_name="analytics_memories",
db_url=os.getenv("DATABASE_URL")
)
)
# Configure storage for session history
storage = PostgresStorage(
table_name="analytics_sessions",
db_url=os.getenv("DATABASE_URL")
)
agent = Agent(
model=model,
tools=[mcp_tools],
memory=memory,
storage=storage,
enable_agentic_memory=True,
add_history_to_messages=True,
num_history_runs=10,
instructions=[
"Remember insights from previous analyses",
"Build context over time about data patterns",
"Reference historical findings when relevant"
]
)
return agent
Handle errors and build resilience
Implement robust error handling:
import logging
from tenacity import retry, stop_after_attempt, wait_exponential
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential(multiplier=1, min=4, max=10)
)
async def resilient_query(agent, query, user_id="system"):
try:
async with agent.tools[0]: # MCP tools context
response = await agent.arun(query, user_id=user_id)
logger.info(f"Query successful: {query[:50]}...")
return response
except Exception as e:
logger.error(f"Query failed: {query[:50]}... - Error: {e}")
raise
async def safe_analytics_run(query):
try:
agent, mcp_tools = await create_analytics_agent()
return await resilient_query(agent, query)
except Exception as e:
logger.error(f"Analytics run failed completely: {e}")
return {"error": str(e), "success": False}
Deploy to production
You can easily deploy your Tinybird resources to a production cloud environment:
tb --cloud deploy
Deploy your agent using Docker and container orchestration:
# Dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["python", "analytics_agent.py"]
# docker-compose.yml
version: '3.8'
services:
analytics-agent:
build: .
environment:
- TINYBIRD_TOKEN=${TINYBIRD_TOKEN}
- SLACK_BOT_TOKEN=${SLACK_BOT_TOKEN}
- DATABASE_URL=${DATABASE_URL}
restart: unless-stopped
postgres:
image: postgres:15
environment:
- POSTGRES_DB=analytics_agent
- POSTGRES_USER=agent
- POSTGRES_PASSWORD=${DB_PASSWORD}
volumes:
- postgres_data:/var/lib/postgresql/data
volumes:
postgres_data:
Conclusion
Analytics agents like those you can build with Agno and Tinybird can change how organizations and their customers interact with data. By combining Agno's performance characteristics and tooling with Tinybird's high-performance data infrastructure and AI tooling, you can create intelligent systems that not only answer questions quickly but also proactively monitor, investigate, and report on your data.
Key advantages of this approach:
- Performance metrics: Agno's 3-microsecond instantiation time combined with Tinybird's sub-second queries
- Natural language interface: Eliminates the need for complex SQL queries or dashboard navigation
- Proactive monitoring: Agents that detect and investigate anomalies automatically
- Multiple delivery channels: Insights delivered via CLI, email, Slack, or APIs
- Production scalability: Both platforms designed for production workloads with measurable performance characteristics
What makes Tinybird essential for analytics agents
- Query performance: Sub-second response times for analytical queries
- MCP integration: Seamless tool discovery and automatic API generation
- SQL compatibility: Full analytical capabilities without compromise
- Developer workflow: From prototype to production deployment
- Transparent pricing: Pay-per-compute model with clear cost structure
As AI agents become central to data operations, having performant data infrastructure becomes crucial. Tinybird provides the foundation that makes your agents not just intelligent, but measurably fast, reliable, and scalable.
For more inspiration on data-driven applications you can build, explore examples like building a Datadog alternative, creating natural language filters for analytics dashboards, or using LLMs to generate user-defined data visualizations.
Start building
Sign up for Tinybird at tinybird.co and start building agents that transform how your team works with data. With Tinybird's free tier, you can prototype and deploy your first analytics agent within an hour.
Need inspiration, check out Tinybird's AI agent templates - including one built with Agno - in the Tinybird AI repository.
The future of data analysis is conversational, proactive, and intelligent. Start building it today.