Build a content recommendation API using vector search

Read on to learn how to calculate vector embeddings using HuggingFace models and use Tinybird to perform vector search to find similar content based on vector distances.

GitHub Repository

+Real-World Example: The 'Related posts' section of the Tinybird blog uses a similar vector search recommendation algorithm.

Vector search recommendation
Vector search recommendation

In this tutorial, you learn how to:

  1. Use Python or Node.js to calculate vector embeddings on blog posts using HuggingFace models
  2. Post vector embeddings to a Tinybird Data Source using the Tinybird Events API
  3. Write a dynamic SQL query to calculate the closest content matches to a given blog post based on vector distances
  4. Publish your query as an API and integrate it into a frontend application

Prerequisites

To complete this tutorial, you need the following:

  1. A free Tinybird account
  2. An empty Tinybird Workspace
  3. Python 3.8+ or Node.js 18+ (for running example scripts)

This tutorial doesn't include a frontend. An example snippet is provided to show how you can integrate the published API into a React frontend.

1

Setup

  1. Clone the demo_vector_search_recommendation repository.

  2. Install the Tinybird CLI (if not already installed):

curl https://tinybird.co | sh
  1. Authenticate with your Tinybird account:
cd demo_vector_search_recommendation
tb login

This will open your browser where you can create a new workspace or select an existing one.

  1. Deploy the Tinybird resources:
# Build the project (builds all datasources and pipes)
tb build

# Deploy to Tinybird Cloud
tb --cloud deploy
2

Install Dependencies

This tutorial uses HuggingFace all-MiniLM-L6-v2 (384 dimensions), which runs locally and doesn't require API keys.

For Python:

cd scripts/python
pip install -r requirements.txt

For Node.js:

cd scripts/node
npm install
3

Set Environment Variables

You'll need tokens with the appropriate scopes:

  • DATASOURCES:WRITE scope to send events to Tinybird
  • PIPES:READ scope to query the pipe endpoint

See the Tinybird Tokens documentation for instructions on creating tokens.

export TB_HOST=https://api.tinybird.co  # or your Tinybird host
export TB_TOKEN=your_tinybird_token_here
4

Calculate embeddings and post to Tinybird

This tutorial uses standalone scripts that generate embeddings locally and send them directly to Tinybird.

The scripts automatically load posts from sample-data/posts.json by default. You can customize the source using the POSTS_SOURCE environment variable to point to a different file or URL.

Posts JSON file format:

The scripts expect a JSON file with a direct array of posts:

[
  {
    "slug": "my-post",
    "title": "My Post Title",
    "excerpt": "Post excerpt...",
    "content": "Full content...",
    "categories": ["tech"],
    "published_on": "2025-01-15",
    "status": "published",
    "updated_at": "2025-01-20"
  }
]

Using Python

Run with default sample data

cd scripts/python
python generate_embeddings.py

Run with custom posts file or URL

# From a local file
POSTS_SOURCE=../custom/posts.json python generate_embeddings.py

# From a URL (e.g., your CMS API)
POSTS_SOURCE=https://your-cms.com/api/posts python generate_embeddings.py

Use as a library

The Python script uses sentence-transformers to generate embeddings. You can also import it as a library:

import sys
sys.path.append('scripts/python')
from generate_embeddings import send_posts_to_tinybird, get_related_posts, load_posts_from_source
from sentence_transformers import SentenceTransformer

# Load model
model = SentenceTransformer('all-MiniLM-L6-v2')

# Load posts from file or URL
posts = load_posts_from_source('sample-data/posts.json')
# Or from URL:
# posts = load_posts_from_source('https://your-cms.com/api/posts')

# Generate embeddings and send to Tinybird
send_posts_to_tinybird(posts, model)

# Get related posts
related = get_related_posts("my-post", limit=10)

You can find the complete example script in scripts/python/generate_embeddings.py.

Using Node.js

Run with default sample data

cd scripts/node
node generate_embeddings.js

Run with custom posts file or URL

# From a local file
POSTS_SOURCE=../custom/posts.json node generate_embeddings.js

# From a URL (e.g., your CMS API)
POSTS_SOURCE=https://your-cms.com/api/posts node generate_embeddings.js

Use as a library

The Node.js script uses @xenova/transformers to generate embeddings. You can also require it as a library:

const { sendPostsToTinybird, getRelatedPosts, loadPostsFromSource } = require('./scripts/node/generate_embeddings');
const { pipeline } = require('@xenova/transformers');

// Load model
const model = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');

// Load posts from file or URL
const posts = await loadPostsFromSource('sample-data/posts.json');
// Or from URL:
// const posts = await loadPostsFromSource('https://your-cms.com/api/posts');

// Generate embeddings and send to Tinybird
await sendPostsToTinybird(posts, model);

// Get related posts
const related = await getRelatedPosts("my-post", 10);

You can find the complete example script in scripts/node/generate_embeddings.js.

5

Post content metadata and embeddings to Tinybird

After calculating the embeddings, you can push them along with the content metadata to Tinybird using the Events API.

The Data Source schema is defined in datasources/posts.datasource:

SCHEMA >
    `slug` String `json:$.slug`,
    `embedding` Array(Float32) `json:$.embedding[:]`,
    `status` String `json:$.status`,
    `timestamp` DateTime `json:$.timestamp` DEFAULT now()

ENGINE ReplacingMergeTree
ENGINE_VER "timestamp"
ENGINE_SORTING_KEY "slug"

This Data Source receives the post metadata and calculated embeddings and deduplicates based on the most up-to-date data. The ReplacingMergeTree engine is used to deduplicate, relying on the ENGINE_VER setting, which is set to the timestamp column. This tells the engine that the versioning of each entry is based on the timestamp column, and only the entry with the latest timestamp is kept in the Data Source.

The Data Source has the slug column as its primary sorting key, because you filter by slug to retrieve the embedding for the current post. Having slug as the primary sorting key makes that filter more performant.

6

Calculate distances in SQL using Tinybird Pipes

If you've completed the previous steps, you should have a posts Data Source in your Tinybird Workspace containing the timestamp, slug, embedding, and status for each blog post.

You can verify that you have data from the Tinybird CLI with:

tb sql 'SELECT * FROM posts LIMIT 1'

This tutorial includes a multi-node SQL Pipe to calculate the vector distance of each post to a specific post supplied as a query parameter. The Pipe config is contained in the pipes/similar_posts.pipe file, and the SQL is explained below:

NODE get_target_embedding
SQL >
    %
    SELECT embedding AS target_embedding
    FROM posts
    WHERE slug = {{ String(slug, required=True) }}
      AND status = 'published'
    ORDER BY timestamp DESC
    LIMIT 1

NODE aggregated_posts
SQL >
    %
    SELECT
        slug,
        argMax(status, timestamp) AS status,
        argMax(embedding, timestamp) AS embedding
    FROM posts
    WHERE slug != {{ String(slug, required=True) }}
    GROUP BY slug
    HAVING status = 'published' AND length(embedding) = 384

NODE find_similar_posts
SQL >
    %
    SELECT
        p.slug,
        p.status,
        1 - cosineDistance(t.target_embedding, p.embedding) AS similarity
    FROM aggregated_posts AS p
    CROSS JOIN get_target_embedding AS t
    WHERE length(t.target_embedding) = 384
    HAVING similarity >= {{ Float32(min_similarity, 0.1, description='Minimum similarity threshold') }}
    ORDER BY similarity DESC
    LIMIT {{ Int32(limit, 10, description='Number of related posts to return') }}

This query:

  1. Fetches the target embedding: Gets the embedding for the requested post by slug
  2. Aggregates other posts: Groups all other published posts by slug, keeping the latest version of each
  3. Calculates similarity: Uses cosine distance (1 - cosineDistance()) to find similar posts, ensuring all embeddings are 384 dimensions (HuggingFace all-MiniLM-L6-v2)
  4. Filters and sorts: Filters by minimum similarity threshold and returns the top N results

You can deploy this Pipe to your Tinybird server with:

tb --cloud deploy

When you deploy it, Tinybird automatically publishes it as a scalable, dynamic REST API Endpoint that accepts a slug query parameter.

7

Once embeddings are in Tinybird, you can query the pipe endpoint:

curl --compressed \
  -H "Authorization: Bearer $TB_TOKEN" \
  "https://<your_host>/v0/pipes/similar_posts.json?slug=my-post&limit=10"

Query Parameters:

  • slug (required): Post slug to find related posts for
  • limit (optional): Maximum number of results (default: 10)
  • min_similarity (optional): Minimum similarity threshold (default: 0.1)

Response:

{
  "data": [
    {
      "slug": "related-post",
      "status": "published",
      "similarity": 0.85
    }
  ]
}
8

Integrate into the frontend

Integrating your vector search API into the frontend is straightforward. Here's an example implementation:

export async function getRelatedPosts(slug: string) {
  const recommendationsUrl = `${host}/v0/pipes/similar_posts.json?token=${token}&slug=${slug}&limit=10`;
  const recommendationsResponse = await fetch(recommendationsUrl).then(
    (response) => response.json()
  );

  if (!recommendationsResponse.data) return [];

  return Promise.all(
    recommendationsResponse.data.map(async ({ slug }) => {
      return await getPost(slug);
    })
  ).then((data) => data.filter(Boolean));
}
9

See it in action

You can see how this looks by checking out any blog post in the Tinybird Blog. At the bottom of each post, you can find a Related Posts section that's powered by a real Tinybird API using a similar implementation.

Important considerations

Model Consistency

Critical: All embeddings must use the same model and dimensions for accurate similarity comparison. This tutorial uses HuggingFace all-MiniLM-L6-v2 (384 dimensions).

Best Practice: Choose one embedding model for your entire dataset and stick with it. If you need to change models, regenerate all embeddings.

Embedding Dimensions

This example uses 384 dimensions (HuggingFace all-MiniLM-L6-v2). The pipe checks that all embeddings are exactly 384 dimensions. If you modify the implementation to use a different model, make sure to:

  1. Update the pipe to check for the correct dimensions (currently length(embedding) = 384)
  2. Regenerate all embeddings with the new model
  3. Ensure all embeddings use the same dimensions

Alternative Models

You can use any embedding model—whether from HuggingFace or another source—by updating the script implementation to load and apply your preferred model. Just make sure to adjust the processing code and pipeline to accommodate the output from your chosen embedding model.

Example: Using OpenAI Embeddings (Node.js)

You can also use OpenAI's API for higher-dimensional embeddings (e.g., text-embedding-3-small, 1536 dims). Note that this requires an OPENAI_API_KEY and may incur API costs.

Here's a basic example (JavaScript / Node.js, using openai npm package):

import { OpenAI } from "openai";

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

async function getOpenAIEmbedding(text) {
  const response = await openai.embeddings.create({
    model: "text-embedding-3-small",
    input: text,
  });
  // Returns array of floats (length 1536)
  return response.data[0].embedding;
}

// Example usage:
const post = {
  slug: "my-post",
  title: "My Post Title",
  excerpt: "Post excerpt...",
  content: "Full content...",
  categories: ["tech"],
  published_on: "2025-01-15",
  status: "published",
};

const textToEmbed = `${post.title} ${post.excerpt} ${post.content}`;
const embedding = await getOpenAIEmbedding(textToEmbed);
// Now use `embedding` in the event you send to Tinybird

If using OpenAI, ensure your Tinybird pipes and queries expect 1536 dimensions: update any checks from length(embedding) = 384 to length(embedding) = 1536, and regenerate all embeddings accordingly.

Next steps

Updated