Back to blog
Tutorial

Embeddings + pgvector: semantic search on your site (real case)

By Flávio Emanuel · · 9 min read

A client of mine runs a dental clinic. Offers 34 different types of procedures. Patients come saying: “I have a baby tooth that didn’t fall out, what do I do?” or “I want to improve my bite”.

Keyword search doesn’t work. If the patient types “wrong tooth”, they won’t find “Orthodontics”. If they type “tartar cleaning”, they won’t find “Periodontal Scaling” (which is basically the same thing).

I solved it with semantic search using embeddings + pgvector. Now the patient types anything and finds the right procedure. Click rate on results: 65% (was 18% with keyword search).

I’ll show how I did it.

Keyword search works like this:

You type: “teeth cleaning” System looks for posts/pages containing the words “cleaning” + “teeth” Returns exact matches

Problem: synonyms. “Cleaning” and “Scaling” are the same thing to a dentist. But to the keyword system, they’re different.

Another real example:

  • Patient: “I want my teeth less yellow”
  • Keyword search: no results (“yellow” isn’t on any page)
  • Semantic search: finds “Teeth Whitening” (because it understands it’s about tooth color)

With 34 procedures, I was missing about 40% of searches due to synonymy.

How embeddings solve it

Embeddings transform text into numbers. Vectors in 1536-dimensional space (using OpenAI).

The beauty: texts with similar meaning end up close in this space.

Example:

  • “teeth cleaning” becomes [0.002, -0.15, 0.98, …]
  • “periodontal scaling” becomes [0.005, -0.14, 0.97, …]
  • These vectors are close (same significance)

When a patient searches something, you:

  1. Convert the search to an embedding
  2. Compare with embeddings of all procedures
  3. Return the closest ones

And everything becomes semantic, not based on exact words.

Complete pipeline I built

1. Prepare the data

You start with a list of procedures. Each with name + description:

id | name | description
1  | Professional Cleaning | Removes tartar and plaque. Done every 6 months.
2  | Periodontal Scaling | Treats gum inflammation. Deep and painless.
3  | Whitening | Makes teeth whiter. Safe and effective.
4  | Orthodontics | Aligns teeth. Fixes bite.
...

2. Generate embeddings (once)

You don’t regenerate embeddings every time. Do it once and save in the database.

npm install openai
import { OpenAI } from "openai";
import { createClient } from "@supabase/supabase-js";

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const supabase = createClient(
  process.env.SUPABASE_URL!,
  process.env.SUPABASE_SERVICE_KEY!
);

async function generateEmbeddings() {
  const { data: procedimentos } = await supabase
    .from("procedimentos")
    .select("id, nome, descricao");

  for (const proc of procedimentos) {
    const text = `${proc.nome}. ${proc.descricao}`;

    const response = await openai.embeddings.create({
      model: "text-embedding-3-small",
      input: text,
    });

    const embedding = response.data[0].embedding;

    await supabase.from("procedimentos").update({ embedding }).eq("id", proc.id);

    console.log(`Embedding generated for ${proc.nome}`);
  }
}

generateEmbeddings();

This generates a 1536-number embedding for each procedure and saves it to Supabase.

Cost: about $0.01 to generate embeddings for 34 procedures. You only do this once (or when adding new procedures).

3. Install pgvector on Supabase

PostgreSQL needs the pgvector extension for efficient semantic search.

If you use Supabase, it’s already included. If you use PostgreSQL directly, run:

CREATE EXTENSION IF NOT EXISTS vector;

4. Create table with vector column

CREATE TABLE procedimentos (
  id SERIAL PRIMARY KEY,
  nome TEXT NOT NULL,
  descricao TEXT NOT NULL,
  preco DECIMAL,
  embedding vector(1536) -- OpenAI embedding size
);

CREATE INDEX ON procedimentos USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

That INDEX is important for performance. Without it, each search scans all rows (slow).

5. Search by similarity

When the patient types something, you:

  1. Generate the search embedding
  2. Query using cosine distance
  3. Return top 5
async function searchProcedimentos(query: string) {
  const queryEmbedding = await openai.embeddings.create({
    model: "text-embedding-3-small",
    input: query,
  });

  const embedding = queryEmbedding.data[0].embedding;

  const { data } = await supabase.rpc("match_procedimentos", {
    query_embedding: embedding,
    match_threshold: 0.7,
    match_count: 5,
  });

  return data;
}

And on Supabase, you create a function:

CREATE OR REPLACE FUNCTION match_procedimentos (
  query_embedding vector(1536),
  match_threshold float,
  match_count int
) RETURNS TABLE (
  id integer,
  nome text,
  descricao text,
  similarity float
) LANGUAGE plpgsql
AS $$
BEGIN
  RETURN QUERY
  SELECT
    procedimentos.id,
    procedimentos.nome,
    procedimentos.descricao,
    1 - (procedimentos.embedding <=> query_embedding) as similarity
  FROM procedimentos
  WHERE 1 - (procedimentos.embedding <=> query_embedding) > match_threshold
  ORDER BY procedimentos.embedding <=> query_embedding
  LIMIT match_count;
END;
$$;

The <=> operator is cosine distance in pgvector.

6. Put it on a page

import { useState } from "react";

export default function SearchProcedimentos() {
  const [query, setQuery] = useState("");
  const [results, setResults] = useState([]);

  async function handleSearch(e: React.ChangeEvent<HTMLInputElement>) {
    const q = e.target.value;
    setQuery(q);

    if (q.length < 2) return;

    const response = await fetch("/api/search-procedimentos", {
      method: "POST",
      body: JSON.stringify({ query: q }),
    });

    const data = await response.json();
    setResults(data);
  }

  return (
    <div>
      <input
        type="text"
        placeholder="Search for procedure..."
        value={query}
        onChange={handleSearch}
      />

      <ul>
        {results.map((r) => (
          <li key={r.id}>
            <h3>{r.nome}</h3>
            <p>{r.descricao}</p>
            <small>Relevance: {(r.similarity * 100).toFixed(0)}%</small>
          </li>
        ))}
      </ul>
    </div>
  );
}

Practical results

Before with keyword search:

  • Patient: “I want to improve my bite”
  • Result: nothing (because “bite” isn’t on any page)

After with semantic search:

  • Patient: “I want to improve my bite”
  • Result: Orthodontics (first result, 92% relevance)

Another example:

  • Patient: “tooth hurting”
  • Before: nothing
  • After: Root Canal Treatment (88% relevance), Dental Emergency (85% relevance)

Click rate went from 18% to 65%. That means people find what they’re looking for. Less bounce.

Comparison: keyword vs semantic

AspectKeywordSemantic
”teeth cleaning”Finds “Professional Cleaning”Finds “Cleaning” + “Periodontal Scaling"
"tooth pain”NothingFinds “Root Canal”, “Emergency"
"improve appearance”NothingFinds “Whitening”, “Veneers”
Speed<100ms50-150ms
Maintenance costZero~$0.01 per new item
Precision30-40%70-80%

Semantic is slower (1 query to vector index), but much more accurate.

Monthly costs

Supabase pgvector: $0 (included in plan) OpenAI embeddings: $0.01-0.05 per month (only when adding procedures) Storage: basically zero (1536 floats = about 6KB per procedure)

Total: about $5-10/month (if heavy volume). Clinic case: about $2/month.

Semantic search is overkill for some cases:

  1. Small catalog (<50 items). Keyword search works fine.
  2. Precise searches by ID (like: specific SKU). Use keyword.
  3. Very structured data (filter by price, date). Use pure SQL.

Combine: use semantic for discovery, pure SQL for filters.

Optimizations I made

Smart threshold

Started with threshold of 0.7 (only return 70%+ relevance). But noticed some legitimate searches fell below that. Changed to dynamic:

const threshold = query.length > 10 ? 0.6 : 0.75;

Short searches need to be more specific. Long searches can be more generous.

Patients search the same things. “cleaning”, “implant”, “whitening”. Cache those embeddings:

const cache = new Map();

async function searchWithCache(query: string) {
  if (cache.has(query)) {
    return cache.get(query);
  }

  const result = await searchProcedimentos(query);
  cache.set(query, result);

  return result;
}

Reduces OpenAI calls by 60%.

Combine semantic + keyword:

SELECT * FROM (
  SELECT * FROM procedimentos
  WHERE 1 - (embedding <=> query_embedding) > 0.7 -- semantic
  UNION
  SELECT * FROM procedimentos
  WHERE nome ILIKE '%' || query_text || '%' -- keyword
  UNION
  SELECT * FROM procedimentos
  WHERE descricao ILIKE '%' || query_text || '%'
) results
ORDER BY similarity DESC
LIMIT 5;

Best of both worlds. Semantic for discovery, keyword as fallback.

Next steps

  1. Add feedback: when patient clicks a result, mark as “relevant”. Use later to train internal model.

  2. Personalization: save patient search history and prioritize categories they’ve already searched.

  3. Analytics: which search gets most clicks? Which ones return no results? These insights help improve the catalog.

  • Supabase (or PostgreSQL) with pgvector installed?
  • Data table structured (id, name, description)?
  • OpenAI API key configured?
  • Embedding generation script tested?
  • SQL function match_procedimentos created?
  • /api/search-procedimentos endpoint works?
  • Frontend renders results with similarity score?
  • Similarity threshold tested (0.7 is good start)?
  • Cache implemented for popular searches?
  • Analytics setup to track zero-result searches?

Scaling embeddings

When you hit 1 million embeddings, things slow down. Queries that took 100ms now take 500ms. Two solutions:

First: partition data. Don’t search all 1 million embeddings at once. If you’re searching customer support tickets from 2026, only search 2026 tickets (50k embeddings instead of 1 million).

Second: use approximate nearest neighbor search. Exact search is accurate but slow. Approximate search is 10x faster but might miss 1-2% of results. For recommendation systems, 1-2% miss rate is acceptable.

Both approaches scale to millions of embeddings.

When to use embeddings

Don’t use embeddings everywhere. Use them when:

  1. You need semantic meaning, not keyword matching. “Best implants” should match “dental implants” even if exact words don’t match. Embeddings do this.

  2. You’re searching unstructured data (text). Embeddings work great. Searching structured data (database records with specific fields), regular databases are better.

  3. You have 1,000+ documents. Below that, brute force semantic search (checking every document) is fast enough.

If you don’t meet these criteria, keep it simple. Embeddings add complexity. Only add complexity when needed.

Read also: Supabase Auth: login to RLS | PostgreSQL multi-tenant RLS | Supabase + React

Next step

Need a dev who truly delivers?

Whether it's a one-time project, team reinforcement, or a long-term partnership. Let's talk.

Chat on WhatsApp

I reply within 2 hours during business hours.