Embeddings + pgvector: semantic search on your site (real case)
A client of mine runs a dental clinic. Offers 34 different types of procedures. Patients come saying: “I have a baby tooth that didn’t fall out, what do I do?” or “I want to improve my bite”.
Keyword search doesn’t work. If the patient types “wrong tooth”, they won’t find “Orthodontics”. If they type “tartar cleaning”, they won’t find “Periodontal Scaling” (which is basically the same thing).
I solved it with semantic search using embeddings + pgvector. Now the patient types anything and finds the right procedure. Click rate on results: 65% (was 18% with keyword search).
I’ll show how I did it.
The problem with keyword search
Keyword search works like this:
You type: “teeth cleaning” System looks for posts/pages containing the words “cleaning” + “teeth” Returns exact matches
Problem: synonyms. “Cleaning” and “Scaling” are the same thing to a dentist. But to the keyword system, they’re different.
Another real example:
- Patient: “I want my teeth less yellow”
- Keyword search: no results (“yellow” isn’t on any page)
- Semantic search: finds “Teeth Whitening” (because it understands it’s about tooth color)
With 34 procedures, I was missing about 40% of searches due to synonymy.
How embeddings solve it
Embeddings transform text into numbers. Vectors in 1536-dimensional space (using OpenAI).
The beauty: texts with similar meaning end up close in this space.
Example:
- “teeth cleaning” becomes [0.002, -0.15, 0.98, …]
- “periodontal scaling” becomes [0.005, -0.14, 0.97, …]
- These vectors are close (same significance)
When a patient searches something, you:
- Convert the search to an embedding
- Compare with embeddings of all procedures
- Return the closest ones
And everything becomes semantic, not based on exact words.
Complete pipeline I built
1. Prepare the data
You start with a list of procedures. Each with name + description:
id | name | description
1 | Professional Cleaning | Removes tartar and plaque. Done every 6 months.
2 | Periodontal Scaling | Treats gum inflammation. Deep and painless.
3 | Whitening | Makes teeth whiter. Safe and effective.
4 | Orthodontics | Aligns teeth. Fixes bite.
...
2. Generate embeddings (once)
You don’t regenerate embeddings every time. Do it once and save in the database.
npm install openai
import { OpenAI } from "openai";
import { createClient } from "@supabase/supabase-js";
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const supabase = createClient(
process.env.SUPABASE_URL!,
process.env.SUPABASE_SERVICE_KEY!
);
async function generateEmbeddings() {
const { data: procedimentos } = await supabase
.from("procedimentos")
.select("id, nome, descricao");
for (const proc of procedimentos) {
const text = `${proc.nome}. ${proc.descricao}`;
const response = await openai.embeddings.create({
model: "text-embedding-3-small",
input: text,
});
const embedding = response.data[0].embedding;
await supabase.from("procedimentos").update({ embedding }).eq("id", proc.id);
console.log(`Embedding generated for ${proc.nome}`);
}
}
generateEmbeddings();
This generates a 1536-number embedding for each procedure and saves it to Supabase.
Cost: about $0.01 to generate embeddings for 34 procedures. You only do this once (or when adding new procedures).
3. Install pgvector on Supabase
PostgreSQL needs the pgvector extension for efficient semantic search.
If you use Supabase, it’s already included. If you use PostgreSQL directly, run:
CREATE EXTENSION IF NOT EXISTS vector;
4. Create table with vector column
CREATE TABLE procedimentos (
id SERIAL PRIMARY KEY,
nome TEXT NOT NULL,
descricao TEXT NOT NULL,
preco DECIMAL,
embedding vector(1536) -- OpenAI embedding size
);
CREATE INDEX ON procedimentos USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
That INDEX is important for performance. Without it, each search scans all rows (slow).
5. Search by similarity
When the patient types something, you:
- Generate the search embedding
- Query using cosine distance
- Return top 5
async function searchProcedimentos(query: string) {
const queryEmbedding = await openai.embeddings.create({
model: "text-embedding-3-small",
input: query,
});
const embedding = queryEmbedding.data[0].embedding;
const { data } = await supabase.rpc("match_procedimentos", {
query_embedding: embedding,
match_threshold: 0.7,
match_count: 5,
});
return data;
}
And on Supabase, you create a function:
CREATE OR REPLACE FUNCTION match_procedimentos (
query_embedding vector(1536),
match_threshold float,
match_count int
) RETURNS TABLE (
id integer,
nome text,
descricao text,
similarity float
) LANGUAGE plpgsql
AS $$
BEGIN
RETURN QUERY
SELECT
procedimentos.id,
procedimentos.nome,
procedimentos.descricao,
1 - (procedimentos.embedding <=> query_embedding) as similarity
FROM procedimentos
WHERE 1 - (procedimentos.embedding <=> query_embedding) > match_threshold
ORDER BY procedimentos.embedding <=> query_embedding
LIMIT match_count;
END;
$$;
The <=> operator is cosine distance in pgvector.
6. Put it on a page
import { useState } from "react";
export default function SearchProcedimentos() {
const [query, setQuery] = useState("");
const [results, setResults] = useState([]);
async function handleSearch(e: React.ChangeEvent<HTMLInputElement>) {
const q = e.target.value;
setQuery(q);
if (q.length < 2) return;
const response = await fetch("/api/search-procedimentos", {
method: "POST",
body: JSON.stringify({ query: q }),
});
const data = await response.json();
setResults(data);
}
return (
<div>
<input
type="text"
placeholder="Search for procedure..."
value={query}
onChange={handleSearch}
/>
<ul>
{results.map((r) => (
<li key={r.id}>
<h3>{r.nome}</h3>
<p>{r.descricao}</p>
<small>Relevance: {(r.similarity * 100).toFixed(0)}%</small>
</li>
))}
</ul>
</div>
);
}
Practical results
Before with keyword search:
- Patient: “I want to improve my bite”
- Result: nothing (because “bite” isn’t on any page)
After with semantic search:
- Patient: “I want to improve my bite”
- Result: Orthodontics (first result, 92% relevance)
Another example:
- Patient: “tooth hurting”
- Before: nothing
- After: Root Canal Treatment (88% relevance), Dental Emergency (85% relevance)
Click rate went from 18% to 65%. That means people find what they’re looking for. Less bounce.
Comparison: keyword vs semantic
| Aspect | Keyword | Semantic |
|---|---|---|
| ”teeth cleaning” | Finds “Professional Cleaning” | Finds “Cleaning” + “Periodontal Scaling" |
| "tooth pain” | Nothing | Finds “Root Canal”, “Emergency" |
| "improve appearance” | Nothing | Finds “Whitening”, “Veneers” |
| Speed | <100ms | 50-150ms |
| Maintenance cost | Zero | ~$0.01 per new item |
| Precision | 30-40% | 70-80% |
Semantic is slower (1 query to vector index), but much more accurate.
Monthly costs
Supabase pgvector: $0 (included in plan) OpenAI embeddings: $0.01-0.05 per month (only when adding procedures) Storage: basically zero (1536 floats = about 6KB per procedure)
Total: about $5-10/month (if heavy volume). Clinic case: about $2/month.
When not to use semantic search
Semantic search is overkill for some cases:
- Small catalog (<50 items). Keyword search works fine.
- Precise searches by ID (like: specific SKU). Use keyword.
- Very structured data (filter by price, date). Use pure SQL.
Combine: use semantic for discovery, pure SQL for filters.
Optimizations I made
Smart threshold
Started with threshold of 0.7 (only return 70%+ relevance). But noticed some legitimate searches fell below that. Changed to dynamic:
const threshold = query.length > 10 ? 0.6 : 0.75;
Short searches need to be more specific. Long searches can be more generous.
Cache popular searches
Patients search the same things. “cleaning”, “implant”, “whitening”. Cache those embeddings:
const cache = new Map();
async function searchWithCache(query: string) {
if (cache.has(query)) {
return cache.get(query);
}
const result = await searchProcedimentos(query);
cache.set(query, result);
return result;
}
Reduces OpenAI calls by 60%.
Hybrid search
Combine semantic + keyword:
SELECT * FROM (
SELECT * FROM procedimentos
WHERE 1 - (embedding <=> query_embedding) > 0.7 -- semantic
UNION
SELECT * FROM procedimentos
WHERE nome ILIKE '%' || query_text || '%' -- keyword
UNION
SELECT * FROM procedimentos
WHERE descricao ILIKE '%' || query_text || '%'
) results
ORDER BY similarity DESC
LIMIT 5;
Best of both worlds. Semantic for discovery, keyword as fallback.
Next steps
-
Add feedback: when patient clicks a result, mark as “relevant”. Use later to train internal model.
-
Personalization: save patient search history and prioritize categories they’ve already searched.
-
Analytics: which search gets most clicks? Which ones return no results? These insights help improve the catalog.
Checklist: implement semantic search
- Supabase (or PostgreSQL) with pgvector installed?
- Data table structured (id, name, description)?
- OpenAI API key configured?
- Embedding generation script tested?
- SQL function match_procedimentos created?
-
/api/search-procedimentosendpoint works? - Frontend renders results with similarity score?
- Similarity threshold tested (0.7 is good start)?
- Cache implemented for popular searches?
- Analytics setup to track zero-result searches?
Scaling embeddings
When you hit 1 million embeddings, things slow down. Queries that took 100ms now take 500ms. Two solutions:
First: partition data. Don’t search all 1 million embeddings at once. If you’re searching customer support tickets from 2026, only search 2026 tickets (50k embeddings instead of 1 million).
Second: use approximate nearest neighbor search. Exact search is accurate but slow. Approximate search is 10x faster but might miss 1-2% of results. For recommendation systems, 1-2% miss rate is acceptable.
Both approaches scale to millions of embeddings.
When to use embeddings
Don’t use embeddings everywhere. Use them when:
-
You need semantic meaning, not keyword matching. “Best implants” should match “dental implants” even if exact words don’t match. Embeddings do this.
-
You’re searching unstructured data (text). Embeddings work great. Searching structured data (database records with specific fields), regular databases are better.
-
You have 1,000+ documents. Below that, brute force semantic search (checking every document) is fast enough.
If you don’t meet these criteria, keep it simple. Embeddings add complexity. Only add complexity when needed.
Read also: Supabase Auth: login to RLS | PostgreSQL multi-tenant RLS | Supabase + React