How does semantic search work with embeddings and PGVector?

Text converts to embedding (vector of 1536 numbers representing meaning). Stores in Supabase with pgvector. Query converts to embedding, compares vectors (cosine similarity). Returns semantically similar results, not just keywords.

When should I use embeddings instead of keyword search?

Keyword: 'dentist' returns only pages with 'dentist'. Embeddings: 'teeth doctor' also returns dentist content. Embeddings work better for natural language (conversations). Keyword faster for exact matches. Combine both.

Which model should I use for generating embeddings?

OpenAI text-embedding-3-small: US$ 0.01 per 1M tokens (cheap), good quality. OpenAI text-embedding-3-large: US$ 0.13 per 1M tokens, better quality. Open source (MiniLM): free, 70% of OpenAI quality. For MVP: free. Production: OpenAI small.

Is Supabase + PGVector enough for complete RAG?

Yes for basic RAG (document retrieval + prompt). For advanced RAG (reranking, query expansion), add tools. Supabase PGVector: 80% of solution. Rest: Claude API for generation. Minimal working stack.

How long to implement semantic search with pgvector?

Setup: 2 hours (create table, install extension, generate embeddings). Code integration: 3 hours. Testing: 1 hour. Total: half a week. After: maintain by adding new embeddings when content changes.

Embeddings + pgvector: semantic search on your site (real case)

A client of mine runs a dental clinic. Offers 34 different types of procedures. Patients come saying: “I have a baby tooth that didn’t fall out, what do I do?” or “I want to improve my bite”.

Keyword search doesn’t work. If the patient types “wrong tooth”, they won’t find “Orthodontics”. If they type “tartar cleaning”, they won’t find “Periodontal Scaling” (which is basically the same thing).

I solved it with semantic search using embeddings + pgvector. Now the patient types anything and finds the right procedure. Click rate on results: 65% (was 18% with keyword search).

I’ll show how I did it.

The problem with keyword search

Keyword search works like this:

You type: “teeth cleaning” System looks for posts/pages containing the words “cleaning” + “teeth” Returns exact matches

Problem: synonyms. “Cleaning” and “Scaling” are the same thing to a dentist. But to the keyword system, they’re different.

Another real example:

Patient: “I want my teeth less yellow”
Keyword search: no results (“yellow” isn’t on any page)
Semantic search: finds “Teeth Whitening” (because it understands it’s about tooth color)

With 34 procedures, I was missing about 40% of searches due to synonymy.

How embeddings solve it

Embeddings transform text into numbers. Vectors in 1536-dimensional space (using OpenAI).

The beauty: texts with similar meaning end up close in this space.

Example:

“teeth cleaning” becomes [0.002, -0.15, 0.98, …]
“periodontal scaling” becomes [0.005, -0.14, 0.97, …]
These vectors are close (same significance)

When a patient searches something, you:

Convert the search to an embedding
Compare with embeddings of all procedures
Return the closest ones

And everything becomes semantic, not based on exact words.

Complete pipeline I built

1. Prepare the data

You start with a list of procedures. Each with name + description:

id | name | description
1  | Professional Cleaning | Removes tartar and plaque. Done every 6 months.
2  | Periodontal Scaling | Treats gum inflammation. Deep and painless.
3  | Whitening | Makes teeth whiter. Safe and effective.
4  | Orthodontics | Aligns teeth. Fixes bite.
...

2. Generate embeddings (once)

You don’t regenerate embeddings every time. Do it once and save in the database.

npm install openai

import { OpenAI } from "openai";
import { createClient } from "@supabase/supabase-js";

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const supabase = createClient(
  process.env.SUPABASE_URL!,
  process.env.SUPABASE_SERVICE_KEY!
);

async function generateEmbeddings() {
  const { data: procedimentos } = await supabase
    .from("procedimentos")
    .select("id, nome, descricao");

  for (const proc of procedimentos) {
    const text = `${proc.nome}. ${proc.descricao}`;

    const response = await openai.embeddings.create({
      model: "text-embedding-3-small",
      input: text,
    });

    const embedding = response.data[0].embedding;

    await supabase.from("procedimentos").update({ embedding }).eq("id", proc.id);

    console.log(`Embedding generated for ${proc.nome}`);
  }
}

generateEmbeddings();

This generates a 1536-number embedding for each procedure and saves it to Supabase.

Cost: about $0.01 to generate embeddings for 34 procedures. You only do this once (or when adding new procedures).

3. Install pgvector on Supabase

PostgreSQL needs the pgvector extension for efficient semantic search.

If you use Supabase, it’s already included. If you use PostgreSQL directly, run:

CREATE EXTENSION IF NOT EXISTS vector;

4. Create table with vector column

CREATE TABLE procedimentos (
  id SERIAL PRIMARY KEY,
  nome TEXT NOT NULL,
  descricao TEXT NOT NULL,
  preco DECIMAL,
  embedding vector(1536) -- OpenAI embedding size
);

CREATE INDEX ON procedimentos USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

That INDEX is important for performance. Without it, each search scans all rows (slow).

5. Search by similarity

When the patient types something, you:

Generate the search embedding
Query using cosine distance
Return top 5

async function searchProcedimentos(query: string) {
  const queryEmbedding = await openai.embeddings.create({
    model: "text-embedding-3-small",
    input: query,
  });

  const embedding = queryEmbedding.data[0].embedding;

  const { data } = await supabase.rpc("match_procedimentos", {
    query_embedding: embedding,
    match_threshold: 0.7,
    match_count: 5,
  });

  return data;
}

And on Supabase, you create a function:

CREATE OR REPLACE FUNCTION match_procedimentos (
  query_embedding vector(1536),
  match_threshold float,
  match_count int
) RETURNS TABLE (
  id integer,
  nome text,
  descricao text,
  similarity float
) LANGUAGE plpgsql
AS $$
BEGIN
  RETURN QUERY
  SELECT
    procedimentos.id,
    procedimentos.nome,
    procedimentos.descricao,
    1 - (procedimentos.embedding <=> query_embedding) as similarity
  FROM procedimentos
  WHERE 1 - (procedimentos.embedding <=> query_embedding) > match_threshold
  ORDER BY procedimentos.embedding <=> query_embedding
  LIMIT match_count;
END;
$$;

The <=> operator is cosine distance in pgvector.

6. Put it on a page

import { useState } from "react";

export default function SearchProcedimentos() {
  const [query, setQuery] = useState("");
  const [results, setResults] = useState([]);

  async function handleSearch(e: React.ChangeEvent<HTMLInputElement>) {
    const q = e.target.value;
    setQuery(q);

    if (q.length < 2) return;

    const response = await fetch("/api/search-procedimentos", {
      method: "POST",
      body: JSON.stringify({ query: q }),
    });

    const data = await response.json();
    setResults(data);
  }

  return (
    <div>
      <input
        type="text"
        placeholder="Search for procedure..."
        value={query}
        onChange={handleSearch}
      />

      <ul>
        {results.map((r) => (
          <li key={r.id}>
            <h3>{r.nome}</h3>
            <p>{r.descricao}</p>
            <small>Relevance: {(r.similarity * 100).toFixed(0)}%</small>
          </li>
        ))}
      </ul>
    </div>
  );
}

Practical results

Before with keyword search:

Patient: “I want to improve my bite”
Result: nothing (because “bite” isn’t on any page)

After with semantic search:

Patient: “I want to improve my bite”
Result: Orthodontics (first result, 92% relevance)

Another example:

Patient: “tooth hurting”
Before: nothing
After: Root Canal Treatment (88% relevance), Dental Emergency (85% relevance)

Click rate went from 18% to 65%. That means people find what they’re looking for. Less bounce.

Comparison: keyword vs semantic

Aspect	Keyword	Semantic
”teeth cleaning”	Finds “Professional Cleaning”	Finds “Cleaning” + “Periodontal Scaling"
"tooth pain”	Nothing	Finds “Root Canal”, “Emergency"
"improve appearance”	Nothing	Finds “Whitening”, “Veneers”
Speed	<100ms	50-150ms
Maintenance cost	Zero	~$0.01 per new item
Precision	30-40%	70-80%

Semantic is slower (1 query to vector index), but much more accurate.

Monthly costs

Supabase pgvector: $0 (included in plan) OpenAI embeddings: $0.01-0.05 per month (only when adding procedures) Storage: basically zero (1536 floats = about 6KB per procedure)

Total: about $5-10/month (if heavy volume). Clinic case: about $2/month.

When not to use semantic search

Semantic search is overkill for some cases:

Small catalog (<50 items). Keyword search works fine.
Precise searches by ID (like: specific SKU). Use keyword.
Very structured data (filter by price, date). Use pure SQL.

Combine: use semantic for discovery, pure SQL for filters.

Optimizations I made

Smart threshold

Started with threshold of 0.7 (only return 70%+ relevance). But noticed some legitimate searches fell below that. Changed to dynamic:

const threshold = query.length > 10 ? 0.6 : 0.75;

Short searches need to be more specific. Long searches can be more generous.

Cache popular searches

Patients search the same things. “cleaning”, “implant”, “whitening”. Cache those embeddings:

const cache = new Map();

async function searchWithCache(query: string) {
  if (cache.has(query)) {
    return cache.get(query);
  }

  const result = await searchProcedimentos(query);
  cache.set(query, result);

  return result;
}

Reduces OpenAI calls by 60%.

Hybrid search

Combine semantic + keyword:

SELECT * FROM (
  SELECT * FROM procedimentos
  WHERE 1 - (embedding <=> query_embedding) > 0.7 -- semantic
  UNION
  SELECT * FROM procedimentos
  WHERE nome ILIKE '%' || query_text || '%' -- keyword
  UNION
  SELECT * FROM procedimentos
  WHERE descricao ILIKE '%' || query_text || '%'
) results
ORDER BY similarity DESC
LIMIT 5;

Best of both worlds. Semantic for discovery, keyword as fallback.

Next steps

Add feedback: when patient clicks a result, mark as “relevant”. Use later to train internal model.
Personalization: save patient search history and prioritize categories they’ve already searched.
Analytics: which search gets most clicks? Which ones return no results? These insights help improve the catalog.

Checklist: implement semantic search

Scaling embeddings

When you hit 1 million embeddings, things slow down. Queries that took 100ms now take 500ms. Two solutions:

First: partition data. Don’t search all 1 million embeddings at once. If you’re searching customer support tickets from 2026, only search 2026 tickets (50k embeddings instead of 1 million).

Second: use approximate nearest neighbor search. Exact search is accurate but slow. Approximate search is 10x faster but might miss 1-2% of results. For recommendation systems, 1-2% miss rate is acceptable.

Both approaches scale to millions of embeddings.

When to use embeddings

Don’t use embeddings everywhere. Use them when:

You need semantic meaning, not keyword matching. “Best implants” should match “dental implants” even if exact words don’t match. Embeddings do this.
You’re searching unstructured data (text). Embeddings work great. Searching structured data (database records with specific fields), regular databases are better.
You have 1,000+ documents. Below that, brute force semantic search (checking every document) is fast enough.

If you don’t meet these criteria, keep it simple. Embeddings add complexity. Only add complexity when needed.