Embeddings vs Postgres FTS for Internal Knowledge — A Real Bench
Everyone reaches for vector search when they should reach for Postgres full-text. I benchmarked both on a real internal-knowledge corpus. The results were not what I expected.
Most teams reach for vector embeddings when they want internal knowledge search. It feels modern. It feels AI-shaped. It's often the wrong tool.
I benchmarked embeddings against Postgres full-text search on a real client corpus. The results pushed me toward FTS as the default. Here are the numbers.
The corpus
8,400 internal documents from a B2B SaaS company. Mix of: - Engineering wikis (~2,200 docs) - Product specifications (~900 docs) - Customer support runbooks (~1,800 docs) - Onboarding materials (~600 docs) - Meeting notes (~2,900 docs)
Average doc length: 1,200 words.
The queries
47 real questions from staff members across the company. Things like: - "How do we handle a refund request over $5,000?" - "What's our policy on production data in dev environments?" - "Who owns the billing service?" - "How do I deploy a hotfix?"
47 questions × 2 systems = 94 trials. Each system returned top-5 results. I manually scored relevance on a 0-3 scale (3 = exact answer, 2 = correct doc but not exact, 1 = related but tangential, 0 = irrelevant).
The setups
Postgres FTS: Standard tsvector + GIN index on doc title and body. Used `websearch_to_tsquery` for query parsing.
Embeddings: OpenAI `text-embedding-3-small`. pgvector for storage. Cosine similarity for retrieval.
Both ran on the same Postgres instance. Same hardware.
The results
Average relevance score: - Postgres FTS: 2.31 / 3 - Embeddings: 2.18 / 3
Latency p50: - Postgres FTS: 12ms - Embeddings: 84ms (includes embedding the query)
Latency p95: - Postgres FTS: 24ms - Embeddings: 142ms
Index size: - Postgres FTS: 41 MB - Embeddings: 218 MB (1536-dim vectors)
Setup cost: - Postgres FTS: 30 minutes (mostly tsvector setup) - Embeddings: 4 hours (embedding the corpus + pgvector setup + tuning)
Ongoing cost: - Postgres FTS: zero marginal (free with Postgres) - Embeddings: about $4/month for re-embedding new docs (small corpus)
Where embeddings won
For 14 of 47 queries, embeddings outscored FTS. The pattern: when the query and answer didn't share vocabulary.
Example: query "How do we handle a customer who's about to churn?" — FTS missed because the relevant docs talked about "retention," "save calls," "customer success interventions." Embeddings caught the semantic similarity.
This is the use case where vector search earns its complexity.
Where FTS won
For 22 of 47 queries, FTS outscored embeddings. The pattern: when the user's terminology matched the doc's terminology.
Example: query "deploy a hotfix" — FTS scored 3/3 by matching "hotfix" exactly in the relevant runbook. Embeddings scored 1/3 because the top match was a doc about "deployment procedures" that was less specifically helpful.
This is the majority case for internal knowledge.
Where both lost
For 11 of 47 queries, neither system scored above 1. The pattern: the answer didn't exist in the corpus. Both systems returned the closest-but-wrong doc.
This is the case where any search is worse than "write the doc first." Search can't surface what isn't written.
What this means
For most internal knowledge bases, Postgres FTS is the right default. It's faster, cheaper, and equally accurate or better for the majority of queries.
Embeddings earn their cost when: - Queries and docs use different vocabulary regularly - The corpus is large enough that re-indexing for FTS becomes expensive - You need cross-language search (FTS handles one language well; embeddings are multilingual)
For a B2B SaaS internal knowledge base, FTS first. Add embeddings as a second-layer fallback if FTS returns nothing useful.
The hybrid pattern that won
After the bench, I built a hybrid:
1. Run FTS first. 2. If top-5 FTS results have a top score below a threshold, fall back to embeddings. 3. If embeddings also return weak matches, surface "no good match" rather than fake confidence.
In production this hybrid scored 2.52 / 3 — better than either alone. Latency stayed under p95 = 60ms because most queries resolved at the FTS layer.
What this isn't
This isn't a claim that embeddings are bad. They're great for the right use cases (semantic similarity across vocabulary differences, recommendation systems, multilingual content).
For an internal knowledge base where most users search using words that match the docs, FTS is the cheaper, faster default. The "every AI app needs embeddings" reflex is wrong.
What to build first
If you're building internal search today:
One, Postgres FTS on title + body. 30 minutes of work. Ships immediately.
Two, measure for 2-4 weeks. Capture failed queries (where the user didn't click anything in the top 5).
Three, only add embeddings if the failed queries cluster around vocabulary mismatch. Otherwise, the better fix is writing better docs.
The bench taught me to use the simpler tool first. The complexity should follow the evidence, not the hype.
Want the full guide? Check out our deep-dive page for more context, FAQs, and resources.
read the full guide