pgvector

pgvector adds the vector type and KNN search operators to Postgres. ide99 detects it automatically and activates a Power Pack: index wizard, KNN browser, hybrid search, 2D map of semantic clusters.

When you need it

When tables have embeddings — for:

RAG (Retrieval-Augmented Generation)
Semantic search ("find articles similar in meaning")
Recommendation systems
Anomaly detection

Without pgvector these are usually solved with Pinecone, Weaviate, Qdrant. With pgvector — everything stays in the same database as your regular data.

Activate

CREATE EXTENSION vector;

-- Embedding column:
ALTER TABLE articles ADD COLUMN embedding vector(1536);

Dimensionality (1536 for OpenAI ada-002, 768 for BAAI/bge-base) depends on your model.

What ide99 shows

In Schema Browser, every table with a vector column gets a Vector panel:

Index Wizard — pick index type and parameters
KNN Browse — find similar items for any query
Hybrid Search — combined full-text + vector
2D Map — project high-dimensional vectors to a plane via UMAP

Index Wizard

The big one. Without an index, the <-> operator does a Seq Scan over the whole table — doesn't scale.

Open: right-click a vector column → Create vector index.

Form:

Index type:
- HNSW — faster on query, slower on insert, more memory. For read-heavy.
- IVFFlat — slower on query, faster on insert, less memory. For write-heavy.
Distance:
- L2 (vector_l2_ops) — Euclidean
- Inner product (vector_ip_ops) — for normalized vectors
- Cosine (vector_cosine_ops) — for cosine similarity (popular for embeddings)
Parameters:
- HNSW: m (default 16), ef_construction (default 64)
- IVFFlat: lists (recommended sqrt(n_rows))

The wizard suggests defaults based on table size. SQL preview:

CREATE INDEX CONCURRENTLY idx_articles_embedding
ON articles USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

CONCURRENTLY means prod-safe (no table locking) but slower — you can watch progress in Live Ops.

KNN Browse

"Show N nearest to this vector".

UI:

Vector source:
- From row — pick a row in the table (then search for items similar to it)
- Paste vector — paste a JSON array manually
- Build from text — if you have an embedding function in the DB (openai_embed(text) wrapper), specify it — ide99 calls it
Distance — which operator to use
K — how many to return (default 10)
Filter — optional WHERE overlay (e.g., language = 'en')

Result — a table with distance and other columns. Double-click to open the row.

Hybrid Search

Semantic + full-text together. For "find articles about machine learning that mention BERT":

SELECT id, title,
       0.6 * (1 - (embedding <=> $query_vec)) +
       0.4 * ts_rank(to_tsvector('english', body), to_tsquery('BERT')) AS score
FROM articles
WHERE to_tsvector('english', body) @@ to_tsquery('BERT')
ORDER BY score DESC
LIMIT 20;

The wizard assembles this query: you provide the text column, the vector column, the weights (0.6/0.4); ide99 writes the SQL. Useful: correct FTS configs (russian, english), tsquery syntax — that's where people usually trip.

2D Map

Project vectors onto a plane via UMAP — see semantic clusters with your eyes.

Open: right-click a vector column → Show 2D map.

ide99 loads a sample (1000–10000 rows, configurable), runs UMAP in the browser (umap-js), renders a scatter plot. Each point is a row, hover shows other column values.

Useful for:

Anomalies: a lonely point far from clusters — outlier
Duplicates: dense cluster of identical embeddings — dedup candidate
Embedding quality: if clusters are smudged and don't separate — the model is off for the task

Tip: HNSW vs IVFFlat

Situation	Pick
< 100K rows	No index needed — Seq Scan is fast
100K–10M rows, few inserts	HNSW
10M+, constant inserts/deletes	IVFFlat
Memory-constrained	IVFFlat
Need maximum query speed	HNSW

Schema Browser — where the Vector panel appears.
PostGIS — another Power Pack for geodata.