Fix: pgvector Not Working — Extension Install, Index Not Used, and Dimension Errors
Part of: Python Errors
Quick Answer
How to fix pgvector errors — extension does not exist CREATE EXTENSION vector, dimension mismatch on insert, HNSW index not used slow queries, distance operator confusion, psycopg register adapter, and ivfflat vs hnsw selection.
The Error
You try to install the pgvector extension and Postgres refuses:
ERROR: could not open extension control file "/usr/share/postgresql/16/extension/vector.control":
No such file or directory
Or you insert vectors and the types clash:
ERROR: expected 1536 dimensions, not 384
Or your query planner ignores the HNSW index you created:
EXPLAIN ANALYZE SELECT ... ORDER BY embedding <-> $1 LIMIT 10;
# Seq Scan on documents (cost=... time=5234ms)
# NOT using the index, even though it exists
Or you query from Python and get lists instead of numpy arrays:
rows = cur.fetchall()
print(rows[0][1]) # '[0.123, 0.456, ...]' — a string, not a list
pgvector adds vector similarity search to Postgres. Unlike dedicated vector databases (Pinecone, Qdrant), it lets you combine vector search with regular SQL joins, transactions, and filters — in the database you already have. But the install process has subtleties, the distance operators are non-obvious, and the query planner’s heuristics sometimes skip your index. This guide covers each failure mode.
Why This Happens
pgvector is a Postgres extension — a binary shared library plus SQL scripts. Installing it requires OS-level package installation or compiling from source, not just pip install. On managed Postgres services (RDS, Cloud SQL, Supabase), the extension must be explicitly enabled in the admin console.
The Postgres query planner chooses between index and sequential scans based on estimated costs. With vector search, it often picks seq scan for small tables or when the filter is very selective — a frustrating surprise if you assumed the HNSW index would always be used.
Fix 1: Installing the Extension
CREATE EXTENSION IF NOT EXISTS vector;
-- ERROR: extension "vector" is not available
On Debian/Ubuntu:
# PostgreSQL 16
sudo apt install postgresql-16-pgvector
# PostgreSQL 15
sudo apt install postgresql-15-pgvector
On macOS (Homebrew):
brew install pgvector
Build from source (if no package for your distro):
git clone https://github.com/pgvector/pgvector.git
cd pgvector
make
sudo make install # May need PG_CONFIG=/usr/pgsql-16/bin/pg_config make
Docker:
FROM postgres:16
RUN apt-get update && apt-get install -y postgresql-16-pgvector
Or use the official pgvector image:
docker run -d \
-e POSTGRES_PASSWORD=mysecretpassword \
-p 5432:5432 \
pgvector/pgvector:pg16
After install, enable per-database:
-- Connect to the target database first
\c mydb
CREATE EXTENSION IF NOT EXISTS vector;
-- Verify
SELECT extname, extversion FROM pg_extension WHERE extname = 'vector';
-- extname | extversion
-- vector | 0.7.0
On managed Postgres (AWS RDS, GCP Cloud SQL, Azure Database):
- RDS: Enable in the parameter group’s
shared_preload_libraries, thenCREATE EXTENSION vector; - Cloud SQL: Check the database flags in the console
- Supabase: Available by default — just
CREATE EXTENSION vector; - Neon: Available on all tiers —
CREATE EXTENSION vector;
Common Mistake: Running CREATE EXTENSION vector; as a non-superuser and getting a permission error. Most managed services give your user permission to create specific extensions via the console, not as a normal CREATE EXTENSION statement. Check your provider’s documentation for the exact enable procedure.
Fix 2: Creating Tables with Vector Columns
CREATE TABLE documents (
id BIGSERIAL PRIMARY KEY,
content TEXT NOT NULL,
embedding vector(1536), -- Dimension must match your embedding model
created_at TIMESTAMPTZ DEFAULT now()
);
Dimension table for common models:
| Model | Dimension |
|---|---|
text-embedding-3-small (OpenAI) | 1536 |
text-embedding-3-large (OpenAI) | 3072 |
text-embedding-ada-002 (OpenAI) | 1536 |
all-MiniLM-L6-v2 | 384 |
all-mpnet-base-v2 | 768 |
bge-large-en-v1.5 | 1024 |
Inserting vectors:
-- From literal array
INSERT INTO documents (content, embedding) VALUES
('First doc', '[0.1, 0.2, 0.3, ...]'::vector);
-- Bulk from query
INSERT INTO documents (content, embedding)
SELECT text, compute_embedding(text) FROM raw_docs;
From Python with psycopg:
import psycopg
from pgvector.psycopg import register_vector
# psycopg3
conn = psycopg.connect("postgresql://user:pass@localhost/mydb")
register_vector(conn) # Enables numpy array ↔ vector conversion
with conn.cursor() as cur:
# Insert from numpy array
import numpy as np
vec = np.array([0.1, 0.2, 0.3, ...], dtype=np.float32)
cur.execute(
"INSERT INTO documents (content, embedding) VALUES (%s, %s)",
("my document", vec),
)
conn.commit()
# Query returns numpy array (not string)
cur.execute("SELECT embedding FROM documents LIMIT 1")
row = cur.fetchone()
print(type(row[0])) # numpy.ndarray
From SQLAlchemy:
from sqlalchemy import Column, BigInteger, Text
from sqlalchemy.orm import DeclarativeBase
from pgvector.sqlalchemy import Vector
class Base(DeclarativeBase):
pass
class Document(Base):
__tablename__ = "documents"
id = Column(BigInteger, primary_key=True)
content = Column(Text)
embedding = Column(Vector(1536))
For SQLAlchemy-specific patterns that affect pgvector integration, see SQLAlchemy not working.
Fix 3: Distance Operators — <->, <#>, <=>
pgvector provides three distance operators, each with different meaning:
| Operator | Distance Type | Notes |
|---|---|---|
<-> | L2 (Euclidean) | Default; smaller = more similar |
<#> | Negative inner product | Smaller (more negative) = more similar |
<=> | Cosine distance | Smaller = more similar |
<+> | L1 (Manhattan) | pgvector 0.7+ |
Query examples:
-- Cosine similarity (most common for LLM embeddings)
SELECT id, content, embedding <=> '[0.1, 0.2, ...]'::vector AS distance
FROM documents
ORDER BY embedding <=> '[0.1, 0.2, ...]'::vector
LIMIT 10;
-- Inner product (fastest; normalize vectors first for cosine equivalence)
SELECT id, content, (embedding <#> '[0.1, 0.2, ...]'::vector) * -1 AS similarity
FROM documents
ORDER BY embedding <#> '[0.1, 0.2, ...]'::vector
LIMIT 10;
-- L2 (Euclidean)
SELECT id, content, embedding <-> '[0.1, 0.2, ...]'::vector AS distance
FROM documents
ORDER BY embedding <-> '[0.1, 0.2, ...]'::vector
LIMIT 10;
Converting distance to similarity (for cosine):
SELECT id, content, 1 - (embedding <=> '[...]'::vector) AS similarity
FROM documents
ORDER BY embedding <=> '[...]'::vector
LIMIT 10;
Common Mistake: Using the wrong operator for your index — the index type must match the operator. An HNSW index built for vector_cosine_ops doesn’t accelerate queries using <-> (L2). You’d silently get sequential scan on large tables. Match operator to index class explicitly.
Fix 4: Indexes — HNSW vs IVFFlat
Without an index, every query is a sequential scan. Two index types:
HNSW (recommended for most use cases — pgvector 0.5+):
-- Cosine distance
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);
-- L2 distance
CREATE INDEX ON documents USING hnsw (embedding vector_l2_ops);
-- Inner product
CREATE INDEX ON documents USING hnsw (embedding vector_ip_ops);
-- Tuned — higher m = more memory, better recall
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
IVFFlat (older, still useful for very large tables):
-- Requires data to exist BEFORE creating index (unlike HNSW)
-- lists parameter: sqrt(rows) is a common starting point
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 1000);
-- Query-time recall tuning
SET ivfflat.probes = 10; -- Higher = better recall, slower
SELECT id FROM documents ORDER BY embedding <=> '[...]'::vector LIMIT 10;
Comparison:
| Feature | HNSW | IVFFlat |
|---|---|---|
| Build time | Slower | Faster |
| Query speed | Fast | Slower than HNSW |
| Recall | Higher | Lower (tunable) |
| Memory | Higher | Lower |
| Build on empty table? | Yes | No — needs data first |
| Incremental inserts | Good | Degrades over time |
Pro Tip: Default to HNSW unless you have specific reasons. IVFFlat was the original pgvector index; HNSW (added in 0.5) is faster and higher quality for almost all workloads. Only consider IVFFlat if build time matters more than query speed, or you’re on an older pgvector version.
Fix 5: Query Planner Not Using the Index
EXPLAIN ANALYZE
SELECT id, embedding <=> '[...]'::vector AS dist
FROM documents
ORDER BY embedding <=> '[...]'::vector
LIMIT 10;
-- Seq Scan on documents (cost=0..12543 rows=50000 time=...)
-- ^ NOT using index
Common causes:
Cause 1: Too few rows. Postgres uses seq scan when a small table is faster to scan than to index-lookup. If you only have 1000 rows, don’t worry about it.
Cause 2: Missing ORDER BY or LIMIT. pgvector indexes only accelerate ORDER BY distance LIMIT n queries — not full distance calculations:
-- WRONG — no ORDER BY/LIMIT, index can't help
SELECT id, embedding <=> '[...]'::vector AS dist FROM documents WHERE dist < 0.5;
-- CORRECT — ORDER BY + LIMIT uses the index
SELECT id, embedding <=> '[...]'::vector AS dist
FROM documents
ORDER BY embedding <=> '[...]'::vector
LIMIT 100;
-- If you need score threshold, do ORDER BY + LIMIT + filter in subquery
SELECT * FROM (
SELECT id, embedding <=> '[...]'::vector AS dist
FROM documents
ORDER BY embedding <=> '[...]'::vector
LIMIT 100
) sub WHERE dist < 0.5;
Cause 3: Index and operator mismatch. An hnsw (vector_cosine_ops) index only accelerates <=>. If you query with <->, the planner falls back to seq scan.
Cause 4: Filter selectivity. When combined with a restrictive WHERE clause, the planner may decide the filter is selective enough that seq scan + filter is faster than index scan + filter:
-- Vector index MIGHT not be used if WHERE is selective
SELECT id FROM documents
WHERE category = 'rare_category' -- Only 100 rows match
ORDER BY embedding <=> '[...]'::vector
LIMIT 10;
Force the planner (pgvector 0.5+):
SET hnsw.iterative_scan = strict_order; -- Use index but maintain exact ordering
For filtered queries, create a composite strategy — pgvector 0.8+ supports pre-filtering via WHERE columns being indexed separately:
CREATE INDEX ON documents (category); -- B-tree on filter column
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);
-- Planner combines both indexes
SELECT id FROM documents
WHERE category = 'news'
ORDER BY embedding <=> '[...]'::vector
LIMIT 10;
For Postgres query planning patterns beyond vector search, see Postgres index not used.
Fix 6: Dimension Mismatch on Insert
ERROR: expected 1536 dimensions, not 384
The table’s column type (vector(1536)) enforces the dimension. Inserting a 384-dim vector fails.
Check your schema:
SELECT column_name, data_type, udt_name, atttypmod
FROM information_schema.columns c
JOIN pg_attribute a ON a.attname = c.column_name
WHERE table_name = 'documents' AND column_name = 'embedding';
Change dimension — requires dropping and recreating the column (Postgres can’t alter vector dimensions in place):
ALTER TABLE documents DROP COLUMN embedding;
ALTER TABLE documents ADD COLUMN embedding vector(1024); -- New dimension
-- Re-compute and re-insert all embeddings
Support multiple embedding models in the same table:
CREATE TABLE documents (
id BIGSERIAL PRIMARY KEY,
content TEXT,
embedding_openai vector(1536), -- OpenAI small
embedding_openai_large vector(3072), -- OpenAI large
embedding_local vector(768), -- mpnet
created_at TIMESTAMPTZ DEFAULT now()
);
Untyped vector column (allows varying dimensions — use with caution):
CREATE TABLE documents (
id BIGSERIAL PRIMARY KEY,
content TEXT,
embedding vector -- No dimension specified
);
-- Each row can have different dimension
-- Can't create HNSW index without a fixed dimension though
Fix 7: Hybrid Search — Vector + Full-Text
One of pgvector’s strengths: combining vector similarity with Postgres full-text search for hybrid retrieval.
-- Add tsvector column for full-text search
ALTER TABLE documents
ADD COLUMN search_vector tsvector
GENERATED ALWAYS AS (to_tsvector('english', content)) STORED;
CREATE INDEX ON documents USING gin(search_vector);
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);
Weighted hybrid query using Reciprocal Rank Fusion:
WITH semantic_search AS (
SELECT id, RANK() OVER (ORDER BY embedding <=> '[...]'::vector) AS rank
FROM documents
ORDER BY embedding <=> '[...]'::vector
LIMIT 20
),
keyword_search AS (
SELECT id, RANK() OVER (ORDER BY ts_rank(search_vector, query) DESC) AS rank
FROM documents, plainto_tsquery('english', 'search terms') query
WHERE search_vector @@ query
ORDER BY ts_rank(search_vector, query) DESC
LIMIT 20
)
SELECT
COALESCE(s.id, k.id) AS id,
COALESCE(1.0 / (60 + s.rank), 0) + COALESCE(1.0 / (60 + k.rank), 0) AS rrf_score
FROM semantic_search s
FULL OUTER JOIN keyword_search k ON s.id = k.id
ORDER BY rrf_score DESC
LIMIT 10;
This hybrid approach catches both semantic matches (embeddings) and exact keyword matches (full-text) — typically better than either alone.
Fix 8: Working with Large Datasets
Batch insert with COPY — 10–100x faster than INSERT for bulk loads:
import psycopg
from pgvector.psycopg import register_vector
conn = psycopg.connect("...")
register_vector(conn)
with conn.cursor() as cur:
with cur.copy("COPY documents (content, embedding) FROM STDIN") as copy:
for text, vec in all_docs:
copy.write_row((text, vec))
conn.commit()
Create index AFTER bulk insert — dramatically faster:
-- 1. Bulk load into table WITHOUT index
COPY documents FROM '/tmp/docs.csv' WITH CSV;
-- 2. Create index on loaded data
CREATE INDEX CONCURRENTLY ON documents USING hnsw (embedding vector_cosine_ops);
CONCURRENTLY keeps the table writable during index creation — important for production.
Parallel index build (pgvector 0.6+):
SET max_parallel_maintenance_workers = 4; -- Use 4 workers for index build
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);
Shard large tables with partitioning:
CREATE TABLE documents (
id BIGSERIAL,
embedding vector(1536),
created_at TIMESTAMPTZ
) PARTITION BY RANGE (created_at);
CREATE TABLE documents_2024 PARTITION OF documents
FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');
CREATE TABLE documents_2025 PARTITION OF documents
FOR VALUES FROM ('2025-01-01') TO ('2026-01-01');
-- Index per partition
CREATE INDEX ON documents_2025 USING hnsw (embedding vector_cosine_ops);
Partitioning keeps individual indexes smaller and enables partition pruning on time-filtered queries.
Still Not Working?
pgvector vs Dedicated Vector Databases
- pgvector — Best when you already run Postgres. Great for combining vector search with SQL joins and transactions. Simpler ops.
- Pinecone — Fully managed, scales automatically. See Pinecone not working.
- Qdrant — Self-hosted or managed, rich filtering and payload schemas.
- Chroma — Local-first, simplest for prototypes.
Monitoring and Maintenance
-- Check index size and usage
SELECT
indexrelname,
pg_size_pretty(pg_relation_size(indexrelid)) AS size,
idx_scan AS scans
FROM pg_stat_user_indexes
WHERE relname = 'documents';
-- Reindex if the HNSW index degrades (rare, but after many updates)
REINDEX INDEX CONCURRENTLY documents_embedding_idx;
-- Analyze to update planner statistics
ANALYZE documents;
Connection Pooling
For web apps making many queries, use a connection pool. Without it, each query opens a new connection, which is slow:
from psycopg_pool import ConnectionPool
from pgvector.psycopg import register_vector
pool = ConnectionPool("postgresql://...", max_size=20, configure=register_vector)
with pool.connection() as conn:
with conn.cursor() as cur:
cur.execute("SELECT ...")
For general Postgres connection issues, check pg_hba.conf rules and that the user role has CONNECT privilege on the database before opening the pool.
LangChain / LlamaIndex Integration
# LangChain
from langchain_postgres import PGVector
from langchain_openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vector_store = PGVector(
embeddings=embeddings,
collection_name="my_docs",
connection="postgresql://user:pass@localhost/mydb",
use_jsonb=True,
)
vector_store.add_texts(texts=["doc 1", "doc 2"])
results = vector_store.similarity_search("query", k=5)
ChromaDB is a simpler alternative when you don’t need SQL joins or transactional guarantees — it runs as a single-binary local server and trades operational features for setup speed.
HNSW Recall Drops After Heavy Updates
HNSW graphs degrade when you delete or update many rows because deleted nodes remain in the graph as tombstones until VACUUM removes them. If recall drops from 0.95 to 0.7 after a large reindex job, you have not actually rebuilt the index — only the visible heap was cleaned. Run REINDEX INDEX CONCURRENTLY on the HNSW index after any operation that touches more than ~10% of rows, and schedule a weekly job for write-heavy tables.
Query Returns Same Top Result Every Time
If every query returns the same row regardless of input, your vectors are probably zero or unnormalized. Confirm with SELECT vector_norm(embedding) FROM documents LIMIT 5; — anything returning 0 or NaN means the embedding pipeline broke (often an OpenAI rate-limit retry that returned an empty array silently). Add a CHECK (vector_norm(embedding) > 0) constraint to catch this at insert time. For cosine distance specifically, all vectors must be L2-normalized or the math is meaningless — verify normalization happens client-side before the insert.
register_vector Has No Effect in Async Code
pgvector.psycopg.register_vector patches the connection adapter map. In psycopg.AsyncConnection, the equivalent is register_vector_async — calling the sync version on an async connection silently does nothing, and queries return str instead of numpy.ndarray. The same trap exists in asyncpg: use pgvector.asyncpg.register_vector(conn) after every new connection, including those handed out by a pool’s setup callback.
Platform-Specific Differences
pgvector behaves the same in the SQL layer everywhere, but everything around it — install path, dimension limits, write throughput, replication — varies dramatically by host. Choosing the wrong platform is the most common reason “pgvector is slow” in production.
pgvector vs pgvecto.rs vs hnswlib-in-DB
Two newer extensions reuse the vector type but replace the index engine. pgvecto.rs (Rust) supports higher dimensions, online index updates, and quantization. The trade-off is that it is not available on most managed Postgres providers — you need self-hosted Postgres or a partner image. hnswlib-in-DB through PL/Python is a hack people sometimes try; it works on a single node but loses transactional consistency on the index because the binary state lives outside Postgres. Stick with stock pgvector unless you have measured a specific bottleneck — the 0.7+ HNSW implementation is already very close to specialized engines.
RDS vs Aurora vs Supabase vs Neon vs Cloudflare D1
- AWS RDS for PostgreSQL:
pgvectorships preinstalled on PG 15.2+. Enable via parameter groupshared_preload_librariesandCREATE EXTENSION vector;. The version is tied to the RDS minor version, so you cannot upgradepgvectorwithout an RDS minor-version upgrade — plan accordingly. - Aurora PostgreSQL: Same install path as RDS, but Aurora’s storage engine writes pages differently. HNSW index builds are noticeably slower than vanilla RDS because of how the shared storage layer handles random writes. Use
max_parallel_maintenance_workersaggressively to compensate. - Supabase:
pgvectoris the default vector store and is exposed throughvecsand the REST API. Supabase pins a specific version per project — check the dashboard before assuming<+>(0.7+) is available. For Supabase auth/RLS issues that often block vector inserts via the API, see Supabase not working. - Neon: Supports pgvector on all tiers. Branching means you can clone a vector-indexed database in seconds for evaluation — useful when tuning
mandef_construction. The autoscaler pauses idle databases, which causes the first query after a cold start to take several seconds. - Cloudflare D1: D1 is SQLite, not Postgres — it does not support pgvector. For vector search on Cloudflare, use Vectorize (Cloudflare’s managed vector index) alongside D1, not inside it.
Embedding Dimension Caps
The vector type allows up to 16,000 dimensions. HNSW and IVFFlat indexes, however, are limited to 2,000 dimensions on pgvector and 4,096 with halfvec (half-precision, pgvector 0.7+). Models like text-embedding-3-large (3,072 dim) cannot be indexed directly with HNSW — either downcast to halfvec(3072), project down to 1,536 with the OpenAI dimensions parameter, or use a binary quantization extension. Hitting the cap silently falls back to sequential scan, which is the most common cause of “my index is huge but queries are still slow.”
Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.
Was this article helpful?
Related Articles
Fix: Milvus Not Working — Connection Errors, Schema Setup, and Index Build Failures
How to fix Milvus errors — pymilvus connection refused localhost 19530, collection schema mismatch, index not built before search, partition not found, embedded vs standalone vs cluster, and flush before search.
Fix: Weaviate Not Working — Client v4 Migration, Schema Setup, and Vectorizer Errors
How to fix Weaviate errors — client v3 to v4 migration breaking imports, schema creation property mismatch, vectorizer module not loaded, connection refused localhost 8080, batch import errors, and hybrid search alpha tuning.
Fix: ChromaDB Not Working — Persistent Client, Collection Errors, and Embedding Function Issues
How to fix ChromaDB errors — persistent client not saving data, collection already exists error, dimension mismatch in embeddings, embedding function required, HTTP client connection refused, and memory growing unbounded.
Fix: Pinecone Not Working — Index Creation, Serverless vs Pod, and Python SDK v3 Migration
How to fix Pinecone errors — ApiException 401 unauthorized, index not found, dimension mismatch, serverless spec required, Python SDK v3 breaking changes, namespace confusion, and upsert rate limit 429.