Fix: PostgreSQL Index Not Being Used — Query Planner Ignores Index
Part of: Database Errors
Quick Answer
How to fix PostgreSQL indexes not being used — EXPLAIN ANALYZE output, function on indexed column, type mismatches, statistics staleness, partial indexes, and query planner costs.
The Problem
A PostgreSQL query is slow despite having an index on the relevant column:
-- Index exists
CREATE INDEX idx_orders_status ON orders(status);
-- Query is still doing a sequential scan
EXPLAIN ANALYZE
SELECT * FROM orders WHERE status = 'pending';
-- Output shows:
-- Seq Scan on orders (cost=0.00..15420.00 rows=50 width=200)
-- (actual time=0.012..89.432 rows=50 loops=1)
-- Filter: ((status)::text = 'pending'::text)Or an index exists but EXPLAIN shows a different index being chosen:
-- Index on (user_id, created_at) exists
-- Query filters on user_id but still does a sequence scan
SELECT * FROM events WHERE user_id = 42 ORDER BY created_at DESC LIMIT 10;Or an index was working but stopped being used after a data load:
-- After bulk insert of 500K rows, queries stopped using the index
-- Table now has 2M rows but planner still uses old statisticsIn an incident, this manifests as p99 query latency spiking after a table crosses a statistics threshold, with every API endpoint depending on that table slowing in unison. Connection pools saturate, downstream services time out, and the on-call rotation gets paged for “the database is slow” without a clear culprit until someone runs EXPLAIN ANALYZE.
Why This Happens
PostgreSQL’s query planner uses cost-based optimization. It estimates the cost of every available plan and chooses the cheapest. An index isn’t always cheaper — the planner may legitimately prefer a sequential scan. But sometimes its estimate is wrong:
- Low selectivity — if
status = 'pending'matches 40% of rows, a sequential scan is often cheaper than random index lookups. The planner is usually right to skip the index. - Function applied to the indexed column —
WHERE LOWER(email) = '[email protected]'can’t use an index onemail. The index stores the original values, not the function’s output. - Type mismatch —
WHERE user_id = '42'(string literal) doesn’t use an index on integeruser_idin some cases. PostgreSQL must cast one side. - Stale statistics — after bulk inserts or deletes,
pg_statisticmay not reflect the new data distribution.ANALYZEupdates statistics. - Small table — for tables with fewer than ~1000 rows, a sequential scan is often faster than index lookups. The planner correctly ignores the index.
- High fill factor / bloat — index bloat from many updates makes the index larger and more expensive to scan.
There is also a class of “phantom” index-not-used incidents driven by infrastructure changes. After a failover to a read replica, statistics may not have been propagated. After a pg_dump/pg_restore migration, ANALYZE was forgotten and the new database has zero statistics — so every query plans against default assumptions until autovacuum catches up.
A subtle root cause: PostgreSQL only re-plans queries when the parameter values change enough to invalidate the cached plan. With prepared statements, the planner sometimes uses a “generic plan” after the fifth execution, and that generic plan can avoid a perfectly good index because it’s pessimistically conservative across the full domain of parameter values.
Fix 1: Read EXPLAIN ANALYZE Output
Understand what the planner is doing before tuning:
-- Always use EXPLAIN ANALYZE (executes the query and shows real timings)
-- EXPLAIN alone shows estimated costs without running
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM orders
WHERE user_id = 42 AND created_at > NOW() - INTERVAL '30 days';Key output fields:
Index Scan using idx_orders_user_created on orders
(cost=0.43..128.50 rows=47 width=200)
(actual time=0.083..1.234 rows=52 loops=1)
Index Cond: (user_id = 42)
Filter: (created_at > (now() - '30 days'::interval))
Rows Removed by Filter: 3
Buffers: shared hit=45 read=12
Planning Time: 0.412 ms
Execution Time: 1.298 mscost=start..total— planner’s estimated cost (lower is better; relative units)rows=— estimated vs actual row count. Large discrepancies → stale statisticsBuffers: shared hit=X read=Y—hit= cached pages (fast),read= disk reads (slow)Seq Scan— sequential scan (reading entire table)Index Scan— uses index, fetches rows from heapIndex Only Scan— uses index alone, no heap access (fastest for covered queries)Bitmap Index Scan+Bitmap Heap Scan— for high-selectivity scans returning many rows
Check for large estimated vs actual row count discrepancies:
-- Estimated 50 rows, actual 50000 → stale statistics or skewed distribution
-- This causes the planner to underestimate index scan cost
-- Fix: ANALYZE the table
ANALYZE orders;Fix 2: Fix Functions on Indexed Columns
A function applied to a column prevents index use — create a functional index instead:
-- Problem: index on email doesn't help LOWER() queries
CREATE INDEX idx_users_email ON users(email);
-- Query can't use the index:
SELECT * FROM users WHERE LOWER(email) = '[email protected]';
-- Planner must evaluate LOWER(email) for every row
-- Fix 1: Create a functional index
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
-- Now this query uses the index:
SELECT * FROM users WHERE LOWER(email) = '[email protected]';Common function patterns that break indexes:
-- Date/time functions
-- WRONG — can't use index on created_at
WHERE DATE(created_at) = '2024-01-15'
WHERE EXTRACT(YEAR FROM created_at) = 2024
-- CORRECT — use range query on the original column
WHERE created_at >= '2024-01-15' AND created_at < '2024-01-16'
WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01'
-- String functions
-- WRONG
WHERE UPPER(last_name) = 'SMITH'
-- CORRECT — functional index
CREATE INDEX idx_users_last_name_upper ON users(UPPER(last_name));
-- WRONG — LIKE with leading wildcard (index unusable)
WHERE name LIKE '%smith%'
-- CORRECT — pg_trgm for substring search
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE INDEX idx_users_name_trgm ON users USING GIN(name gin_trgm_ops);
WHERE name ILIKE '%smith%' -- Now uses GIN indexFix 3: Fix Type Mismatches
Implicit type casts prevent index use in some cases:
-- Table definition
CREATE TABLE events (
id BIGSERIAL PRIMARY KEY,
user_id INTEGER, -- Integer column
status VARCHAR(20)
);
CREATE INDEX idx_events_user_id ON events(user_id);
-- PROBLEM — string literal '42' vs integer column
-- PostgreSQL may cast the column instead of the literal
EXPLAIN SELECT * FROM events WHERE user_id = '42';
-- May show Seq Scan due to implicit cast
-- FIX — use the correct type
SELECT * FROM events WHERE user_id = 42; -- Integer literal
-- Verify column type
SELECT column_name, data_type
FROM information_schema.columns
WHERE table_name = 'events' AND column_name = 'user_id';UUID columns — common type mismatch:
-- UUID stored as text vs uuid type
-- A text index on a uuid column (or vice versa) won't be used with uuid literals
-- Check actual column type
SELECT pg_typeof(user_id) FROM events LIMIT 1;
-- If you must compare text to uuid, cast explicitly:
SELECT * FROM events WHERE user_id = '550e8400-e29b-41d4-a716-446655440000'::uuid;Fix 4: Update Statistics After Bulk Operations
After bulk inserts, updates, or deletes, statistics may be stale:
-- After loading 1M rows
ANALYZE orders; -- Update statistics for the orders table
-- Or analyze specific columns that are used in WHERE clauses
ANALYZE orders(status, created_at, user_id);
-- Check when statistics were last updated
SELECT
schemaname,
tablename,
last_analyze,
last_autoanalyze,
n_live_tup,
n_dead_tup
FROM pg_stat_user_tables
WHERE tablename = 'orders';
-- Check if autovacuum is keeping up
SELECT relname, last_vacuum, last_autovacuum, last_analyze, last_autoanalyze
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;Increase statistics target for columns with skewed distributions:
-- Default statistics target is 100 samples
-- Increase for columns with highly non-uniform distribution
ALTER TABLE orders ALTER COLUMN status SET STATISTICS 500;
-- Then analyze to regenerate with more samples
ANALYZE orders;
-- Check current statistics target
SELECT attname, attstattarget
FROM pg_attribute
WHERE attrelid = 'orders'::regclass AND attnum > 0;Fix 5: Create the Right Index Type
Different query patterns need different index types:
-- B-tree (default) — equality, range, ORDER BY, LIKE 'prefix%'
CREATE INDEX idx_orders_created ON orders(created_at);
-- Range query — B-tree handles this well
SELECT * FROM orders WHERE created_at BETWEEN '2024-01-01' AND '2024-12-31';
-- GIN — full-text search, JSONB, arrays
CREATE INDEX idx_products_tags ON products USING GIN(tags); -- tags is JSONB or array
SELECT * FROM products WHERE tags @> ARRAY['electronics'];
-- GiST — geometric types, full-text search
CREATE INDEX idx_locations_point ON locations USING GIST(coordinates);
-- BRIN (Block Range INdex) — huge tables with naturally ordered data (logs, time-series)
-- Very small index size, works by tracking min/max per block
CREATE INDEX idx_events_created_brin ON events USING BRIN(created_at);
-- 100-1000x smaller than B-tree, but only useful for naturally ordered data
-- Hash — only equality (=) queries, not ranges
CREATE INDEX idx_users_token ON users USING HASH(session_token);
SELECT * FROM users WHERE session_token = 'abc123';
-- Composite index — column order matters
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
-- Usable for: WHERE user_id = 42
-- Usable for: WHERE user_id = 42 AND status = 'pending'
-- NOT usable for: WHERE status = 'pending' (leading column missing)Partial indexes — index only the rows you query:
-- Only 5% of orders are 'pending' — index just those rows
CREATE INDEX idx_orders_pending ON orders(created_at)
WHERE status = 'pending';
-- Query must include the WHERE condition to use this index
SELECT * FROM orders
WHERE status = 'pending' AND created_at > NOW() - INTERVAL '7 days';
-- MUCH smaller index — faster to scan, less memory needed
-- Check size comparison:
SELECT
indexname,
pg_size_pretty(pg_relation_size(indexname::regclass)) AS size
FROM pg_indexes
WHERE tablename = 'orders';Fix 6: Force Index Use for Testing
Use enable_seqscan = off to test whether an index would help:
-- Temporarily disable sequential scans to force index use
-- Use only for testing — never in production
SET enable_seqscan = off;
EXPLAIN ANALYZE
SELECT * FROM orders WHERE status = 'pending';
-- Compare:
-- Sequential scan: actual time=89.432ms
-- Index scan (forced): actual time=2.341ms
-- If index scan is much faster, the planner has wrong cost estimates
-- Reset after testing
SET enable_seqscan = on;
-- If forcing the index IS faster, diagnose why planner avoids it:
-- 1. Run ANALYZE to update statistics
-- 2. Check effective_cache_size and random_page_cost settings
-- 3. Consider increasing statistics targetTune planner cost parameters:
-- Check current settings
SHOW random_page_cost; -- Default: 4.0 (SSD should be ~1.1)
SHOW effective_cache_size; -- Default: 4GB (set to available RAM)
-- For SSD-backed databases — index scans are less "expensive"
-- Set random_page_cost closer to seq_page_cost (1.0)
ALTER SYSTEM SET random_page_cost = 1.1;
ALTER SYSTEM SET effective_cache_size = '8GB'; -- Set to actual available RAM
SELECT pg_reload_conf();
-- Verify the planner now prefers the index
EXPLAIN ANALYZE SELECT * FROM orders WHERE status = 'pending';Fix 7: Monitor Index Usage
Find unused indexes wasting space and slowing writes:
-- Indexes never (or rarely) used in scans
SELECT
schemaname,
tablename,
indexname,
idx_scan, -- Number of index scans
idx_tup_read, -- Rows returned by index scans
idx_tup_fetch, -- Rows fetched from table via index
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0 -- Never used since last stats reset
ORDER BY pg_relation_size(indexrelid) DESC;
-- Reset stats (do this periodically to get current usage)
-- SELECT pg_stat_reset(); -- Resets ALL stats — use carefully
-- Find duplicate indexes (same columns, different names)
SELECT
indrelid::regclass AS table_name,
array_agg(indexrelid::regclass) AS indexes,
array_agg(indkey) AS keys
FROM pg_index
GROUP BY indrelid, indkey
HAVING COUNT(*) > 1;
-- Check index bloat
SELECT
tablename,
indexname,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
pg_size_pretty(pg_relation_size(indrelid)) AS table_size
FROM pg_stat_user_indexes
JOIN pg_index USING (indexrelid)
ORDER BY pg_relation_size(indexrelid) DESC
LIMIT 20;Fix 8: The Production Incident Playbook
When p99 latency spikes and you suspect an index regression, work through this checklist in order:
Step 1: Identify the slow query. Use pg_stat_statements to find which queries jumped in mean time:
SELECT
query,
calls,
total_exec_time / calls AS avg_ms,
rows / calls AS avg_rows
FROM pg_stat_statements
WHERE total_exec_time > 1000
ORDER BY total_exec_time DESC
LIMIT 20;Step 2: Capture the current plan. Run EXPLAIN (ANALYZE, BUFFERS) on a representative parameter set. Compare against a baseline plan from a healthy snapshot — most teams should keep a daily archive of EXPLAIN output for top queries.
Step 3: Check statistics freshness. Query pg_stat_user_tables.last_analyze for the affected table. If it’s hours old after a recent bulk operation, run ANALYZE table_name immediately. This is the single fastest mitigation and resolves a large fraction of “index suddenly not used” incidents.
Step 4: Watch pg_stat_user_indexes.idx_scan. If idx_scan = 0 for an index that should be hot, the planner is not using it at all. If idx_scan is incrementing but the query is still slow, the planner is using a different (worse) index.
Step 5: Confirm with a forced plan. In a non-production session, SET enable_seqscan = off and re-run the query. If forcing the index produces dramatically better timings, you have planner cost-estimation drift — usually fixed by ANALYZE, tuning random_page_cost, or increasing the statistics target.
Blast radius: A single mis-planned query on a hot table can cascade. Each slow query holds a connection, which exhausts the pool, which forces application servers to queue requests, which times out upstream calls. Within minutes, an apparently isolated database issue presents as a site-wide outage.
Recovery: Run ANALYZE first (it’s safe, non-blocking, and resolves stale-statistics issues immediately). If that doesn’t fix it, look for missing or wrong-type indexes, then planner cost tuning. Avoid REINDEX during an active incident unless you have confirmed bloat — REINDEX without CONCURRENTLY takes a write lock.
Still Not Working?
Covering indexes eliminate heap access — include all columns needed by the query in the index to enable Index Only Scans:
-- Query fetches user_id, email, created_at — all in the index
CREATE INDEX idx_users_covering ON users(user_id) INCLUDE (email, created_at);
SELECT user_id, email, created_at FROM users WHERE user_id = 42;
-- Index Only Scan — never touches the heap (fastest possible)HOT updates and index bloat — PostgreSQL’s HOT (Heap Only Tuple) optimization avoids updating indexes during updates when only non-indexed columns change. If indexed columns are frequently updated, index bloat grows. Use REINDEX CONCURRENTLY to rebuild without blocking:
REINDEX INDEX CONCURRENTLY idx_orders_status;Indexes on very wide columns — B-tree indexes on columns wider than ~2700 bytes fail silently. Use expression indexes or hash indexes for very long strings.
Generic plan vs custom plan with prepared statements — PostgreSQL switches from custom plans to a generic plan after five executions of a prepared statement. The generic plan can avoid an index that the custom plan would use. Override with plan_cache_mode = force_custom_plan per session if you see prepared statements regressing after the fifth call.
Read replica lagging on statistics — on replicas, ANALYZE runs only on the primary and is replicated as part of the catalog. If you see a primary using an index but the replica doing a seq scan, check pg_stat_user_tables.last_analyze on both. Some replication topologies need an explicit ANALYZE push.
Parallel query disabled — for very large tables, the planner may choose a parallel sequential scan over an index scan because the parallel workers make seq scan competitive. Check max_parallel_workers_per_gather. If you want to suppress parallel for a query, set SET max_parallel_workers_per_gather = 0.
For related database issues, see Fix: MySQL Index Not Being Used, Fix: PostgreSQL Deadlock Detected, Fix: Postgres Slow Query, and Fix: Postgres Max Connections Exceeded.
Solo developer based in Japan. Every solution is cross-referenced with official documentation and tested before publishing.
Was this article helpful?
Related Articles
Fix: PostgreSQL Slow Query — Finding and Optimizing with EXPLAIN ANALYZE
How to diagnose and fix slow PostgreSQL queries — reading EXPLAIN ANALYZE output, adding the right indexes, fixing N+1 queries, optimizing joins, and using pg_stat_statements.
Fix: PostgreSQL JSONB Query Not Working — Operator Errors, Wrong Results, or Slow Queries
How to fix PostgreSQL JSONB query issues — -> vs ->> operators, @> containment, GIN indexes, type casting, array queries, and indexing strategies for JSONB columns.
Fix: PostgreSQL Row Level Security Not Working — Policy Not Applied, All Rows Visible, or Permission Denied
How to fix PostgreSQL Row Level Security (RLS) issues — enabling RLS, policy expressions, BYPASSRLS role, SET ROLE, current_user vs session_user, and Supabase auth.uid() patterns.
Fix: MySQL Full-Text Search Not Working — MATCH AGAINST Returns No Results
How to fix MySQL full-text search issues — FULLTEXT index creation, minimum word length, stopwords, boolean mode vs natural language mode, InnoDB vs MyISAM, and LIKE fallback.