Fix: PostgreSQL Index Not Being Used — Query Planner Ignores Index

Q: How do I fix "PostgreSQL Index Not Being Used — Query Planner Ignores Index"?

How to fix PostgreSQL indexes not being used — EXPLAIN ANALYZE output, function on indexed column, type mismatches, statistics staleness, partial indexes, and query planner costs.

The Problem

A PostgreSQL query is slow despite having an index on the relevant column:

-- Index exists
CREATE INDEX idx_orders_status ON orders(status);

-- Query is still doing a sequential scan
EXPLAIN ANALYZE
SELECT * FROM orders WHERE status = 'pending';

-- Output shows:
-- Seq Scan on orders  (cost=0.00..15420.00 rows=50 width=200)
--                     (actual time=0.012..89.432 rows=50 loops=1)
-- Filter: ((status)::text = 'pending'::text)

Or an index exists but EXPLAIN shows a different index being chosen:

-- Index on (user_id, created_at) exists
-- Query filters on user_id but still does a sequence scan
SELECT * FROM events WHERE user_id = 42 ORDER BY created_at DESC LIMIT 10;

Or an index was working but stopped being used after a data load:

-- After bulk insert of 500K rows, queries stopped using the index
-- Table now has 2M rows but planner still uses old statistics

In an incident, this manifests as p99 query latency spiking after a table crosses a statistics threshold, with every API endpoint depending on that table slowing in unison. Connection pools saturate, downstream services time out, and the on-call rotation gets paged for “the database is slow” without a clear culprit until someone runs EXPLAIN ANALYZE.

Why This Happens

PostgreSQL’s query planner uses cost-based optimization. It estimates the cost of every available plan and chooses the cheapest. An index isn’t always cheaper — the planner may legitimately prefer a sequential scan. But sometimes its estimate is wrong:

Low selectivity — if status = 'pending' matches 40% of rows, a sequential scan is often cheaper than random index lookups. The planner is usually right to skip the index.
Function applied to the indexed column — WHERE LOWER(email) = '[email protected]' can’t use an index on email. The index stores the original values, not the function’s output.
Type mismatch — WHERE user_id = '42' (string literal) doesn’t use an index on integer user_id in some cases. PostgreSQL must cast one side.
Stale statistics — after bulk inserts or deletes, pg_statistic may not reflect the new data distribution. ANALYZE updates statistics.
Small table — for tables with fewer than ~1000 rows, a sequential scan is often faster than index lookups. The planner correctly ignores the index.
High fill factor / bloat — index bloat from many updates makes the index larger and more expensive to scan.

There is also a class of “phantom” index-not-used incidents driven by infrastructure changes. After a failover to a read replica, statistics may not have been propagated. After a pg_dump/pg_restore migration, ANALYZE was forgotten and the new database has zero statistics — so every query plans against default assumptions until autovacuum catches up.

A subtle root cause: PostgreSQL only re-plans queries when the parameter values change enough to invalidate the cached plan. With prepared statements, the planner sometimes uses a “generic plan” after the fifth execution, and that generic plan can avoid a perfectly good index because it’s pessimistically conservative across the full domain of parameter values.

Fix 1: Read EXPLAIN ANALYZE Output

Understand what the planner is doing before tuning:

-- Always use EXPLAIN ANALYZE (executes the query and shows real timings)
-- EXPLAIN alone shows estimated costs without running
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM orders
WHERE user_id = 42 AND created_at > NOW() - INTERVAL '30 days';

Key output fields:

Index Scan using idx_orders_user_created on orders
  (cost=0.43..128.50 rows=47 width=200)
  (actual time=0.083..1.234 rows=52 loops=1)
  Index Cond: (user_id = 42)
  Filter: (created_at > (now() - '30 days'::interval))
  Rows Removed by Filter: 3
Buffers: shared hit=45 read=12
Planning Time: 0.412 ms
Execution Time: 1.298 ms

cost=start..total — planner’s estimated cost (lower is better; relative units)
rows= — estimated vs actual row count. Large discrepancies → stale statistics
Buffers: shared hit=X read=Y — hit = cached pages (fast), read = disk reads (slow)
Seq Scan — sequential scan (reading entire table)
Index Scan — uses index, fetches rows from heap
Index Only Scan — uses index alone, no heap access (fastest for covered queries)
Bitmap Index Scan + Bitmap Heap Scan — for high-selectivity scans returning many rows

Check for large estimated vs actual row count discrepancies:

-- Estimated 50 rows, actual 50000 → stale statistics or skewed distribution
-- This causes the planner to underestimate index scan cost
-- Fix: ANALYZE the table
ANALYZE orders;

Fix 2: Fix Functions on Indexed Columns

A function applied to a column prevents index use — create a functional index instead:

-- Problem: index on email doesn't help LOWER() queries
CREATE INDEX idx_users_email ON users(email);

-- Query can't use the index:
SELECT * FROM users WHERE LOWER(email) = '[email protected]';
-- Planner must evaluate LOWER(email) for every row

-- Fix 1: Create a functional index
CREATE INDEX idx_users_email_lower ON users(LOWER(email));

-- Now this query uses the index:
SELECT * FROM users WHERE LOWER(email) = '[email protected]';

Common function patterns that break indexes:

-- Date/time functions
-- WRONG — can't use index on created_at
WHERE DATE(created_at) = '2024-01-15'
WHERE EXTRACT(YEAR FROM created_at) = 2024

-- CORRECT — use range query on the original column
WHERE created_at >= '2024-01-15' AND created_at < '2024-01-16'
WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01'

-- String functions
-- WRONG
WHERE UPPER(last_name) = 'SMITH'

-- CORRECT — functional index
CREATE INDEX idx_users_last_name_upper ON users(UPPER(last_name));

-- WRONG — LIKE with leading wildcard (index unusable)
WHERE name LIKE '%smith%'

-- CORRECT — pg_trgm for substring search
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE INDEX idx_users_name_trgm ON users USING GIN(name gin_trgm_ops);
WHERE name ILIKE '%smith%'  -- Now uses GIN index

Fix 3: Fix Type Mismatches

Implicit type casts prevent index use in some cases:

-- Table definition
CREATE TABLE events (
  id BIGSERIAL PRIMARY KEY,
  user_id INTEGER,   -- Integer column
  status VARCHAR(20)
);

CREATE INDEX idx_events_user_id ON events(user_id);

-- PROBLEM — string literal '42' vs integer column
-- PostgreSQL may cast the column instead of the literal
EXPLAIN SELECT * FROM events WHERE user_id = '42';
-- May show Seq Scan due to implicit cast

-- FIX — use the correct type
SELECT * FROM events WHERE user_id = 42;   -- Integer literal

-- Verify column type
SELECT column_name, data_type
FROM information_schema.columns
WHERE table_name = 'events' AND column_name = 'user_id';

UUID columns — common type mismatch:

-- UUID stored as text vs uuid type
-- A text index on a uuid column (or vice versa) won't be used with uuid literals

-- Check actual column type
SELECT pg_typeof(user_id) FROM events LIMIT 1;

-- If you must compare text to uuid, cast explicitly:
SELECT * FROM events WHERE user_id = '550e8400-e29b-41d4-a716-446655440000'::uuid;

Fix 4: Update Statistics After Bulk Operations

After bulk inserts, updates, or deletes, statistics may be stale:

-- After loading 1M rows
ANALYZE orders;     -- Update statistics for the orders table

-- Or analyze specific columns that are used in WHERE clauses
ANALYZE orders(status, created_at, user_id);

-- Check when statistics were last updated
SELECT
  schemaname,
  tablename,
  last_analyze,
  last_autoanalyze,
  n_live_tup,
  n_dead_tup
FROM pg_stat_user_tables
WHERE tablename = 'orders';

-- Check if autovacuum is keeping up
SELECT relname, last_vacuum, last_autovacuum, last_analyze, last_autoanalyze
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;

Increase statistics target for columns with skewed distributions:

-- Default statistics target is 100 samples
-- Increase for columns with highly non-uniform distribution
ALTER TABLE orders ALTER COLUMN status SET STATISTICS 500;

-- Then analyze to regenerate with more samples
ANALYZE orders;

-- Check current statistics target
SELECT attname, attstattarget
FROM pg_attribute
WHERE attrelid = 'orders'::regclass AND attnum > 0;

Fix 5: Create the Right Index Type

Different query patterns need different index types:

-- B-tree (default) — equality, range, ORDER BY, LIKE 'prefix%'
CREATE INDEX idx_orders_created ON orders(created_at);

-- Range query — B-tree handles this well
SELECT * FROM orders WHERE created_at BETWEEN '2024-01-01' AND '2024-12-31';

-- GIN — full-text search, JSONB, arrays
CREATE INDEX idx_products_tags ON products USING GIN(tags);  -- tags is JSONB or array
SELECT * FROM products WHERE tags @> ARRAY['electronics'];

-- GiST — geometric types, full-text search
CREATE INDEX idx_locations_point ON locations USING GIST(coordinates);

-- BRIN (Block Range INdex) — huge tables with naturally ordered data (logs, time-series)
-- Very small index size, works by tracking min/max per block
CREATE INDEX idx_events_created_brin ON events USING BRIN(created_at);
-- 100-1000x smaller than B-tree, but only useful for naturally ordered data

-- Hash — only equality (=) queries, not ranges
CREATE INDEX idx_users_token ON users USING HASH(session_token);
SELECT * FROM users WHERE session_token = 'abc123';

-- Composite index — column order matters
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
-- Usable for: WHERE user_id = 42
-- Usable for: WHERE user_id = 42 AND status = 'pending'
-- NOT usable for: WHERE status = 'pending' (leading column missing)

Partial indexes — index only the rows you query:

-- Only 5% of orders are 'pending' — index just those rows
CREATE INDEX idx_orders_pending ON orders(created_at)
WHERE status = 'pending';

-- Query must include the WHERE condition to use this index
SELECT * FROM orders
WHERE status = 'pending' AND created_at > NOW() - INTERVAL '7 days';

-- MUCH smaller index — faster to scan, less memory needed
-- Check size comparison:
SELECT
  indexname,
  pg_size_pretty(pg_relation_size(indexname::regclass)) AS size
FROM pg_indexes
WHERE tablename = 'orders';

Fix 6: Force Index Use for Testing

Use enable_seqscan = off to test whether an index would help:

-- Temporarily disable sequential scans to force index use
-- Use only for testing — never in production
SET enable_seqscan = off;

EXPLAIN ANALYZE
SELECT * FROM orders WHERE status = 'pending';

-- Compare:
-- Sequential scan: actual time=89.432ms
-- Index scan (forced): actual time=2.341ms
-- If index scan is much faster, the planner has wrong cost estimates

-- Reset after testing
SET enable_seqscan = on;

-- If forcing the index IS faster, diagnose why planner avoids it:
-- 1. Run ANALYZE to update statistics
-- 2. Check effective_cache_size and random_page_cost settings
-- 3. Consider increasing statistics target

Tune planner cost parameters:

-- Check current settings
SHOW random_page_cost;   -- Default: 4.0 (SSD should be ~1.1)
SHOW effective_cache_size;  -- Default: 4GB (set to available RAM)

-- For SSD-backed databases — index scans are less "expensive"
-- Set random_page_cost closer to seq_page_cost (1.0)
ALTER SYSTEM SET random_page_cost = 1.1;
ALTER SYSTEM SET effective_cache_size = '8GB';  -- Set to actual available RAM
SELECT pg_reload_conf();

-- Verify the planner now prefers the index
EXPLAIN ANALYZE SELECT * FROM orders WHERE status = 'pending';

Fix 7: Monitor Index Usage

Find unused indexes wasting space and slowing writes:

-- Indexes never (or rarely) used in scans
SELECT
  schemaname,
  tablename,
  indexname,
  idx_scan,           -- Number of index scans
  idx_tup_read,       -- Rows returned by index scans
  idx_tup_fetch,      -- Rows fetched from table via index
  pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0    -- Never used since last stats reset
ORDER BY pg_relation_size(indexrelid) DESC;

-- Reset stats (do this periodically to get current usage)
-- SELECT pg_stat_reset();  -- Resets ALL stats — use carefully

-- Find duplicate indexes (same columns, different names)
SELECT
  indrelid::regclass AS table_name,
  array_agg(indexrelid::regclass) AS indexes,
  array_agg(indkey) AS keys
FROM pg_index
GROUP BY indrelid, indkey
HAVING COUNT(*) > 1;

-- Check index bloat
SELECT
  tablename,
  indexname,
  pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
  pg_size_pretty(pg_relation_size(indrelid)) AS table_size
FROM pg_stat_user_indexes
JOIN pg_index USING (indexrelid)
ORDER BY pg_relation_size(indexrelid) DESC
LIMIT 20;

Fix 8: The Production Incident Playbook

When p99 latency spikes and you suspect an index regression, work through this checklist in order:

Step 1: Identify the slow query. Use pg_stat_statements to find which queries jumped in mean time:

SELECT
  query,
  calls,
  total_exec_time / calls AS avg_ms,
  rows / calls AS avg_rows
FROM pg_stat_statements
WHERE total_exec_time > 1000
ORDER BY total_exec_time DESC
LIMIT 20;

Step 2: Capture the current plan. Run EXPLAIN (ANALYZE, BUFFERS) on a representative parameter set. Compare against a baseline plan from a healthy snapshot — most teams should keep a daily archive of EXPLAIN output for top queries.

Step 3: Check statistics freshness. Query pg_stat_user_tables.last_analyze for the affected table. If it’s hours old after a recent bulk operation, run ANALYZE table_name immediately. This is the single fastest mitigation and resolves a large fraction of “index suddenly not used” incidents.

Step 4: Watch pg_stat_user_indexes.idx_scan. If idx_scan = 0 for an index that should be hot, the planner is not using it at all. If idx_scan is incrementing but the query is still slow, the planner is using a different (worse) index.

Step 5: Confirm with a forced plan. In a non-production session, SET enable_seqscan = off and re-run the query. If forcing the index produces dramatically better timings, you have planner cost-estimation drift — usually fixed by ANALYZE, tuning random_page_cost, or increasing the statistics target.

Blast radius: A single mis-planned query on a hot table can cascade. Each slow query holds a connection, which exhausts the pool, which forces application servers to queue requests, which times out upstream calls. Within minutes, an apparently isolated database issue presents as a site-wide outage.

Recovery: Run ANALYZE first (it’s safe, non-blocking, and resolves stale-statistics issues immediately). If that doesn’t fix it, look for missing or wrong-type indexes, then planner cost tuning. Avoid REINDEX during an active incident unless you have confirmed bloat — REINDEX without CONCURRENTLY takes a write lock.

Still Not Working?

Covering indexes eliminate heap access — include all columns needed by the query in the index to enable Index Only Scans:

-- Query fetches user_id, email, created_at — all in the index
CREATE INDEX idx_users_covering ON users(user_id) INCLUDE (email, created_at);

SELECT user_id, email, created_at FROM users WHERE user_id = 42;
-- Index Only Scan — never touches the heap (fastest possible)

HOT updates and index bloat — PostgreSQL’s HOT (Heap Only Tuple) optimization avoids updating indexes during updates when only non-indexed columns change. If indexed columns are frequently updated, index bloat grows. Use REINDEX CONCURRENTLY to rebuild without blocking:

REINDEX INDEX CONCURRENTLY idx_orders_status;

Indexes on very wide columns — B-tree indexes on columns wider than ~2700 bytes fail silently. Use expression indexes or hash indexes for very long strings.

Generic plan vs custom plan with prepared statements — PostgreSQL switches from custom plans to a generic plan after five executions of a prepared statement. The generic plan can avoid an index that the custom plan would use. Override with plan_cache_mode = force_custom_plan per session if you see prepared statements regressing after the fifth call.

Read replica lagging on statistics — on replicas, ANALYZE runs only on the primary and is replicated as part of the catalog. If you see a primary using an index but the replica doing a seq scan, check pg_stat_user_tables.last_analyze on both. Some replication topologies need an explicit ANALYZE push.

Parallel query disabled — for very large tables, the planner may choose a parallel sequential scan over an index scan because the parallel workers make seq scan competitive. Check max_parallel_workers_per_gather. If you want to suppress parallel for a query, set SET max_parallel_workers_per_gather = 0.