What Is a Database Index?
An index is a separate data structure — usually a B-tree — that the database maintains alongside your table to speed up lookups. Think of it like a book's index: instead of reading every page to find "authentication," you jump straight to the page number listed in the index.
Without an index, a query like SELECT * FROM users WHERE email = 'alex@example.com' scans every row in the table — an O(n) operation. With an index on email, it's O(log n).
How Indexes Work Internally
PostgreSQL (and most databases) use a B-tree (balanced tree) as the default index structure:
- All leaf nodes are at the same depth
- Each lookup traverses from root to leaf in O(log n) time
- The tree automatically stays balanced on inserts/updates
For a 10-million row table, that's about 23 comparisons instead of up to 10,000,000.
Creating Indexes
-- Single column index
CREATE INDEX idx_users_email ON users(email);
-- Composite index (column order matters!)
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);
-- Unique index (also enforces constraint)
CREATE UNIQUE INDEX idx_users_email_unique ON users(email);
-- Partial index (only indexes rows matching condition)
CREATE INDEX idx_active_users ON users(email) WHERE active = true;
When to Add an Index
Good candidates:
- Columns in
WHEREclauses you run frequently - Columns used in
JOINconditions - Columns in
ORDER BYwhen sorting large result sets - Columns with high cardinality (many unique values)
Poor candidates:
- Low cardinality columns (
gender,statuswith 2-3 values) — full scan is often faster - Columns on small tables (< 1,000 rows) — full scans are fast enough
- Columns that are rarely queried
Composite Indexes: Column Order Matters
CREATE INDEX idx_orders ON orders(user_id, status, created_at);
This index can satisfy:
WHERE user_id = 5✓WHERE user_id = 5 AND status = 'pending'✓WHERE user_id = 5 AND status = 'pending' AND created_at > '2025-01-01'✓WHERE status = 'pending'✗ (doesn't start with user_id)WHERE created_at > '2025-01-01'✗
The leftmost prefix rule: the index is usable for any query that specifies the leading columns in order.
Using EXPLAIN ANALYZE
Always verify your indexes are being used:
EXPLAIN ANALYZE
SELECT * FROM orders
WHERE user_id = 123 AND status = 'pending'
ORDER BY created_at DESC;
Look for:
- Index Scan — using the index ✓
- Seq Scan — full table scan (check if you need an index)
- Bitmap Heap Scan — partial index use ✓
- cost and actual time numbers
Common Indexing Mistakes
1. Over-indexing
Every index speeds up reads but slows down writes. Each INSERT, UPDATE, DELETE must update all indexes on that table. A table with 15 indexes on a high-write workload will be slower than one with 5 well-chosen indexes.
2. Using functions on indexed columns
-- This WON'T use the index on email:
WHERE LOWER(email) = 'alex@example.com'
-- Fix: create a functional index
CREATE INDEX idx_users_lower_email ON users(LOWER(email));
-- Or store emails pre-lowercased
3. Implicit type casting
-- email is VARCHAR, this won't use the index:
WHERE email = 12345
-- Fix: match types
WHERE email = '12345'
Monitoring Index Usage
-- Find unused indexes (PostgreSQL)
SELECT schemaname, tablename, indexname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY tablename;
Drop indexes that never get used — they only cost write performance.
Conclusion
Indexes are one of the highest-leverage performance tools in a developer's toolkit. The formula is simple: identify slow queries with EXPLAIN ANALYZE, add indexes on the columns being filtered/joined/sorted, verify the index is used, and monitor for unused indexes over time. Most performance problems in database-backed apps are solved with the right indexing strategy.