Back to Blog
DatabaseIndexingPerformanceSQL

Database Index Optimization Techniques

Master database indexing. From choosing the right index type to analyzing query plans to maintaining index health.

B
Bootspring Team
Engineering
April 28, 2023
6 min read

Proper indexing can make your database queries 100x faster. Poor indexing wastes storage and slows down writes. Here's how to get it right.

Understanding Query Plans#

1-- Analyze query execution 2EXPLAIN ANALYZE SELECT * FROM orders WHERE user_id = 123; 3 4-- Output tells you: 5-- Seq Scan = No index used (bad for large tables) 6-- Index Scan = Index used efficiently 7-- Index Only Scan = All data from index (best) 8-- Bitmap Scan = Index used for filtering 9 10-- Check which indexes exist 11SELECT indexname, indexdef 12FROM pg_indexes 13WHERE tablename = 'orders';

Index Selection Guidelines#

1// Analyze your queries to determine indexes 2const queryPatterns = { 3 // Exact match - single column index 4 findByEmail: 'SELECT * FROM users WHERE email = $1', 5 // Index: CREATE INDEX idx_users_email ON users(email) 6 7 // Range query - index on range column 8 recentOrders: 'SELECT * FROM orders WHERE created_at > $1', 9 // Index: CREATE INDEX idx_orders_created ON orders(created_at) 10 11 // Multiple conditions - composite index 12 userOrders: 'SELECT * FROM orders WHERE user_id = $1 AND status = $2', 13 // Index: CREATE INDEX idx_orders_user_status ON orders(user_id, status) 14 15 // Sort with filter - include sort column 16 sortedOrders: 'SELECT * FROM orders WHERE user_id = $1 ORDER BY created_at DESC', 17 // Index: CREATE INDEX idx_orders_user_created ON orders(user_id, created_at DESC) 18};

Composite Index Ordering#

1-- The order of columns matters! 2 3-- Index: (user_id, status, created_at) 4 5-- ✅ Uses full index 6SELECT * FROM orders WHERE user_id = 1 AND status = 'pending' AND created_at > '2024-01-01'; 7 8-- ✅ Uses first two columns 9SELECT * FROM orders WHERE user_id = 1 AND status = 'pending'; 10 11-- ✅ Uses first column only 12SELECT * FROM orders WHERE user_id = 1; 13 14-- ❌ Cannot use index efficiently (skips first column) 15SELECT * FROM orders WHERE status = 'pending'; 16 17-- ❌ Cannot use index (skips first two columns) 18SELECT * FROM orders WHERE created_at > '2024-01-01'; 19 20-- Rule: Put equality conditions first, then range conditions 21CREATE INDEX idx_orders_optimized ON orders(user_id, status, created_at);

Covering Indexes#

1-- Covering index includes all columns needed by query 2-- Query doesn't need to access table data 3 4-- Query 5SELECT email, name FROM users WHERE status = 'active'; 6 7-- Covering index 8CREATE INDEX idx_users_status_covering ON users(status) INCLUDE (email, name); 9 10-- Now query uses "Index Only Scan" - fastest possible 11 12-- PostgreSQL 11+ 13CREATE INDEX idx_orders_covering ON orders(user_id) 14INCLUDE (total, status, created_at); 15 16-- This query won't access the table at all 17SELECT total, status FROM orders WHERE user_id = 123;

Partial Indexes#

1-- Index only relevant rows 2-- Smaller index = faster queries and less storage 3 4-- Only index active users 5CREATE INDEX idx_users_active_email ON users(email) WHERE status = 'active'; 6 7-- Only index recent orders 8CREATE INDEX idx_orders_recent ON orders(user_id, created_at) 9WHERE created_at > '2024-01-01'; 10 11-- Only index non-null values 12CREATE INDEX idx_users_phone ON users(phone) WHERE phone IS NOT NULL; 13 14-- Query must match WHERE clause to use partial index 15SELECT * FROM users WHERE email = 'test@example.com' AND status = 'active'; -- ✅ Uses index 16SELECT * FROM users WHERE email = 'test@example.com'; -- ❌ Cannot use partial index

Expression Indexes#

1-- Index on computed values 2 3-- Case-insensitive email search 4CREATE INDEX idx_users_email_lower ON users(LOWER(email)); 5-- Query must use same expression 6SELECT * FROM users WHERE LOWER(email) = 'test@example.com'; 7 8-- Date extraction 9CREATE INDEX idx_orders_month ON orders(DATE_TRUNC('month', created_at)); 10SELECT * FROM orders WHERE DATE_TRUNC('month', created_at) = '2024-01-01'; 11 12-- JSON field 13CREATE INDEX idx_users_plan ON users((metadata->>'plan')); 14SELECT * FROM users WHERE metadata->>'plan' = 'pro';

Index Maintenance#

1-- Find unused indexes 2SELECT 3 schemaname || '.' || relname AS table, 4 indexrelname AS index, 5 pg_size_pretty(pg_relation_size(i.indexrelid)) AS size, 6 idx_scan as index_scans 7FROM pg_stat_user_indexes ui 8JOIN pg_index i ON ui.indexrelid = i.indexrelid 9WHERE idx_scan = 0 10 AND NOT indisunique 11ORDER BY pg_relation_size(i.indexrelid) DESC; 12 13-- Find duplicate indexes 14SELECT 15 a.indrelid::regclass AS table_name, 16 a.indexrelid::regclass AS index1, 17 b.indexrelid::regclass AS index2 18FROM pg_index a 19JOIN pg_index b ON a.indrelid = b.indrelid 20 AND a.indexrelid < b.indexrelid 21 AND a.indkey = b.indkey; 22 23-- Check index bloat 24SELECT 25 tablename, 26 indexname, 27 pg_size_pretty(pg_relation_size(indexname::regclass)) as index_size, 28 idx_scan as times_used 29FROM pg_stat_user_indexes 30ORDER BY pg_relation_size(indexname::regclass) DESC 31LIMIT 20; 32 33-- Reindex to reduce bloat (use CONCURRENTLY to avoid locks) 34REINDEX INDEX CONCURRENTLY idx_orders_user_id; 35 36-- Update statistics 37ANALYZE orders;

Index for Sorting#

1-- Index can eliminate sort operations 2 3-- Without proper index: Sort step needed 4EXPLAIN SELECT * FROM orders WHERE user_id = 1 ORDER BY created_at DESC; 5-- -> Sort (expensive for large result sets) 6 7-- With matching index 8CREATE INDEX idx_orders_user_created_desc ON orders(user_id, created_at DESC); 9 10-- Now: Index Scan without Sort step 11EXPLAIN SELECT * FROM orders WHERE user_id = 1 ORDER BY created_at DESC; 12 13-- Multi-column sort 14CREATE INDEX idx_orders_user_status_date ON orders(user_id, status, created_at DESC); 15-- Supports: ORDER BY status ASC, created_at DESC

Monitoring Index Usage#

1// Track slow queries 2import { Pool } from 'pg'; 3 4const pool = new Pool(); 5 6// Log slow queries 7pool.on('query', (query) => { 8 const start = Date.now(); 9 10 query.on('end', () => { 11 const duration = Date.now() - start; 12 if (duration > 100) { 13 console.warn({ 14 query: query.text, 15 duration, 16 rows: query.rows?.length, 17 }); 18 } 19 }); 20}); 21 22// Periodic index analysis 23async function analyzeIndexUsage(): Promise<void> { 24 const result = await pool.query(` 25 SELECT 26 schemaname, 27 relname as table_name, 28 indexrelname as index_name, 29 idx_scan as times_used, 30 pg_size_pretty(pg_relation_size(indexrelid)) as size 31 FROM pg_stat_user_indexes 32 ORDER BY idx_scan ASC 33 LIMIT 20 34 `); 35 36 console.log('Least used indexes:', result.rows); 37}

Best Practices#

Creating Indexes: ✓ Index columns in WHERE clauses ✓ Index foreign keys ✓ Use composite indexes for multi-column queries ✓ Order: equality columns, then range, then sort Avoiding Problems: ✗ Don't index low-cardinality columns alone ✗ Don't create redundant indexes ✗ Don't ignore write performance impact ✗ Don't forget to ANALYZE after bulk changes Maintenance: ✓ Monitor index usage regularly ✓ Remove unused indexes ✓ Reindex periodically ✓ Update statistics

Conclusion#

Effective indexing requires understanding your query patterns. Use EXPLAIN ANALYZE to verify index usage, create targeted indexes, and regularly audit for unused or duplicate indexes. The right indexes can transform query performance from seconds to milliseconds.

Share this article

Help spread the word about Bootspring