Back to Blog
DatabaseIndexingPerformancePostgreSQL

Database Indexing Strategies for Better Performance

Master database indexing to speed up queries. From B-tree basics to composite indexes to avoiding common pitfalls.

B
Bootspring Team
Engineering
January 18, 2025
6 min read

Indexes are the most important tool for query optimization. A well-designed index can turn a query from seconds to milliseconds. Here's how to use them effectively.

How Indexes Work#

Without index (table scan): Query: SELECT * FROM users WHERE email = 'john@example.com' Table: 1,000,000 rows Operation: Scan every row → 1,000,000 comparisons Time: O(n) With index (index lookup): B-tree index on email column Operation: Tree traversal → ~20 comparisons Time: O(log n)

Index Types#

B-Tree (Default)#

1-- Most common, works for equality and range queries 2CREATE INDEX idx_users_email ON users(email); 3 4-- Good for: 5SELECT * FROM users WHERE email = 'john@example.com'; 6SELECT * FROM users WHERE created_at > '2024-01-01'; 7SELECT * FROM users WHERE name LIKE 'John%'; -- prefix search 8SELECT * FROM users ORDER BY created_at DESC;

Hash Index#

1-- Only for equality comparisons, faster than B-tree for exact matches 2CREATE INDEX idx_users_email_hash ON users USING HASH (email); 3 4-- Good for: 5SELECT * FROM users WHERE email = 'john@example.com'; 6 7-- NOT good for: 8SELECT * FROM users WHERE email LIKE 'john%'; -- Can't use hash 9SELECT * FROM users ORDER BY email; -- Can't use for sorting

GIN (Generalized Inverted Index)#

1-- For array, JSONB, and full-text search 2CREATE INDEX idx_users_tags ON users USING GIN (tags); 3CREATE INDEX idx_products_data ON products USING GIN (metadata); 4 5-- Good for: 6SELECT * FROM users WHERE tags @> ARRAY['developer']; 7SELECT * FROM products WHERE metadata @> '{"featured": true}';

GiST (Generalized Search Tree)#

1-- For geometric and full-text data 2CREATE INDEX idx_locations_point ON locations USING GIST (coordinates); 3CREATE INDEX idx_documents_search ON documents USING GIST (search_vector); 4 5-- Good for: 6SELECT * FROM locations 7WHERE coordinates <@ box '((0,0),(100,100))';

Composite Indexes#

Column Order Matters#

1-- Index on (a, b, c) 2CREATE INDEX idx_composite ON orders(customer_id, status, created_at); 3 4-- Can use index: 5WHERE customer_id = 123 6WHERE customer_id = 123 AND status = 'pending' 7WHERE customer_id = 123 AND status = 'pending' AND created_at > '2024-01-01' 8 9-- Cannot use index efficiently: 10WHERE status = 'pending' -- Skips first column 11WHERE customer_id = 123 AND created_at > '2024-01-01' -- Skips middle column

Design Principles#

1-- Put equality conditions first, range conditions last 2-- Good 3CREATE INDEX idx_orders ON orders(status, customer_id, created_at); 4WHERE status = 'pending' AND customer_id = 123 AND created_at > '2024-01-01' 5 6-- Put high-selectivity columns first 7-- If status has 5 values but customer_id has 100,000 values: 8CREATE INDEX idx_orders ON orders(customer_id, status);

Covering Indexes (Index-Only Scans)#

1-- Include all columns needed by query 2CREATE INDEX idx_users_email_name ON users(email) INCLUDE (name, created_at); 3 4-- Query satisfied entirely from index, no table access 5SELECT name, created_at FROM users WHERE email = 'john@example.com'; 6 7-- Check if using index-only scan 8EXPLAIN SELECT name, created_at FROM users WHERE email = 'john@example.com'; 9-- Look for "Index Only Scan"

Partial Indexes#

1-- Index only subset of rows 2CREATE INDEX idx_active_users ON users(email) WHERE status = 'active'; 3CREATE INDEX idx_recent_orders ON orders(customer_id) WHERE created_at > '2024-01-01'; 4 5-- Much smaller than full index 6-- Only used when query matches WHERE clause 7 8-- Good for: 9SELECT * FROM users WHERE email = 'john@example.com' AND status = 'active';

Expression Indexes#

1-- Index on computed values 2CREATE INDEX idx_users_lower_email ON users(LOWER(email)); 3CREATE INDEX idx_orders_year ON orders(EXTRACT(YEAR FROM created_at)); 4CREATE INDEX idx_products_price_cents ON products((price * 100)::integer); 5 6-- Query must match expression exactly 7SELECT * FROM users WHERE LOWER(email) = 'john@example.com'; -- Uses index 8SELECT * FROM users WHERE email = 'john@example.com'; -- Doesn't use index

Analyzing Index Usage#

EXPLAIN ANALYZE#

1EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'john@example.com'; 2 3-- Output: 4Index Scan using idx_users_email on users (cost=0.42..8.44 rows=1 width=128) (actual time=0.028..0.029 rows=1 loops=1) 5 Index Cond: (email = 'john@example.com'::text) 6Planning Time: 0.081 ms 7Execution Time: 0.048 ms 8 9-- Key things to look for: 10-- - "Index Scan" or "Index Only Scan" = using index 11-- - "Seq Scan" = table scan (might need index) 12-- - "Bitmap Index Scan" = using index, then fetching rows 13-- - actual time vs estimated cost

Finding Unused Indexes#

1-- PostgreSQL: Check index usage stats 2SELECT 3 schemaname || '.' || relname AS table, 4 indexrelname AS index, 5 idx_scan AS times_used, 6 pg_size_pretty(pg_relation_size(indexrelid)) AS size 7FROM pg_stat_user_indexes 8WHERE idx_scan = 0 9ORDER BY pg_relation_size(indexrelid) DESC;

Finding Missing Indexes#

1-- Check for sequential scans on large tables 2SELECT 3 schemaname || '.' || relname AS table, 4 seq_scan, 5 seq_tup_read, 6 idx_scan, 7 n_live_tup AS row_count 8FROM pg_stat_user_tables 9WHERE seq_scan > 0 10 AND n_live_tup > 10000 11ORDER BY seq_tup_read DESC;

Common Patterns#

Foreign Key Indexes#

1-- Always index foreign keys (not automatic in PostgreSQL) 2CREATE TABLE orders ( 3 id SERIAL PRIMARY KEY, 4 customer_id INTEGER REFERENCES customers(id), 5 product_id INTEGER REFERENCES products(id) 6); 7 8CREATE INDEX idx_orders_customer ON orders(customer_id); 9CREATE INDEX idx_orders_product ON orders(product_id);

Pagination#

1-- Keyset pagination is faster than OFFSET 2CREATE INDEX idx_posts_created ON posts(created_at DESC, id DESC); 3 4-- Instead of: 5SELECT * FROM posts ORDER BY created_at DESC LIMIT 20 OFFSET 1000; 6 7-- Use: 8SELECT * FROM posts 9WHERE (created_at, id) < ('2024-01-15', 12345) 10ORDER BY created_at DESC, id DESC 11LIMIT 20;

Soft Deletes#

-- Partial index for non-deleted rows CREATE INDEX idx_users_active_email ON users(email) WHERE deleted_at IS NULL; -- Most queries only need active records SELECT * FROM users WHERE email = 'john@example.com' AND deleted_at IS NULL;

Anti-Patterns#

Over-Indexing#

1-- Don't index every column 2-- Each index: 3-- - Uses disk space 4-- - Slows down writes (INSERT, UPDATE, DELETE) 5-- - Requires maintenance 6 7-- Only index columns that are: 8-- - Used in WHERE clauses 9-- - Used in JOIN conditions 10-- - Used in ORDER BY

Low Selectivity Indexes#

-- Boolean columns often not worth indexing CREATE INDEX idx_users_active ON users(is_active); -- Usually not helpful -- Better: partial index CREATE INDEX idx_users_inactive ON users(email) WHERE is_active = false;

Functions Preventing Index Use#

1-- Index on email won't help: 2SELECT * FROM users WHERE LOWER(email) = 'john@example.com'; 3SELECT * FROM users WHERE email::text = 'john@example.com'; 4SELECT * FROM users WHERE COALESCE(email, '') = 'john@example.com'; 5 6-- Either create expression index or rewrite query

Maintenance#

Monitoring Index Bloat#

1-- Check index bloat (PostgreSQL) 2SELECT 3 schemaname || '.' || relname AS table, 4 indexrelname AS index, 5 pg_size_pretty(pg_relation_size(indexrelid)) AS size, 6 idx_scan AS scans 7FROM pg_stat_user_indexes 8ORDER BY pg_relation_size(indexrelid) DESC;

Rebuilding Indexes#

1-- Rebuild without blocking (PostgreSQL) 2REINDEX INDEX CONCURRENTLY idx_users_email; 3 4-- Or create new and swap 5CREATE INDEX CONCURRENTLY idx_users_email_new ON users(email); 6DROP INDEX idx_users_email; 7ALTER INDEX idx_users_email_new RENAME TO idx_users_email;

Conclusion#

Indexes are essential for query performance but come with trade-offs. Start with indexes on foreign keys and frequently-queried columns. Use EXPLAIN ANALYZE to verify index usage. Monitor for unused indexes and remove them.

Remember: the best index is one that helps your actual queries. Profile first, then optimize.

Share this article

Help spread the word about Bootspring