Slow queries impact user experience and server resources. Learn to analyze and optimize SQL for better database performance.
Understanding EXPLAIN#
1-- PostgreSQL EXPLAIN
2EXPLAIN ANALYZE
3SELECT u.name, COUNT(o.id) as order_count
4FROM users u
5LEFT JOIN orders o ON o.user_id = u.id
6WHERE u.created_at > '2024-01-01'
7GROUP BY u.id;
8
9-- Output interpretation
10Aggregate (cost=1234.56..1234.78 rows=100 width=40)
11 (actual time=45.123..45.234 rows=100 loops=1)
12 -> Hash Join (cost=100.00..1200.00 rows=1000 width=24)
13 (actual time=5.123..40.456 rows=1000 loops=1)
14 Hash Cond: (o.user_id = u.id)
15 -> Seq Scan on orders o -- Sequential scan (slow!)
16 -> Hash
17 -> Index Scan on users u
18 Filter: (created_at > '2024-01-01')
19Planning Time: 0.5 ms
20Execution Time: 45.5 msCommon Performance Issues#
1-- ❌ N+1 Problem
2SELECT * FROM users;
3-- Then for each user:
4SELECT * FROM orders WHERE user_id = ?;
5
6-- ✅ Join in single query
7SELECT u.*, o.*
8FROM users u
9LEFT JOIN orders o ON o.user_id = u.id;
10
11-- ❌ SELECT *
12SELECT * FROM users WHERE id = 1;
13
14-- ✅ Select only needed columns
15SELECT id, name, email FROM users WHERE id = 1;
16
17-- ❌ Functions on indexed columns
18SELECT * FROM users WHERE YEAR(created_at) = 2024;
19
20-- ✅ Use range instead
21SELECT * FROM users
22WHERE created_at >= '2024-01-01'
23AND created_at < '2025-01-01';Index Optimization#
1-- Create indexes for common queries
2CREATE INDEX idx_users_email ON users(email);
3CREATE INDEX idx_orders_user_id ON orders(user_id);
4CREATE INDEX idx_orders_created_at ON orders(created_at);
5
6-- Composite indexes for multi-column queries
7-- Order matters! Most selective column first
8CREATE INDEX idx_orders_user_status ON orders(user_id, status);
9
10-- Partial indexes for filtered queries
11CREATE INDEX idx_active_users ON users(email)
12WHERE status = 'active';
13
14-- Include columns to avoid table lookup
15CREATE INDEX idx_orders_user_cover ON orders(user_id)
16INCLUDE (total, status);Query Refactoring#
1-- ❌ Correlated subquery
2SELECT *
3FROM products p
4WHERE price > (
5 SELECT AVG(price)
6 FROM products
7 WHERE category_id = p.category_id
8);
9
10-- ✅ Use JOIN with CTE
11WITH category_avg AS (
12 SELECT category_id, AVG(price) as avg_price
13 FROM products
14 GROUP BY category_id
15)
16SELECT p.*
17FROM products p
18JOIN category_avg ca ON ca.category_id = p.category_id
19WHERE p.price > ca.avg_price;
20
21-- ❌ OR conditions (often prevents index usage)
22SELECT * FROM users
23WHERE email = 'a@b.com' OR phone = '123';
24
25-- ✅ Use UNION
26SELECT * FROM users WHERE email = 'a@b.com'
27UNION
28SELECT * FROM users WHERE phone = '123';
29
30-- ❌ LIKE with leading wildcard
31SELECT * FROM products WHERE name LIKE '%widget%';
32
33-- ✅ Use full-text search
34CREATE INDEX idx_products_name_fts ON products
35USING gin(to_tsvector('english', name));
36
37SELECT * FROM products
38WHERE to_tsvector('english', name) @@ to_tsquery('widget');Pagination Optimization#
1-- ❌ Slow for large offsets
2SELECT * FROM products
3ORDER BY created_at DESC
4LIMIT 20 OFFSET 10000;
5-- Has to scan 10,020 rows
6
7-- ✅ Keyset pagination (cursor-based)
8SELECT * FROM products
9WHERE created_at < '2024-01-15T10:30:00Z'
10ORDER BY created_at DESC
11LIMIT 20;
12
13-- ✅ With tie-breaker for non-unique columns
14SELECT * FROM products
15WHERE (created_at, id) < ('2024-01-15T10:30:00Z', 'last-id')
16ORDER BY created_at DESC, id DESC
17LIMIT 20;Aggregation Optimization#
1-- ❌ Count with subquery
2SELECT
3 (SELECT COUNT(*) FROM orders WHERE status = 'pending') as pending,
4 (SELECT COUNT(*) FROM orders WHERE status = 'completed') as completed;
5
6-- ✅ Single scan with FILTER
7SELECT
8 COUNT(*) FILTER (WHERE status = 'pending') as pending,
9 COUNT(*) FILTER (WHERE status = 'completed') as completed
10FROM orders;
11
12-- ✅ Or use CASE
13SELECT
14 SUM(CASE WHEN status = 'pending' THEN 1 ELSE 0 END) as pending,
15 SUM(CASE WHEN status = 'completed' THEN 1 ELSE 0 END) as completed
16FROM orders;Join Optimization#
1-- Prefer explicit JOINs over subqueries
2-- ❌ Subquery
3SELECT * FROM orders
4WHERE user_id IN (
5 SELECT id FROM users WHERE country = 'US'
6);
7
8-- ✅ JOIN
9SELECT o.*
10FROM orders o
11JOIN users u ON u.id = o.user_id
12WHERE u.country = 'US';
13
14-- Use EXISTS for existence checks
15-- ❌ IN with large results
16SELECT * FROM users
17WHERE id IN (SELECT user_id FROM orders);
18
19-- ✅ EXISTS
20SELECT * FROM users u
21WHERE EXISTS (
22 SELECT 1 FROM orders o WHERE o.user_id = u.id
23);Batch Operations#
1-- ❌ Multiple individual inserts
2INSERT INTO logs (message) VALUES ('msg1');
3INSERT INTO logs (message) VALUES ('msg2');
4INSERT INTO logs (message) VALUES ('msg3');
5
6-- ✅ Batch insert
7INSERT INTO logs (message) VALUES
8 ('msg1'),
9 ('msg2'),
10 ('msg3');
11
12-- ❌ Update one by one
13UPDATE products SET price = 100 WHERE id = 1;
14UPDATE products SET price = 200 WHERE id = 2;
15
16-- ✅ Batch update
17UPDATE products
18SET price = CASE id
19 WHEN 1 THEN 100
20 WHEN 2 THEN 200
21END
22WHERE id IN (1, 2);
23
24-- ✅ Or use VALUES
25UPDATE products p
26SET price = v.new_price
27FROM (VALUES (1, 100), (2, 200)) AS v(id, new_price)
28WHERE p.id = v.id;Monitoring Queries#
1-- PostgreSQL: Find slow queries
2SELECT
3 query,
4 calls,
5 mean_exec_time,
6 total_exec_time
7FROM pg_stat_statements
8ORDER BY mean_exec_time DESC
9LIMIT 10;
10
11-- Find missing indexes
12SELECT
13 schemaname,
14 relname,
15 seq_scan,
16 seq_tup_read,
17 idx_scan
18FROM pg_stat_user_tables
19WHERE seq_scan > idx_scan
20ORDER BY seq_tup_read DESC;
21
22-- Find unused indexes
23SELECT
24 indexrelname,
25 idx_scan
26FROM pg_stat_user_indexes
27WHERE idx_scan = 0;Best Practices#
DO:
✓ Always check EXPLAIN before production
✓ Index columns used in WHERE, JOIN, ORDER BY
✓ Use covering indexes for frequent queries
✓ Batch operations when possible
✓ Use connection pooling
✓ Monitor slow query logs
DON'T:
✗ Select more columns than needed
✗ Use functions on indexed columns
✗ Ignore query plans
✗ Create indexes on low-cardinality columns
✗ Use OFFSET for deep pagination
Conclusion#
Query optimization is iterative. Use EXPLAIN to understand execution plans, add appropriate indexes, and refactor queries that cause full table scans.
Monitor your production queries regularly—performance degrades as data grows.