Back to Blog
SQLPerformanceDatabaseOptimization

SQL Query Optimization Techniques

Write faster SQL queries. From EXPLAIN analysis to index usage to query refactoring for better performance.

B
Bootspring Team
Engineering
February 20, 2024
5 min read

Slow queries impact user experience and server resources. Learn to analyze and optimize SQL for better database performance.

Understanding EXPLAIN#

1-- PostgreSQL EXPLAIN 2EXPLAIN ANALYZE 3SELECT u.name, COUNT(o.id) as order_count 4FROM users u 5LEFT JOIN orders o ON o.user_id = u.id 6WHERE u.created_at > '2024-01-01' 7GROUP BY u.id; 8 9-- Output interpretation 10Aggregate (cost=1234.56..1234.78 rows=100 width=40) 11 (actual time=45.123..45.234 rows=100 loops=1) 12 -> Hash Join (cost=100.00..1200.00 rows=1000 width=24) 13 (actual time=5.123..40.456 rows=1000 loops=1) 14 Hash Cond: (o.user_id = u.id) 15 -> Seq Scan on orders o -- Sequential scan (slow!) 16 -> Hash 17 -> Index Scan on users u 18 Filter: (created_at > '2024-01-01') 19Planning Time: 0.5 ms 20Execution Time: 45.5 ms

Common Performance Issues#

1-- ❌ N+1 Problem 2SELECT * FROM users; 3-- Then for each user: 4SELECT * FROM orders WHERE user_id = ?; 5 6-- ✅ Join in single query 7SELECT u.*, o.* 8FROM users u 9LEFT JOIN orders o ON o.user_id = u.id; 10 11-- ❌ SELECT * 12SELECT * FROM users WHERE id = 1; 13 14-- ✅ Select only needed columns 15SELECT id, name, email FROM users WHERE id = 1; 16 17-- ❌ Functions on indexed columns 18SELECT * FROM users WHERE YEAR(created_at) = 2024; 19 20-- ✅ Use range instead 21SELECT * FROM users 22WHERE created_at >= '2024-01-01' 23AND created_at < '2025-01-01';

Index Optimization#

1-- Create indexes for common queries 2CREATE INDEX idx_users_email ON users(email); 3CREATE INDEX idx_orders_user_id ON orders(user_id); 4CREATE INDEX idx_orders_created_at ON orders(created_at); 5 6-- Composite indexes for multi-column queries 7-- Order matters! Most selective column first 8CREATE INDEX idx_orders_user_status ON orders(user_id, status); 9 10-- Partial indexes for filtered queries 11CREATE INDEX idx_active_users ON users(email) 12WHERE status = 'active'; 13 14-- Include columns to avoid table lookup 15CREATE INDEX idx_orders_user_cover ON orders(user_id) 16INCLUDE (total, status);

Query Refactoring#

1-- ❌ Correlated subquery 2SELECT * 3FROM products p 4WHERE price > ( 5 SELECT AVG(price) 6 FROM products 7 WHERE category_id = p.category_id 8); 9 10-- ✅ Use JOIN with CTE 11WITH category_avg AS ( 12 SELECT category_id, AVG(price) as avg_price 13 FROM products 14 GROUP BY category_id 15) 16SELECT p.* 17FROM products p 18JOIN category_avg ca ON ca.category_id = p.category_id 19WHERE p.price > ca.avg_price; 20 21-- ❌ OR conditions (often prevents index usage) 22SELECT * FROM users 23WHERE email = 'a@b.com' OR phone = '123'; 24 25-- ✅ Use UNION 26SELECT * FROM users WHERE email = 'a@b.com' 27UNION 28SELECT * FROM users WHERE phone = '123'; 29 30-- ❌ LIKE with leading wildcard 31SELECT * FROM products WHERE name LIKE '%widget%'; 32 33-- ✅ Use full-text search 34CREATE INDEX idx_products_name_fts ON products 35USING gin(to_tsvector('english', name)); 36 37SELECT * FROM products 38WHERE to_tsvector('english', name) @@ to_tsquery('widget');

Pagination Optimization#

1-- ❌ Slow for large offsets 2SELECT * FROM products 3ORDER BY created_at DESC 4LIMIT 20 OFFSET 10000; 5-- Has to scan 10,020 rows 6 7-- ✅ Keyset pagination (cursor-based) 8SELECT * FROM products 9WHERE created_at < '2024-01-15T10:30:00Z' 10ORDER BY created_at DESC 11LIMIT 20; 12 13-- ✅ With tie-breaker for non-unique columns 14SELECT * FROM products 15WHERE (created_at, id) < ('2024-01-15T10:30:00Z', 'last-id') 16ORDER BY created_at DESC, id DESC 17LIMIT 20;

Aggregation Optimization#

1-- ❌ Count with subquery 2SELECT 3 (SELECT COUNT(*) FROM orders WHERE status = 'pending') as pending, 4 (SELECT COUNT(*) FROM orders WHERE status = 'completed') as completed; 5 6-- ✅ Single scan with FILTER 7SELECT 8 COUNT(*) FILTER (WHERE status = 'pending') as pending, 9 COUNT(*) FILTER (WHERE status = 'completed') as completed 10FROM orders; 11 12-- ✅ Or use CASE 13SELECT 14 SUM(CASE WHEN status = 'pending' THEN 1 ELSE 0 END) as pending, 15 SUM(CASE WHEN status = 'completed' THEN 1 ELSE 0 END) as completed 16FROM orders;

Join Optimization#

1-- Prefer explicit JOINs over subqueries 2-- ❌ Subquery 3SELECT * FROM orders 4WHERE user_id IN ( 5 SELECT id FROM users WHERE country = 'US' 6); 7 8-- ✅ JOIN 9SELECT o.* 10FROM orders o 11JOIN users u ON u.id = o.user_id 12WHERE u.country = 'US'; 13 14-- Use EXISTS for existence checks 15-- ❌ IN with large results 16SELECT * FROM users 17WHERE id IN (SELECT user_id FROM orders); 18 19-- ✅ EXISTS 20SELECT * FROM users u 21WHERE EXISTS ( 22 SELECT 1 FROM orders o WHERE o.user_id = u.id 23);

Batch Operations#

1-- ❌ Multiple individual inserts 2INSERT INTO logs (message) VALUES ('msg1'); 3INSERT INTO logs (message) VALUES ('msg2'); 4INSERT INTO logs (message) VALUES ('msg3'); 5 6-- ✅ Batch insert 7INSERT INTO logs (message) VALUES 8 ('msg1'), 9 ('msg2'), 10 ('msg3'); 11 12-- ❌ Update one by one 13UPDATE products SET price = 100 WHERE id = 1; 14UPDATE products SET price = 200 WHERE id = 2; 15 16-- ✅ Batch update 17UPDATE products 18SET price = CASE id 19 WHEN 1 THEN 100 20 WHEN 2 THEN 200 21END 22WHERE id IN (1, 2); 23 24-- ✅ Or use VALUES 25UPDATE products p 26SET price = v.new_price 27FROM (VALUES (1, 100), (2, 200)) AS v(id, new_price) 28WHERE p.id = v.id;

Monitoring Queries#

1-- PostgreSQL: Find slow queries 2SELECT 3 query, 4 calls, 5 mean_exec_time, 6 total_exec_time 7FROM pg_stat_statements 8ORDER BY mean_exec_time DESC 9LIMIT 10; 10 11-- Find missing indexes 12SELECT 13 schemaname, 14 relname, 15 seq_scan, 16 seq_tup_read, 17 idx_scan 18FROM pg_stat_user_tables 19WHERE seq_scan > idx_scan 20ORDER BY seq_tup_read DESC; 21 22-- Find unused indexes 23SELECT 24 indexrelname, 25 idx_scan 26FROM pg_stat_user_indexes 27WHERE idx_scan = 0;

Best Practices#

DO: ✓ Always check EXPLAIN before production ✓ Index columns used in WHERE, JOIN, ORDER BY ✓ Use covering indexes for frequent queries ✓ Batch operations when possible ✓ Use connection pooling ✓ Monitor slow query logs DON'T: ✗ Select more columns than needed ✗ Use functions on indexed columns ✗ Ignore query plans ✗ Create indexes on low-cardinality columns ✗ Use OFFSET for deep pagination

Conclusion#

Query optimization is iterative. Use EXPLAIN to understand execution plans, add appropriate indexes, and refactor queries that cause full table scans.

Monitor your production queries regularly—performance degrades as data grows.

Share this article

Help spread the word about Bootspring