Back to Blog
SQLDatabasePerformancePostgreSQL

SQL Query Optimization: Writing Faster Database Queries

Optimize SQL queries for better performance. Learn query analysis, execution plans, and techniques to speed up your database operations.

B
Bootspring Team
Engineering
February 26, 2026
7 min read

Slow queries can cripple application performance. This guide covers techniques to analyze, understand, and optimize SQL queries for PostgreSQL and other databases.

Understanding Query Execution#

EXPLAIN ANALYZE#

Always start with understanding how your query executes:

1EXPLAIN ANALYZE 2SELECT u.name, COUNT(o.id) as order_count 3FROM users u 4LEFT JOIN orders o ON o.user_id = u.id 5WHERE u.created_at > '2024-01-01' 6GROUP BY u.id; 7 8-- Output: 9-- HashAggregate (cost=1234.56..1234.78 rows=100 width=40) (actual time=45.123..45.456 rows=98 loops=1) 10-- Group Key: u.id 11-- -> Hash Right Join (cost=123.45..1200.00 rows=5000 width=36) (actual time=1.234..40.567 rows=4532 loops=1) 12-- Hash Cond: (o.user_id = u.id) 13-- -> Seq Scan on orders o (cost=0.00..800.00 rows=50000 width=8) (actual time=0.012..20.345 rows=50000 loops=1) 14-- -> Hash (cost=100.00..100.00 rows=1000 width=36) (actual time=1.000..1.000 rows=1000 loops=1) 15-- -> Seq Scan on users u (cost=0.00..100.00 rows=1000 width=36) (actual time=0.010..0.800 rows=1000 loops=1) 16-- Filter: (created_at > '2024-01-01'::date) 17-- Planning Time: 0.234 ms 18-- Execution Time: 45.789 ms

Key Metrics to Watch#

1-- Cost: Estimated work (arbitrary units) 2-- Rows: Estimated vs actual row counts 3-- Time: Planning + Execution time 4-- Loops: Number of times operation executed 5 6-- Look for: 7-- 1. Seq Scan on large tables (missing index?) 8-- 2. Large difference between estimated and actual rows 9-- 3. Nested loops with high loop counts 10-- 4. Sort operations on large datasets

Common Optimizations#

1. Use Appropriate Indexes#

1-- Before: Seq Scan 2SELECT * FROM orders WHERE customer_email = 'john@example.com'; 3 4-- Add index 5CREATE INDEX idx_orders_customer_email ON orders(customer_email); 6 7-- After: Index Scan (much faster) 8 9-- Composite index for multiple columns 10CREATE INDEX idx_orders_status_date ON orders(status, created_at); 11 12-- Covers queries like: 13SELECT * FROM orders WHERE status = 'pending' AND created_at > '2024-01-01';

2. Avoid SELECT *#

1-- Bad: Fetches all columns 2SELECT * FROM users WHERE id = 123; 3 4-- Good: Only fetch what you need 5SELECT id, name, email FROM users WHERE id = 123; 6 7-- Even better with covering index 8CREATE INDEX idx_users_email_name ON users(id) INCLUDE (name, email); 9-- Enables index-only scan

3. Optimize JOINs#

1-- Ensure join columns are indexed 2CREATE INDEX idx_orders_user_id ON orders(user_id); 3 4-- Use appropriate join type 5-- INNER JOIN when you need matches in both tables 6SELECT u.name, o.total 7FROM users u 8INNER JOIN orders o ON o.user_id = u.id; 9 10-- LEFT JOIN when you need all from left table 11SELECT u.name, COALESCE(COUNT(o.id), 0) as order_count 12FROM users u 13LEFT JOIN orders o ON o.user_id = u.id 14GROUP BY u.id;

4. Limit Result Sets#

1-- Bad: Fetch everything, filter in app 2SELECT * FROM products WHERE category = 'electronics'; 3 4-- Good: Paginate in database 5SELECT * FROM products 6WHERE category = 'electronics' 7ORDER BY created_at DESC 8LIMIT 20 OFFSET 0; 9 10-- Better: Keyset pagination for large datasets 11SELECT * FROM products 12WHERE category = 'electronics' 13 AND created_at < '2024-01-15 10:30:00' 14ORDER BY created_at DESC 15LIMIT 20;

5. Optimize Subqueries#

1-- Bad: Correlated subquery (runs for each row) 2SELECT u.name, 3 (SELECT COUNT(*) FROM orders WHERE user_id = u.id) as order_count 4FROM users u; 5 6-- Good: JOIN with aggregation 7SELECT u.name, COUNT(o.id) as order_count 8FROM users u 9LEFT JOIN orders o ON o.user_id = u.id 10GROUP BY u.id, u.name; 11 12-- Alternative: Lateral join for complex subqueries 13SELECT u.name, recent.total 14FROM users u 15LEFT JOIN LATERAL ( 16 SELECT SUM(total) as total 17 FROM orders 18 WHERE user_id = u.id 19 AND created_at > NOW() - INTERVAL '30 days' 20) recent ON true;

6. Use EXISTS Instead of IN#

1-- Slower with large subquery results 2SELECT * FROM users 3WHERE id IN (SELECT user_id FROM orders WHERE total > 1000); 4 5-- Faster: EXISTS stops at first match 6SELECT * FROM users u 7WHERE EXISTS ( 8 SELECT 1 FROM orders o 9 WHERE o.user_id = u.id AND o.total > 1000 10);

7. Optimize GROUP BY#

1-- Add index on grouped columns 2CREATE INDEX idx_orders_status ON orders(status); 3 4-- Filter before grouping 5SELECT status, COUNT(*) 6FROM orders 7WHERE created_at > '2024-01-01' -- Filter first 8GROUP BY status; 9 10-- Use partial aggregation for time-series 11SELECT 12 date_trunc('day', created_at) as day, 13 COUNT(*) 14FROM orders 15WHERE created_at > NOW() - INTERVAL '30 days' 16GROUP BY date_trunc('day', created_at) 17ORDER BY day;

Advanced Techniques#

Materialized Views#

1-- Create materialized view for expensive aggregations 2CREATE MATERIALIZED VIEW daily_sales AS 3SELECT 4 date_trunc('day', created_at) as day, 5 COUNT(*) as order_count, 6 SUM(total) as revenue 7FROM orders 8GROUP BY date_trunc('day', created_at); 9 10-- Create index on materialized view 11CREATE INDEX idx_daily_sales_day ON daily_sales(day); 12 13-- Refresh periodically 14REFRESH MATERIALIZED VIEW CONCURRENTLY daily_sales; 15 16-- Query the view instead 17SELECT * FROM daily_sales WHERE day > '2024-01-01';

Partitioning#

1-- Partition large tables by time 2CREATE TABLE orders ( 3 id SERIAL, 4 user_id INTEGER, 5 total DECIMAL, 6 created_at TIMESTAMP 7) PARTITION BY RANGE (created_at); 8 9-- Create partitions 10CREATE TABLE orders_2024_q1 PARTITION OF orders 11 FOR VALUES FROM ('2024-01-01') TO ('2024-04-01'); 12 13CREATE TABLE orders_2024_q2 PARTITION OF orders 14 FOR VALUES FROM ('2024-04-01') TO ('2024-07-01'); 15 16-- Queries automatically use partition pruning 17SELECT * FROM orders WHERE created_at > '2024-03-01'; 18-- Only scans orders_2024_q1 and later partitions

Common Table Expressions (CTEs)#

1-- Use CTEs for readability and reuse 2WITH active_users AS ( 3 SELECT id, name 4 FROM users 5 WHERE last_login > NOW() - INTERVAL '30 days' 6), 7user_orders AS ( 8 SELECT user_id, SUM(total) as total_spent 9 FROM orders 10 WHERE created_at > NOW() - INTERVAL '30 days' 11 GROUP BY user_id 12) 13SELECT 14 au.name, 15 COALESCE(uo.total_spent, 0) as total_spent 16FROM active_users au 17LEFT JOIN user_orders uo ON uo.user_id = au.id 18ORDER BY total_spent DESC;

Query Anti-Patterns#

1. Functions on Indexed Columns#

1-- Bad: Can't use index on email 2SELECT * FROM users WHERE LOWER(email) = 'john@example.com'; 3 4-- Good: Expression index 5CREATE INDEX idx_users_email_lower ON users(LOWER(email)); 6 7-- Or store normalized data 8ALTER TABLE users ADD COLUMN email_normalized VARCHAR(255); 9CREATE INDEX idx_users_email_normalized ON users(email_normalized);

2. Implicit Type Conversions#

-- Bad: user_id is INTEGER, comparing to STRING SELECT * FROM orders WHERE user_id = '123'; -- Good: Use correct type SELECT * FROM orders WHERE user_id = 123;

3. OR Conditions#

1-- Bad: OR can prevent index usage 2SELECT * FROM orders 3WHERE status = 'pending' OR status = 'processing'; 4 5-- Good: Use IN 6SELECT * FROM orders 7WHERE status IN ('pending', 'processing'); 8 9-- Or UNION for complex conditions 10SELECT * FROM orders WHERE status = 'pending' 11UNION ALL 12SELECT * FROM orders WHERE customer_id = 123;

4. LIKE with Leading Wildcard#

1-- Bad: Can't use index 2SELECT * FROM products WHERE name LIKE '%phone%'; 3 4-- Good: Full-text search 5CREATE INDEX idx_products_search ON products USING GIN(to_tsvector('english', name)); 6 7SELECT * FROM products 8WHERE to_tsvector('english', name) @@ to_tsquery('phone');

Monitoring Query Performance#

1-- Enable query statistics 2CREATE EXTENSION pg_stat_statements; 3 4-- Find slowest queries 5SELECT 6 query, 7 calls, 8 total_exec_time / 1000 as total_seconds, 9 mean_exec_time / 1000 as avg_seconds, 10 rows 11FROM pg_stat_statements 12ORDER BY total_exec_time DESC 13LIMIT 20; 14 15-- Find queries with most rows scanned 16SELECT 17 query, 18 shared_blks_read + shared_blks_hit as total_blocks, 19 rows 20FROM pg_stat_statements 21ORDER BY shared_blks_read DESC 22LIMIT 20;

Conclusion#

Query optimization is iterative. Start with EXPLAIN ANALYZE, identify bottlenecks, add appropriate indexes, and restructure queries. Monitor production queries continuously to catch regressions early.

Share this article

Help spread the word about Bootspring