When vertical scaling hits limits, horizontal sharding distributes data across multiple databases. Here's how to shard effectively.
When to Shard#
Consider sharding when:
- Single database can't handle write load
- Data size exceeds single server capacity
- Read replicas aren't enough
- Geographic distribution needed
Before sharding, try:
- Query optimization
- Caching
- Read replicas
- Vertical scaling
Sharding Strategies#
Range-Based:
- Shard by value ranges (user_id 1-1M, 1M-2M)
- Simple to implement
- Can cause hotspots
Hash-Based:
- Shard by hash(key) % num_shards
- Even distribution
- Harder to range query
Directory-Based:
- Lookup table maps keys to shards
- Most flexible
- Additional lookup overhead
Geographic:
- Shard by region/location
- Reduces latency
- Compliance friendly
Shard Key Selection#
Hash-Based Sharding#
Consistent Hashing#
Directory-Based Sharding#
Cross-Shard Queries#
Global Tables#
Shard Rebalancing#
Schema Management#
Best Practices#
Design:
✓ Choose shard key carefully
✓ Plan for cross-shard queries
✓ Keep global data separate
✓ Design for rebalancing
Operations:
✓ Monitor shard sizes
✓ Automate migrations
✓ Test failover
✓ Document shard topology
Avoid:
✗ Sharding too early
✗ Cross-shard transactions
✗ Hotspot shard keys
✗ Manual shard management
Conclusion#
Sharding adds complexity but enables horizontal scale. Choose the right shard key, plan for cross-shard operations, and automate management. Consider managed solutions like CockroachDB or Vitess that handle sharding transparently.