MongoDB
Database Sharding Strategies
Master horizontal database sharding strategies — range, hash, and directory-based — to scale databases beyond a single machine.
S
srikanthtelkalapally888@gmail.com
Database Sharding Strategies
Sharding splits a large database into smaller, faster, distributed pieces called shards.
Why Shard?
- Single server has storage limits
- High write throughput requirements
- Reduce query latency
Sharding Strategies
Range-Based Sharding
Shard 1: user_id 1 – 1,000,000
Shard 2: user_id 1,000,001 – 2,000,000
Shard 3: user_id 2,000,001 – 3,000,000
Pros: Simple, good for range queries Cons: Hotspot problem (new users always hit last shard)
Hash-Based Sharding
Shard = hash(user_id) % num_shards
Pros: Even distribution Cons: Range queries span all shards
Directory-Based Sharding
Lookup table maps keys to shard.
Pros: Flexible shard assignment Cons: Lookup table is single point of failure
Resharding
When a shard is too large:
- Split shard into two
- Update routing logic
- Migrate data
Challenges
- Cross-shard joins are expensive
- Distributed transactions are complex
- Rebalancing requires data migration
Conclusion
Hash sharding is the most common for even distribution. Combine with replication for high availability.