MongoDB
Consistent Hashing Explained
Understand consistent hashing — the algorithm that powers distributed caching and load balancing in systems like DynamoDB and Cassandra.
S
srikanthtelkalapally888@gmail.com
Consistent Hashing Explained
Consistent hashing is a distributed hashing technique that minimizes data reshuffling when nodes are added or removed.
The Problem
With traditional hashing key % N, when N changes (node added/removed), most keys get remapped — causing massive cache invalidation.
How Consistent Hashing Works
- Map both nodes and keys to a circular ring (0 to 2^32)
- Each key is assigned to the next clockwise node
- When a node is added/removed, only its neighboring keys are remapped
Ring:
0 ──── Node A ──── Node B ──── Node C ──── 2^32
Virtual Nodes
To avoid uneven distribution, each physical node maps to multiple virtual nodes on the ring.
Node A → VNode A1, A2, A3
Node B → VNode B1, B2, B3
Benefits
- Only K/N keys remapped when a node joins/leaves
- Even load distribution with virtual nodes
- Used in Cassandra, DynamoDB, Riak
Real-World Use
- CDN: Route users to nearest edge server
- Distributed Cache: Decide which cache node stores a key
- Load Balancers: Sticky sessions
Conclusion
Consistent hashing is foundational in distributed systems to achieve horizontal scalability with minimal disruption.