MongoDB

Consistent Hashing Explained

Understand consistent hashing — the algorithm that powers distributed caching and load balancing in systems like DynamoDB and Cassandra.

S

srikanthtelkalapally888@gmail.com

Consistent Hashing Explained

Consistent hashing is a distributed hashing technique that minimizes data reshuffling when nodes are added or removed.

The Problem

With traditional hashing key % N, when N changes (node added/removed), most keys get remapped — causing massive cache invalidation.

How Consistent Hashing Works

  1. Map both nodes and keys to a circular ring (0 to 2^32)
  2. Each key is assigned to the next clockwise node
  3. When a node is added/removed, only its neighboring keys are remapped
Ring:
0 ──── Node A ──── Node B ──── Node C ──── 2^32

Virtual Nodes

To avoid uneven distribution, each physical node maps to multiple virtual nodes on the ring.

Node A → VNode A1, A2, A3
Node B → VNode B1, B2, B3

Benefits

  • Only K/N keys remapped when a node joins/leaves
  • Even load distribution with virtual nodes
  • Used in Cassandra, DynamoDB, Riak

Real-World Use

  • CDN: Route users to nearest edge server
  • Distributed Cache: Decide which cache node stores a key
  • Load Balancers: Sticky sessions

Conclusion

Consistent hashing is foundational in distributed systems to achieve horizontal scalability with minimal disruption.

Share this article