💍Key Takeaways
- 1Consistent hashing maps both keys and nodes onto a ring (0 to 2^32). Each key is assigned to the next node clockwise on the ring
- 2When a node is added/removed, only K/N keys need to move (vs. all keys with modular hashing)
- 3Virtual nodes (vnodes) solve the non-uniform distribution problem — each physical node gets 100-200 points on the ring
- 4Used in DynamoDB, Cassandra, Akamai CDN, Memcached, and Discord for server routing
The Problem with Simple Hashing
Suppose you have N cache servers. The simplest approach: `server = hash(key) % N`. This works until N changes. If you add or remove a server, almost **all keys get remapped** — causing a cache storm (every key misses simultaneously).
Example: With 4 servers, `hash("user:123") % 4 = 2` → server 2. Add a 5th server, now `hash("user:123") % 5 = 3` → different server. For a cache system with millions of keys, this means millions of simultaneous cache misses — a potential cascading failure.
**Consistent hashing solves this:** when a node is added or removed, only 1/N of the keys need to move.
How It Works — Step by Step
The hash space is [0, 2^32 - 1], arranged as a circle. The hash function (e.g., MD5, SHA-1, xxHash) maps both node identifiers and keys to positions on this ring. The ring is conceptual — implemented as a sorted array or balanced BST for O(log N) lookups.
Each node (server) is hashed: position = hash(node_name). For example, hash("cache-server-1") → position 1234567. Multiple nodes are placed around the ring at their hash positions.
For a key, compute hash(key) and walk clockwise on the ring until you find the first node. That node is responsible for the key. Implementation: binary search on the sorted node positions array.
**Node added:** Only keys between the new node and its predecessor need to move. All other mappings stay the same. **Node removed:** Only keys that were on the removed node need to move to the next node clockwise. Minimal disruption. With N nodes, adding/removing one affects only ~1/N of the keys.
With just 3 physical nodes, the ring often has very uneven segments. Solution: each physical node gets V virtual nodes (e.g., V=150). hash("nodeA-0"), hash("nodeA-1"), ..., hash("nodeA-149") all map to physical node A. This creates uniform distribution even with few physical nodes. Nodes with more capacity can have more virtual nodes.
import hashlib
from bisect import bisect_right
class ConsistentHashRing:
def __init__(self, vnodes=150):
self.vnodes = vnodes
self.ring = {} # hash_value -> node
self.sorted_keys = [] # sorted hash values
def _hash(self, key):
return int(hashlib.md5(key.encode()).hexdigest(), 16) % (2**32)
def add_node(self, node):
for i in range(self.vnodes):
vnode_key = f"{node}-vnode-{i}"
h = self._hash(vnode_key)
self.ring[h] = node
self.sorted_keys.append(h)
self.sorted_keys.sort()
def remove_node(self, node):
for i in range(self.vnodes):
vnode_key = f"{node}-vnode-{i}"
h = self._hash(vnode_key)
del self.ring[h]
self.sorted_keys.remove(h)
def get_node(self, key):
if not self.sorted_keys:
return None
h = self._hash(key)
idx = bisect_right(self.sorted_keys, h) % len(self.sorted_keys)
return self.ring[self.sorted_keys[idx]]
# Usage
ring = ConsistentHashRing(vnodes=150)
ring.add_node("cache-1")
ring.add_node("cache-2")
ring.add_node("cache-3")
print(ring.get_node("user:42")) # → "cache-2"
print(ring.get_node("session:99")) # → "cache-1"
# Adding a 4th node only moves ~25% of keys
ring.add_node("cache-4")Modular Hashing vs Consistent Hashing
| Property | Modular (hash % N) | Consistent Hashing |
|---|---|---|
| Key redistribution on node add | ~100% of keys move | ~1/N keys move |
| Key redistribution on node remove | ~100% of keys move | ~1/N keys move |
| Lookup complexity | O(1) | O(log N) with sorted array |
| Load balance | Uniform if hash is good | Needs virtual nodes for uniformity |
| Implementation complexity | Trivial | Moderate |
| Best for | Fixed cluster size | Dynamic clusters, caches, CDNs |
Real-World Implementations
Amazon DynamoDB uses consistent hashing with virtual nodes for partition assignment. Each partition is responsible for a range of hash values. When load increases, partitions split automatically. The partition key is hashed to determine which partition handles the data. This enables DynamoDB's auto-scaling without downtime.
Advantages
- •Minimal key redistribution when nodes change: only ~K/N keys move
- •No central coordinator needed — stateless, deterministic
- •Virtual nodes enable fine-grained load balancing and heterogeneous node capacity
- •Battle-tested at massive scale: DynamoDB, Cassandra, Akamai, Discord
Disadvantages
- •More complex to implement than modular hashing
- •Virtual nodes increase memory overhead for the ring metadata
- •Removing a node creates temporary hot spot on successor (2× load briefly)
- •Rebalancing data after topology change still requires data migration
🧪 Test Your Understanding
What happens when you add a 5th server using hash(key) % N?