Consistent hashing deep dive

💍Key Takeaways

1
Consistent hashing maps both keys and nodes onto a ring (0 to 2^32). Each key is assigned to the next node clockwise on the ring
2
When a node is added/removed, only K/N keys need to move (vs. all keys with modular hashing)
3
Virtual nodes (vnodes) solve the non-uniform distribution problem — each physical node gets 100-200 points on the ring
4
Used in DynamoDB, Cassandra, Akamai CDN, Memcached, and Discord for server routing

The Problem with Simple Hashing

Suppose you have N cache servers. The simplest approach: `server = hash(key) % N`. This works until N changes. If you add or remove a server, almost **all keys get remapped** — causing a cache storm (every key misses simultaneously).

Example: With 4 servers, `hash("user:123") % 4 = 2` → server 2. Add a 5th server, now `hash("user:123") % 5 = 3` → different server. For a cache system with millions of keys, this means millions of simultaneous cache misses — a potential cascading failure.

**Consistent hashing solves this:** when a node is added or removed, only 1/N of the keys need to move.

Consistent Hashing Ring

How It Works — Step by Step

The hash space is [0, 2^32 - 1], arranged as a circle. The hash function (e.g., MD5, SHA-1, xxHash) maps both node identifiers and keys to positions on this ring. The ring is conceptual — implemented as a sorted array or balanced BST for O(log N) lookups.

Each node (server) is hashed: position = hash(node_name). For example, hash("cache-server-1") → position 1234567. Multiple nodes are placed around the ring at their hash positions.

For a key, compute hash(key) and walk clockwise on the ring until you find the first node. That node is responsible for the key. Implementation: binary search on the sorted node positions array.

**Node added:** Only keys between the new node and its predecessor need to move. All other mappings stay the same. **Node removed:** Only keys that were on the removed node need to move to the next node clockwise. Minimal disruption. With N nodes, adding/removing one affects only ~1/N of the keys.

With just 3 physical nodes, the ring often has very uneven segments. Solution: each physical node gets V virtual nodes (e.g., V=150). hash("nodeA-0"), hash("nodeA-1"), ..., hash("nodeA-149") all map to physical node A. This creates uniform distribution even with few physical nodes. Nodes with more capacity can have more virtual nodes.

Consistent Hash Ring Implementation

import hashlib
from bisect import bisect_right

class ConsistentHashRing:
    def __init__(self, vnodes=150):
        self.vnodes = vnodes
        self.ring = {}       # hash_value -> node
        self.sorted_keys = []  # sorted hash values
    
    def _hash(self, key):
        return int(hashlib.md5(key.encode()).hexdigest(), 16) % (2**32)
    
    def add_node(self, node):
        for i in range(self.vnodes):
            vnode_key = f"{node}-vnode-{i}"
            h = self._hash(vnode_key)
            self.ring[h] = node
            self.sorted_keys.append(h)
        self.sorted_keys.sort()
    
    def remove_node(self, node):
        for i in range(self.vnodes):
            vnode_key = f"{node}-vnode-{i}"
            h = self._hash(vnode_key)
            del self.ring[h]
            self.sorted_keys.remove(h)
    
    def get_node(self, key):
        if not self.sorted_keys:
            return None
        h = self._hash(key)
        idx = bisect_right(self.sorted_keys, h) % len(self.sorted_keys)
        return self.ring[self.sorted_keys[idx]]

# Usage
ring = ConsistentHashRing(vnodes=150)
ring.add_node("cache-1")
ring.add_node("cache-2")
ring.add_node("cache-3")
print(ring.get_node("user:42"))    # → "cache-2"
print(ring.get_node("session:99")) # → "cache-1"

# Adding a 4th node only moves ~25% of keys
ring.add_node("cache-4")

Modular Hashing vs Consistent Hashing

Property	Modular (hash % N)	Consistent Hashing
Key redistribution on node add	~100% of keys move	~1/N keys move
Key redistribution on node remove	~100% of keys move	~1/N keys move
Lookup complexity	O(1)	O(log N) with sorted array
Load balance	Uniform if hash is good	Needs virtual nodes for uniformity
Implementation complexity	Trivial	Moderate
Best for	Fixed cluster size	Dynamic clusters, caches, CDNs

Real-World Implementations

Amazon DynamoDB uses consistent hashing with virtual nodes for partition assignment. Each partition is responsible for a range of hash values. When load increases, partitions split automatically. The partition key is hashed to determine which partition handles the data. This enables DynamoDB's auto-scaling without downtime.

✅Interview Tip

When consistent hashing comes up in interviews (it will!), always mention: (1) the problem with modular hashing, (2) the ring concept, (3) virtual nodes for balance, and (4) a real-world example. Bonus: mention that removing a node causes a "hot" successor that temporarily handles 2× load — this shows depth.

Advantages

•Minimal key redistribution when nodes change: only ~K/N keys move
•No central coordinator needed — stateless, deterministic
•Virtual nodes enable fine-grained load balancing and heterogeneous node capacity
•Battle-tested at massive scale: DynamoDB, Cassandra, Akamai, Discord

Disadvantages

•More complex to implement than modular hashing
•Virtual nodes increase memory overhead for the ring metadata
•Removing a node creates temporary hot spot on successor (2× load briefly)
•Rebalancing data after topology change still requires data migration

🧪 Test Your Understanding

Knowledge Check1/3

What happens when you add a 5th server using hash(key) % N?