๐Key Takeaways
- 1Leader election: one node coordinates work โ all others are followers. If the leader fails, a new one is elected.
- 2ZooKeeper/etcd provide distributed coordination primitives: locks, leader election, configuration
- 3Fencing tokens prevent 'split brain' โ a token monotonically increasing ensures stale leaders can't write
- 4Distributed locks are inherently dangerous โ prefer lease-based locks with TTL and fencing tokens
Coordinating Distributed Nodes
Many distributed systems need a single coordinator: one node that assigns work, holds a lock, or acts as the single writer. Leader election algorithms ensure exactly one leader exists, and that a new leader is elected quickly if the current one fails.
Coordination Services
| Service | Consensus | Use Case | API Style |
|---|---|---|---|
| ZooKeeper | Zab (Paxos variant) | Leader election, config, distributed locks | Hierarchical znodes, watchers |
| etcd | Raft | K8s config store, leader election, service discovery | Key-value with watch, lease, transactions |
| Consul | Raft | Service discovery, health checks, KV, leader election | HTTP API, DNS interface |
โ ๏ธThe Distributed Lock Problem
A lock acquired from Redis/etcd can fail silently: network partition causes the lock holder to lose connectivity while still believing it holds the lock. Always use fencing tokens โ a monotonically increasing number attached to each lock acquisition. Resource servers reject requests with tokens older than the latest seen.
Advantages
- โขLeader election provides a clear coordination point
- โขZooKeeper/etcd are battle-tested for coordination
- โขFencing tokens prevent split-brain writes
Disadvantages
- โขLeader is a bottleneck and SPOF (need fast re-election)
- โขDistributed locks are complex to use correctly
- โขCoordination services add infrastructure overhead
๐งช Test Your Understanding
Knowledge Check1/1
What is a fencing token used for?