Last Updated: July 15, 2025
Replication Topologies
| Topology | Writes | Reads | Best For |
|---|---|---|---|
| Leader-Follower | Leader only | Any replica | Read-heavy, simple ops |
| Multi-Leader | Any leader | Any replica | Multi-DC, offline clients |
| Leaderless | Any node (quorum) | Any node (quorum) | High availability, Dynamo-style |
Sync vs Async Replication
| Mode | Pros | Cons |
|---|---|---|
| Synchronous | No data loss — follower always up-to-date | Slow writes — waits for follower ACK; follower down = writes blocked |
| Asynchronous | Fast writes — fire and forget | Stale reads possible; leader crash = committed writes lost |
| Semi-Synchronous | One sync follower, rest async | Best of both — durability + throughput |
Quorum Reads & Writes (Leaderless)
| Concept | Formula | Example (N=3) |
|---|---|---|
| N | Total replicas | 3 nodes |
| W | Write quorum | W=2 (wait for 2 ACKs) |
| R | Read quorum | R=2 (read from 2, pick newest) |
| Consistency | W + R > N | 2+2>3 ✓ (strong consistency) |
Conflict Resolution
| Strategy | How It Works |
|---|---|
| Last Write Wins | Timestamp-based — simplest, can lose data |
| CRDTs | Conflict-free data types — counters, sets, maps that merge automatically |
| Version Vectors | Track causal history — detect concurrent writes precisely |
| Application-Level | Custom merge logic — Amazon cart: items never deleted, just merged |
Failover Process
1. Detect failureHeartbeat timeout — usually 30-60 seconds
2. Elect new leaderNode with most up-to-date log wins (Raft: highest term+index)
3. ReconfigureClients and followers pointed to new leader
4. Old leader returnsSteps down to follower — catches up via replication log
Pro Tip: Choose async replication for performance (risk: data loss on leader crash), sync for durability (cost: higher write latency). Most systems use semi-sync — one follower is sync, rest async.