Consensus Algorithms Cheat Sheet

Complete reference for distributed consensus algorithms — Raft, Paxos, and PBFT explained with leader election, log replication, safety guarantees, and real-world.

Last Updated: May 1, 2025

Raft — Core Mechanisms

ItemDescription
Leader ElectionOne leader handles all client requests. Term numbers increase monotonically. Candidate requests votes; majority wins.
Log ReplicationLeader appends entries to its log, sends AppendEntries RPCs to followers. Entry is committed once replicated to a majority.
SafetyLeader never overwrites committed entries. At most one leader per term. Election restriction: candidate's log must be at least as up-to-date as voter's.
Membership ChangesJoint consensus: transition through intermediate configuration that includes both old and new servers. Two-phase approach prevents split-brain.
Log CompactionSnapshotting: periodically write entire state to disk, discard old log entries. InstallSnapshot RPC for slow followers.
Heartbeat TimerElection timeout randomized (150-300 ms typical). Leader sends heartbeats every `timeout/2` to prevent spurious elections.

Paxos — The Classic

ItemDescription
Basic PaxosThree roles: Proposers, Acceptors, Learners. Two-phase protocol: Prepare/Promise, then Accept/Accepted. Quorum size = majority (2f+1 for f failures).
Multi-PaxosElect distinguished proposer (leader) to skip Prepare phase for most commands. Leader issues Accept directly with monotonically increasing instance IDs.
Phase 1: PrepareProposer picks proposal number `n` higher than any it has used. Sends Prepare(n) to all acceptors. Each acceptor promises to reject proposals < n.
Phase 2: AcceptIf proposer receives promises from majority, sends Accept(n, value). Value is either its own or the highest-numbered value from Promise replies.
Liveness vs SafetyPaxos guarantees safety (no two values chosen for same instance) even with failures. Liveness requires distinguished leader to prevent dueling proposers.
Practical VariantsCheap Paxos (fewer acceptors), Fast Paxos (bypass leader for uncontended writes), EPaxos (leaderless, low-latency across WAN).

Consensus Algorithm Comparison

AlgorithmFault ModelLeader?PerformanceComplexityUse Cases
RaftCrash-faultYes — strongHigh throughputModerateetcd, Consul, TiKV, CockroachDB
Multi-PaxosCrash-faultYes — weakHigh throughputHighChubby, Spanner, Google File System
Basic PaxosCrash-faultNoLow (2 rounds)HighTheoretical foundation, teaching
PBFTByzantine (f < n/3)RotatingModerate (3 phases)Very HighHyperledger Fabric, Zyzzyva
Viewstamped ReplicationCrash-faultYesModerateModerateNuoDB, academic systems
EPaxosCrash-faultNo (leaderless)Low latency WANHighGeo-distributed databases

Byzantine Fault Tolerance

ItemDescription
PBFT (Practical BFT)Tolerates up to f < n/3 Byzantine nodes. Three phases per request: Pre-Prepare, Prepare, Commit. Each requires 2f+1 matching messages.
View ChangeIf primary is suspected faulty, replicas initiate view change. New primary = v+1 mod n. Must carry over prepared requests to new view.
Tendermint / HotStuffModern BFT: leaders propose blocks, validators vote. Pipelined consensus for high throughput. Used by Cosmos blockchain and Facebook's Libra/Diem.
Blockchain ConsensusNakamoto consensus (PoW): probabilistic finality, open membership. PoS (Ethereum 2.0): validators stake, slashing conditions for safety.
BFT Limitations3x replication minimum. All-to-all communication O(n²) per round. Latency-sensitive — single slow node delays the whole system.
Asynchronous BFTFLP impossibility: deterministic consensus impossible with even one faulty node in async model. Practical systems use partial synchrony (eventual message delivery).
Pro Tip: Raft was designed for understandability — if you're learning consensus from scratch, start there. Paxos is the theoretical foundation, Raft is the practical implementation.