Last Updated: May 1, 2025
Consistency Models
| Model | Guarantee | Performance | Complexity | When to Use |
|---|---|---|---|---|
| ACID (2PC/XA) | Strong — all nodes commit or all rollback. Linearizable. | Low — locks held for entire transaction. Blocks on coordinator failure. | High — need transaction manager, XA drivers. | Payment systems, financial ledgers, inventory count. |
| BASE (Saga) | Eventually consistent. Each step commits locally. | High — no distributed locks. Steps process independently. | Moderate — need compensating transactions for rollback. | Order management, reservations, multi-service workflows. |
| TCC (Try-Confirm-Cancel) | Resource-level reservation (Try), then Confirm or Cancel. | High — resources reserved, not locked. | High — two-phase at resource level. | Seat booking, warehouse allocation, hotel reservations. |
| Outbox Pattern | Atomic write of state + event in local DB. Guaranteed publish. | High — no distributed coordination. | Moderate — needs outbox processor (polling or CDC). | Domain event publication, event-driven microservices. |
Two-Phase Commit (2PC / XA)
| Item | Description |
|---|---|
Phase 1: Prepare | Coordinator asks ALL participants: 'Can you commit?' Each participant writes undo+redo logs to disk, votes YES/NO. Locks held on resources until phase 2. |
Phase 2: Commit | If ALL vote YES: coordinator writes commit decision to log, sends COMMIT to all. If ANY vote NO: coordinator sends ROLLBACK to all. Participants release locks. |
Blocking Problem | Coordinator failure after sending PREPARE but before COMMIT → participants block forever holding locks. The single-point-of-failure that makes 2PC impractical at scale. |
XA Protocol | Standard 2PC implementation: XA interface between transaction manager and resource managers (databases). javax.transaction.xa in Java. Most RDBMS support XA. |
3PC (Three-Phase Commit) | Adds pre-commit phase and timeouts to avoid indefinite blocking. In theory, non-blocking. In practice, rarely used — added complexity and message overhead. |
When 2PC Makes Sense | Single-digit number of participants. Short-lived transactions (milliseconds). High cost of inconsistency (financial transfers). Within a single cloud region. |
Saga Pattern
| Item | Description |
|---|---|
Choreographed Saga | Each service publishes events after local transaction. Next service subscribes and reacts. No central coordinator. Low coupling, harder to reason about end-to-end flow. |
Orchestrated Saga | Central saga orchestrator sends commands and listens for outcomes. Explicit workflow: step 1 → step 2 → step 3. Easier to test, debug, and version control. |
Compensating Transactions | Undo already-committed steps when saga fails. Semantic undo — NOT a database rollback. Example: 'cancel order' email to warehouse, not 'DELETE FROM orders'. |
Saga Implementation States | Pending (in progress), Completed (all steps succeeded), Compensating (failure detected, undoing steps), Compensated (all undo steps done), Failed (compensation also failed = manual intervention). |
Idempotency in Sagas | Every saga step must be idempotent: retry on timeout must not double-charge. Each step stores saga_id + step_id for dedup. |
Failure Modes | Lost response (step succeeded but orchestrator didn't get ack), duplicate execution, out-of-order delivery. Handle all three or your saga will fail in production. |
Practical Patterns
Outbox PatternUPDATE orders SET status='paid', outbox(event_type='OrderPaid', payload=json) — one DB transaction. Outbox poller publishes to Kafka.
Transactional Outbox with CDCDebezium tails MySQL binlog / Postgres WAL → emits changes to Kafka. Zero application code for outbox polling. Handles exactly-once with log position.
Idempotent ReceiverOn incoming message: check `processed_messages` table by message_id. If exists → ACK and skip. If not → process + insert processed row + ACK in one tx.
Saga Orchestrator State MachineAWS Step Functions, Camunda, Temporal.io, or custom: state machine with retry policies, timeouts, and compensation steps per state.
Pro Tip: ACID is for monoliths, BASE is for microservices. In distributed systems, strong consistency is expensive and slow. Choose eventual consistency with compensating transactions unless you absolutely need 2PC.