Circuit Breaker Pattern Cheat Sheet

Circuit breaker pattern for resilient distributed systems — prevent cascading failures with closed/open/half-open states, thresholds, fallbacks, and real-world im.

Last Updated: May 1, 2025

Circuit Breaker States

StateBehaviorTransitions ToTypical Action
CLOSEDRequests flow normally. Failure count tracked.OPEN (when failures exceed threshold)Normal operation — call downstream.
OPENRequests fail immediately. No calls to downstream. Timer counts down.HALF-OPEN (after timeout expires)Fast-fail with exception or fallback response.
HALF-OPENLimited probe requests allowed through.CLOSED (if probes succeed) or OPEN (if probes fail)Test if downstream recovered; gate remaining requests.

Configuration Parameters

ItemDescription
failureThresholdNumber of consecutive failures before opening the circuit (e.g., 5). Also called `errorThresholdPercentage` for sliding window approaches.
slowCallThresholdMax duration for a call to be considered slow (e.g., 1000 ms). Slow calls count as failures when configuring slow-call thresholds.
waitDurationInOpenTime circuit stays OPEN before transitioning to HALF-OPEN (e.g., 30s). Must be long enough for downstream to recover.
permittedCallsInHalfOpenMax concurrent calls allowed in HALF-OPEN state (e.g., 3). If they succeed → CLOSED; if any fail → back to OPEN.
slidingWindowSizeNumber of calls to evaluate when computing failure rate. Count-based (last N calls) or time-based (last N seconds).
recordExceptionsList of exceptions that count as failures (e.g., TimeoutException, ConnectException). Business exceptions should not trip the breaker.

Implementation Patterns

Resilience4j CircuitBreaker (Java)
@CircuitBreaker(name = 'paymentService', fallbackMethod = 'fallback') defaultConfig: slidingWindowSize=10, failureRateThreshold=50%
Polly Circuit Breaker (C# .NET)
Policy.Handle().CircuitBreakerAsync( exceptionsAllowedBeforeBreaking: 3, durationOfBreak: TimeSpan.FromSeconds(30))
Go Circuit Breaker
Use sony/gobreaker or hashicorp/gobreaker: settings with MaxRequests=3, Interval=30s, Timeout=60s
Envoy Proxy Circuit Breaker
per-connection and per-request limits: max_connections, max_pending_requests, max_requests per cluster
Istio DestinationRule
trafficPolicy.connectionPool for TCP, outlierDetection for HTTP — consecutiveErrors=5, baseEjectionTime=30s

Best Practices & Anti-Patterns

ItemDescription
Always provide fallbackDegrade gracefully: cached data, default response, or queued retry. Never return raw errors to users without a plan.
Don't circuit-break business errorsOnly trip on integration failures (timeouts, connection refused). HTTP 404 or 422 are valid responses — not failures.
Log every state transitionAudit every CLOSED→OPEN, OPEN→HALF-OPEN, HALF-OPEN→CLOSED. Essential for debugging cascading failures across services.
Tune thresholds per-dependencyDatabase calls need tighter thresholds than cache calls. Redis can handle 10x more retries than a slow downstream API.
Combine with retriesRetry with exponential backoff inside CLOSED state. Circuit breaker catches what retries can't handle — persistent failures.
Monitor breaker metricsExpose via /actuator/health or Prometheus metrics: state, failure rate, not-permitted count. Alert on OPEN state > 5 min.
Don't share breakers across unrelated dependenciesOne slow endpoint shouldn't break the circuit for all calls to a service. Use per-endpoint or per-operation granularity.
Pro Tip: Circuit breakers protect downstream services from overload. The key insight: failing fast is better than making users wait. Always pair with fallback behavior — a degraded experience beats no experience.