Distributed Systems

Distributed Transactions

Цели урока

  • Understand the atomicity problem across independent services and the 4 partial failure scenarios
  • Know the 2PC protocol, its guarantees, and key limitations (blocking, coordinator failure)
  • Apply the Saga Pattern with compensating transactions for eventual consistency
  • Choose between Orchestration and Choreography based on flow complexity and requirements

Предварительные знания

  • ACID transactions in a single database (atomicity, isolation, durability)
  • Basic understanding of microservices architecture
  • WAL (Write-Ahead Log) concept and crash recovery
  • Eventual consistency vs strong consistency

eBay in 2004 was losing USD 1.5B per year on partial service failures - payment went through, order was never created. This is not a code bug, it is a fundamental problem of distributed systems without the right protocol.

  • **Uber Payments** - Saga with Orchestration for ride payment: 5 steps (auth, hold, trip, release, charge), compensate on any failure
  • **Amazon Order Pipeline** - Choreography via EventBridge: 100+ steps, each service is autonomous, no single coordinator
  • **Stripe Payments** - 2PC for internal DB + idempotency keys for API - clients can safely retry any request
  • **Google Spanner** - distributed transactions via 2PC over Paxos: coordinator is the Paxos group leader
  • **Airbnb Booking** - Saga + Outbox Pattern: event written to the same DB as business data - atomicity via one local transaction

2PC and the history of distributed transactions

Two-Phase Commit was invented by Jim Gray in the late 1970s at IBM Research. In 1979 he published its formal description in 'Notes on Database Operating Systems'. Gray also introduced ACID, WAL, and concurrency control - the foundation of all modern databases. He received the Turing Award in 1998. In 2007 Gray disappeared at sea on his yacht and was never found.

Atomicity Across Service Boundaries

**2004, eBay. The payment service charged PayPal - confirmed. But the order record in eBay's DB was never saved: the server crashed 80ms later. Customer paid, item not reserved. Over a year, such partial failures cost the company USD 1.5B in losses and refunds.** This is the distributed transaction problem: an operation spans multiple nodes, but ACID guarantees only apply within a single database. Related to the CAP theorem choice between C and A.

**Three partial failure scenarios:** 1. first node succeeded, second crashed - data lost 2. both succeeded but the acknowledgment was lost in the network - duplicate on retry 3. first succeeded, second failed on business logic - need to roll back the first. Each scenario requires a separate strategy.

ScenarioProblemRequired strategy
Node B crashed after A succeededMoney debited, not credited2PC rollback or compensation
A's response lost in networkClient retries - double chargeIdempotency key
B failed on business logicA done, B not - need to undo ASaga compensating transaction
Coordinator crashed mid-flowParticipants blocked in PREPARED2PC timeout + 3PC or Saga

A transaction in microservices works just like BEGIN/COMMIT in PostgreSQL

Each service commits its local transaction independently - there is no shared ROLLBACK across service boundaries

PostgreSQL ACID is one process, one WAL, one lock manager. In a distributed system each service has its own DB, its own isolation, its own commit point. Coordination requires protocols (2PC) or patterns (Saga) - and neither provides full ACID.

Client called a money transfer: service A (debit) replied OK, service B (credit) crashed. What is the system state?

Two-Phase Commit (2PC)

**Two-Phase Commit** is the classic protocol for atomic distributed transactions. One node becomes the Coordinator, others are Participants. The protocol works by locking resources until a final decision is made: either everyone commits or everyone aborts.

2PC ProblemDescriptionImpact
BlockingResources locked from Phase 1 until Phase 2Under high RPS - 5-10x throughput degradation
Coordinator failureParticipants stuck in PREPARED indefinitelyDeadlock until coordinator recovers
Network partitionCoordinator cannot reach some participantsTimeout forces ABORT even if all were ready
Does not scaleSynchronous wait for all participantsLatency = max(latency of all nodes)

**2PC is a blocking protocol.** If the Coordinator crashes after Phase 1 (all replied YES) but before Phase 2 - participants hold locks indefinitely. Three-Phase Commit (3PC) partially solves this, but has its own trade-offs. In production microservices, 2PC is rarely used. Covered in detail in the 2PC lesson.

The Coordinator in 2PC wrote COMMIT to its WAL, then crashed before sending COMMIT to participants. What happens?

Saga Pattern: Chain of Compensations

**Uber, 2016. When launching their microservices architecture they hit a wall: 2PC does not work across 50+ independent services (trips, payments, drivers, surge, promotions). Even a Paxos-replicated coordinator could not help - a different pattern was needed. Saga Pattern: a long transaction is split into a chain of local transactions. Each step has a compensating transaction. If step N fails - compensations for steps N-1, N-2, ... run in reverse order.**

**Key difference between compensation and rollback:** a normal ROLLBACK erases the operation as if it never happened. A compensating transaction is a new business operation (issue a refund, release a reservation). It appears in logs, it can partially fail, and it must be idempotent. This is eventual consistency, not ACID.

Saga guarantees ACID the same way as a local database transaction

Saga provides eventual consistency - between steps the system is in an intermediate state

After step 1 (debit) and before step 2 (credit), real time passes - tens of milliseconds. Other clients may see debited money and an unreserved item. This is called ACI without D, or BASE (Basic Availability, Soft state, Eventual consistency). For most business workflows this is acceptable.

In a Saga, steps 1 and 2 completed successfully, step 3 failed. What should happen?

Choreography vs Orchestration

**Saga is implemented in two ways.** Orchestration: one central service (Orchestrator) calls all steps in order and manages compensations. Choreography: each service listens to events and reacts independently - no central coordinator. Amazon Order Service, Uber Dispatch, Netflix Watch Party - all use a hybrid depending on flow complexity.

ApproachHow it worksProsCons
OrchestrationCentral service calls steps and compensationsFlow readable as code, easy debug, explicit rollbackSingle point of failure, coupling through orchestrator
ChoreographyServices react to events from a queueLoose coupling, no SPOF, easy to add a stepImplicit flow, hard to debug, compensation needs planning

**Selection rule:** if the flow is linear and has 3-5 steps - use Orchestration (easier to debug). For 10+ steps or maximum service autonomy - use Choreography. Uber uses Orchestration for critical financial flows (ride payment) and Choreography for analytics pipelines.

SystemApproachWhy
Uber PaymentsOrchestrationFinancial atomicity, explicit audit trail required
Amazon OrderChoreography100+ steps, independent teams, loose coupling
Netflix BillingOrchestrationCompliance requires predictable rollback
Airbnb BookingSaga + OutboxOutbox pattern for reliable event publishing

Which approach is better for a financial process with 4 steps where a clear audit log and predictable rollback are critical?

Вопросы для размышления

  • A bank implements an account transfer via Saga (debit one account, credit another). Between steps 1 and 2 there is a 200ms gap. What issues can arise and how should they be addressed?

Связанные уроки

  • ds-03-consensus — Atomic commit requires agreement across nodes
  • dist-07-2pc — 2PC is the standard distributed transaction commit protocol
  • dist-12-consistency — Transaction isolation level determines consistency guarantees
  • ds-11-distributed-locks — Pessimistic transactions use distributed locks
  • db-13-transactions
Distributed Transactions

0

1

Sign In