Distributed Systems

Distributed Transactions

Цели урока

Understand the atomicity problem across independent services and the 4 partial failure scenarios
Know the 2PC protocol, its guarantees, and key limitations (blocking, coordinator failure)
Apply the Saga Pattern with compensating transactions for eventual consistency
Choose between Orchestration and Choreography based on flow complexity and requirements

Предварительные знания

ACID transactions in a single database (atomicity, isolation, durability)
Basic understanding of microservices architecture
WAL (Write-Ahead Log) concept and crash recovery
Eventual consistency vs strong consistency

eBay in 2004 was losing USD 1.5B per year on partial service failures - payment went through, order was never created. This is not a code bug, it is a fundamental problem of distributed systems without the right protocol.

**Uber Payments** - Saga with Orchestration for ride payment: 5 steps (auth, hold, trip, release, charge), compensate on any failure
**Amazon Order Pipeline** - Choreography via EventBridge: 100+ steps, each service is autonomous, no single coordinator
**Stripe Payments** - 2PC for internal DB + idempotency keys for API - clients can safely retry any request
**Google Spanner** - distributed transactions via 2PC over Paxos: coordinator is the Paxos group leader
**Airbnb Booking** - Saga + Outbox Pattern: event written to the same DB as business data - atomicity via one local transaction

2PC and the history of distributed transactions

Two-Phase Commit was invented by Jim Gray in the late 1970s at IBM Research. In 1979 he published its formal description in 'Notes on Database Operating Systems'. Gray also introduced ACID, WAL, and concurrency control - the foundation of all modern databases. He received the Turing Award in 1998. In 2007 Gray disappeared at sea on his yacht and was never found.

Atomicity Across Service Boundaries

**2004, eBay. The payment service charged PayPal - confirmed. But the order record in eBay's DB was never saved: the server crashed 80ms later. Customer paid, item not reserved. Over a year, such partial failures cost the company USD 1.5B in losses and refunds.** This is the distributed transaction problem: an operation spans multiple nodes, but ACID guarantees only apply within a single database. Related to the CAP theorem choice between C and A.

**Three partial failure scenarios:** 1. first node succeeded, second crashed - data lost 2. both succeeded but the acknowledgment was lost in the network - duplicate on retry 3. first succeeded, second failed on business logic - need to roll back the first. Each scenario requires a separate strategy.

Scenario	Problem	Required strategy
Node B crashed after A succeeded	Money debited, not credited	2PC rollback or compensation
A's response lost in network	Client retries - double charge	Idempotency key
B failed on business logic	A done, B not - need to undo A	Saga compensating transaction
Coordinator crashed mid-flow	Participants blocked in PREPARED	2PC timeout + 3PC or Saga

A transaction in microservices works just like BEGIN/COMMIT in PostgreSQL

Each service commits its local transaction independently - there is no shared ROLLBACK across service boundaries

PostgreSQL ACID is one process, one WAL, one lock manager. In a distributed system each service has its own DB, its own isolation, its own commit point. Coordination requires protocols (2PC) or patterns (Saga) - and neither provides full ACID.

Client called a money transfer: service A (debit) replied OK, service B (credit) crashed. What is the system state?

Two-Phase Commit (2PC)

**Two-Phase Commit** is the classic protocol for atomic distributed transactions. One node becomes the Coordinator, others are Participants. The protocol works by locking resources until a final decision is made: either everyone commits or everyone aborts.

2PC Problem	Description	Impact
Blocking	Resources locked from Phase 1 until Phase 2	Under high RPS - 5-10x throughput degradation
Coordinator failure	Participants stuck in PREPARED indefinitely	Deadlock until coordinator recovers
Network partition	Coordinator cannot reach some participants	Timeout forces ABORT even if all were ready
Does not scale	Synchronous wait for all participants	Latency = max(latency of all nodes)

**2PC is a blocking protocol.** If the Coordinator crashes after Phase 1 (all replied YES) but before Phase 2 - participants hold locks indefinitely. Three-Phase Commit (3PC) partially solves this, but has its own trade-offs. In production microservices, 2PC is rarely used. Covered in detail in the 2PC lesson.

The Coordinator in 2PC wrote COMMIT to its WAL, then crashed before sending COMMIT to participants. What happens?

Saga Pattern: Chain of Compensations

**Uber, 2016. When launching their microservices architecture they hit a wall: 2PC does not work across 50+ independent services (trips, payments, drivers, surge, promotions). Even a Paxos-replicated coordinator could not help - a different pattern was needed. Saga Pattern: a long transaction is split into a chain of local transactions. Each step has a compensating transaction. If step N fails - compensations for steps N-1, N-2, ... run in reverse order.**

**Key difference between compensation and rollback:** a normal ROLLBACK erases the operation as if it never happened. A compensating transaction is a new business operation (issue a refund, release a reservation). It appears in logs, it can partially fail, and it must be idempotent. This is eventual consistency, not ACID.

Saga guarantees ACID the same way as a local database transaction

Saga provides eventual consistency - between steps the system is in an intermediate state

After step 1 (debit) and before step 2 (credit), real time passes - tens of milliseconds. Other clients may see debited money and an unreserved item. This is called ACI without D, or BASE (Basic Availability, Soft state, Eventual consistency). For most business workflows this is acceptable.

In a Saga, steps 1 and 2 completed successfully, step 3 failed. What should happen?

Choreography vs Orchestration

**Saga is implemented in two ways.** Orchestration: one central service (Orchestrator) calls all steps in order and manages compensations. Choreography: each service listens to events and reacts independently - no central coordinator. Amazon Order Service, Uber Dispatch, Netflix Watch Party - all use a hybrid depending on flow complexity.

Approach	How it works	Pros	Cons
Orchestration	Central service calls steps and compensations	Flow readable as code, easy debug, explicit rollback	Single point of failure, coupling through orchestrator
Choreography	Services react to events from a queue	Loose coupling, no SPOF, easy to add a step	Implicit flow, hard to debug, compensation needs planning

**Selection rule:** if the flow is linear and has 3-5 steps - use Orchestration (easier to debug). For 10+ steps or maximum service autonomy - use Choreography. Uber uses Orchestration for critical financial flows (ride payment) and Choreography for analytics pipelines.

System	Approach	Why
Uber Payments	Orchestration	Financial atomicity, explicit audit trail required
Amazon Order	Choreography	100+ steps, independent teams, loose coupling
Netflix Billing	Orchestration	Compliance requires predictable rollback
Airbnb Booking	Saga + Outbox	Outbox pattern for reliable event publishing

Which approach is better for a financial process with 4 steps where a clear audit log and predictable rollback are critical?

Вопросы для размышления

A bank implements an account transfer via Saga (debit one account, credit another). Between steps 1 and 2 there is a 200ms gap. What issues can arise and how should they be addressed?

Связанные уроки

ds-03-consensus — Atomic commit requires agreement across nodes
dist-07-2pc — 2PC is the standard distributed transaction commit protocol
dist-12-consistency — Transaction isolation level determines consistency guarantees
ds-11-distributed-locks — Pessimistic transactions use distributed locks
db-13-transactions