Parallel Computing

Synchronization

Цели урока

Understand mutex and the causes of deadlock (Coffman's 4 conditions)
Distinguish mutex from semaphore: ownership vs counting
Apply condition variable for event-driven waiting without busy spin
Build phase synchronization with barrier, understand load imbalance

Предварительные знания

Threads and Processes

1997. The Pathfinder rover reboots on Mars again and again. Root cause: priority inversion. A low-priority thread holds a mutex needed by a high-priority thread. A synchronization bug worth USD 265M, on another planet. NASA engineers patched it remotely. Understanding synchronization primitives literally saves missions.

**Databases:** every transaction uses locks for ACID guarantees; deadlock detection is a standard DBMS feature
**Operating systems:** all kernel resources (file system, network, memory) are protected by mutexes and semaphores
**Game engines:** the render thread and physics thread synchronize via a barrier every frame
**PyTorch DDP:** a barrier synchronizes gradients across GPUs after every backward pass

Dijkstra and Semaphores

Edsger Dijkstra invented semaphores in 1965 while building the THE operating system. The operations P (proberen = test, acquire) and V (verhogen = increment, release) are Dutch words. The same year Dijkstra formulated the Dining Philosophers problem - the classic illustration of deadlock and synchronization headaches. Both results landed in one paper, "Cooperating Sequential Processes".

Mutex: Mutual Exclusion

1997. The Pathfinder rover on Mars starts rebooting for no apparent reason. NASA engineers sit 300 million km away. Root cause: a low-priority thread holds a mutex that a high-priority thread needs. Priority inversion. A synchronization bug worth USD 265M, on another planet.

A **mutex** (mutual exclusion) is a lock that guarantees only one thread executes the critical section at a time. Picture a single-occupancy stall: door locked, wait; door open, enter and lock it behind.

**Deadlock** is the lethal failure mode of mutexes. Thread A holds lock1 and waits for lock2. Thread B holds lock2 and waits for lock1. Both wait forever. Fix: acquire mutexes in a consistent order (lock ordering), or use trylock with a timeout.

**Coffman's 4 conditions** for deadlock: 1. mutual exclusion 2. hold and wait 3. no preemption 4. circular wait. Break any one of them and deadlock cannot happen. Lock ordering breaks circular wait.

Two threads: A acquires lock1->lock2, B acquires lock2->lock1. What happens?

Semaphore: Generalized Counter

A mutex is binary: held or free. A **semaphore** generalizes the idea into a counter that admits N concurrent threads. Picture a parking lot with 5 spots: full, wait; someone leaves, drive in. Every production connection pool runs on this exact pattern.

Type	Counter	Use case
Binary Semaphore	0 or 1	Mutex analog (but without ownership!)
Counting Semaphore	0..N	Connection pool, thread pool, rate limiting
Mutex	0 or 1 + owner	Only the owner can unlock

**Key difference from mutex:** a semaphore has no owner. One thread can acquire it and another can release it. That is what makes the **producer-consumer** pattern work: the producer calls release ("produced an item"), the consumer calls acquire ("took an item").

A database connection pool has a maximum of 5 connections. 10 threads want to connect. Which primitive to use?

Condition Variable: Waiting for an Event

A mutex protects data. What about waiting for a specific data state? "Wait until the queue is non-empty". Spinning in a loop (busy waiting) torches CPU for nothing. Go runtime, Java thread pools, Rust Tokio all reach for the same primitive: a **condition variable**. Sleep, then wake when the condition holds.

**Spurious wakeups:** a thread can wake up without a notify. Always wrap wait() in `while (condition)`, never `if (condition)`. Not a bug - it is a property of the OS-level implementation (signals, scheduler).

**Pattern:** a condition variable always pairs with a mutex. The mutex protects shared state; the condition variable handles waiting for state changes. wait() **atomically** releases the mutex and puts the thread to sleep. On wakeup, the mutex is re-acquired.

Why does cond.wait() need while (condition), not if (condition)?

Barrier: Synchronization Point

Google MapReduce, Apache Spark, neural network training all run the same dance: chop work into independent chunks, compute in parallel, then **merge**. The moment when "everyone is done computing, now merge" is a barrier - a point where every thread waits for the rest before continuing.

Pattern	Primitive	Example
Mutual exclusion	Mutex	Protecting a shared counter
Resource limiting	Semaphore	Connection pool (max 5)
Event waiting	Condition Variable	Producer-consumer queue
Phase synchronization	Barrier	MapReduce, parallel simulation

A barrier can become the bottleneck. If one thread lags, every other thread sits idle. That is **load imbalance** - one of the headline problems in parallel systems. Fix: dynamic work distribution (work stealing). In GPU computing, CUDA's `__syncthreads()` synchronizes every thread in a block - without it, one warp can read data another warp has not yet written.

More mutexes = safer

More mutexes = higher deadlock risk and worse performance. The optimum is the minimum necessary number.

Each additional mutex increases deadlock risk (more acquisition-order combinations), contention (threads block each other more often), and overhead (every lock/unlock is an atomic operation). Rule: "Don't communicate by sharing memory; share memory by communicating" (Go proverb).

4 threads process a matrix. Threads 0-2 finish in 10ms, thread 3 takes 100ms. With a barrier for 4 threads, how long does the phase take?

Key Ideas

**Mutex** - one thread in the critical section; deadlock is possible when lock ordering is violated
**Semaphore** - counter for N concurrent accesses; no owner (enables producer/consumer pattern)
**Condition Variable** - "sleep until condition is met"; always with while (spurious wakeups break if)
**Barrier** - "wait for everyone, then continue"; phase time equals the slowest thread's time

Вопросы для размышления

Go promotes channels over mutexes: "Don't communicate by sharing memory; share memory by communicating". What is the advantage of this approach?
Is it possible to completely avoid mutexes by using only lock-free data structures? What are the trade-offs?
How did NASA engineers fix the priority inversion bug on Pathfinder while being on Earth with the rover on Mars?

Связанные уроки

par-02 — Threads and shared memory - the foundation for why primitives exist
par-04 — Lock-free structures are built on top of mutex understanding
rts-02 — Priority inversion in RT systems is a direct consequence of mutex ownership
ds-03 — Distributed consensus is a cluster-level mutex
par-08 — Parallel algorithms require a sound synchronization strategy
stream-03 — Message brokers as an alternative to shared memory via message passing
os-05-sync
os-03-threads

Parallel Computing

Synchronization

Цели урока

Understand mutex and the causes of deadlock (Coffman's 4 conditions)
Distinguish mutex from semaphore: ownership vs counting
Apply condition variable for event-driven waiting without busy spin
Build phase synchronization with barrier, understand load imbalance

Предварительные знания

Threads and Processes

**Databases:** every transaction uses locks for ACID guarantees; deadlock detection is a standard DBMS feature
**Operating systems:** all kernel resources (file system, network, memory) are protected by mutexes and semaphores
**Game engines:** the render thread and physics thread synchronize via a barrier every frame
**PyTorch DDP:** a barrier synchronizes gradients across GPUs after every backward pass

Dijkstra and Semaphores

Mutex: Mutual Exclusion

**Coffman's 4 conditions** for deadlock: 1. mutual exclusion 2. hold and wait 3. no preemption 4. circular wait. Break any one of them and deadlock cannot happen. Lock ordering breaks circular wait.

Two threads: A acquires lock1->lock2, B acquires lock2->lock1. What happens?

Semaphore: Generalized Counter

Type	Counter	Use case
Binary Semaphore	0 or 1	Mutex analog (but without ownership!)
Counting Semaphore	0..N	Connection pool, thread pool, rate limiting
Mutex	0 or 1 + owner	Only the owner can unlock

A database connection pool has a maximum of 5 connections. 10 threads want to connect. Which primitive to use?

Condition Variable: Waiting for an Event

Why does cond.wait() need while (condition), not if (condition)?

Barrier: Synchronization Point

Pattern	Primitive	Example
Mutual exclusion	Mutex	Protecting a shared counter
Resource limiting	Semaphore	Connection pool (max 5)
Event waiting	Condition Variable	Producer-consumer queue
Phase synchronization	Barrier	MapReduce, parallel simulation

More mutexes = safer

More mutexes = higher deadlock risk and worse performance. The optimum is the minimum necessary number.

4 threads process a matrix. Threads 0-2 finish in 10ms, thread 3 takes 100ms. With a barrier for 4 threads, how long does the phase take?

Key Ideas

**Mutex** - one thread in the critical section; deadlock is possible when lock ordering is violated
**Semaphore** - counter for N concurrent accesses; no owner (enables producer/consumer pattern)
**Condition Variable** - "sleep until condition is met"; always with while (spurious wakeups break if)
**Barrier** - "wait for everyone, then continue"; phase time equals the slowest thread's time

Вопросы для размышления

Go promotes channels over mutexes: "Don't communicate by sharing memory; share memory by communicating". What is the advantage of this approach?
Is it possible to completely avoid mutexes by using only lock-free data structures? What are the trade-offs?
How did NASA engineers fix the priority inversion bug on Pathfinder while being on Earth with the rover on Mars?

Связанные уроки

par-02 — Threads and shared memory - the foundation for why primitives exist
par-04 — Lock-free structures are built on top of mutex understanding
rts-02 — Priority inversion in RT systems is a direct consequence of mutex ownership
ds-03 — Distributed consensus is a cluster-level mutex
par-08 — Parallel algorithms require a sound synchronization strategy
stream-03 — Message brokers as an alternative to shared memory via message passing
os-05-sync
os-03-threads

Synchronization

Цели урока

Предварительные знания

Dijkstra and Semaphores

Mutex: Mutual Exclusion

Semaphore: Generalized Counter

Condition Variable: Waiting for an Event

Barrier: Synchronization Point

Key Ideas

Related Topics

Вопросы для размышления

Связанные уроки

Synchronization

Цели урока

Предварительные знания

Dijkstra and Semaphores

Mutex: Mutual Exclusion

Semaphore: Generalized Counter

Condition Variable: Waiting for an Event

Barrier: Synchronization Point

Key Ideas

Related Topics

Вопросы для размышления

Связанные уроки