Operating Systems

Inter-Process Communication (IPC)

Processes are isolated from each other - this is the foundation of OS security. But what if they need to cooperate? A browser passes a URL to the renderer process, a database sends query results to the executor, Docker CLI manages the daemon. All modern systems are ensembles of interacting processes. IPC (Inter-Process Communication) is the language they use to communicate. Understanding IPC is understanding the architecture of real software.

**Chrome multi-process architecture:** Each tab is a separate renderer process (isolation). The browser process coordinates through **shared memory** (for page bitmaps) and **message passing** (for commands). A crash of one tab does not crash the browser.
**Docker daemon communication:** `docker run` uses a **Unix domain socket** (/var/run/docker.sock) to send commands to the daemon. This is safer than TCP (file permissions), faster (no network stack), and a standard pattern for system services.
**PostgreSQL shared buffer pool:** All backend processes (serving clients) work with a single **shared memory** segment (gigabytes). One data page loaded from disk is instantly available to all - critical for performance.

Цели урока

Apply IPC mechanisms: pipes (anonymous/named), shared memory, message queues, sockets
Pick the right one: shared memory is fastest, sockets cross hosts, pipes glue shells
Know SysV vs POSIX IPC; ftok, shmget, shm_open
Estimate costs: pipe ~1µs, shared memory ~50ns, Unix domain socket ~5µs
Prefer Unix domain sockets over TCP loopback for local IPC

Pipes - data pipelines

**Pipe** is the simplest IPC mechanism, allowing data transfer between processes as a byte stream. A pipe is a unidirectional channel: one process writes, another reads. In UNIX, pipes are fundamental to the philosophy of program composition.

Shell pipes - command pipeline

The command `cat file.txt | grep error | wc -l` makes the shell create **three processes** and **two pipes**. The stdout of the `cat` process is connected by a pipe to the stdin of the `grep` process, and its stdout is connected to the stdin of the `wc` process. Data flows like through a pipeline, processes work in parallel.

A pipe is created by the `pipe()` system call, which returns **two file descriptors**: one for reading (pipe[0]), the other for writing (pipe[1]). After `fork()`, both processes receive copies of these descriptors and can communicate.

**Important properties of pipes:** - **FIFO order:** data is read in the same order it was written - **Atomicity:** writing < PIPE_BUF bytes (usually 4096) is atomic - it won't mix with other writes - **Blocking I/O:** `read()` waits for data, `write()` waits for space in the buffer - **Closure:** if all write ends are closed, `read()` returns EOF (0 bytes)

**Named pipes (FIFO)** are an extension of the concept. A regular pipe exists only between related processes (parent-child). FIFO is a **file in the file system**, but works like a pipe. Any processes can open it and exchange data.

Redis uses pipes for worker communication

In Redis, when the main thread receives a BGSAVE (background save) command, it creates a child process via `fork()`. The child writes the snapshot RDB to disk and reports progress to the parent through a **pipe**. The main thread reads the pipe non-blocking (O_NONBLOCK), updating statistics. This allows asynchronous data saving without stopping the server.

What happens if a process tries to read from a pipe that no one writes to (all write ends are closed)?

Shared Memory

**Shared Memory** is the fastest IPC mechanism. The idea is simple: two processes **map the same physical memory area** into their virtual address spaces. Writing to this area by one process is instantly visible to the other - without copying through the kernel.

In POSIX, there are two APIs for shared memory: 1. **System V IPC** (`shmget()`, `shmat()`) - an old interface, used in legacy systems 2. **POSIX shared memory** (`shm_open()`, `mmap()`) - a modern, file-oriented approach

**Performance:** shared memory avoids copying data between user space and kernel space. In a pipe or socket, data is copied twice: from process A to the kernel buffer, then from the kernel buffer to process B. In shared memory - zero copies.

**Problem: synchronization!** Shared memory offers speed but does not protect against race conditions. If two processes write to the same cell simultaneously, the result is unpredictable. Synchronization is needed through **mutexes (pthread_mutex)** or **semaphores (sem_t)**.

PostgreSQL uses shared memory for buffer cache

PostgreSQL creates a large shared memory segment (gigabytes) for the **shared buffer pool** - a cache of data pages. All backend processes (serving clients) map this segment into their memory. When one process loads a page from disk into the buffer, it is instantly available to all others. This provides a huge performance boost compared to per-process caches.

Redis shared memory in master-replica communication

In Redis replication, when the master receives a WRITE command, it adds it to the **replication buffer** in shared memory. Replica processes read this buffer and apply changes. Using shared memory instead of sockets provides latency < 1μs for local replicas.

Why is shared memory considered the fastest IPC mechanism but requires additional synchronization?

Message Queues

**Message Queue** is a structured IPC mechanism where processes exchange **messages** (not just bytes). Each message has a type and data. The queue is stored in the kernel, and processes can read/write messages asynchronously, independently of each other.

POSIX message queues (`mq_open()`, `mq_send()`, `mq_receive()`) provide a richer interface than System V. They support **message priorities** and **asynchronous notifications** (via signals or threads).

**Differences between message queues and pipes:** - **Structure:** message queue transmits discrete messages, pipe - a byte stream - **Priorities:** message queue supports priorities, pipe - strictly FIFO - **Blocking:** message queue can operate in non-blocking mode (O_NONBLOCK), like a pipe, but also supports timeouts (mq_timedreceive) - **Size:** message queue is limited by the number/size of messages, pipe has a fixed buffer size

**Asynchronous notifications** - a powerful feature of POSIX message queues. A process can register a callback that will be called when a message arrives in an empty queue. This eliminates the need for active polling.

systemd uses message queues for service communication

In systemd, each service can send structured messages (with metadata: timestamp, priority, service name) to the central journal daemon (systemd-journald). This is implemented through POSIX message queues. Journald processes messages asynchronously, services are not blocked on logging.

Real-world: task dispatcher pattern

Classic pattern: **producer-consumer with priorities**. A web server receives HTTP requests, each wrapped in a message with priority (depending on URL: /api/critical - high, /api/batch - low) and placed in a message queue. Worker processes read from the queue - critical tasks are processed first, even if they arrived later.

What is the key advantage of POSIX message queues over regular pipes for implementing a task queue with priorities?

Unix Domain Sockets

**Unix Domain Sockets** are the most flexible IPC mechanism. These are sockets (like for TCP/IP) but operating **locally, within one machine**. They support both stream transmission (SOCK_STREAM, analogous to TCP) and datagrams (SOCK_DGRAM, analogous to UDP), but without the overhead of the network stack.

Unix domain sockets are addressed through **files in the file system** (e.g., `/var/run/docker.sock`). This provides file permissions for access control: `chmod 600 my.sock` - only the owner can connect.

**Why are Unix sockets faster than TCP loopback (127.0.0.1)?** - **No network stack:** skips TCP checksum, routing, congestion control - **Zero-copy in kernel:** modern kernels use sendfile() under the hood - **Fewer context switches:** optimized path in the kernel Benchmark: Unix socket ~2x faster than TCP loopback for local IPC

**File descriptor passing** - a unique feature of Unix domain sockets. A process can pass an open file descriptor to another process through a socket. This allows, for example, a master process to accept connections and then "pass" the socket to a worker process for handling.

Docker uses Unix socket for API

Docker daemon listens on `/var/run/docker.sock` (Unix domain socket). Running `docker run`, the CLI connects to this socket and sends JSON commands over HTTP over Unix socket. This is safer than TCP (no external access), faster (no network overhead), and easier to control via file permissions.

Nginx master-worker architecture

Nginx master process creates a listening socket (e.g., on port 80), then `fork()` creates worker processes. All workers **inherit** this socket FD. When a connection arrives, the kernel wakes one of the workers (via accept() mutex). Alternatively, the master can pass a new connection FD to a specific worker through a Unix domain socket (SO_REUSEPORT strategy).

X11 window system and Wayland

The X11 graphical server listens on `/tmp/.X11-unix/X0` (Unix socket). Each GUI application connects to this socket, sends drawing commands ("draw window", "update pixel"), and receives events (mouse clicks, key presses). Wayland uses a similar approach through Unix sockets for compositor-client communication.

Shared memory is always the best choice for IPC because it's the fastest - it should be used everywhere

The choice of IPC mechanism depends on requirements: speed vs structure vs reliability. There is no universal solution

Shared memory is indeed the fastest (zero-copy), but it requires complex synchronization (mutexes, semaphores) and is prone to race conditions. For simple pipelines (cat | grep), pipes are ideal. For structured tasks with priorities - message queues. For client-server architecture - Unix sockets (+ file permissions). For large volumes of data between closely related processes - shared memory. **Practical example:** PostgreSQL uses shared memory for the buffer pool (millions of accesses per second), but Unix sockets for client connections (structured protocol, authentication). The right IPC is a trade-off between performance, simplicity, reliability, and security.

What unique capability of Unix domain sockets distinguishes them from all other IPC mechanisms (pipes, shared memory, message queues)?

Key Ideas

**Pipes - simplicity and composition.** Unidirectional byte stream, FIFO order. Ideal for pipelines (cat | grep | wc). Named pipes (FIFO) work between any processes. Basic block of UNIX philosophy.
**Shared Memory - maximum speed.** One physical memory mapped into process address spaces. Zero-copy, but requires synchronization (mutexes, semaphores). For high-load systems (PostgreSQL buffer pool, Redis replication buffer).
**Message Queues - structured communication.** Transmission of discrete messages with priorities. POSIX mq supports asynchronous notifications. Producer-consumer pattern with task priorities.
**Unix Domain Sockets - flexibility and power.** Client-server architecture through the file system. File permissions for access control. Unique feature: passing file descriptors between processes. Faster than TCP loopback by ~2x. Docker, X11, systemd - all use Unix sockets.

Вопросы для размышления

Designing a system for real-time video processing (low latency is critical): which IPC mechanism best suits transferring frames between the capture process and the encoder process? Why?
The Docker daemon can listen on a TCP port (tcp://0.0.0.0:2375) or a Unix socket (/var/run/docker.sock). In production, the Unix socket is almost always used. What security risks does the TCP option pose?
Chrome uses shared memory to transfer rendered bitmaps from the renderer process to the browser process (for display on the screen). Why can't pipes or sockets be used for this task? (Hint: data size, latency)
In what scenarios is a message queue with priorities better than a simple pipe? Provide an example of a system where this is critical.

Связанные уроки

os-03-threads — IPC addresses communication between processes, not threads
os-05-sync — Sync primitives are used inside IPC mechanisms
net-15-tcp-basics — Sockets are IPC over the network, same semantics
os-17-locks-advanced — Advanced lock mechanisms build on top of IPC primitives
net-54-rpc
net-55-message-queues

Operating Systems

Inter-Process Communication (IPC)

**Chrome multi-process architecture:** Each tab is a separate renderer process (isolation). The browser process coordinates through **shared memory** (for page bitmaps) and **message passing** (for commands). A crash of one tab does not crash the browser.
**Docker daemon communication:** `docker run` uses a **Unix domain socket** (/var/run/docker.sock) to send commands to the daemon. This is safer than TCP (file permissions), faster (no network stack), and a standard pattern for system services.
**PostgreSQL shared buffer pool:** All backend processes (serving clients) work with a single **shared memory** segment (gigabytes). One data page loaded from disk is instantly available to all - critical for performance.

Цели урока

Apply IPC mechanisms: pipes (anonymous/named), shared memory, message queues, sockets
Pick the right one: shared memory is fastest, sockets cross hosts, pipes glue shells
Know SysV vs POSIX IPC; ftok, shmget, shm_open
Estimate costs: pipe ~1µs, shared memory ~50ns, Unix domain socket ~5µs
Prefer Unix domain sockets over TCP loopback for local IPC

Pipes - data pipelines

Shell pipes - command pipeline

Redis uses pipes for worker communication

What happens if a process tries to read from a pipe that no one writes to (all write ends are closed)?

Shared Memory

PostgreSQL uses shared memory for buffer cache

Redis shared memory in master-replica communication

Why is shared memory considered the fastest IPC mechanism but requires additional synchronization?

Message Queues

systemd uses message queues for service communication

Real-world: task dispatcher pattern

What is the key advantage of POSIX message queues over regular pipes for implementing a task queue with priorities?

Unix Domain Sockets

Docker uses Unix socket for API

Nginx master-worker architecture

X11 window system and Wayland

Shared memory is always the best choice for IPC because it's the fastest - it should be used everywhere

The choice of IPC mechanism depends on requirements: speed vs structure vs reliability. There is no universal solution

What unique capability of Unix domain sockets distinguishes them from all other IPC mechanisms (pipes, shared memory, message queues)?

Key Ideas

**Pipes - simplicity and composition.** Unidirectional byte stream, FIFO order. Ideal for pipelines (cat | grep | wc). Named pipes (FIFO) work between any processes. Basic block of UNIX philosophy.
**Shared Memory - maximum speed.** One physical memory mapped into process address spaces. Zero-copy, but requires synchronization (mutexes, semaphores). For high-load systems (PostgreSQL buffer pool, Redis replication buffer).
**Message Queues - structured communication.** Transmission of discrete messages with priorities. POSIX mq supports asynchronous notifications. Producer-consumer pattern with task priorities.
**Unix Domain Sockets - flexibility and power.** Client-server architecture through the file system. File permissions for access control. Unique feature: passing file descriptors between processes. Faster than TCP loopback by ~2x. Docker, X11, systemd - all use Unix sockets.

Вопросы для размышления

Designing a system for real-time video processing (low latency is critical): which IPC mechanism best suits transferring frames between the capture process and the encoder process? Why?
The Docker daemon can listen on a TCP port (tcp://0.0.0.0:2375) or a Unix socket (/var/run/docker.sock). In production, the Unix socket is almost always used. What security risks does the TCP option pose?
Chrome uses shared memory to transfer rendered bitmaps from the renderer process to the browser process (for display on the screen). Why can't pipes or sockets be used for this task? (Hint: data size, latency)
In what scenarios is a message queue with priorities better than a simple pipe? Provide an example of a system where this is critical.

Связанные уроки

os-03-threads — IPC addresses communication between processes, not threads
os-05-sync — Sync primitives are used inside IPC mechanisms
net-15-tcp-basics — Sockets are IPC over the network, same semantics
os-17-locks-advanced — Advanced lock mechanisms build on top of IPC primitives
net-54-rpc
net-55-message-queues

Inter-Process Communication (IPC)

Цели урока

Pipes - data pipelines

Shell pipes - command pipeline

Redis uses pipes for worker communication

Shared Memory

PostgreSQL uses shared memory for buffer cache

Redis shared memory in master-replica communication

Message Queues

systemd uses message queues for service communication

Real-world: task dispatcher pattern

Unix Domain Sockets

Docker uses Unix socket for API

Nginx master-worker architecture

X11 window system and Wayland

Key Ideas

Related Topics

Вопросы для размышления

Связанные уроки

Inter-Process Communication (IPC)

Цели урока

Pipes - data pipelines

Shell pipes - command pipeline

Redis uses pipes for worker communication

Shared Memory

PostgreSQL uses shared memory for buffer cache

Redis shared memory in master-replica communication

Message Queues

systemd uses message queues for service communication

Real-world: task dispatcher pattern

Unix Domain Sockets

Docker uses Unix socket for API

Nginx master-worker architecture

X11 window system and Wayland

Key Ideas

Related Topics

Вопросы для размышления

Связанные уроки