Operating Systems
Signals in Unix
Every Ctrl+C that stops a program, every server that gracefully shuts down upon receiving SIGTERM, every "Segmentation fault" crash - signals are behind all of this. It is one of the oldest Unix mechanisms (dating back to the 1970s) but remains critical for system programming. Understanding signals distinguishes a junior from a senior backend engineer.
- **Graceful shutdown of services.** When systemd restarts nginx, it sends SIGTERM. Nginx catches the signal, completes current HTTP requests, closes connections, saves logs, and exits gracefully. Without handling SIGTERM, active connections would be interrupted with a 502 error.
- **Monitoring and health checks.** Kubernetes sends SIGTERM to a pod during scale-down with a grace period of 30 seconds. If the process does not terminate, SIGKILL follows. Production-ready applications handle SIGTERM for proper completion of long operations (DB transactions, network requests).
- **Debugging and profiling.** GDB uses SIGTRAP for breakpoints. Profilers (perf, gperftools) use SIGPROF for sampling. strace intercepts syscalls via ptrace + SIGTRAP. Understanding signals is critical for debugging production issues.
Цели урока
- Know the key signals: SIGTERM, SIGINT, SIGKILL, SIGSTOP, SIGSEGV, SIGCHLD
- Distinguish catchable (TERM, INT) from uncatchable (KILL, STOP)
- Understand signal-safe functions: the async-signal-safe list from man 7 signal-safety
- Apply sigaction (not signal); mask signals inside handlers
- Use real-time signals (SIGRTMIN-SIGRTMAX): queueing, payload, delivery guarantee
Signal Basics
**Signal** - a software interrupt sent to a process by the operating system or another process. It is a mechanism for asynchronous event notification: from user commands (Ctrl+C) to critical errors (segmentation fault).
Signal as a Telegram
Analogy: a process is an office worker. Regular data arrives via mail (pipes, sockets) - they are read on schedule. **Signal** - is a courier who bursts into the office with a critical message: "FIRE!", "THE BOSS DEMANDS A REPORT!", "LUNCH BREAK!". The worker **must** interrupt the current task and respond.
In Unix, there are about 30 standard signals. Each has a number and a symbolic name. The most important ones: • **SIGINT (2)** - Ctrl+C, keyboard interrupt • **SIGTERM (15)** - polite request to terminate • **SIGKILL (9)** - immediate kill (cannot be caught) • **SIGSEGV (11)** - segmentation fault • **SIGCHLD (17)** - child process terminated
**Key property of signals - asynchronicity.** A signal can arrive at ANY moment during program execution: in the middle of a function, during a system call, even between processor instructions. This creates unique security challenges.
Each signal has a **default action**: 1. **Term** - terminate the process 2. **Ign** - ignore 3. **Core** - terminate + create core dump (memory snapshot) 4. **Stop** - suspend the process 5. **Cont** - resume the suspended process
How the kill Command Works
The `kill` command does NOT kill processes - it simply sends signals! By default, it sends SIGTERM (15), which the process can catch and ignore. SIGKILL (9) is the only signal that guarantees a kill because it **cannot** be caught or ignored. The kernel forcibly kills the process.
Why does SIGKILL guarantee to kill a process, while SIGTERM does not?
Signal Handling
A process can change its behavior upon receiving a signal by setting a **signal handler** - a function that is called when a signal is received. This is the basis for graceful shutdown, handling Ctrl+C, reacting to timeouts.
**Problem with signal():** outdated interface with undefined behavior on different systems. Modern code uses **sigaction()** - a more powerful and predictable API.
**sigaction flags (sa_flags):** • **SA_SIGINFO** - use extended handler with siginfo_t parameters • **SA_RESTART** - automatically restart interrupted syscalls (read, write) • **SA_NODEFER** - do not block the signal during its handling • **SA_RESETHAND** - reset handler to default after the first call
Graceful Shutdown in Production
On `systemctl restart nginx`, systemd sends SIGTERM to the nginx process. Nginx catches the signal, **stops accepting new connections**, finishes processing current requests, closes log files, releases resources, and **only then** terminates. Without graceful shutdown, active connections would be interrupted with an error.
**Signal masking** - temporary blocking of signals during the execution of a critical section. Blocked signals become **pending** and are delivered after unblocking.
Example: Configuration Reload
Nginx uses the **SIGHUP** signal to reload configuration without stopping the server. The handler reads the new config, creates new worker processes with the new configuration, and the old workers terminate after processing current requests. Zero-downtime reload!
What happens after blocking SIGTERM with sigprocmask(), if the process then receives kill -TERM?
Signal-Safe Programming
**Async-signal-safe** - the main challenge of programming with signals. A signal handler can interrupt a program at ANY moment, even in the middle of a malloc() or printf() call. This creates race conditions and leads to deadlocks.
Why printf() is Dangerous in a Signal Handler
Scenario: the main program calls printf(), which internally locks a mutex on stdout. At this moment, a signal arrives, the handler is called, which ALSO tries to lock the mutex on stdout via printf(). **Deadlock!** The handler waits for the mutex held by the main program, but it is suspended until the handler finishes. Eternal waiting.
**Async-signal-safe functions (allowed in handler):** • **I/O:** read(), write(), close(), pipe() • **Process:** _exit(), fork(), kill(), sigaction() • **Memory:** mmap(), munmap() (BUT NOT malloc/free!) • **Sync:** sem_post() (BUT NOT mutex!) FULL list: `man 7 signal-safety` **Prohibited:** printf, malloc, free, pthread_mutex_*, fopen, exit() and most library functions
**Classic pattern: self-pipe trick.** The handler does not perform heavy work, but only **signals** the main loop through a pipe. The main loop processes the signal in a safe context (not in the handler).
Linux signalfd - Modern Approach
Linux offers **signalfd()** - converts signals into a file descriptor! Signals fit into an event loop (epoll, select) alongside sockets and timers. No handlers - just read signals as data from a file.
**Reentrancy** - a key requirement for async-signal-safe functions. A function must work correctly even if it is called from itself (through signal interruption).
Why are malloc() and free() prohibited in a signal handler?
Real-Time Signals
**Real-time signals (POSIX.1b)** - an extension of standard signals with guaranteed delivery and queuing. Standard signals (SIGINT, SIGTERM) have drawbacks: sending 3 SIGTERM in a row results in the process receiving only 1 (signals are not queued). Real-time signals solve this problem.
**Differences between real-time signals and standard signals:** • **Guaranteed delivery** - every sent signal will be delivered (within queue limit) • **Ordering** - signals are delivered in order of priority and sending time • **Payload** - can transmit data (int or pointer) with the signal • **Range:** SIGRTMIN (34) to SIGRTMAX (64) - 30+ additional signals
Use Case: Asynchronous I/O
Linux AIO (Asynchronous I/O) uses real-time signals for notifications. When a disk completes a read operation, the kernel sends SIGRTMIN with a payload - a pointer to the aiocb structure. The program receives a completion notification without polling and blocking.
**Priorities of real-time signals:** Lower number = higher priority. SIGRTMIN has the highest priority, SIGRTMAX - the lowest. If multiple signals are pending, high-priority ones are delivered first.
**Limitations of real-time signals:** The kernel has a limit on the number of queued signals per user (usually RLIMIT_SIGPENDING ~ 16000). If the queue is full, sigqueue() will return EAGAIN. Check limits via `ulimit -i`.
Real-time vs Standard Signals
**Standard:** Send 100 SIGUSR1 in a row - the process will receive 1 or 2 (signals merge). Use for events where the count doesn't matter: "reload config". **Real-time:** Send 100 SIGRTMIN - the process will receive all 100 in the order sent. Use for tasks where each one matters: "process transaction 42", "complete request 101".
Signals are just functions that are called on an event, working like regular callbacks
Signals are asynchronous interrupts that can arrive at ANY moment and interrupt the program between any two instructions, creating unique security challenges
A regular callback is called at a specific point in the program (event loop, after an operation completes). A signal handler is called asynchronously: it can interrupt the program in the middle of malloc(), while holding a mutex, between reading and writing a global variable. This leads to race conditions, deadlocks, and data corruption. That's why there is a strict list of async-signal-safe functions and printf/malloc/mutex are prohibited in handlers. Signals are a low-level IPC mechanism requiring a deep understanding of the kernel and concurrency.
What is the key advantage of real-time signals (sigqueue) over standard ones (kill)?
Key Ideas
- **Signals - asynchronous software interrupts.** They can arrive at any moment during program execution. The kernel delivers signals when transitioning from kernel mode to user mode. SIGKILL and SIGSTOP cannot be caught - the kernel forcibly kills the process.
- **Signal handling: signal() is outdated, use sigaction().** Modern API with predictable behavior, support for flags (SA_RESTART, SA_SIGINFO), signal masking. Graceful shutdown through handling SIGTERM/SIGHUP is standard practice in production systems.
- **Async-signal-safety - the main challenge.** Only async-signal-safe functions (write, _exit, sigaction) are allowed in a signal handler. Prohibited: printf, malloc, mutex. Self-pipe trick or signalfd() for safe handling in the main loop.
- **Real-time signals - guaranteed delivery + payload.** Standard signals merge (pending bit), real-time are queued. Can pass data (int/pointer), have priorities. Used in AIO, task scheduling, priority queues.
Related Topics
Signals - a fundamental IPC mechanism related to processes, synchronization, and system programming:
- Processes and IPC — Signals are one of the IPC mechanisms. Others: pipes, shared memory, message queues. Signals are the only asynchronous event delivery mechanism
- System Calls — kill(), sigaction(), sigprocmask(), signalfd() - system calls for working with signals. Understanding syscalls is critical for effective signal use
- Concurrency and Synchronization — Signal handlers - a form of asynchronous code execution, similar to interrupt handlers. Same issues: race conditions, reentrancy, atomicity
- File Descriptors and I/O — signalfd() converts signals into FD, integrating them into event-driven architectures (epoll/select). Self-pipe trick uses pipe() for safe handling
Вопросы для размышления
- Why do modern asynchronous frameworks (Node.js, tokio, asyncio) rarely use signals directly, preferring signalfd/event loops?
- How should a graceful shutdown be designed for a microservice with active HTTP connections, DB transactions, and background jobs? Which signals to use?
- What is the fundamental difference between signals (push notification from the kernel) and polling (the program checks flags itself)? When is polling preferable?
Связанные уроки
- os-13-ipc — Signals are one IPC mechanism: lightweight async notification
- os-02-processes — Process lifecycle must be understood before signals
- net-14-udp — UDP and signals: fire-and-forget, no delivery guarantee
- os-15-syscalls — Signals are handled via syscalls (sigaction)
- rt-18