Real-Time Systems
RTOS Architecture: kernel, scheduling, IPC, memory
An ABS module on a car must respond to wheel slip in under 10 milliseconds, every time, on the worst day of its life. A general-purpose Linux kernel cannot promise that. An RTOS can - because every scheduling decision, every interrupt handler, every IPC primitive is bounded by design.
- **FreeRTOS**: ships in over a billion devices - from smart light bulbs to the James Webb Space Telescope's attitude control system; AWS acquired it in 2017 and made it the basis of FreeRTOS LTS.
- **QNX**: powers infotainment and ADAS in over 200 million vehicles, plus medical devices certified to IEC 62304 Class C.
- **VxWorks**: flew on the Mars rovers Spirit, Opportunity, Curiosity, and Perseverance - same kernel scheduling priorities for the entire mission lifecycle.
Historical context
In 1968, the Apollo Guidance Computer designed by Charles Stark Draper's team at MIT became the first embedded system to run a priority-based real-time executive. The software, written by Margaret Hamilton's team, used preemptive scheduling to handle the abort guidance and navigation tasks simultaneously - the same architecture that modern RTOS kernels implement. During Apollo 11's lunar descent, the computer's overload detection restarted lower-priority tasks and allowed the mission to continue.
RTOS Kernel
An **RTOS (Real-Time Operating System)** kernel provides deterministic task scheduling with guaranteed worst-case response times. Unlike general-purpose OSes (Linux, macOS) that optimize average throughput, an RTOS optimizes **worst-case execution time (WCET)**. The preemptive scheduler ensures a high-priority task always preempts a running lower-priority task within microseconds of becoming ready.
**Hard vs soft vs firm real-time**: Hard RT means missing a deadline is a system failure (ABS brakes, aircraft fly-by-wire, pacemakers). Soft RT means occasional deadline misses degrade quality but are tolerable (video streaming, audio). Firm RT means a missed deadline renders the result useless but is not catastrophic (stock trading: a stale quote is worthless but not dangerous). Linux with PREEMPT_RT patch achieves soft RT with ~50-100 microsecond latency; RTOS achieves hard RT with < 10 microsecond latency.
An ABS (anti-lock braking system) must respond to wheel slip in < 10ms. Which platform is appropriate?
Real-Time Scheduling
**Priority inversion** occurs when a high-priority task waits for a resource held by a low-priority task, while medium-priority tasks run instead. This can cause unbounded blocking - the high-priority task effectively runs at the lowest priority. **Priority inheritance** solves this: a task holding a mutex temporarily inherits the priority of the highest-priority waiter until it releases the mutex. Mars Pathfinder (1997) suffered priority inversion without inheritance, causing system resets.
**Rate Monotonic Scheduling (RMS)**: for periodic tasks with fixed priorities, RMS assigns priority by rate (shorter period = higher priority). Liu and Layland (1973) proved that RMS is optimal for fixed-priority preemptive scheduling and that n tasks are schedulable if CPU utilization U <= n(2^(1/n) - 1). For n=infinity, U_max approaches ln(2) ≈ 0.693 - meaning at most 69.3% CPU utilization guarantees all deadlines are met.
Task H (high priority=10) waits for a mutex held by Task L (low priority=1). Task M (medium priority=5) is also ready to run. What happens without priority inheritance?
IPC: queues, semaphores, mutexes
RTOS IPC primitives: **Queue** - typed FIFO for passing data between tasks or ISRs (ISR writes, task reads). **Mutex** (with priority inheritance) - exclusive resource access. **Binary semaphore** - signaling (ISR signals task). **Counting semaphore** - resource pool management. **Task notifications** - lightweight alternative to semaphores for simple signaling (no queue overhead).
**ISR-safe API**: FreeRTOS provides separate ISR-safe versions of all IPC calls: `xQueueSendFromISR`, `xSemaphoreGiveFromISR`, `vTaskNotifyGiveFromISR`. These versions do not block (ISRs must return immediately) and take a `pxHigherPriorityTaskWoken` parameter. If signaling the queue or semaphore woke a higher-priority task, the ISR sets this flag and calls `portYIELD_FROM_ISR()` at the end - causing a context switch immediately after ISR exit.
An ISR receives data from UART and needs to pass it to a processing task. Which IPC primitive is correct?
Memory Protection and MPU
An RTOS with **MPU (Memory Protection Unit)** runs each task in a restricted memory region. A task can only read/write its own stack and explicitly granted peripheral registers. Writing to another task's stack causes an immediate MPU fault (HardFault on ARM Cortex-M). Without MPU, memory corruption propagates silently - the system continues with corrupted data, producing incorrect behavior that may be hard to diagnose.
**Stack overflow detection**: even without full MPU protection, FreeRTOS provides `uxTaskGetStackHighWaterMark()` - the minimum remaining stack space ever recorded. A value near zero indicates a potential stack overflow. `configCHECK_FOR_STACK_OVERFLOW` enables runtime stack checking: FreeRTOS writes a canary pattern at the stack boundary and checks it during context switches, calling `vApplicationStackOverflowHook` when the pattern is corrupted.
A buggy task writes data to the wrong address, corrupting another task's stack. Without MPU and without stack overflow detection, what happens?
Key ideas
- **RTOS kernel**: preemptive priority scheduling with bounded worst-case execution time, not average throughput.
- **Priority inversion** (Mars Pathfinder, 1997): solved by priority inheritance on mutexes - the standard FreeRTOS Mutex behavior.
- **Rate Monotonic Scheduling**: shorter period gets higher priority; schedulability bound for n tasks is n(2^(1/n)-1), approaching 69.3%.
- **IPC primitives**: queues for ISR-to-task data transfer, mutexes with PI for shared resources, task notifications for lightweight signals.
- **MPU protection**: per-task memory regions catch corruption immediately rather than letting it propagate silently.
Вопросы для размышления
- Linux with the PREEMPT_RT patch achieves ~50 us worst-case latency. Which industries can adopt that as their RTOS replacement, and which still need a dedicated kernel like FreeRTOS or VxWorks?
- Mars Pathfinder's priority inversion went undiagnosed until ground engineers reproduced it. What test methodology would catch a priority inversion bug before launch?
- Rate Monotonic Scheduling caps utilization at 69.3% for hard real-time guarantees. EDF reaches 100% but is rarely used. What practical concerns make RMS the default in production RTOS deployments?
Связанные уроки
- rts-03 — Interrupt service routines and timing fundamentals are prerequisites for RTOS scheduling
- arvr-04 — VR rendering has the same hard deadline structure as RTOS tasks - missing the 11ms frame deadline causes visible artifacts, missing an ABS deadline is catastrophic
- ds-04-clocks — Lamport logical clocks and RTOS system ticks both solve the problem of ordering events in time without a reliable external clock
- devops-04 — Docker containers run on Linux with soft real-time scheduling (SCHED_FIFO); RTOS provides hard real-time guarantees that Linux cannot match
- rts-05 — Real-time networking (EtherCAT, TSN) and safety certification (ISO 26262, DO-178C) build on the RTOS fundamentals covered here
- rts-01
- rts-02
- os-04-scheduling
- emb-01
- emb-04
- opt-04
- os-01-intro