Real-Time Systems
Schedulability Analysis
1996. Boeing 777 - the first commercial airliner with no mechanical backup for flight control. 100+ real-time tasks, hard deadlines, zero margin for error. Mathematics developed 23 years before the first flight became the only way to prove safety for 400 passengers.
- Boeing 777 and DO-178B Level A - formal schedulability proof as a mandatory certification requirement, not an option
- Mars Exploration Rover - VxWorks with RMS, WCET analysis of every manipulator control task before launch
- Medtronic pacemakers - utilization deliberately held below 50% as a safety margin buffer
- Tesla Autopilot - RMS for the safety-critical layer, EDF for perception and planning
- TCAS (Traffic Collision Avoidance System) - EDF for maximum utilization in critical alert scenarios
RMS and DMS: priorities from mathematics
1996. Boeing 777 enters production. The first commercial aircraft fully certified to DO-178B: fly-by-wire with no mechanical backup. The pilot moves the stick - a signal reaches the computer - the computer moves the control surfaces. 100+ real-time tasks, each with its own period and deadline. If even one flap-control task misses its 20 ms deadline, the system loses surface authority. The question is not 'does it usually make it' but 'does it always make it under any combination of load'. The mathematics that answered that question was published 23 years before the first flight.
Liu and Layland, 1973. 'Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment.' Two foundational results: an optimal static priority algorithm and an analytical schedulability bound. Aircraft fly, rovers drive, pacemakers beat on these two results.
**Rate Monotonic Scheduling (RMS)** - the rule is stark in its simplicity: the task with the higher frequency gets the higher priority. A task with a 10 ms period always outranks a task with a 50 ms period. Statically. Forever. No exceptions.
RMS is not just an algorithm - it is an optimality theorem: if any task set is schedulable by any fixed-priority algorithm, RMS will also schedule it. Among all static-priority algorithms, RMS dominates. No stronger static scheduler exists.
**Deadline Monotonic Scheduling (DMS)** closes the gap where deadline differs from period ($D_i < T_i$). An antenna control system: period 50 ms (the antenna moves slowly), but critical telemetry must arrive within 15 ms. DMS assigns priority by deadline, not by period. The task with a 15 ms deadline outranks one with a 50 ms deadline. DMS is proven optimal among all static-priority algorithms for the case $D_i \leq T_i$.
RMS is a special case of DMS: when $D_i = T_i$ for all tasks, both algorithms yield identical priorities. Because most real-time tasks have deadline equal to period, RMS appears more often in literature and textbooks.
Three tasks with periods 5 ms, 15 ms, and 30 ms are scheduled under RMS. What priority does the 15 ms task receive?
Utilization Bound: the Liu-Layland threshold
Liu and Layland delivered more than an algorithm - they derived a formula that answers 'is this task set schedulable?' with no simulation at all. The utilization of each task: $U_i = C_i / T_i$, where $C_i$ is the Worst Case Execution Time and $T_i$ is the period. Sum across all tasks. If the total $U$ stays below the threshold, the system is schedulable under RMS with a mathematical guarantee.
As $n \to \infty$, the right-hand side converges to $\ln 2 \approx 0.693$. The famous 69.3% - the absolute lower bound on the RMS utilization threshold for any number of tasks. Any system with $U \leq 0.693$ is schedulable under RMS regardless of task count.
The critical trap: the condition is **sufficient, not necessary**. If $U > n(2^{1/n}-1)$, that is not a verdict. The system may still be schedulable - it requires Response Time Analysis. But if $U \leq$ the threshold, the guarantee is absolute. Boeing 777 cleared DO-178B Level A certification not because engineers ran 10,000 load scenarios. Because they computed utilization, ran RTA, and mathematically proved schedulability for every critical partition. Testing does not give a guarantee. Analysis does.
The Liu-Layland formula assumes independent periodic tasks with no shared resources. With mutexes and shared state, blocking time must be factored in. Without that correction, the system may miss a deadline even at U < 0.693.
A system of 3 tasks has total utilization U = 0.82. The Liu-Layland threshold for n=3 is 0.780. What can be concluded?
EDF: the dynamic optimum
RMS leaves roughly 30% of the CPU unused when task count grows large. That is the price of static priorities. An algorithm exists without that price - and it is also proven optimal.
**Earliest Deadline First (EDF)** - at every instant, the processor goes to the task with the nearest absolute deadline. Priorities are not fixed: at every event (task activation, completion, interrupt) the scheduler recomputes the priority of every ready task. This is a dynamic algorithm.
The schedulability threshold for EDF is exactly 100%. If total utilization does not exceed one, EDF guarantees all deadlines. No algorithm can operate above 100% - that is a physical ceiling. EDF reaches it.
So why has aviation not switched to EDF? The cost is overload behavior. At $U > 1$ (transient overload from a hardware fault or anomalous WCET), RMS degrades predictably: low-priority tasks suffer, high-priority tasks survive. The flight control system keeps running, just with reduced non-critical features. EDF under overload degrades chaotically: any task can miss its deadline. In safety-critical systems, controlled degradation is worth more than theoretical utilization optimality.
Tesla Autopilot uses a hybrid: steering and braking tasks get static priorities (RMS-style), perception and path planning use dynamic scheduling. Two layers with different guarantees for different safety levels. This architecture keeps CPU utilization above 90% while preserving hard deadline guarantees for safety-critical work.
Response Time Analysis is the tool when the utilization test is inconclusive. For each task $i$, solve iteratively: $R_i^{(k+1)} = C_i + \sum_{j \in hp(i)} \lceil R_i^{(k)} / T_j \rceil C_j$, where $hp(i)$ is the set of higher-priority tasks. Iterate until convergence. If $R_i \leq D_i$ for every task, the system is schedulable. RTA is more expensive than the utilization test but more precise: it accepts schedulable systems that the utilization test rejects.
A system of 5 tasks has total utilization U = 0.95. Which algorithm guarantees all deadlines are met?
Key ideas
- RMS: static priorities by frequency ($1/T_i$) - optimal among all fixed-priority algorithms
- DMS: extension for $D_i < T_i$ - priority by deadline, not by period
- Utilization bound: $U \leq n(2^{1/n}-1) \to \ln 2 \approx 69.3\%$ - sufficient (not necessary) condition
- EDF: dynamic priorities by nearest deadline, schedulable at $U \leq 100\%$ - theoretical optimum
- RTA ($R_i = C_i + \sum_{j \in hp(i)} \lceil R_i/T_j \rceil C_j$) - precise analysis when the utilization test is insufficient
- In safety-critical systems, predictable degradation under RMS overload outweighs EDF's utilization optimality
Related topics
Schedulability analysis is the central tool of RTS engineering, connected to the scheduler, synchronization primitives, and system architecture:
- Priority Inversion — The next problem after proving schedulability - high-priority task blocked through a mutex
- RTOS Architecture — The RTOS scheduler implements RMS/EDF - theory meets code
- OS CPU Scheduling — A contrasting example: throughput optimization vs deadline guarantees
Вопросы для размышления
- DO-178B requires formal proof of schedulability, not testing. Why does 'we ran 10,000 tests without a single deadline miss' fail to satisfy the certification requirement for Level A avionics software?
Связанные уроки
- rts-04 — RTOS scheduler architecture is the foundation for schedulability analysis
- rts-07 — Priority Inversion - the next problem after proving schedulability
- os-04-scheduling — OS scheduling optimizes throughput; RTS scheduling proves deadline guarantees
- emb-05 — FreeRTOS implements RMS priorities - theory meets production code
- par-01 — Task parallelism and resource contention - adjacent context for interference analysis
- os-01-intro