Real-Time Systems
Task Scheduling: Rate Monotonic
Mars Pathfinder 1997: the rover started rebooting on Mars due to priority inversion. Toyota 2009: a recall of 4 million vehicles connected to RT scheduling problems. DO-178C requires mathematical schedulability proof for every aircraft. Formal task scheduling is not an academic discipline - it is an engineering tool with clear real-world consequences.
- **AUTOSAR (automotive):** the standard uses RM scheduling; schedulability is proved at design time
- **Mars Pathfinder (1997):** priority inversion nearly destroyed the rover; priority inheritance saved the mission
- **Avionics (DO-178C):** formal schedulability proof is a mandatory certification requirement
Предварительные знания
Liu & Layland and the birth of RT scheduling theory
The paper 'Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment' (1973) was written at MIT with NASA support. The authors were interested in onboard systems - tasks with hard deadlines on a single processor. The paper laid the mathematical foundation of RT scheduling: the utilization bound concept, the optimality proof for RM, schedulability analysis. Before publication, engineers used heuristics. After - a formal apparatus emerged that is today mandatory in automotive and aviation.
Rate Monotonic Scheduling
Toyota, 2009. Sudden acceleration in the Prius led to a recall of 4 million vehicles. One root cause hypothesis: the throttle control task missed its deadline due to incorrect priority assignment. In every modern car, 50-100 tasks run on one microcontroller: ABS (every 5 ms), engine control (every 20 ms), display update (every 100 ms). One core. **Rate Monotonic** offers a compact answer: the shorter the period, the higher the priority.
**Rate Monotonic** is optimal among algorithms with **fixed priorities**. Liu & Layland theorem (1973): if a task set is schedulable by any fixed-priority algorithm, it is also schedulable by RM. Priorities are assigned once and never change.
Liu & Layland, 1973
The paper 'Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment' is one of the most cited works in computer science. The authors proved the optimality of RM and provided a simple schedulability test. Nearly all modern RTOS systems use RM as their baseline algorithm. The paper was written at MIT with NASA support - the authors were interested in onboard systems for Apollo.
Tasks: T1(period=10, exec=3), T2(period=20, exec=5). U = 3/10 + 5/20 = 0.55. Bound for n=2: 0.828. Schedulable?
Earliest Deadline First
RM wastes up to 30% of the CPU (bound ~0.69 for many tasks). Can this be improved? **EDF** (Earliest Deadline First) is a dynamic algorithm: whoever has the nearest deadline runs next. EDF is **optimal** for uniprocessor systems: if a task set is schedulable at all, EDF will find a valid schedule. This is a theorem, not a heuristic.
| Property | Rate Monotonic (RM) | Earliest Deadline First (EDF) |
|---|---|---|
| Priorities | Static (fixed) | Dynamic (change over time) |
| Optimality | Among fixed-priority | Absolute for uniprocessor |
| Utilization bound | ~0.69 (ln 2) as n->inf | 1.0 (100% CPU!) |
| Implementation | Simple (priorities set once) | More complex (recomputed each tick) |
| Under overload (U>1) | Predictable: low priority misses | Unpredictable: domino effect |
| Usage | RTOS: FreeRTOS, VxWorks | Research, Linux SCHED_DEADLINE |
**EDF is optimal but dangerous under overload.** If U > 1, RM predictably drops low-priority tasks while high-priority ones keep running. EDF under overload can miss the deadline of **any** task - the domino effect. This is why safety-critical systems often prefer RM.
Linux supports EDF via SCHED_DEADLINE (since kernel 3.14). A task declares its runtime, deadline, and period, and the kernel guarantees CPU allocation. In industrial RTOS (VxWorks, QNX), fixed-priority scheduling dominates - predictability matters more than optimality.
U = 0.85. RM bound for 5 tasks = 0.743. Which scheduler guarantees schedulability?
Priority Inversion
July 4, 1997. The Mars Pathfinder rover started rebooting spontaneously on Mars. The cause: **priority inversion** - a low-priority task was blocking a high-priority task through a shared mutex. One of the most famous bugs in software engineering history. Fixed remotely from Earth - at a distance of 191 million kilometers.
Mars Pathfinder Bug (1997)
On Pathfinder, a low-priority meteorological task held the data bus mutex. A high-priority bus management task could not acquire the mutex. The watchdog timer detected the delay and rebooted the computer. JPL engineers fixed the bug remotely by enabling priority inheritance via a command sent from Earth. The bug existed in VxWorks but was not caught during testing - the timing scenario was not covered.
**Priority Inheritance** is reactive: it raises the priority when a conflict occurs. **Priority Ceiling** is proactive: it prevents the conflict in advance. Ceiling is harder to configure but guarantees freedom from deadlock and limits blocking time to a single critical section.
HIGH(priority 1) is waiting for a mutex held by LOW(priority 3). MED(priority 2) is ready to run. With priority inheritance, who runs?
Schedulability Test
Aviation standard DO-178C (Boeing, Airbus, all civil aircraft). Section 6.4: every task in an RT system must have a formal schedulability proof. Not 'testing showed' - a formal mathematical proof. Without it, no FAA certification. Without certification, no flights. **Schedulability analysis** is a mathematical proof that all deadlines will be met under any conditions.
| Test | Type | Complexity | Accuracy |
|---|---|---|---|
| Utilization bound (RM) | Sufficient | O(n) | Conservative (may reject schedulable sets) |
| Response Time Analysis (RM) | Exact (necessary & sufficient) | Pseudo-polynomial | Exact for RM |
| Utilization test (EDF) | Necessary & sufficient | O(n) | Exact for periodic tasks |
| Processor Demand Analysis (EDF) | Exact for sporadic | Exponential | Exact for EDF |
**Utilization bound** is fast but conservative: it may reject a schedulable task set. **Response Time Analysis** is exact: it computes the worst-case response time for each task. If R_i <= D_i, the task is schedulable. RTA is the standard method in automotive (AUTOSAR).
In certified systems (DO-178C for aviation, ISO 26262 for automotive), schedulability analysis is a **mandatory** step. Engineers must formally prove that all tasks meet their deadlines under worst-case conditions.
Schedulability tests assume an **ideal processor**: no context switch overhead, no cache misses, no interrupt jitter. In practice, margins must be added for these effects. Typical practice: U < 0.7 for hard RT, leaving a buffer for the unexpected.
EDF is always better than RM - it is optimal and uses 100% of the CPU
EDF is optimal for schedulability, but behaves unpredictably under overload (U > 1). RM is simpler, more predictable, and safer.
EDF is schedulable up to U <= 1.0 vs. RM's ~0.69 - EDF wins on utilization. But under brief overload, RM predictably sacrifices low-priority tasks, while EDF can miss the deadline of ANY task (domino effect). In safety-critical systems, predictable failure is more valuable than peak utilization. This is why VxWorks, QNX, and AUTOSAR use fixed-priority (RM) rather than EDF.
Three tasks: T1(5,1), T2(10,3), T3(20,4). U=0.20+0.30+0.20=0.70. RM bound for 3 = 0.780. Guaranteed schedulable?
Key Ideas
- **RM:** static priorities by period; optimal among fixed-priority; bound ~ln(2) approx 0.693
- **EDF:** dynamic priorities by deadline; schedulable up to U <= 1.0; optimal for uniprocessor
- **Priority Inversion:** LOW blocks HIGH through a mutex; solution: priority inheritance or priority ceiling
- **Schedulability analysis:** formal proof of deadline compliance; mandatory for certification
Related Topics
RT task scheduling is closely tied to execution time analysis:
- What Are RT Systems — Hard/soft/firm classification defines scheduling requirements
- WCET Analysis — Without knowing worst-case execution time, schedulability analysis is impossible
- Synchronization — Mutexes in RT systems require priority inheritance/ceiling protocols
Вопросы для размышления
- Why does automotive prefer RM (non-optimal) over EDF (optimal)?
- How did NASA engineers fix the priority inversion bug on Pathfinder from 191 million km away?
- If U = 0.70 and RM guarantees schedulability, should a margin be added? Why?
Связанные уроки
- rts-01 — Hard/soft/firm classification defines scheduling requirements
- rts-03 — WCET analysis - without it schedulability analysis is impossible
- par-03 — Mutexes in RT systems require priority inheritance/ceiling protocols
- emb-01 — Embedded microcontroller firmware is the primary deployment environment for RT schedulers
- os-04-scheduling — OS scheduling is the general case - RT scheduling adds hard deadline guarantees
- alg-01-big-o — Scheduling algorithms are a specialized class of resource distribution optimization