Real-Time Systems

Formal Methods for RT

Pentium FDIV bug, 1994: a mathematician was computing prime numbers and got a wrong result from the CPU. Intel spent USD 475 million replacing processors. After that, Intel applied formal verification. The question: can the correctness of a program be proven mathematically, not just by tests?

  • **Amazon AWS:** TLA+ is used to verify DynamoDB, S3, and EBS protocols. In 2014, Amazon found 10 critical bugs including potential data loss that was unreachable by ordinary testing.
  • **Airbus:** UPPAAL is applied to verify parts of flight management software. Timed Automata allow proving compliance with timing requirements without exhaustive testing of all scenarios.
  • **seL4 microkernel:** The first and only OS with a complete mathematical proof of implementation correctness. Used in Boeing military UAVs. 200000 lines of Isabelle proof for 8700 lines of C.

Model Checking: exhaustive state space exploration

In 1994, Intel shipped the Pentium with a bug in the floating-point division unit. A mathematician discovered it while computing prime numbers and got a wrong result. Intel's loss: USD 475 million and reputation. Intel subsequently applied formal verification to successive CPUs. Model checking is a method where the system is represented as a finite automaton and properties are expressed as temporal logic formulas. The algorithm automatically checks all reachable states - not running tests, but mathematically traversing the transition graph.

CTL (Computation Tree Logic) temporal logic expresses properties like: AG(request -> AF response) - 'always: if request, then eventually a response will come'. LTL (Linear Temporal Logic) works with paths: G(p -> F q). For real-time systems, TCTL (Timed CTL) is critical: formulas can contain timing constraints: AF<=10ms(response). A model checker produces a counterexample when a property is violated.

What does a model checker produce when a verified property is violated?

Timed Automata: time as a first-class citizen

Ordinary finite automata have no notion of how much time passes between transitions. For real-time systems this is catastrophic: 'the watchdog must fire within 10 ms' or 'the task must respond within 5 ms' cannot be expressed. Timed Automata, proposed by Alur and Dill in 1994, solve this by adding clocks - continuously increasing variables. Transitions can check clock values (guard: x < 10) and reset them (reset: x := 0). Locations can have invariants: x <= 10, which forbid the system from remaining in a state longer than allowed.

Timed Automata are verifiable: despite continuous time, the state space can be finitely partitioned into regions - equivalence classes by behavior. Two time points are equivalent if they are indistinguishable from the automaton's perspective. The number of regions grows exponentially with the number of clocks and numeric constants in guards. In practice, tools use more efficient zones (DBM - Difference Bound Matrices).

Why do Timed Automata need invariants on locations in addition to guards on transitions?

UPPAAL: practical verification tool

UPPAAL is a joint development of Uppsala University and Aalborg University, one of the most widely used tools for real-time system verification. It implements an extension of Timed Automata: multiple automata run in parallel, synchronizing through channels (!/?). The entire system is a network of automata. Swarms of satellites, ABS protocols, medical devices, avionics - all have been verified in UPPAAL. Airbus uses UPPAAL for verifying parts of its flight management software.

UPPAAL supports three modes: Simulator (step-by-step execution), Verifier (model checking with CTL formulas), Tracer (counterexample replay). Query language: A[] - for all paths always, E<> - there exists a path where eventually, A<> - for all paths eventually, E[] - there exists a path where always. Plus TCTL extensions: A[] (Clock <= deadline).

What does the query A[] not deadlock mean in UPPAAL?

Verification in practice: industry applications

Formal methods were long considered the domain of academics. Amazon AWS broke this stereotype: in 2014 the company published a paper on using TLA+ to verify distributed protocols (DynamoDB, S3, EBS). Engineers found 10 critical bugs, including potential data loss under a specific failure sequence. Without formal specification these scenarios would have passed all tests. For real-time systems, formal verification is not an academic exercise - it is a tool that replaces thousands of hours of testing.

Practical limitations: state space explosion limits model checking for large systems. Solutions: abstraction (remove irrelevant details), compositional verification (verify components separately), bounded model checking (check limited depth). For DO-178C DAL-A, formal methods are recognized as an alternative verification method that reduces some test coverage requirements.

Formal methods are only for academic tasks; in real projects they are too slow and expensive

Amazon, Airbus, and Microsoft apply formal methods in production. TLA+ found bugs in S3 and EBS; UPPAAL is used in avionics systems

The cost of formal verification is higher than ordinary testing. But the cost of an undetected bug in a safety-critical system is incomparably higher. Amazon calculated that TLA+ pays for itself on the first data-loss bug it finds.

Why is the seL4 microkernel considered a landmark achievement in software verification?

Key ideas

  • **Model checking** automatically traverses all system states and checks CTL/LTL formulas. On violation it produces a counterexample - a concrete execution path.
  • **Timed Automata** extend automata with clocks and invariants, enabling the expression and verification of timing requirements. UPPAAL implements this formalism for parallel systems.
  • **Formal methods in industry** are used by Amazon (TLA+), Airbus (UPPAAL), Microsoft (P language). DO-178C recognizes them as an alternative to some test coverage requirements for DAL-A.

Related topics

Formal methods intersect with safety standard requirements and task scheduling practice:

  • Safety Standards DO-178C, ISO 26262 — DO-178C DAL-A recognizes formal verification as an alternative method that reduces test coverage requirements
  • Schedulers and WCET — WCET analysis is a special case of formal verification: proving an upper bound on task execution time

Вопросы для размышления

  • Model checking suffers from state space explosion. How does compositional verification address this problem, and what assumptions must be made to apply it?
  • TLA+ allowed Amazon to find bugs in S3 and EBS that testing missed. What class of errors is hardest to find with tests but straightforward with model checking?
  • seL4 has 200000 lines of proof for 8700 lines of code. What happens to the proof when one line of implementation changes, and how does this affect maintainability?

Связанные уроки

  • fl-33-formal-verification
Formal Methods for RT

0

1

Sign In