Programming Language Theory
Ownership and lifetimes
In 2019 Microsoft published a statistic: 70% of security vulnerabilities in Windows over the previous 12 years were memory safety bugs. The same 70% in Chromium, in Android. C and C++ remained the systems-programming standard for decades, and the industry learned to treat use-after-free like the weather. Rust appeared with a thesis: these bugs are not inevitable - they can be eliminated through the type system, without a garbage collector and without losing performance.
- **Linux kernel**: accepts Rust drivers since version 6.1 (2022) - the first language added to the kernel in 30 years
- **Android**: since 2021 new native code is written in Rust; memory-safety vulnerabilities in native code dropped from 76% to 24%
- **Microsoft**: porting critical Windows components to Rust - DWriteCore, GDI - to eliminate memory bugs as a class
- **Cloudflare**: the Pingora proxy in Rust serves >1 trillion requests per day - replaced nginx with no runtime overhead
Linear types and move semantics
Jean-Yves Girard formulated linear logic in 1987: a resource cannot be silently duplicated or discarded - it must be consumed exactly once. Three decades later this idea became the foundation of Rust's type system. Every value has a single owner; assignment does not copy - it **moves** ownership. After a move the old variable becomes invalid and the compiler refuses to reference it. Drop at the end of scope is deterministic: memory is released at the same program point as in C, but without manual free calls.
A move is not a runtime operation but a static check. The number of bits actually copied during `let b = a` is the same as during a copy: the stack-slot bytes are bitwise duplicated. The difference is that the compiler marks `a` as consumed and no longer permits access to it. This is a zero-cost abstraction: the check lives in the type checker and produces no machine code.
Trait `Copy` marks types whose move can be replaced by a bitwise duplication. It can be auto-derived (`#[derive(Copy, Clone)]`) only when every field is itself Copy. Any type that owns heap memory (String, Vec, Box) cannot implement Copy - doing so would break the single-ownership invariant and cause a double-free at drop time.
Why does `String` not implement trait `Copy` while `i32` does?
Borrow checker: shared and exclusive references
Move is too rigid for everyday code: a function that only reads a value should not seize it. Rust adds **borrowing** - temporary references that do not transfer ownership. There are two flavours: `&T` (shared, immutable) and `&mut T` (exclusive, mutable). The central invariant of the borrow checker: at every program point, for every value, there exist either any number of `&T` or exactly one `&mut T` - never both. This invariant, known as aliasing XOR mutability, formally rules out data races at compile time.
Since 2018 the borrow checker uses **non-lexical lifetimes** (NLL): a reference's lifetime is no longer tied to its enclosing `{}` block but to its last use in the control-flow graph. Before NLL, code had to be reshuffled with artificial scopes; after NLL the compiler infers that a reference is no longer needed and permits new borrows to begin.
Aliasing XOR mutability is not merely a pragmatic rule; it is an invariant that validates a whole class of optimizations. The compiler may assume that data behind `&T` does not change under it, and that data behind `&mut T` is observed by no one else. This is equivalent to the C `restrict` keyword applied to every Rust program for free.
What does 'aliasing XOR mutability' prevent in a single-threaded program (no concurrency)?
Lifetimes and elision
Every reference carries a **lifetime** - the region of code over which it is guaranteed to be valid. A lifetime is an ordinary type parameter (like a generic), denoted with an apostrophe: `&'a T` reads as 'reference with lifetime 'a to T'. The compiler requires that the lifetime of a reference never exceed the lifetime of the data it points to. Returning a reference to a local variable is rejected: the local is destroyed at return, while the reference would outlive it - a dangling pointer.
Most lifetimes are inferred automatically by **lifetime elision** rules: (1) each reference parameter gets its own lifetime; (2) if there is a single input lifetime, it is propagated to every output reference; (3) if `&self` is present, its lifetime is propagated to the output. Explicit annotation is required only when elision is ambiguous - typically when returning a reference picked among several inputs.
A lifetime is not a runtime concept; nothing is allocated or freed under the hood. It is only a marker for the type checker. After monomorphization every lifetime parameter is erased and nothing about them remains in machine code. The cost of the system is zero bytes and zero cycles.
Why does the function `fn longest<'a>(x: &'a str, y: &'a str) -> &'a str` use a single lifetime for both parameters and the result?
Box, Rc, Arc and shared ownership
Single ownership describes ownership trees well, but real data structures are sometimes graphs. Rust handles this with **smart pointers** - types that own heap-allocated data and customise their cleanup via the Drop trait. `Box<T>` is the simplest: single owner, heap allocation, zero overhead compared to C++'s new. `Rc<T>` adds a non-atomic reference count; `clone()` increments it, the last drop releases the data. `Arc<T>` is the same with an atomic counter, which makes it usable across threads.
Rc/Arc provide shared ownership but still forbid mutation through a shared reference. Controlled mutation is enabled by **interior mutability** - types `Cell<T>`, `RefCell<T>` (single-threaded), `Mutex<T>`, `RwLock<T>` (thread-safe). RefCell defers the borrow check to runtime: an attempt to obtain &mut while & is live panics. This is a trade-off: flexibility in exchange for losing static guarantees.
Rc/Arc cycles cause leaks: the count never drops to zero. The remedy is **Weak** - a non-owning reference that does not contribute to the count and may become dangling without undefined behaviour (`Weak::upgrade` returns Option). The canonical tree pattern has parents holding Rc to children and children holding Weak back to the parent.
Ownership makes Rust unsuitable for graphs and cyclic data structures
Graphs are implemented via indices into a Vec (arena allocation) or via Rc<RefCell<T>> + Weak. The standard library and crates such as petgraph use these patterns extensively.
Single ownership is formally incompatible with arbitrary graphs, but the problem moves to the collection layer. Arena pattern gives O(1) indexed access, type safety, and avoids cyclic references - the arena is the single owner.
What is the practical difference between `Rc<T>` and `Arc<T>`?
Key ideas
- **Linear types**: every value has a single owner; assignment is a move, not a copy; drop is deterministic at scope exit
- **Borrow checker**: aliasing XOR mutability - at every point either any number of &T or exactly one &mut T; data races and iterator invalidation are impossible
- **Lifetimes**: a reference's lifetime is a type parameter checked statically; nothing remains at runtime, the system costs zero
- **Shared ownership**: Rc/Arc give reference counting; interior mutability via RefCell/Mutex; cycles are broken by Weak references
Related topics
Rust ownership is not an isolated feature but the intersection of several theoretical strands:
- Subtyping — Lifetimes form a subtype lattice: 'static <: 'a for every 'a. Lifetime-parameter variance (covariant, contravariant, invariant) applies the same rules as ordinary types
- Effect systems — Move semantics is a special case of linear typing; full effect systems in Koka and Frank generalise the idea to tracking arbitrary effects
- Memory management — RAII in C++ was the source of the idea; the difference is that C++ does not check the validity of shared references statically while Rust does
Вопросы для размышления
- If 70% of Windows vulnerabilities are memory safety, why did the industry accept that as normal for 30 years instead of switching to GC languages?
- Lifetimes are a static notion with no runtime footprint. What other runtime checks could be moved into the type system following the same principle?
- The borrow checker turns the design question of data ownership into a mandatory part of the architecture. Where does that help, and where does it become an obstacle?
Связанные уроки
- plt-10-linear-types — Ownership is a practical implementation of linear types
- plt-12-subtyping — Lifetime variance builds on subtyping rules
- os-07-memory — OS memory management is the same problem, but at runtime
- db-13-transactions — ACID transactions resolve the same access conflicts as borrow checker