Programming Language Theory
Garbage Collection
Every language has to decide: who frees memory? In C/C++, the programmer does (source of 70% of CVEs). Rust: the compile-time borrow checker. Java/Go/Python: a garbage collector. Each choice has a price: productivity, performance, pauses.
- **Go GC**: concurrent tri-color mark with sub-millisecond STW. Discord migrated from Python to Go for predictable latency: GC pauses were Python's biggest problem at 5M connections
- **Java ZGC**: Google uses ZGC in the Android runtime for smooth 120 fps. Sub-millisecond GC pauses are critical for gaming
- **Python ARC**: CPython uses reference counting plus a cyclic GC for cycles. This is why the GIL exists: thread-safe RC increment/decrement without a mutex requires the GIL
Mark-Sweep GC
Mark-sweep is the basic GC algorithm and has two phases. Mark: traverse the object graph from GC roots (stack, globals, registers) and mark every live object. Sweep: walk the entire heap and free every unmarked object.
Mark-sweep suffers from heap fragmentation. Mark-compact addresses this: after the sweep, live objects are compacted to the start of the heap. The cost: an extra pass and updating every reference.
What are GC roots, and why does traversal start from them?
Generational GC
The generational hypothesis: most objects die young. Generational GC splits the heap into generations. The young generation is small and collected often (minor GC). The old generation is large and collected rarely (major GC). The JVM, .NET, and Python all use generational GC.
Write barriers are the key mechanism: when a reference from Old to Young is written, the GC records it in the remembered set. Otherwise a minor GC would miss Young objects that are only reachable from Old.
Why is generational GC faster than simple mark-sweep?
Concurrent GC
Concurrent GC runs alongside the program, shortening pause times. The catch: the program mutates the object graph while marking. Tri-color marking (white/gray/black) plus write barriers solves this. ZGC (Java), Shenandoah, and the Go GC are examples.
Why does concurrent GC need write barriers?
Reference Counting
Reference counting (RC): each object holds a reference count. Adding a reference increments it; removing one decrements it. At zero, the object is freed. Deterministic release, no GC pauses. Python, Swift, and Rust's Arc<T> all use RC.
GC pauses are Java's main weakness. The JVM is unsuitable for real-time systems
ZGC (Java 15+) has sub-millisecond pauses regardless of heap size. Modern JVMs are usable for soft real-time
ZGC, Shenandoah, and G1 are the product of 15 years of work. The problem is not GC pauses but JVM tuning and choosing the right collector
Why can't plain reference counting collect cyclic structures?
Takeaways
- **Mark-sweep**: traverse from roots to mark, sweep the heap to free unmarked. Stop-the-world. Fragmentation handled by compaction
- **Generational**: young (frequent, fast) + old (rare, expensive). Write barriers maintain the remembered set. JVM/CLR/Python
- **Concurrent GC**: tri-color marking in parallel with the program. Write barriers preserve the invariant. ZGC: <1 ms on 16 TB
- **Reference counting**: deterministic release, no pauses, but cycles leak. Weak references break cycles. Swift ARC, Rust Arc
Related Topics
GC is closely tied to the runtime and the language:
- Memory models — Concurrent GC requires careful work with the memory model for write barriers
- WebAssembly — The Wasm GC proposal adds GC to WebAssembly, affecting compilation of managed languages to Wasm
Вопросы для размышления
- Rust uses the borrow checker instead of a GC and guarantees memory safety with no runtime overhead. Why don't all languages adopt this approach?
- Python's GIL exists in part because of its non-thread-safe reference counting. How does Py-nogil (PEP 703) plan to solve this without breaking compatibility?
- ZGC achieves sub-millisecond pauses via colored pointers and load barriers. What is the program's throughput cost for this?