Compilers
Speculative optimizations
JavaScript `x + y` can run as a single machine ADD instruction, or as a 20-instruction chain of type checks. The difference is what the JIT has seen before. Speculative optimizations are bets: 'I am sure x is always a number, so I emit fast code'. If the bet is wrong, the runtime rolls back. V8 makes billions of such bets per second, and that is exactly what turns interpreted JavaScript into code that can beat naive C++.
- **V8 in Chrome**: Google Docs with 10,000 cells in a table feels smooth thanks to type specialization. Every arithmetic operation on numbers compiles to a single native instruction
- **HotSpot JVM**: Elasticsearch under peak load handles 50k+ requests per second partly because C2 optimizes virtual calls through monomorphic guards
- **PyPy**: NumPy-style numeric operations in pure Python speed up 20-50x through type specialization in the RPython JIT, with no source changes
Type specialization
Type specialization is the generation of native code for the specific argument types observed in a profile. V8 sees that `add(x, y)` is always called with Smi numbers (Small Integers) and emits a single machine ADD instead of the generic path with type checks. The speedup is 3-10x for arithmetic.
HotSpot JVM specializes virtual calls. If `interface.method()` is called on a single implementation 99% of the time (a monomorphic call site), C2 inlines a direct call with one inline guard. Bimorphic means two variants. Polymorphic (3+) falls back to vtable dispatch. V8 uses the same idea with monomorphic and polymorphic inline caches (MIC/PIC).
What does 'monomorphic call site' mean in JIT specialization?
Deoptimization
Deoptimization (bailout) is the rollback from optimized native code to the interpreter when a speculative assumption breaks. V8 must reconstruct the stack and register state in a format the Ignition interpreter understands. This is an expensive operation (around 1-10ms), so the JIT tries hard to avoid deoptimizations.
V8 keeps a deoptimization counter for every function. Past a certain threshold the function is marked 'never optimize' and stays in Ignition forever. This protects against hot optimization-deoptimization loops (optimization bailout loops). The `node --allow-natives-syntax` flag combined with `%GetOptimizationStatus` lets you diagnose this in development.
Why is deoptimization expensive?
Guards
A guard is a check inside optimized native code that verifies a speculative assumption still holds. If a guard fails, the runtime deoptimizes. TurboFan inserts guards after every typed operation. The compiler's job is to minimize the number of guards and hoist them out of hot loops.
Loop invariant code motion (LICM) is an important optimization for guards. If a guard does not depend on loop variables, the compiler hoists it before the loop, so the hot loop body has no type checks. TurboFan applies this aggressively: a loop over a numeric array checks the elements array type once before entering.
Why does a JIT compiler apply Loop Invariant Code Motion to guards?
On-Stack Replacement (OSR)
On-Stack Replacement swaps a function for an optimized version while it is running, without waiting for the next call. The classic scenario: a function with a long loop starts executing in the interpreter, but the loop is so hot that the JIT wants to optimize it right away. OSR builds a new stack frame whose state matches the optimized code and 'jumps' into it.
HotSpot JVM has supported OSR since version 1.3 (2001). Without OSR, Java servers could not get JIT benefit from methods with long initialization loops. GraalVM Truffle uses OSR for language runtimes: Python/Ruby/JS methods with hot loops get OSR optimization without restarting the method.
Speculative optimizations are dangerous because the program might return a wrong result during deoptimization
Deoptimization is always correct. The JIT must restore the exact state and continue execution in the interpreter without changing semantics
Correctness is a JIT invariant. Deoptimization is a rollback to a slower but 100% correct path. If that were not so, browsers and JVMs would be unreliable. Deoptimization is a performance problem, not a correctness problem
In which scenario is OSR especially important?
Key ideas
- Type specialization emits native code for the specific types observed in the profile. A monomorphic call site allows inlining without dispatch
- Guards are cheap checks in native code. LICM hoists them out of hot loops. A guard failure triggers deoptimization
- OSR (On-Stack Replacement) swaps a running function for the optimized version mid-execution. Critical for functions with long loops
Related topics
Speculative optimizations are a core mechanism of JIT compilation. They tie into profiling and security:
- JIT basics — Speculative optimizations apply in method JIT and tiered compilation
- GraalVM — The Graal compiler implements advanced speculative optimizations through Partial Escape Analysis
- Memory management — Speculative escape analysis lets the runtime allocate objects on the stack instead of the heap
Вопросы для размышления
- Spectre and Meltdown (2018) exploited speculative execution in CPUs. Are there analogous security risks in JIT compilers with their speculative optimizations?
- If a function is first called with integers and then with floats, V8 deoptimizes and creates a polymorphic IC. How can you rewrite the code to keep monomorphic specialization?
- OSR swaps the stack frame while the loop is running. Which invariants must the JIT guarantee during the OSR transition so it does not break correctness?