Node.js Internals

V8: JavaScript engine

Why can the same JavaScript code run 10-100 times faster or slower? The secret lies in V8 - the engine that **does not interpret code**, but compiles it into native CPU instructions. However, V8 is not magic: it makes aggressive assumptions about code structure (types, object shapes). Breaking these assumptions collapses performance. This topic is about how V8 works under the hood and how to write code that V8 **loves**.

  • **Google Sheets (millions of cells):** V8 optimizes hot functions (formula recalculation) through TurboFan, turning them into machine code. If the data types in the cells are stable (all numbers), recalculation works at C++ speed. If the types are mixed (numbers + strings), bailouts occur → formulas slow down.
  • **Discord (real-time chat):** The Node.js backend processes millions of messages per second. Hidden Classes are critical: all message objects have the same structure (`{ id, userId, content, timestamp }`). This allows V8 to use inline caching - property access is 10x faster. If the structure changes (dynamic fields are added), IC breaks → throughput drops.
  • **Netflix (server-side rendering):** Node.js renders HTML on the server. GC is critically important: if the heap grows uncontrollably, Major GC pauses (50-100 ms) kill latency. Netflix uses object pooling (object reuse) and the `--max-old-space-size` setting to minimize GC pauses. Result: stable 99th percentile latency <10 ms.

V8 Architecture

V8 is not just a JavaScript interpreter. It is a highly optimized engine that **compiles** JS into native machine code at runtime. Written in C++, V8 is used in Chrome, Node.js, Deno, and Electron. Its goal: to make the dynamic language so fast that the difference with C++ becomes imperceptible for many tasks.

**Why is V8 special?** Before V8 (before 2008), JavaScript was slow. Browsers simply interpreted the code line by line. Google created V8 for Chrome with one goal: to make web applications (Gmail, Google Maps) as fast as desktop applications. V8 was the first engine to apply **JIT compilation** (Just-In-Time) for JavaScript, turning code into native processor instructions.

**Key difference:** V8 does not interpret code. It compiles JavaScript into x64/ARM64 machine code, which the CPU executes directly. This makes V8 orders of magnitude faster than traditional interpreters.

**V8 Components:** **Parser** - turns JavaScript code into an **AST** (Abstract Syntax Tree). V8 uses lazy parsing: functions that are not immediately called are parsed in a simplified mode (pre-parsing) to save time. **Ignition** (bytecode interpreter) - compiles the AST into **bytecode** and executes it. Bytecode is a compact intermediate representation (not machine code, but no longer JS). Ignition quickly starts the code while collecting statistics: which functions are called frequently (hot functions). **TurboFan** (optimizing compiler) - takes hot functions and compiles them into **highly optimized machine code**. TurboFan applies aggressive optimizations: inline caching, dead code elimination, escape analysis. But if assumptions are not met (for example, a function suddenly receives an object of a different type), TurboFan performs **deoptimization** (bailout) - reverts to bytecode. **Orinoco GC** (Garbage Collector) - a parallel and incremental garbage collector. It works in background threads, minimizing stop-the-world pauses.

Why V8 is faster than interpreters

```javascript // Simple addition function function add(a, b) { return a + b; } // Call a million times for (let i = 0; i < 1_000_000; i++) { add(i, i + 1); } ``` **Old interpreter:** parses `a + b` each time, checks types, calls the addition operation. **V8 (TurboFan):** after ~100 calls sees that `a` and `b` are always numbers → compiles to machine code: ```asm ; x64 assembly (simplified) mov rax, [rbp-8] ; load a add rax, [rbp-16] ; add b ret ; return result ``` Result: **100x+ speedup** for hot functions.

**Why two compilers (Ignition + TurboFan)?** Trade-off between startup speed and execution speed. Ignition provides a quick start, TurboFan - maximum performance for long-lived code. Previously, V8 used Full-codegen (basic compiler) + Crankshaft (optimizer), but in 2017 switched to Ignition + TurboFan - this saved memory and simplified the architecture.

**History of V8:** Launched in 2008 along with Chrome. Created by the team of Lars Bak (previously worked on virtual machines at Sun Microsystems). In 2009, Ryan Dahl used V8 as the basis for Node.js. Since then, V8 has been updated every 6 weeks in sync with Chrome releases. Each Node.js release (for example, Node 20) uses a specific version of V8.

**V8 vs JavaScriptCore vs SpiderMonkey:** V8 (Chrome/Node.js), JavaScriptCore (Safari), SpiderMonkey (Firefox) use different approaches to JIT compilation. Code optimized for V8 may run slower in other engines. For example, V8 aggressively optimizes hidden classes (more on this later), while SpiderMonkey uses other techniques (type-based Inline Caching). However, general principles (avoiding type changes, using stable object structures) work everywhere.

What is the main role of Ignition in the V8 pipeline?

JIT Compilation

**JIT (Just-In-Time) compilation** is the heart of V8's performance. Instead of executing JavaScript line by line (like an interpreter) or compiling all the code in advance (like AOT in C++), V8 does both: **it compiles code during execution**, adapting to actual usage.

**How the JIT-pipeline works:** 1. **Quick Start (Ignition):** The code is compiled into bytecode and immediately begins execution. This is faster than generating optimized machine code. 2. **Profiling:** During bytecode execution, V8 collects statistics - **argument types**, call frequency, condition branching (which `if`/`else` is executed more often). 3. **Optimization (TurboFan):** When a function becomes **hot**, TurboFan compiles it into native code with aggressive optimizations. For example, if a function always receives numbers, TurboFan generates code only for numbers (without type checks). 4. **Deoptimization (Bailout):** If TurboFan's assumptions turn out to be incorrect (for example, the function suddenly receives a string instead of a number), a **bailout** occurs - V8 reverts to Ignition bytecode.

**Inline Caching (IC)** - a key JIT technique. V8 caches the results of operations (e.g., object property access). On access to `obj.x`, V8 remembers: "object of type Shape1, property x offset = 8 bytes". The next access to `obj.x` is a direct read from memory (offset +8), without a name lookup. But if `obj` changes structure (a property is added), the IC is invalidated.

Deoptimization kills performance

```javascript function processUser(user) { return user.name + " " + user.age; } // Variant 1: Stable types (FAST) const users1 = [ { name: "Alice", age: 25 }, { name: "Bob", age: 30 }, { name: "Charlie", age: 35 } ]; for (const user of users1) { processUser(user); // TurboFan optimizes } // Variant 2: Changing types (SLOW) const users2 = [ { name: "Alice", age: 25 }, { name: "Bob", age: "30" }, // age is now a string! { name: "Charlie", age: 35 } ]; for (const user of users2) { processUser(user); // Bailout every time! } ``` **Benchmark:** Variant 1 is ~5x faster. TurboFan generates optimized code for numbers, but variant 2 constantly triggers deoptimization.

**How to check if a function is optimized?** Run Node.js with the flags: ```bash node --trace-opt --trace-deopt script.js ``` The output logs: - `[optimizing 0x... <multiply> ... ]` - the function is optimized by TurboFan - `[bailout ... reason: ...] ` - deoptimization (for example, `Insufficient type feedback`)

**Why is deoptimization so expensive?** When a bailout occurs, V8 must: 1. Stop executing the optimized code 2. Restore the state (stack frame) for the bytecode 3. Continue execution in Ignition 4. Re-profile the function (if the types stabilize, TurboFan will try to optimize again) This can take **microseconds** - not critical for a single call, but if the function is called millions of times, it turns into seconds.

**Megamorphic functions** are the worst-case scenario. When a function receives more than four different argument types, V8 marks it as megamorphic and stops optimizing it. Inline caching (IC) is also disabled. Solution: use **monomorphic** functions (one type) or **polymorphic** (2-4 types). Avoid situations where a function processes numbers, strings, and objects.

**TurboFan Optimizations:** - **Inlining** - embedding small functions directly into the caller (removes call overhead) - **Escape Analysis** - if an object does not leave the function, V8 places it on the stack instead of the heap - **Dead Code Elimination** - removal of unused code - **Loop Unrolling** - loop unrolling - **Type Specialization** - code generation for specific types (numbers, strings) All these optimizations work **only if types are predictable**.

What happens if a function optimized by TurboFan unexpectedly receives an argument of a different type?

Hidden Classes

JavaScript is a dynamic language. Properties can be added to objects at any time: ```javascript const obj = {}; obj.x = 1; obj.y = 2; obj.z = 3; ``` In C++, the structure of an object is fixed at compile time, and the compiler knows: "property `x` is at offset +0 bytes, `y` at +4 bytes". How does V8 do the same for JavaScript? The answer: **Hidden Classes** (or Shapes, Maps).

**Hidden Class** is V8's internal representation for an object's structure. It's like a blueprint that describes: - What properties the object has - In what order they are added - Where each property is located in memory (offset) Creating `{ x: 1, y: 2 }` causes V8 to generate a Hidden Class: "an object with two properties: x at offset 0, y at offset 8". Creating another `{ x: 10, y: 20 }` reuses the same Hidden Class - they have the same structure.

**Why is this important?** Inline Caching (IC) works only with objects of the same Hidden Class. When V8 sees `obj.x`, it caches: "this is an object Shape1, property x at offset 0". The next access to `obj.x` is a direct read from memory (one asm instruction). But if the Hidden Class changes, the IC is invalidated → V8 again performs a slow lookup by property name.

Optimization through Hidden Classes

```javascript // BAD: Different property order const user1 = { name: "Alice", age: 25 }; // HiddenClass A const user2 = { age: 30, name: "Bob" }; // HiddenClass B (different order!) // V8 creates TWO different Hidden Classes - inline caching doesn't work // GOOD: Same order const user1 = { name: "Alice", age: 25 }; // HiddenClass C const user2 = { name: "Bob", age: 30 }; // HiddenClass C (reused!) // V8 reuses the Hidden Class - IC works perfectly ``` **Benchmark:** The second variant is ~2-3x faster for property access (especially in loops).

**Performance Tip:** Initialize all object properties **in the constructor** and in the same order. Do not add properties dynamically after creating the object - this creates new Hidden Classes and breaks IC. ```javascript // GOOD class User { constructor(name, age) { this.name = name; this.age = age; } } // BAD class User { constructor(name) { this.name = name; } setAge(age) { this.age = age; // Adding property later → new Hidden Class } } ```

**What Kills Hidden Classes Optimization:** 1. **`delete obj.property`** - deleting a property switches the object to slow mode (dictionary mode). V8 switches from fast offset-based access to slow hash-table lookup. **Never use `delete` in a hot path!** Instead, assign `null` or `undefined`. 2. **Adding properties in different orders** - creates different Hidden Classes, breaks IC. 3. **Mixing numeric indices with named properties**: ```javascript const obj = { x: 1 }; obj[0] = "value"; // V8 creates an elements backing store (a separate structure) obj.y = 2; // Named properties in another structure ``` V8 divides objects into two parts: **elements** (numeric indices) and **properties** (named properties). Mixing them is inefficient.

**Dictionary Mode (slow properties):** If an object has too many properties (>100-1000) or `delete` is used, V8 switches the object to dictionary mode. Instead of a Hidden Class with fixed offsets, properties are stored in a hash-table. Access slows down by ~10x. This is irreversible - the object remains in slow mode forever. Check the mode: ```bash node --allow-natives-syntax script.js ``` ```javascript const obj = { x: 1 }; delete obj.x; console.log(%HasFastProperties(obj)); // false (dictionary mode) ```

**Hidden Classes and ES6 Classes:** ```javascript class Point { constructor(x, y) { this.x = x; this.y = y; } } const p1 = new Point(1, 2); const p2 = new Point(3, 4); ``` Both objects `p1` and `p2` have the same Hidden Class (created based on the `Point` class). V8 works very efficiently with ES6 classes - this is the **best way** to create many objects with the same structure.

Why does using `delete obj.property` critically slow down access to object properties in V8?

Garbage Collection Basics

JavaScript is a language with automatic memory management. Memory does not need to be freed manually (as in C/C++). The **Garbage Collector (GC)** handles this. V8 uses **Orinoco** - a parallel, incremental, and generational garbage collector. Its task is to find objects that are no longer used and free memory, minimizing pauses (stop-the-world).

**Generational Hypothesis** - the foundation of modern GC. Observation: most objects die young (live for milliseconds). A small portion of objects live long (minutes, hours). Therefore, V8 divides the heap into **two generations**: - **Young Generation (new objects):** Small size (~8-16 MB). Objects are created here. GC runs frequently (every few milliseconds) but operates quickly (1-5 ms). - **Old Generation (old objects):** Large size (hundreds of MB / several GB). Objects that survive several GC cycles are moved here. GC runs infrequently but operates longer (10-100 ms).

**Why two generations?** Scanning the entire heap every time is expensive. The Young Generation is small → GC runs quickly. The Old Generation is scanned infrequently, when a lot of garbage accumulates. This is a trade-off: frequent short pauses (Young GC) vs infrequent long pauses (Old GC).

**Scavenge (Young Generation GC):** Young objects are placed in **From-Space**. When From-Space fills up, V8 triggers **Scavenge** (Cheney's copying algorithm): 1. **Find live objects:** The GC scans roots (stack, global variables) and marks objects that have references. 2. **Evacuation:** Live objects are copied to **To-Space**. Dead objects are simply ignored (automatically freed). 3. **Swap:** From-Space and To-Space are swapped. Scavenge speed: **1-5 ms**. This is almost imperceptible. If an object survives **2 Scavenge cycles**, it is considered long-lived and moved to the **Old Generation** (tenure/promotion).

**Mark-Sweep-Compact (Old Generation GC):** When the Old Generation fills up, V8 initiates a **Major GC** (full cycle): 1. **Marking:** The GC scans the entire heap, marking live objects (starting from the roots, traversing the object graph). This is an **incremental process** - performed in small steps between executing JS code. 2. **Sweeping:** The GC goes through the memory, freeing spaces where there are no marks (dead objects). 3. **Compacting:** The GC moves live objects closer together, eliminating memory fragmentation. This allows new objects to be allocated faster (bump allocation). Speed of Major GC: **10-100 ms** (depends on the size of the heap). This can cause noticeable pauses (lag in UI, delays in API).

How to See GC Pauses

Node.js with the `--trace-gc` flag: ```bash node --trace-gc app.js ``` Output: ``` [12345:0x...] Scavenge 2.3 (3.1) -> 1.8 (4.1) MB, 1.2 / 0.0 ms ... [12345:0x...] Mark-sweep 45.2 (52.0) -> 38.1 (50.0) MB, 23.4 ms ... ``` - **Scavenge:** 1.2 ms (fast) - **Mark-sweep:** 23.4 ms (noticeable pause!) Frequent Major GC (>10 times per second) indicates either a memory leak or a heap that is too small.

**GC Optimization: Object Pooling** Creating millions of temporary objects (for example, in a game loop or parser) triggers GC very often. Solution: **Object Pooling** - reuse objects instead of creating new ones. ```javascript // BAD: Creating a million objects for (let i = 0; i < 1_000_000; i++) { const point = { x: i, y: i }; // GC will suffer process(point); } // GOOD: Reusing an object const point = { x: 0, y: 0 }; for (let i = 0; i < 1_000_000; i++) { point.x = i; point.y = i; process(point); } ``` The second option: **zero** GC pauses (the object does not die → no Scavenge needed).

**Memory Leaks in JavaScript:** Despite the GC, memory leaks are possible. Typical causes: 1. **Global Variables:** ```javascript function leak() { data = new Array(1_000_000); // Forgot let/const → global variable } leak(); // data lives forever ``` 2. **Forgotten Event Handlers:** ```javascript const button = document.getElementById("btn"); button.addEventListener("click", () => { // Handler holds a reference to the entire scope }); // If the button is removed from the DOM but the handler is not unsubscribed → leak ``` 3. **Closures:** ```javascript function createHandler() { const bigData = new Array(1_000_000); return () => { console.log(bigData[0]); // Closure holds a reference to bigData }; } const handler = createHandler(); // bigData cannot be collected by GC ``` Solution: use WeakMap/WeakSet for objects that should be automatically removed.

**Incremental and Concurrent GC:** Orinoco (the modern GC in V8) uses **incremental marking** and **concurrent sweeping**: - **Incremental:** Marking is broken into small steps (~5-10 ms). JavaScript code is executed between steps. This makes pauses less noticeable. - **Concurrent:** Sweeping and Compacting are performed in **background threads** while the main thread executes JS. This drastically reduces stop-the-world pauses. Result: Major GC in V8 can take only **5-10 ms** (instead of 100 ms in older versions).

**Memory Monitoring in Node.js:** ```javascript const memUsage = process.memoryUsage(); console.log({ rss: memUsage.rss / 1024 / 1024, // Resident Set Size (total memory) heapTotal: memUsage.heapTotal / 1024 / 1024, // Allocated heap heapUsed: memUsage.heapUsed / 1024 / 1024, // Used heap external: memUsage.external / 1024 / 1024, // C++ objects (buffers) }); ``` If `heapUsed` constantly grows and does not decrease after GC → memory leak. Use Chrome DevTools (Memory Profiler) or `node --inspect` for analysis.

Garbage Collector always runs in the background thread and never stops the execution of JavaScript code.

GC sometimes requires stop-the-world pauses, although V8 minimizes them through incremental and concurrent algorithms.

Even the modern Orinoco GC makes short stop-the-world pauses (1-10 ms) for critical operations (e.g., stack scanning, finalization marking). Incremental marking breaks the work into steps, and concurrent sweeping/compacting works in the background, but **completely avoiding pauses is impossible** - V8 must ensure heap consistency. In Node.js applications with a large heap (>1 GB), Major GC can cause pauses of 20-50 ms, which is critical for low-latency services (e.g., real-time API). Solution: configure `--max-old-space-size`, use Worker Threads for heap isolation, avoid memory leaks.

What happens if an object in the Young Generation survives 2 Scavenge cycles?

Key Ideas

  • **V8 uses JIT compilation:** Ignition (quick start → bytecode) + TurboFan (optimization of hot functions → machine code). Stable types → 100x speedup. Mixed types → bailout → degradation to interpretation.
  • **Hidden Classes optimize property access:** V8 creates an internal schema of the object's structure. Objects with the same structure reuse Hidden Class → inline caching works (O(1) access). Different property orders, `delete`, numeric indices → different Hidden Classes → IC breaks.
  • **Garbage Collector (Orinoco) uses a generational approach:** Young Generation (frequent fast GCs) + Old Generation (rare long GCs). Create fewer temporary objects (use pooling) and avoid leaks (global variables, forgotten event handlers).
  • **V8 Optimization Requires Predictability:** Stable types, stable object structures, avoiding `delete`/`eval`/`arguments`. V8 rewards disciplined code with C++ performance but penalizes dynamism with deoptimizations.

Related topics

V8 is the performance foundation of Node.js. Understanding its operation is critical for the following topics:

  • Event Loop & Async I/O — The Event Loop operates in a single thread with V8. GC pauses block the event loop → delays in request processing. GC optimization directly affects latency.
  • Memory Management — Heap limits, buffer allocation, Worker Threads for heap isolation - everything is built on understanding GC and the V8 memory model.
  • Performance Profiling — Flame graphs, CPU profiling, heap snapshots - tools for analyzing V8 optimizations and finding bottlenecks.

Вопросы для размышления

  • Why does a function that processes both numbers and strings work slower than two separate functions (one for numbers, the other for strings)?
  • In which scenarios might the use of `delete obj.property` be justified, despite the transition to dictionary mode? (Hint: when performance is not critical, but semantic correctness is needed.)
  • How does object pooling help avoid GC pauses? What trade-off does pooling introduce? (Hint: code complexity vs performance.)

Связанные уроки

  • comp-32-jvm
V8: JavaScript engine

0

1

Sign In