Node.js Internals

Memory Management

Your server crashes every night at 3:47. The logs are clean. CPU is normal. Disk is fine. Only one line before the crash: `FATAL ERROR: JavaScript heap out of memory`. Memory leaked. But where? And most importantly - how to find the hole in the abstraction that is supposed to "manage memory itself"?

  • **Production OOM crisis:** You deploy a new feature, everything works on staging. After 3 days in production, the heap grows by 50MB/hour. After a week, the server crashes. Rollback is impossible - there is already data in the new format. Heap snapshot shows 100K UserSession objects in a Map. It turned out that `sessions.delete()` was forgotten to be called during logout. The fix is a single line of code, but the downtime cost $50K.
  • **WebSocket memory leak:** Real-time chat for 10K users. With each connection, `eventBus.on('message')` is registered, but `off()` is not called upon disconnect. After a month of operation - 500K listeners are hanging in memory. The event loop slows down, latency increases to 5 seconds. Users leave. Finding the leak through heap snapshot comparison - 2 hours, fix - 10 minutes.
  • **Streaming saves the server:** API for generating reports. Initially, the entire CSV was loaded into memory (500MB), then sent to the client. Two parallel requests = 1GB heap = OOM. Rewritten to streaming using `Transform` stream - heap usage dropped to 10MB, allowing for 100 parallel requests. Profit: scaling from one instance instead of 10.

Memory as a Resource

**Memory is something you don't think about until it runs out.** In Node.js, memory is managed automatically through the V8 garbage collector, but that doesn't mean you can forget about it entirely. In production, memory leaks lead to server crashes, and suboptimal usage leads to lags and expensive instances.

Imagine: your API processes a million requests a day. Each request creates objects, closures, promises. If even 0.1% of objects are not released - within a week your server will consume all 2GB of heap and crash. **The garbage collector is not a magician** - it only frees unreachable objects. If you keep a reference in a global array "just in case" - memory leaks.

**Default V8 heap limit:** ~2GB on 64-bit systems. This is not a hard RAM limit, but a limit for managed memory (managed heap). Exceeding it causes `FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory`.

**Why exactly 2GB?** V8 was originally developed for browsers, where a tab should not consume all the RAM. For servers, this limitation is outdated but remains the default for compatibility. In production for heavy services, they use `--max-old-space-size=4096` or more.

What happens if a Node.js process exceeds the heap limit?

V8 Heap Architecture

**The V8 heap is not a monolithic block of memory, but a hierarchical structure.** Understanding this architecture is critical for optimization because different parts of the heap are processed by different GC algorithms at different speeds.

**New Space:** All new objects are created here. The small size (1-8MB) ensures fast garbage collection (Scavenge). 90% of objects die young and do not survive to Old Space. **Old Space:** Objects that survive 2+ Scavenge cycles end up here. Cleanup is slower but performed less frequently. **Large Object Space:** Arrays, buffers >1MB are stored separately and never moved.

**Two Semispaces in New Space:** Semi-space architecture. There is always a From-space (active) and a To-space (reserve). During Scavenge, live objects are copied from From to To, then they switch places. This is cheaper than compaction: copying only live objects instead of moving all.

Why is New Space intentionally made small (1-8MB)?

Garbage Collection Algorithms

**V8 uses two fundamentally different GC algorithms:** Scavenge for New Space (fast, frequent) and Mark-Sweep-Compact for Old Space (slow, rare). This is a compromise between latency and throughput.

**Scavenge (Cheney's algorithm):** Copying live objects from From-space to To-space. Complexity O(live objects), dead ones are ignored. Pause ~1-5ms. **Mark-Sweep-Compact:** 1) Mark - marking reachable objects (graph traversal) 2) Sweep - freeing unmarked ones 3) Compact - memory defragmentation. Pause ~100-500ms, but performed incrementally.

**Incremental Marking:** V8 does not stop the application for 500ms for Mark-Sweep. Instead, marking is performed in small portions (5-10ms) between event loop ticks. This is called **Tri-color marking:** objects are marked white (not visited), gray (in queue), black (processed).

**Concurrent and Parallel GC:** Modern versions of V8 use multithreading. **Parallel** - several GC threads work during a stop-the-world pause. **Concurrent** - GC works in parallel with JS execution (without a pause). Incremental marking is now performed concurrently - the application almost doesn't slow down.

Why does Scavenge GC work faster than Mark-Sweep?

Memory leak patterns

**Memory leak in JS is not a forgotten free(), but an unreachable object that still has a reference.** The GC is not a mind reader-if you keep an array with a million objects in a global variable "just in case," it won't guess that you don't need it. The three main patterns of leaks are: **closures, event listeners, global variables**.

**Leak Pattern #1: Closures.** A function captures the parent's scope. If it is long-lived (event handler, timer), the entire scope is not released. **Leak Pattern #2: Event Listeners.** A forgotten `addEventListener` without `removeEventListener` holds a reference to the callback and its scope. **Leak Pattern #3: Global Variables.** `global.cache = []` is never released.

**Leak detection in production:** Monitor `process.memoryUsage().heapUsed` every 10 seconds. If the heap grows linearly without a plateau - it's a leak. **Heap snapshot comparison:** take a snapshot before and after a load test, compare - the objects that have increased are the suspects.

Why does a forgotten event listener cause a memory leak?

Heap Snapshots and Profiling

**Heap snapshot is a complete dump of the V8 heap at a point in time.** It contains all objects, their sizes, references between them, retention paths (who keeps whom alive). Chrome DevTools can visualize snapshots and compare them - this is the main tool for finding leaks.

**How snapshot works:** V8 pauses execution, traverses the entire heap, serializes objects into a .heapsnapshot file (JSON). File size ~= heap size. For a 2GB heap → snapshot weighs ~2GB. **Retention path:** a chain of references from the GC root to the object. If a path exists → the object is reachable → GC will not delete it.

**Analysis in Chrome DevTools:** Open DevTools → Memory → Load Snapshot → select the file. Switch to **Comparison** mode → see the diff between snapshots. Columns: **# New** (new objects), **# Deleted** (deleted), **# Delta** (difference), **Retained Size** (how much memory the object holds with dependencies).

What does the retention path in a heap snapshot show?

Optimization and best practices

**Memory optimization is not micro-optimizations, but a systematic approach.** 80% of problems are solved with the right architecture: streaming instead of buffering, object pools instead of allocations, bounded queues instead of unbounded. The remaining 20% is understanding V8 internals and fine-tuning the GC.

**Best Practices:** 1) **Streaming over buffering** - do not load a 100MB file into memory, stream it 2) **Object pooling** - reuse objects instead of creating new ones (critical for hot path) 3) **Bounded structures** - limit the size of queue/cache through LRU or max size 4) **WeakMap/WeakRef** - for caches where objects can die independently.

**When to increase --max-old-space-size:** If you see frequent Major GC (every 10-30 seconds) and the heap is used >80% of the limit. Increasing the limit → less frequent GC → fewer pauses, but more latency during OOM. **When to decrease:** If the application uses <1GB and the limit is 4GB - reduce it to 2GB. Smaller heap → faster Mark-Sweep.

GC automatically solves all memory problems, so there's no need to think about it.

GC only frees unreachable objects. If you hold references (in Map, closure, listeners), GC will do nothing. Leaks are possible even with GC.

Garbage Collection is the automation of free(), but not the automation of architectural decisions. If your code creates an unbounded Map or forgets to removeListener, memory will leak regardless of GC. The programmer is still responsible for the lifecycle of objects through reference management.

What is the advantage of WeakMap over a regular Map for caching?

Key Ideas

  • **The V8 heap consists of New Space (1-8MB, Scavenge GC ~1-5ms) and Old Space (~2GB, Mark-Sweep ~100-500ms).** Young objects die quickly in New Space, long-lived ones move to Old Space. This is a compromise between latency and throughput.
  • **GC only frees unreachable objects.** Leaks occur due to forgotten references: closures capture the entire scope, event listeners remain after disconnect, global Map/Set grow indefinitely without an eviction policy. WeakMap/WeakRef solve part of the problems.
  • **Heap snapshots + comparison in Chrome DevTools - the main diagnostic tool.** Retention path shows the chain of references from the GC root to the object. If you see the growth of a certain constructor in comparison → look for where objects are created and why they are not deleted. Streaming, object pooling, bounded structures - 80% of optimizations.

Related topics

Memory Management is related to everything that creates objects: event loop, streams, concurrency. Understanding the heap is critical for optimization:

  • Event Loop & Async — Asynchronous operations create promises, callbacks, closures - all of these are objects in the heap. An unhandled promise rejection can keep the entire chain in memory.
  • Streams API — Streaming is a key technique for reducing heap usage. Transform/PassThrough streams create backpressure, preventing buffering.
  • Child Processes — Memory isolation through worker_threads/child_process. Each process has its own heap → a leak in one does not affect the other.

Вопросы для размышления

  • If your API creates 1000 objects per request and processes 100 req/sec, how many objects are created between Major GC cycles (once per minute)? How does this affect the heap size?
  • Why can't WeakMap use primitives (string, number) as keys, only objects? Hint: how does the GC know that an object is no longer needed?
  • You see 50K objects of type Promise in the heap snapshot with a retention path through global.pendingRequests. What code patterns could have led to this leak? How to fix it without changing the architecture?

Связанные уроки

  • arch-08-memory-hierarchy
Memory Management

0

1

Sign In