Node.js Internals

Worker Threads: Parallelism in Node.js

Node.js is single-threaded. But what if a task requires 5 seconds of computation? The Event Loop freezes, the server does not respond. Worker Threads solve the problem: true parallelism, when JavaScript code runs on all CPU cores simultaneously.

  • **Password Hashing**: bcrypt.hash blocks the Event Loop for ~100ms. With Worker Pool, 4 threads handle 40 registrations per second instead of 10.
  • **Image Processing**: the server generates previews for uploaded photos. Sharp (libvips) is fast, but for a 4K image it requires ~200ms CPU. Worker Pool allows processing hundreds of photos in parallel.
  • **Real-time analytics**: real-time log aggregation. Parsing and group-by for a million records takes ~5 seconds. 8 workers split the data into chunks - result in ~700ms.

Parallelism vs Concurrency

Node.js is built on a **single-threaded model**: the Event Loop processes tasks sequentially, switching between them (concurrency). However, CPU-intensive tasks block the thread - password hashing, image processing, compilation.

**Worker Threads** introduce true parallelism: multiple threads execute code simultaneously on different CPU cores. This is not `child_process` (a heavy process with a separate V8) - these are lightweight threads sharing memory.

**Key Difference:** - **Concurrency (Event Loop)**: one chef juggling multiple dishes - **Parallelism (Worker Threads)**: several chefs cooking simultaneously Event Loop is ideal for I/O (network, disk), Worker Threads are for computations (cryptography, data processing).

The application processes 1000 HTTP requests per second (reading from the database). Are Worker Threads needed?

Creating a Worker Thread

Worker Thread is created using the `Worker` class from the `worker_threads` module. The worker executes a separate file or inline code in an isolated JavaScript context with its own Event Loop.

**Worker Thread Lifecycle:** 1. **Spawn**: creation of an OS thread, initialization of the V8 context 2. **Execute**: execution of code in worker.js 3. **Communicate**: message exchange via `postMessage` 4. **Terminate**: termination via `worker.terminate()` or script exit Each worker has ~2MB overhead (V8 context + Event Loop). Creation takes ~10-50ms.

What is the overhead of creating a new Worker Thread in Node.js?

MessageChannel and MessagePort

Workers communicate via **postMessage** - data cloning (structured clone). For advanced scenarios, **MessageChannel** is used: a pair of connected ports for two-way communication between threads.

**Communication Hierarchy:** - **parentPort.postMessage()**: worker → parent (built-in channel) - **worker.postMessage()**: parent → worker (built-in channel) - **MessageChannel**: creating custom channels for worker ↔ worker - **transferList**: transferring buffer ownership (zero-copy) Structured clone supports: primitives, objects, arrays, Date, RegExp, ArrayBuffer, but not functions and symbols.

A 100MB ArrayBuffer is passed to a worker via postMessage. What happens?

SharedArrayBuffer and Atomics

**SharedArrayBuffer** - shared memory between threads. Multiple workers read and write to a single buffer simultaneously. But without synchronization - data race. **Atomics** solves the problem: atomic operations and synchronization primitives.

**Atomics API:** - `Atomics.add/sub/and/or/xor`: atomic arithmetic - `Atomics.load/store`: atomic read/write - `Atomics.compareExchange`: CAS (Compare-And-Swap) - `Atomics.wait/notify`: futex-like synchronization (thread blocking) ⚠️ **Spectre/Meltdown**: SharedArrayBuffer was disabled in 2018, returned with Cross-Origin-Opener-Policy. Available in Node.js without restrictions.

Two workers increment sharedArray[0] 1000 times. Without Atomics the result:

Worker Pool: Reusing Threads

Creating a worker for each task is expensive (~10-50ms + 2MB). **Worker Pool** is a fixed set of workers that are reused for tasks. It is analogous to a thread pool in other languages.

**Pool Strategies:** - **Fixed size**: N workers (usually = number of CPU cores) - **Dynamic**: grows under load, shrinks when idle - **Task queue**: tasks in a queue, workers take the next one - **Round-robin**: tasks are distributed in a circular manner Libraries: **piscina** (recommended), **workerpool**, **worker-threads-pool**.

A server has a 4-core CPU. How many workers should be created in the pool for CPU-intensive tasks?

Usage Patterns of Worker Threads

Worker Threads solve specific problems: CPU-intensive tasks, parallel data processing, isolation of unreliable code. Let's consider usage patterns and antipatterns.

**When to use Worker Threads:** ✅ **CPU-intensive**: cryptography (bcrypt, scrypt), compression, parsing large files ✅ **Parallel computations**: image processing, ML inference, data processing ✅ **Isolation**: executing user code (sandbox) ❌ **When NOT to use:** - I/O operations (DB, API, files) → Event Loop is more efficient - Short tasks (<10ms) → overhead of creating a worker - Frequent data exchange → serialization overhead

Worker Threads speed up any asynchronous code (e.g., HTTP requests).

Worker Threads accelerate only CPU-bound tasks. For I/O (network, DB), they add overhead.

The Event Loop is already optimized for I/O: it does not block on network requests or file reading (libuv thread pool). Worker Threads are needed when JavaScript code itself occupies the CPU: parsing, hashing, data processing. Otherwise, the cost of creating a worker (~10ms) yields no acceleration.

API endpoint uploads a file from S3, parses JSON, saves it to the database. Is a Worker Thread needed?

Key Ideas

  • **Parallelism vs Concurrency**: The Event Loop switches between tasks (concurrency), Worker Threads execute simultaneously on different cores (parallelism). Workers are for CPU-bound, Event Loop for I/O.
  • **Communication**: postMessage (clone), transferList (zero-copy), MessageChannel (worker ↔ worker), SharedArrayBuffer (shared memory). The choice depends on the size of the data and the frequency of exchange.
  • **SharedArrayBuffer + Atomics**: shared memory without copying, but requires synchronization. Atomics.add for atomicity, Atomics.wait/notify for locks. Data race without Atomics → data loss.
  • **Worker Pool**: reuse of workers (creation is expensive: ~10ms + 2MB). Pool size = CPU cores for CPU-bound tasks. Piscina for production.
  • **Patterns**: offloading CPU-intensive tasks from API endpoints, parallel processing of arrays (chunks), long-running workers with bidirectional communication, sandbox for user code.

Related topics

Worker Threads are embedded into the architecture of Node.js applications:

  • Event Loop — Worker Threads complement the Event Loop: I/O remains in the main thread, CPU-intensive tasks are offloaded to workers.
  • Streams — Worker Threads process chunks from streams in parallel (for example, parsing CSV in parts)
  • Cluster Module — Cluster for scaling I/O (multiple processes listen to one port), Worker Threads for CPU within a process

Вопросы для размышления

  • Why don't Worker Threads replace the Event Loop for I/O operations? What overhead do they add?
  • In which cases is transferList better than postMessage? When is SharedArrayBuffer more efficient than both?
  • How to protect against data race when working with SharedArrayBuffer? Why is Atomics.add atomic, but a regular increment is not?

Связанные уроки

  • os-03-threads
Worker Threads: Parallelism in Node.js

0

1

Sign In