Node.js Internals

Worker Threads: Parallelism in Node.js

Node.js is single-threaded. But what if a task requires 5 seconds of computation? The Event Loop freezes, the server does not respond. Worker Threads solve the problem: true parallelism, when JavaScript code runs on all CPU cores simultaneously.

**Password Hashing**: bcrypt.hash blocks the Event Loop for ~100ms. With Worker Pool, 4 threads handle 40 registrations per second instead of 10.
**Image Processing**: the server generates previews for uploaded photos. Sharp (libvips) is fast, but for a 4K image it requires ~200ms CPU. Worker Pool allows processing hundreds of photos in parallel.
**Real-time analytics**: real-time log aggregation. Parsing and group-by for a million records takes ~5 seconds. 8 workers split the data into chunks - result in ~700ms.

Parallelism vs Concurrency

Node.js is built on a **single-threaded model**: the Event Loop processes tasks sequentially, switching between them (concurrency). However, CPU-intensive tasks block the thread - password hashing, image processing, compilation.

**Worker Threads** introduce true parallelism: multiple threads execute code simultaneously on different CPU cores. This is not `child_process` (a heavy process with a separate V8) - these are lightweight threads sharing memory.

**Key Difference:** - **Concurrency (Event Loop)**: one chef juggling multiple dishes - **Parallelism (Worker Threads)**: several chefs cooking simultaneously Event Loop is ideal for I/O (network, disk), Worker Threads are for computations (cryptography, data processing).

The application processes 1000 HTTP requests per second (reading from the database). Are Worker Threads needed?

Creating a Worker Thread

Worker Thread is created using the `Worker` class from the `worker_threads` module. The worker executes a separate file or inline code in an isolated JavaScript context with its own Event Loop.

**Worker Thread Lifecycle:** 1. **Spawn**: creation of an OS thread, initialization of the V8 context 2. **Execute**: execution of code in worker.js 3. **Communicate**: message exchange via `postMessage` 4. **Terminate**: termination via `worker.terminate()` or script exit Each worker has ~2MB overhead (V8 context + Event Loop). Creation takes ~10-50ms.

What is the overhead of creating a new Worker Thread in Node.js?

MessageChannel and MessagePort

Workers communicate via **postMessage** - data cloning (structured clone). For advanced scenarios, **MessageChannel** is used: a pair of connected ports for two-way communication between threads.

**Communication Hierarchy:** - **parentPort.postMessage()**: worker → parent (built-in channel) - **worker.postMessage()**: parent → worker (built-in channel) - **MessageChannel**: creating custom channels for worker ↔ worker - **transferList**: transferring buffer ownership (zero-copy) Structured clone supports: primitives, objects, arrays, Date, RegExp, ArrayBuffer, but not functions and symbols.

A 100MB ArrayBuffer is passed to a worker via postMessage. What happens?

SharedArrayBuffer and Atomics

**SharedArrayBuffer** - shared memory between threads. Multiple workers read and write to a single buffer simultaneously. But without synchronization - data race. **Atomics** solves the problem: atomic operations and synchronization primitives.

**Atomics API:** - `Atomics.add/sub/and/or/xor`: atomic arithmetic - `Atomics.load/store`: atomic read/write - `Atomics.compareExchange`: CAS (Compare-And-Swap) - `Atomics.wait/notify`: futex-like synchronization (thread blocking) ⚠️ **Spectre/Meltdown**: SharedArrayBuffer was disabled in 2018, returned with Cross-Origin-Opener-Policy. Available in Node.js without restrictions.

Two workers increment sharedArray[0] 1000 times. Without Atomics the result:

Worker Pool: Reusing Threads

Creating a worker for each task is expensive (~10-50ms + 2MB). **Worker Pool** is a fixed set of workers that are reused for tasks. It is analogous to a thread pool in other languages.

**Pool Strategies:** - **Fixed size**: N workers (usually = number of CPU cores) - **Dynamic**: grows under load, shrinks when idle - **Task queue**: tasks in a queue, workers take the next one - **Round-robin**: tasks are distributed in a circular manner Libraries: **piscina** (recommended), **workerpool**, **worker-threads-pool**.

A server has a 4-core CPU. How many workers should be created in the pool for CPU-intensive tasks?

Usage Patterns of Worker Threads

Worker Threads solve specific problems: CPU-intensive tasks, parallel data processing, isolation of unreliable code. Let's consider usage patterns and antipatterns.

**When to use Worker Threads:** ✅ **CPU-intensive**: cryptography (bcrypt, scrypt), compression, parsing large files ✅ **Parallel computations**: image processing, ML inference, data processing ✅ **Isolation**: executing user code (sandbox) ❌ **When NOT to use:** - I/O operations (DB, API, files) → Event Loop is more efficient - Short tasks (<10ms) → overhead of creating a worker - Frequent data exchange → serialization overhead

Worker Threads speed up any asynchronous code (e.g., HTTP requests).

Worker Threads accelerate only CPU-bound tasks. For I/O (network, DB), they add overhead.

The Event Loop is already optimized for I/O: it does not block on network requests or file reading (libuv thread pool). Worker Threads are needed when JavaScript code itself occupies the CPU: parsing, hashing, data processing. Otherwise, the cost of creating a worker (~10ms) yields no acceleration.

API endpoint uploads a file from S3, parses JSON, saves it to the database. Is a Worker Thread needed?

Key Ideas

**Parallelism vs Concurrency**: The Event Loop switches between tasks (concurrency), Worker Threads execute simultaneously on different cores (parallelism). Workers are for CPU-bound, Event Loop for I/O.
**Communication**: postMessage (clone), transferList (zero-copy), MessageChannel (worker ↔ worker), SharedArrayBuffer (shared memory). The choice depends on the size of the data and the frequency of exchange.
**SharedArrayBuffer + Atomics**: shared memory without copying, but requires synchronization. Atomics.add for atomicity, Atomics.wait/notify for locks. Data race without Atomics → data loss.
**Worker Pool**: reuse of workers (creation is expensive: ~10ms + 2MB). Pool size = CPU cores for CPU-bound tasks. Piscina for production.
**Patterns**: offloading CPU-intensive tasks from API endpoints, parallel processing of arrays (chunks), long-running workers with bidirectional communication, sandbox for user code.

Вопросы для размышления

Why don't Worker Threads replace the Event Loop for I/O operations? What overhead do they add?
In which cases is transferList better than postMessage? When is SharedArrayBuffer more efficient than both?
How to protect against data race when working with SharedArrayBuffer? Why is Atomics.add atomic, but a regular increment is not?

Связанные уроки

os-03-threads

Node.js Internals

Worker Threads: Parallelism in Node.js

**Password Hashing**: bcrypt.hash blocks the Event Loop for ~100ms. With Worker Pool, 4 threads handle 40 registrations per second instead of 10.
**Image Processing**: the server generates previews for uploaded photos. Sharp (libvips) is fast, but for a 4K image it requires ~200ms CPU. Worker Pool allows processing hundreds of photos in parallel.
**Real-time analytics**: real-time log aggregation. Parsing and group-by for a million records takes ~5 seconds. 8 workers split the data into chunks - result in ~700ms.

Parallelism vs Concurrency

The application processes 1000 HTTP requests per second (reading from the database). Are Worker Threads needed?

Creating a Worker Thread

Worker Thread is created using the `Worker` class from the `worker_threads` module. The worker executes a separate file or inline code in an isolated JavaScript context with its own Event Loop.

What is the overhead of creating a new Worker Thread in Node.js?

MessageChannel and MessagePort

Workers communicate via **postMessage** - data cloning (structured clone). For advanced scenarios, **MessageChannel** is used: a pair of connected ports for two-way communication between threads.

A 100MB ArrayBuffer is passed to a worker via postMessage. What happens?

SharedArrayBuffer and Atomics

Two workers increment sharedArray[0] 1000 times. Without Atomics the result:

Worker Pool: Reusing Threads

Creating a worker for each task is expensive (~10-50ms + 2MB). **Worker Pool** is a fixed set of workers that are reused for tasks. It is analogous to a thread pool in other languages.

A server has a 4-core CPU. How many workers should be created in the pool for CPU-intensive tasks?

Usage Patterns of Worker Threads

Worker Threads solve specific problems: CPU-intensive tasks, parallel data processing, isolation of unreliable code. Let's consider usage patterns and antipatterns.

Worker Threads speed up any asynchronous code (e.g., HTTP requests).

Worker Threads accelerate only CPU-bound tasks. For I/O (network, DB), they add overhead.

API endpoint uploads a file from S3, parses JSON, saves it to the database. Is a Worker Thread needed?

Key Ideas

**Parallelism vs Concurrency**: The Event Loop switches between tasks (concurrency), Worker Threads execute simultaneously on different cores (parallelism). Workers are for CPU-bound, Event Loop for I/O.
**Communication**: postMessage (clone), transferList (zero-copy), MessageChannel (worker ↔ worker), SharedArrayBuffer (shared memory). The choice depends on the size of the data and the frequency of exchange.
**SharedArrayBuffer + Atomics**: shared memory without copying, but requires synchronization. Atomics.add for atomicity, Atomics.wait/notify for locks. Data race without Atomics → data loss.
**Worker Pool**: reuse of workers (creation is expensive: ~10ms + 2MB). Pool size = CPU cores for CPU-bound tasks. Piscina for production.
**Patterns**: offloading CPU-intensive tasks from API endpoints, parallel processing of arrays (chunks), long-running workers with bidirectional communication, sandbox for user code.

Вопросы для размышления

Why don't Worker Threads replace the Event Loop for I/O operations? What overhead do they add?
In which cases is transferList better than postMessage? When is SharedArrayBuffer more efficient than both?
How to protect against data race when working with SharedArrayBuffer? Why is Atomics.add atomic, but a regular increment is not?

Связанные уроки

os-03-threads

Worker Threads: Parallelism in Node.js

Parallelism vs Concurrency

Creating a Worker Thread

MessageChannel and MessagePort

SharedArrayBuffer and Atomics

Worker Pool: Reusing Threads

Usage Patterns of Worker Threads

Key Ideas

Related topics

Вопросы для размышления

Связанные уроки

Worker Threads: Parallelism in Node.js

Parallelism vs Concurrency

Creating a Worker Thread

MessageChannel and MessagePort

SharedArrayBuffer and Atomics

Worker Pool: Reusing Threads

Usage Patterns of Worker Threads

Key Ideas

Related topics

Вопросы для размышления

Связанные уроки