Node.js Internals

libuv: Asynchronous Engine of Node.js

Node.js is often called single-threaded. Open `htop` while a server is running - 5+ threads appear. One is JavaScript. The others are the libuv thread pool, quietly handling files, DNS, and cryptography.

  • **Production server with high file load**: Standard 4 threads thread pool - bottleneck. Increase UV_THREADPOOL_SIZE=16, throughput increases 3 times.
  • **Slow Start of a Node.js Application**: Loading dependencies performs thousands of `fs.readFile()` operations. All of them compete for 4 threads in the thread pool. Solution: preload modules or increase the thread pool.
  • **UV_DEBUG for debugging native addon**: A C++ addon crashes periodically. Enabling UV_DEBUG=uv_async reveals a race condition in uv_async_send() - a missing mutex before accessing shared data.

Intro

When they say that **Node.js is single-threaded**, they are lying. More precisely, they simplify it so much that it turns into a lie.

Node.js uses **libuv** - a C library that internally hides a **thread pool** of 4 threads (by default). While the JavaScript code runs in a single thread, libuv quietly distributes blocking operations across worker threads.

That's why `fs.readFileSync()` blocks the entire server, while `fs.readFile()` does not. The synchronous version runs in the main JavaScript thread. The asynchronous one sends the task to the libuv thread pool, freeing up the event loop for other tasks.

**libuv** is a cross-platform C library for asynchronous I/O. It provides Node.js access to file system operations, networking, timers, and IPC (inter-process communication) through a unified interface, hiding the differences between Linux (epoll), macOS (kqueue), and Windows (IOCP).

Without libuv, Node.js would not be cross-platform. Each OS has its own API for asynchronous I/O: Linux uses `epoll`, macOS - `kqueue`, Windows - `IOCP` (I/O Completion Ports). libuv abstracts these differences.

Why does Node.js need a C library at all? JavaScript cannot directly communicate with the OS kernel. A bridge is needed between V8 (JavaScript engine) and system calls. libuv is that bridge.

Why does `fs.readFile()` not block the event loop, but `fs.readFileSync()` does?

Architecture

The architecture of libuv is built on two key abstractions: **handles** (long-lived objects like TCP sockets, timers) and **requests** (one-time operations like file writing, DNS requests).

**Handle** is an object that exists for a long time and can generate events. Examples: `uv_tcp_t` (TCP connection), `uv_timer_t` (timer), `uv_fs_event_t` (file change tracking).

**Request** is a one-time operation. You create a request, libuv executes it (possibly in a thread pool), calls the callback, and the request is destroyed. Examples: `uv_fs_t` (file reading), `uv_getaddrinfo_t` (DNS resolving), `uv_write_t` (writing to a socket).

**Event Loop in libuv** goes through 7 phases: timers → pending callbacks → idle/prepare → poll → check → close callbacks → repeat. Node.js adds its microtasks (Promise callbacks) between the phases.

Key difference: a handle lives until explicitly closed (`.close()`). A request lives only until the operation is completed. Memory leaks in Node.js are often associated with unclosed handles - the server continues to hold sockets, timers, file watchers.

What is the difference between handle and request in libuv?

Thread Pool

Here's the moment of truth: **Node.js is NOT single-threaded**. libuv creates a thread pool of 4 threads (by default), where blocking operations are executed.

What operations use a thread pool? Three categories:

1. **File System** - all `fs` operations (except `fs.watch`): `readFile`, `writeFile`, `stat`, `readdir`. File reading is a blocking operation not supported by epoll/kqueue.

2. **DNS** - `dns.lookup()` (but not `dns.resolve()`!). `lookup` calls the blocking system call `getaddrinfo()`, which requires a thread pool. `resolve` uses c-ares (an asynchronous library), working in the event loop.

3. **Crypto** - `crypto.pbkdf2()`, `crypto.scrypt()`, `crypto.randomBytes()` (if without callback). These operations are CPU-intensive, block the event loop, and therefore go into the thread pool.

**UV_THREADPOOL_SIZE** - environment variable for configuring the size of the thread pool (default is 4). Maximum is 1024. Increase it when the application actively uses `fs` or `crypto`.

**When to increase UV_THREADPOOL_SIZE?** When an application performs many parallel `fs` operations (for example, processing uploaded files) or `crypto` operations (password hashing), the standard 4 threads become a bottleneck.

Which operation does NOT use the libuv thread pool?

I/O Polling

The main task of libuv is to abstract the differences in asynchronous I/O between operating systems. Linux uses **epoll**, macOS - **kqueue**, Windows - **IOCP** (I/O Completion Ports). libuv hides these details behind a unified API.

**epoll (Linux)** - a mechanism for monitoring multiple file descriptors (sockets, pipes, but NOT files). It operates in edge-triggered mode: it notifies only when there is a change in state (new data in the socket).

**kqueue (macOS/BSD)** - an analog of epoll, but with support for a larger number of event types: file changes, signals, timers. Also edge-triggered.

**IOCP (Windows)** - a fundamentally different model: completion-based instead of readiness-based. You don't ask "is the socket ready for reading?" but receive a notification "the read operation is complete."

**Poll Phase** - the phase of the event loop where libuv calls the OS polling API (epoll_wait/kevent/GetQueuedCompletionStatus). Here, the event loop can block while waiting for I/O events, but with a timeout (the next timer or setImmediate).

Why don't files use epoll/kqueue? These mechanisms only work with network I/O and pipes. Reading regular files is always blocking in the POSIX API, so libuv uses a thread pool.

**Edge-triggered vs Level-triggered**. epoll and kqueue operate in edge-triggered mode: they notify only when new data appears. If all the data is not read at once, the next notification will not come until more data appears.

**IOCP on Windows** - a special story. The completion-based model means a read request is sent, and the OS notifies when the read is complete (not when the data is ready to be read). libuv emulates a readiness model on top of IOCP for compatibility.

Why does network I/O (HTTP requests) not block the event loop, while file I/O (fs.readFile) uses a thread pool?

Timers

Timers in libuv are implemented through a **min-heap** (binary heap). Each `setTimeout`/`setInterval` adds a node to the heap, where the key is the trigger time. The root of the heap is the nearest timer.

Why min-heap? Because we need O(1) complexity for finding the nearest timer (the root of the heap), O(log n) for adding a new timer, and O(log n) for removing a triggered timer.

Event loop checks timers in the **timers phase**: it compares the current time with the root of the min-heap. If the time has come, it calls the callback and removes the timer from the heap. It repeats until the root of the heap > current time.

**Monotonic Time** - libuv uses monotonic time (not wall clock time), which does not change when the system clock is adjusted. The function `uv_now()` returns milliseconds since the event loop started.

**Timer accuracy** depends on the system load. If the event loop is busy (for example, executing long synchronous code), timers will trigger with a delay. setTimeout(100) guarantees "not earlier than 100ms," but not "exactly in 100ms."

**Why monotonic time?** Consider a scenario where `setTimeout(1000)`, and the user turned the system clock back by an hour. If libuv used wall clock time, the timer would trigger in 1 hour + 1 second. With monotonic time - exactly in 1 second, regardless of clock changes.

Why does libuv use a min-heap for timers instead of an array or linked list?

Async Handles

**Async handles** (`uv_async_t`) are a mechanism for thread-safe communication between threads. They allow a worker thread (from a thread pool or `worker_threads`) to safely invoke a callback in the main event loop thread.

Why is a special mechanism needed? The event loop operates in a single thread. If the worker thread directly calls a JavaScript callback, a race condition will occur (two threads simultaneously modify the V8 heap).

**uv_async_send()** - an atomic operation: the worker thread calls it, libuv marks the async handle as "pending", and the event loop will call the callback in the next iteration (in a safe context).

**Coalescing** - if `uv_async_send()` is called several times before the event loop processes the handle, the callback will be executed only once. Async handles do not guarantee "one send = one callback", only "at least one callback".

Inside Node.js bindings: when `parentPort.postMessage()` is called, the C++ code creates a `uv_async_t` handle and calls `uv_async_send()`. The main thread's event loop receives the notification and invokes the JavaScript callback ('message' event).

**Why coalescing?** Performance. If a worker thread sends thousands of events per second, the event loop should not handle each one separately - this would kill throughput. Coalescing allows for batch processing.

**Native addons** actively use `uv_async_t`. When a C++ addon performs blocking work in a separate thread (for example, video processing), it calls `uv_async_send()` to notify JavaScript about the result.

Node.js is completely single-threaded, all code is executed in one thread.

JavaScript code is executed in a single thread, but libuv uses a thread pool (4+ threads) for blocking operations (fs, dns.lookup, crypto).

Confusion arises from simplified explanations like "Node.js is single-threaded." In reality, libuv hides multithreading: JavaScript runs in a single thread, but file operations, DNS requests, and cryptography are executed in parallel. UV_THREADPOOL_SIZE controls the number of these hidden threads.

What happens if a worker thread calls uv_async_send() 1000 times in a row?

Key Ideas

  • **libuv - cross-platform layer** between Node.js and OS (epoll/kqueue/IOCP). Abstracts differences in asynchronous I/O.
  • **Thread pool (4 threads)** performs blocking operations: fs, dns.lookup, crypto. Network I/O goes through the event loop (epoll/kqueue). UV_THREADPOOL_SIZE is configured via an environment variable.
  • **Timers are implemented through a min-heap** with monotonic time. setTimeout(0) does not guarantee immediate execution - it depends on the event loop phase.
  • **uv_async_t** provides thread-safe communication between threads. worker_threads use this under the hood. Coalescing combines multiple sends into one callback.

Related topics

libuv - the foundation for understanding the entire asynchronous architecture of Node.js:

  • Event Loop — libuv implements an event loop with 7 phases (timers, poll, check, etc.). Understanding libuv explains why setTimeout and setImmediate can execute in a different order.
  • Worker Threads — worker_threads uses uv_async_t for inter-thread communication. Each worker has its own event loop (a separate uv_loop_t).
  • Async Patterns — All async patterns in Node.js (callbacks, promises, async/await) are built on top of libuv primitives. fs.readFile returns a Promise, but under the hood, it's a uv_fs_t request.

Вопросы для размышления

  • When an application makes a lot of DNS requests, is it better to use dns.lookup() or dns.resolve4()? Why?
  • Is it possible to increase UV_THREADPOOL_SIZE to 1000, and would it be a good idea for production?
  • Why can't libuv use epoll for reading regular files (not pipes, not sockets)? What prevents truly asynchronous file I/O in POSIX?

Связанные уроки

  • os-04-scheduling
libuv: Asynchronous Engine of Node.js

0

1

Sign In