Node.js Internals

Event Loop: The Heart of Node.js

Why can Netflix stream video to millions of users simultaneously on Node.js servers? Why does Discord handle billions of messages a day with minimal latency? The secret is not in the power of the servers, but in understanding the Event Loop - the mechanism that makes Node.js one of the most efficient solutions for I/O-intensive applications.

  • **LinkedIn** migrated from Ruby on Rails to Node.js and reduced the number of servers from 30 to 3 while handling the same traffic. Reason: Event Loop efficiently uses a single thread instead of creating a new thread for each connection.
  • **PayPal** after switching to Node.js doubled requests/sec while reducing response time by 35%. The Event Loop allowed handling API calls to banks in parallel, instead of sequentially waiting for each response.
  • **Walmart** processes 500 million pageviews/month on Node.js, saving millions on infrastructure. The Event Loop allows a single server to maintain hundreds of thousands of WebSocket connections for real-time cart updates.

Loop Overview

Consider a waiter in a busy restaurant. Instead of standing at each table waiting for the customer to decide, the waiter moves continuously: taking an order here, delivering a dish there, picking up a check elsewhere. One person, dozens of tables served simultaneously. This is exactly how Node.js works.

**Event Loop** is an infinite loop that checks task queues and executes them one by one. Node.js uses a single thread, but thanks to the asynchronous model, it handles thousands of connections in parallel. The secret is that input-output (I/O) operations are performed in the background by the operating system or the libuv library, and JavaScript code is called only when the result is ready.

**Why is Node.js faster than traditional multithreaded servers for I/O-intensive tasks?** In Apache or Tomcat, each connection creates a new thread. 1000 connections = 1000 threads = gigabytes of memory for stacks + expensive context switches. Node.js uses a single thread for JS code and delegates I/O to the operating system. Result: millions of active WebSocket connections on a single server (like in WhatsApp).

Real example: API server processes 1000 requests

**Multithreaded Server (Apache):** - 1000 requests = 1000 threads - Each thread: ~1MB stack = 1GB memory - Context switch between threads: thousands of switches/sec - CPU spends time managing threads rather than doing useful work **Node.js:** - 1000 requests = 1 JavaScript thread + libuv thread pool (4-128 threads for I/O) - Memory: ~50MB for JS heap + small overhead for callbacks - While one request waits for the database, the Event Loop handles others - CPU is busy only when there is real work (executing JS code) That's why Node.js has become the standard for microservices and real-time applications.

**Main rule:** Never block the Event Loop! Any synchronous operation that lasts more than a few milliseconds (complex calculations, synchronous reading of large files, `JSON.parse()` on megabytes of data) will freeze the entire server. For CPU-intensive tasks, use Worker Threads or offload them to separate microservices.

An API server on Node.js handles requests to a database. The average request takes: 5ms CPU + 45ms waiting for a database response. How many requests per second can one process theoretically handle?

Loop Phases

Event Loop is not just `while(true)`. It is a strictly ordered cycle of **6 phases**, each of which processes its own type of tasks. Understanding the phases is critical for debugging: why `setImmediate()` sometimes executes before `setTimeout(0)`, why `process.nextTick()` can freeze the server, how the poll phase works, which takes up most of the time.

After **each phase**, **microtasks** are executed: first the entire `process.nextTick()` queue, then `Promise.then()` / `queueMicrotask()`.

**Details of each phase:** **1. Timers** - executes `setTimeout()` and `setInterval()` callbacks whose timers have expired. Important: timers do not guarantee exact execution time. `setTimeout(fn, 100)` means "execute no earlier than after 100ms," but it may be later if the Event Loop is busy. **2. Pending callbacks** - executes I/O callbacks deferred from the previous cycle (e.g., TCP errors). **3. Idle, prepare** - an internal libuv phase used for preparation for poll. **4. Poll** - THE MOST IMPORTANT phase. Here the Event Loop receives new I/O events (incoming HTTP requests, database responses, data from sockets) and executes their callbacks. If the queue is empty, the Event Loop **blocks** here and waits for new events (but not longer than the nearest timer). **5. Check** - executes `setImmediate()` callbacks. This phase exists to allow code execution immediately after the poll phase. **6. Close callbacks** - executes connection closure callbacks (`socket.on('close')`, `server.close()`).

Real Case: Slow Timers

You set `setTimeout(() => sendMetrics(), 5000)` to send metrics every 5 seconds. But in production, metrics arrive every 10-15 seconds. Why? **Reason:** The Event Loop is blocked by a CPU-intensive task. For example, `JSON.stringify()` on a large object takes 8 seconds. During this time: - The timer expired after 5 seconds - But the Event Loop is stuck in another callback (JSON parsing) - Only after 8 seconds will the Event Loop reach the timers phase - The timer will execute with a 3-second delay **Solution:** Break heavy operations into chunks or use Worker Threads.

**Danger of blocking operations:** A single `fs.readFileSync()` on a 100MB file will block the Event Loop for seconds. During this time: - All new HTTP requests are queued by the OS (or receive ECONNREFUSED) - All timers execute with a delay - WebSocket connections may time out In production, this means complete downtime of the service. ALWAYS use asynchronous versions: `fs.readFile()`, `crypto.pbkdf2()`, etc.

An HTTP server is created. In each request handler, `crypto.pbkdf2Sync()` (a CPU-intensive password hash) runs and takes 500ms. The server receives 10 requests simultaneously. How many seconds does it take to process the last request?

Microtasks

Microtasks are a special queue that executes **between phases of the Event Loop** (and even between individual callbacks within a phase). In Node.js, there are two types of microtasks with different priorities: **`process.nextTick()`** (highest priority) and **Promise microtasks** (`Promise.then()`, `queueMicrotask()`).

Picture the Event Loop as a mailman visiting houses (phases) in sequence. Microtasks are urgent letters that **must** be delivered before moving to the next house. `process.nextTick()` letters are marked "open immediately" - they are processed BEFORE regular microtasks.

**Critical difference from macrotasks:** setTimeout, setImmediate, I/O callbacks are macrotasks. They are executed in their phases of the Event Loop. Microtasks are executed **between** phases and have priority. Even if 100 setTimeouts are waiting in the timers phase, ALL microtasks will be executed first.

**DANGER: process.nextTick() can freeze the Event Loop!** If each nextTick callback creates a new nextTick, an infinite chain is formed. The Event Loop will never reach the next phase because the nextTick queue is constantly being replenished. This is called **nextTick starvation**.

Real bug: race condition due to nextTick

```typescript class Database { private connected = false; connect() { // Emulation of async connection setImmediate(() => { this.connected = true; this.emit('ready'); }); } query(sql: string) { if (!this.connected) throw new Error('Not connected!'); // ... } } const db = new Database(); db.connect(); db.query('SELECT * FROM users'); // ERROR! // Problem: query() executes synchronously, // while connect() will complete only in the next Event Loop phase. // Solution 1: Promise-based API async connect() { await new Promise(resolve => { setImmediate(() => { this.connected = true; resolve(); }); }); } await db.connect(); db.query('SELECT * FROM users'); // OK // Solution 2: callback db.connect(() => { db.query('SELECT * FROM users'); // OK }); ``` This is a classic example of why it's important to understand asynchrony at the Event Loop level, not just "async/await magic".

**When to use each type:** - **`process.nextTick()`** - for critical logic that must be executed BEFORE any I/O operations. For example, emitting an 'error' event before the function completes. Use VERY cautiously! - **`Promise.then()` / `queueMicrotask()`** - the standard way for asynchronous logic. More predictable, less risk of starvation. - **`setImmediate()`** - for deferring work to the next Event Loop cycle. Ideal for breaking heavy tasks into chunks. - **`setTimeout(fn, 0)`** - similar to setImmediate, but with a guarantee of "not earlier than in 1ms". Almost never needed in Node.js (there is setImmediate).

Given this code: ```typescript setTimeout(() => console.log('A'), 0); Promise.resolve().then(() => { console.log('B'); process.nextTick(() => console.log('C')); }); process.nextTick(() => console.log('D')); ``` What is the order of output?

Poll Phase

**Poll phase** is the heart of the Event Loop, the place where all the magic of Node.js asynchrony happens. It is here that the Event Loop receives new events from the operating system: incoming HTTP requests, responses from the database, data from files, socket events. The Poll phase is the only phase where the Event Loop can **block** and wait for new events.

Picture a waiter who has visited all tables (completed all Event Loop phases) and now stands at the entrance waiting for new customers. The wait is not indefinite - if an order is being prepared in the kitchen (a pending timer exists), the waiter will check the kitchen (return to the timers phase). The Poll phase works the same way: it blocks and waits for I/O events, but no longer than the nearest timer.

**How the poll phase works under the hood:** Node.js uses system calls like `epoll` (Linux), `kqueue` (macOS/BSD), `IOCP` (Windows) - these are OS kernel mechanisms for efficiently monitoring multiple file descriptors. Instead of polling each socket in a loop, the OS notifies Node.js when data appears on a descriptor. This operates at the kernel level without creating threads.

Real Case: Why the Server "Sleeps" When Idle

You launched an Express server and are looking at htop - the Node.js process shows 0% CPU. This is not a bug, it's a **feature**! **What's happening:** 1. The server has processed all requests 2. The Event Loop reached the poll phase 3. The poll queue is empty, no pending timers 4. The Event Loop called `epoll_wait()` with timeout = ∞ 5. The OS put the process into SLEEP state 6. The process does not consume CPU until an event arrives **A new HTTP request arrives:** 1. The TCP packet hits the network card 2. The Linux kernel processes the TCP handshake 3. Data goes into the socket buffer 4. `epoll_wait()` returns control with event information 5. Node.js wakes up and processes the request 6. All this takes microseconds That's why Node.js can handle thousands of connections with minimal resource consumption - most of the time it just sleeps, waiting for events from the OS.

**Danger: CPU-intensive task in I/O callback blocks poll phase** ```typescript server.on('request', (req, res) => { // Parsing a huge JSON const data = JSON.parse(hugeString); // 2 seconds res.json({ ok: true }); }); ``` While the first request is parsing JSON: - The Event Loop is stuck in the poll phase callback - New HTTP requests accumulate in the OS queue - Other I/O events are not processed - The server appears "frozen" to new clients **Solution:** Break into chunks or use Worker Threads for heavy operations.

**Optimization of the poll phase for high-load applications:** 1. **UV_THREADPOOL_SIZE** - size of the libuv thread pool (default is 4). Increase it to the number of CPU cores for fs/crypto operations: ```bash UV_THREADPOOL_SIZE=16 node server.js ``` 2. **Use streams** instead of buffering the entire file in memory: ```typescript fs.createReadStream('huge.json') .pipe(parser) .pipe(res); ``` The Event Loop will process chunks without blocking on the entire file. 3. **Worker Threads** for CPU-intensive tasks - offload parsing, cryptography, compression to separate threads: ```typescript const { Worker } = require('worker_threads'); const worker = new Worker('./heavy-task.js', { workerData: data }); ``` 4. **Graceful degradation** - if the Event Loop lag exceeds the threshold, reject new requests with 503: ```typescript const toobusy = require('toobusy-js'); app.use((req, res, next) => { if (toobusy()) return res.status(503).send('Server too busy'); next(); }); ```

Key Ideas

  • **Event Loop consists of 6 phases:** timers, pending callbacks, idle/prepare, poll, check, close. Each phase processes its type of tasks in a strict order. Understanding the phases is critical for the predictable behavior of asynchronous code.
  • **Microtasks are executed between phases:** process.nextTick has the highest priority, then Promise.then/queueMicrotask, followed by macrotasks (setTimeout, setImmediate). Microtasks can cause starvation by blocking the Event Loop.
  • **Poll phase - the heart of asynchronicity:** this is where the Event Loop receives I/O events from the OS through epoll/kqueue and can block while waiting for new events. This allows Node.js to handle millions of connections with minimal CPU consumption.
  • **NEVER block the Event Loop:** any synchronous operation >10ms blocks ALL connections. Use asynchronous APIs, break heavy tasks into chunks, offload CPU-intensive work to Worker Threads.

Related topics

Event Loop is the foundation of Node.js's asynchronous nature. For a complete understanding, study the related concepts:

  • libuv and Thread Pool — libuv is a C library that implements an Event Loop. Understanding the thread pool (for fs/crypto operations) and async I/O (for network) explains why some operations are parallel and others are not.
  • Streams and Backpressure — Streams use the Event Loop to process data in chunks without blocking memory. The backpressure mechanism prevents memory overflow with a slow consumer.
  • Worker Threads — For CPU-intensive tasks, the Event Loop is not sufficient - separate threads are needed. Worker Threads allows executing JS code in parallel without blocking the main Event Loop.
  • Memory Management and Garbage Collection — GC is executed synchronously and blocks the Event Loop. Understanding V8 heap and GC patterns is critical for high-load applications.

Вопросы для размышления

  • A server processes 1000 req/sec with an avg latency of 50ms. After deploying a new feature, latency jumps to 500ms while CPU stays at 30%. How can Event Loop monitoring help diagnose the root cause?
  • When setImmediate is used to break a heavy task into chunks, HTTP requests still time out under load. Which phase of the Event Loop is being blocked and why?
  • In which scenarios is process.nextTick preferable to Promise.then, despite the risk of queue starvation? Provide a real-world example from a Node.js library (e.g., EventEmitter).

Связанные уроки

  • os-01-intro
Event Loop: The Heart of Node.js

0

1

Sign In

setTimeout(fn, 0) and setImmediate(fn) are the same thing, just different names.

These are completely different mechanisms with different behaviors. `setTimeout` is executed in the timers phase (beginning of the cycle), `setImmediate` - in the check phase (after poll). Inside an I/O callback, `setImmediate` will ALWAYS execute before `setTimeout(0)`.

setTimeout(fn, 0) is actually setTimeout(fn, 1) (minimum 1ms according to the specification). The Event Loop must reach the timers phase and check if the timer has expired. setImmediate is specifically designed to execute "right after the current poll phase." Inside the I/O callback, the order is deterministic: poll -> check (setImmediate) -> new cycle -> timers (setTimeout). That's why in production, setImmediate is used to defer work to the next tick, rather than setTimeout(0).

An HTTP server is running. In the poll phase, the Event Loop receives 3 events simultaneously: 1. a new HTTP request 2. a response from PostgreSQL 3. data from the fs.readFile() file. Also, a setImmediate callback is waiting in the check queue. What will be executed first?