Blockchain

DAG-based consensus: Avalanche, Narwhal

Bitcoin processes 7 transactions per second. Visa - 65,000. Can you catch up with Visa without sacrificing decentralization? Engineers struggled with this for decades until they asked the right question: why do blocks need to be in a single line at all? What if blocks could reference multiple previous ones - turning a chain into a web? The result - protocols with a throughput of 130,000 transactions per second, running today in Avalanche, Sui, and Aptos.

  • **Avalanche** processes ~4,500 TPS with sub-second finality for 1,800+ validators - the Snowball consensus lets each node query only 20 random neighbors instead of all others
  • **Sui** (built on a Narwhal variant) confirms simple transfers in ~400ms without consensus - the object-centric model automatically identifies which transactions are independent and can execute in parallel
  • **IOTA Tangle** - DAG for IoT devices: each transaction confirms two previous ones, and throughput grows with the number of participants instead of falling

Предварительные знания

  • The Consensus Problem: Why and How

DAG instead of a block chain

Blockchain in the classic sense is a **linear chain**: each block references exactly one previous block. This creates a strict ordering, but also a bottleneck: while the whole world waits for the next single block, throughput is physically limited.

**DAG (Directed Acyclic Graph)** is a graph where each vertex (block or transaction) can reference **multiple previous ones**. Edges are directed (from newer to older), cycles are forbidden. The result, instead of a single 'thread', is a 'web' - multiple transactions can be added **simultaneously** without waiting for each other.

Key properties of DAG structure: - **Directed** - edges have a direction: a new vertex references old ones, not vice versa - **Acyclic** - following edges, you can never return to the starting point (otherwise causality would break: a block cannot depend on itself) - **Multi-parent** - a vertex references multiple 'parents', encoding parallelism

**IOTA Tangle** - one of the first projects to use DAG. Each transaction confirms two previous ones (two-parent DAG). The idea: the more participants, the more transactions are confirmed in parallel - throughput **grows** with the number of users instead of dropping.

**BlockDAG** - a term for approaches where blocks form a DAG: PHANTOM, SPECTRE (from Hebrew University), Kaspa. Unlike IOTA (transaction DAG), here the DAG is made of **blocks** - each block contains a set of transactions and references several preceding blocks.

How does a DAG structure differ at the data-model level from a linear blockchain?

The Avalanche family: from Slush to Snowball

In 2018, an anonymous group 'Team Rocket' published a paper proposing a radically new approach to consensus. Instead of collecting signatures from all validators (as in PBFT), or mining blocks (as in PoW), they proposed **repeated random sub-sampling**.

The idea is simple: each node picks a **small random group** (sample) of other nodes and asks their opinion. If a majority in the sample considers a transaction valid - the node leans the same way. This process repeats, and the system reaches agreement through **metastability**: like a ball on a hilltop rolling into one of the valleys.

Why does this work? Metastability. Imagine 100 people in a room: each randomly asks 10 neighbors and joins the majority. Even with an initial split of 51/49 the system quickly converges to a single opinion - like a snowball (hence the name Snowball).

PropertyClassical BFT (PBFT, HotStuff)Avalanche (Snowball)
CommunicationO(n) or O(n^2) messages per blockO(k log n) - k is fixed (~20)
Scalability~100 nodes (communication bottleneck)~1000+ nodes (sample size k is independent of n)
FinalityDeterministic (after quorum - forever)Probabilistic (error < 10^-9 within seconds)
LeaderRequired (single point of failure / attack)Not required (leaderless)
Fault toleranceBFT: up to 1/3 maliciousParametric: depends on k, alpha

**O(1) communication per node** - each node in each round queries a fixed number k of neighbors (typically k=20). This is independent of the total number of nodes n. For comparison: in PBFT each node exchanges messages with **all** others - O(n) per node, O(n^2) total.

Why does Snowball add a confidence counter compared to Snowflake?

Narwhal + Tusk: separate data and ordering

Avalanche showed that DAG speeds up consensus. But researchers from Meta (formerly Facebook) went further: what if you **completely separate** two tasks - data dissemination and ordering? The result is **Narwhal** (DAG mempool) and **Tusk** (zero-message consensus).

In classical blockchain both tasks are intertwined: the leader collects transactions, packs them into a block, broadcasts the block, nodes vote on the block. If the leader is slow or malicious - everything stalls. Narwhal/Tusk cuts this knot: - **Narwhal** - each validator continuously disseminates its transactions as DAG vertices, without waiting for a leader - **Tusk** - determines the order of already-disseminated vertices using the DAG structure

**How Narwhal (DAG mempool) works:** 1. Time is divided into **rounds** 2. In each round every validator creates a **batch** (set of transactions) and broadcasts it 3. A batch from a new round contains **references** (hashes) to batches from the previous round from other validators 4. Before accepting a batch, a validator checks for a **certificate of availability** - confirmation that the data is available at 2f+1 nodes 5. Result: a DAG of batches where each batch is certified and available

**How Tusk (consensus) works:** Tusk sends **zero additional messages** - it uses the DAG structure already built by Narwhal as 'voting'. In even-numbered rounds an **anchor** vertex is chosen in a predetermined way. If the anchor received enough 'votes' (references from batches in the next round), it is **committed** together with all its ancestors.

**130,000 TPS** - benchmark result from Narwhal/Tusk paper (Danezis et al., 2022). For comparison: Bitcoin ~7 TPS, Ethereum ~30 TPS, Solana ~4,000 TPS in real conditions. The key to this performance is full bandwidth utilization: data is disseminated continuously without waiting for consensus.

**Sui** (by Mysten Labs, former Meta/Diem engineers) uses **Narwhal + Bullshark** (improved Tusk) as its consensus foundation. **Aptos** (also a Diem successor) uses **Quorum Store** - its own variation of Narwhal ideas for separating data dissemination and ordering. Both projects confirm: separating data and ordering is the architectural pattern Diem's heirs converged on.

What is the key architectural innovation of Narwhal/Tusk compared to classical BFT?

Parallel transaction processing in DAG

The DAG structure removes the constraint on sequential block addition. But there is another side to this: how to **execute** transactions in parallel? In a linear blockchain the order is simple - transaction 5 executes after transaction 4. In a DAG two branches can evolve simultaneously, and if both spend the same funds - a **conflict** arises.

Solution: split transactions into **conflicting** and **non-conflicting**: - **Non-conflicting** (independent): Alice pays Bob, Charlie pays Dave - different accounts, can execute in parallel - **Conflicting**: Alice pays Bob 10 ETH and simultaneously Charlie 10 ETH, but the balance is 15 ETH - one must be chosen In a well-designed system **>90% of transactions are independent** - parallelism provides a huge gain.

**Sui** implements this principle through an **object-centric model**: each object (coin, NFT, smart contract) has an owner. Transactions working with **different objects** are processed in parallel without consensus (fast path, ~400ms). Transactions touching **shared objects** go through the full Narwhal/Bullshark consensus.

ApproachThroughputLatencyExample
Linear chain (1 block at a time)1x (baseline)Block time (Bitcoin: 10 min)Bitcoin, classic Ethereum
Pipelining (overlapping phases)2-3xLower (phases in parallel)HotStuff
DAG + parallel execution10-100xSub-second for independent TXAvalanche (~4,500 TPS), Sui (~100K+ TPS)
DAG + data/ordering separation100x+~400ms for fast pathNarwhal/Tusk (130K TPS in benchmarks)

**TPS figures from benchmarks require context.** Narwhal/Tusk's 130K TPS is in controlled conditions (geo-distributed WAN, 50 nodes). Real throughput depends on: the share of conflicting transactions, smart contract complexity, network latency, number of validators. Solana claims 65K TPS but in practice shows ~4,000 TPS. DAG gives a multiplied gain, but not a magical one.

**Kaspa** (BlockDAG based on PHANTOM/GhostDAG) - one of the first projects to apply DAG to PoW: blocks are mined in parallel at ~1 block/sec (vs 1 block/10 min for Bitcoin), the DAG structure includes all parallel blocks instead of discarding forks.

DAG completely replaces blockchain and makes the linear chain obsolete

DAG and the linear chain solve different problems. DAG increases throughput through parallelism, but adds complexity to ordering and conflict resolution. Many DAG systems ultimately **linearize** the DAG to determine the final order (Narwhal/Tusk commits the anchor and all its ancestors in a linear sequence). DAG is an optimization of the **data layer**, not a replacement for **consensus logic**.

Linear ordering is unavoidable for conflicting transactions: if Alice can pay either Bob or Charlie - a choice must be made. DAG speeds up data dissemination and allows independent transactions to execute in parallel, but the final order of conflicting transactions is still determined linearly. Sui, for example, uses DAG consensus only for shared objects, and for owned objects bypasses consensus entirely.

Why can't DAG systems process ALL transactions in parallel?

Summary

  • **DAG (Directed Acyclic Graph)** replaces the linear chain with a 'web': each vertex references multiple previous ones, allowing transactions to be added **in parallel** instead of waiting sequentially for the next block
  • **Avalanche (Snowball)** - leaderless consensus through repeated random sub-sampling: each node queries k random neighbors, the system converges via **metastability** with O(k log n) communication instead of O(n^2) in classical BFT
  • **Narwhal/Tusk** separates **data dissemination** and **ordering** - all validators continuously disseminate data (Narwhal), while ordering is determined from the DAG structure without extra messages (Tusk), achieving 130K TPS in benchmarks
  • **Parallel processing** works for independent transactions (different accounts/objects), but **conflicting** transactions still require ordering - DAG speeds up the data layer but does not eliminate the need to linearize conflicts
  • Remember the opening question: can you catch up with Visa without sacrificing decentralization? DAG consensus shows that the bottleneck was not in consensus itself, but in the **linear data structure** - replacing the chain with a web gives a 10-100x speedup while maintaining BFT guarantees

Related topics

DAG consensus builds on classical BFT ideas and is connected to modern protocols:

  • The consensus problem: why and how — DAG protocols solve the same Byzantine generals problem, but through random sub-sampling (Avalanche) or DAG structure (Narwhal) instead of classical voting
  • BFT protocols — Classical BFT (PBFT) - O(n^2) communication, which limits scalability. DAG approaches reduce this to O(k log n) while maintaining BFT guarantees
  • HotStuff and linear BFT — HotStuff optimizes BFT to O(n) with pipelining; Narwhal/Tusk goes further by separating data dissemination and ordering
  • Solana — Solana uses its own approach to parallel processing (Sealevel) and PoH for ordering - an alternative to the DAG approach

Вопросы для размышления

  • Narwhal/Tusk shows 130K TPS in benchmarks, but real DAG networks (Sui, Avalanche) run an order of magnitude slower. What factors, in your opinion, create the gap between benchmark and production - and can it be closed?
  • Avalanche uses probabilistic finality through random sampling, while Tendermint uses deterministic finality through a 2/3 quorum. For what types of applications would you prefer each approach - and are there scenarios where probabilistic finality with error probability 10^-9 is practically equivalent to deterministic?
  • Sui separates transactions into 'owned objects' (no consensus, ~400ms) and 'shared objects' (full consensus). How would you design a DeFi application (e.g., a DEX) to maximize the share of transactions going through the fast path?

Связанные уроки

  • alg-13-dfs
DAG-based consensus: Avalanche, Narwhal

0

1

Sign In