Blockchain

BFT Consensus: PBFT and variants

Imagine: you are a commander-in-chief, and your generals must attack simultaneously. But communication only goes through messengers, messengers can be intercepted, and some generals are traitors. One will send you a false confirmation; another will intentionally attack too early. How do you coordinate an army when you don't know who the enemy is? Lamport, Shostak, and Pease formalized this problem in 1982. For seventeen years, scientists considered it unsolvable for practical systems. Then in 1999, Castro and Liskov showed: 3f+1 nodes and three rounds of messages are enough to guarantee agreement even with traitors present.

**Hyperledger Fabric** - IBM's enterprise blockchain, which in early versions used PBFT-like consensus between member organizations (banks, logistics, healthcare)
**Tendermint (Cosmos)** - a BFT engine powering 50+ blockchains (Cosmos Hub, Binance Chain, Terra). Every block is final immediately - no waiting for confirmations
**HotStuff (Diem/Libra)** - Facebook/Meta's protocol for their payment system. Linear complexity allowed scaling to hundreds of validators. Now used in Aptos
**Aviation and space** - Boeing 787 and SpaceX Dragon use BFT-like protocols to reconcile readings between redundant onboard computers

Castro & Liskov: BFT leaves the lab

In 1999, PhD student Miguel Castro and his advisor Barbara Liskov of MIT published "Practical Byzantine Fault Tolerance". Before this, BFT protocols existed only in theory - they required an exponential number of messages. Castro and Liskov showed that BFT could be implemented with polynomial complexity O(n²), sufficient for dozens of nodes. Liskov later received the Turing Award (2008), and while her main contribution is the Liskov Substitution Principle (LSP), PBFT became one of the most cited papers in distributed systems.

Blockchain

BFT Consensus: PBFT and variants

**Hyperledger Fabric** - IBM's enterprise blockchain, which in early versions used PBFT-like consensus between member organizations (banks, logistics, healthcare)
**Tendermint (Cosmos)** - a BFT engine powering 50+ blockchains (Cosmos Hub, Binance Chain, Terra). Every block is final immediately - no waiting for confirmations
**HotStuff (Diem/Libra)** - Facebook/Meta's protocol for their payment system. Linear complexity allowed scaling to hundreds of validators. Now used in Aptos
**Aviation and space** - Boeing 787 and SpaceX Dragon use BFT-like protocols to reconcile readings between redundant onboard computers

Castro & Liskov: BFT leaves the lab

Communication Complexity and BFT scaling

PBFT solved the BFT problem for practical systems, but it has a fundamental bottleneck - **the number of messages grows quadratically**.

This is exactly why classic PBFT is practically unusable with n > 20-30 nodes. Blockchains with hundreds of validators (Ethereum: ~900,000 validators) need different approaches.

**Threshold signatures** are the key optimization. Instead of every node sending a message to every other node, signature aggregation is used:

Comparison of BFT protocols by generation:

Protocol

Year

Complexity

Phases

Used in

PBFT

1999

O(n²)

Hyperledger (early versions)

Tendermint

2014

O(n²)

Cosmos, Binance Chain

HotStuff

2018

O(n)

3 (pipelined)

Diem (Libra), Flow

SBFT

2019

O(n)

2 (fast path)

JPMorgan Quorum

Narwhal+Tusk

2022

O(n)

DAG-based

Sui, Aptos (variant)

**HotStuff** was a breakthrough in 2018. Three key improvements: 1. linear communication complexity via threshold signatures 2. pipelined phases - each new block starts prepare for itself and finishes commit for the previous one 3. simple view change (the hardest part of PBFT) in O(n).

**Trade-off**: reducing complexity to O(n) typically requires trusting the leader for one round (optimistic) or using cryptographic primitives (threshold signatures, BLS aggregation) that are computationally more expensive than plain signatures.

BFT protocols are useless for blockchain because they require knowing all participants and don't scale

Modern BFT protocols (HotStuff, Tendermint) are the foundation of most PoS blockchains. They operate in the permissioned layer (known validators), while entry into the validator set is managed through staking (permissionless). Optimizations (threshold signatures, pipelining) solved the scaling problem for hundreds of nodes.

The confusion arises from conflating PBFT (1999, a research protocol) with modern BFT protocols. PBFT indeed does not scale, but its descendants - HotStuff (Diem), Tendermint (Cosmos), Gasper (Ethereum) - run in production with hundreds and thousands of validators.

Why is PBFT unsuitable for public blockchains with thousands of validators?

Property	PBFT	Nakamoto (PoW)
Finality	Instant	Probabilistic
Fault tolerance	f < n/3 Byzantine	< 50% hashrate
Participants	Known in advance (permissioned)	Anyone (permissionless)
Energy	Minimal	Enormous
Scale	Tens of nodes	Thousands of nodes

BFT Consensus: PBFT and variants

Castro & Liskov: BFT leaves the lab

BFT Consensus: PBFT and variants

Castro & Liskov: BFT leaves the lab

Предварительные знания

Practical BFT: the problem and the solution

The three phases of PBFT: Pre-Prepare, Prepare, Commit

View Change: replacing the leader

Communication Complexity and BFT scaling

Key ideas

Related topics

Вопросы для размышления

Связанные уроки