Blockchain
KZG Commitments and Polynomials
Ethereum processes transactions from rollups - Arbitrum, Optimism, Base - but storing all their data on L1 is prohibitively expensive. In March 2024, a single upgrade (EIP-4844) cut the cost by 100x. At its core is a mathematical construction that compresses 128 KB of data into a 48-byte "fingerprint" while still allowing any fragment to be verified. That construction is the KZG commitment, and to ensure its security, 140,000 people from 177 countries participated in the largest cryptographic ceremony in history.
- **EIP-4844 (proto-danksharding)** - blob transactions with KZG commitments reduced L2 transaction costs by 10-100x starting March 2024
- **Ethereum KZG Ceremony** - 140,000+ participants generated trusted setup parameters for the entire ecosystem using lava lamps, radioactive decay, and atmospheric noise
- **Data Availability Sampling** - KZG lets validators verify data availability by downloading just a few random points instead of the entire blob
Предварительные знания
Polynomial Commitments: a fingerprint of a formula
In the lesson on commitment schemes we learned how to commit to **one value** - hide it, then prove it was never changed. But what if you need to commit to an **entire function** - a polynomial - and later prove values at individual points without revealing the polynomial itself?
A **polynomial commitment** is a cryptographic "fingerprint" of a polynomial. You publish a commitment `C`, and then you can prove that `f(z) = y` for any point `z` by providing a short proof `π`. The verifier checks the proof without ever knowing the polynomial.
Why does a blockchain need this? Polynomials can **encode data**. Think of 4096 rollup data values as points on a polynomial graph. Instead of storing all 4096 values, you only need to store one commitment - **48 bytes**. And if someone doubts a specific value, you show them a proof - also **48 bytes**.
**Connection to erasure coding:** a degree-`d` polynomial is uniquely determined by `d+1` points. If we know the polynomial's values at enough points, we can **reconstruct** all other values. This is the foundation of **data availability sampling** - you can verify data availability by requesting just a few random points.
A rollup publishes a 48-byte polynomial commitment for 4096 data values. What does an evaluation proof allow you to do?
KZG: commitment on elliptic curves
**KZG** (Kate-Zaverucha-Goldberg, 2010) is a concrete polynomial commitment scheme built on **elliptic curves** and **bilinear pairings**. It is the scheme Ethereum chose for proto-danksharding (EIP-4844).
The key idea: instead of evaluating the polynomial `f(x)` with numbers, we evaluate it **on elliptic curve points**. As we know from the ECC lesson, operations with curve points are "one-way". Knowing `f(s)·G` (a point on the curve), it is impossible to recover either `f(s)` or the secret value `s`.
Why is **bilinear pairing** the key ingredient? A pairing `e(A, B)` is a special function that lets you "multiply" two curve points and get an element in a target group. It is the only known way to verify a polynomial relationship "in encrypted form" - without knowing `s` or the polynomial itself.
**Sizes in Ethereum (BLS12-381):** commitment = **48 bytes**, proof = **48 bytes**, verification = **2 pairing operations** (~1 ms). For comparison: a Merkle proof for the same data is ~1 KB, and an FRI proof (in STARKs) is ~50 KB.
A KZG proof verifier checks via bilinear pairing. What key fact does it rely on?
Trusted Setup: the ceremony of secret destruction
KZG requires a secret value `s` to generate the public parameters `[G, s·G, s²·G, ...]`. If someone knows `s`, they can create **forged proofs** - prove that `f(z) = y` for any `y`, even if it is false. How can we be sure `s` was truly destroyed?
The answer: an **MPC ceremony** (Multi-Party Computation). Many participants take turns adding their own randomness. The final secret `s` is known to **nobody** - it only ever existed "inside" the protocol and was destroyed piece by piece.
**The Ethereum KZG Ceremony** (2023) became the largest trusted setup ceremony in cryptographic history. More than **140,000 participants** from 177 countries contributed their randomness. **Anyone** could participate - via a browser, CLI, or even dedicated hardware devices.
Creative sources of randomness
How participants in the Ethereum ceremony generated their entropy
Participants competed in generating randomness creatively: - Mouse movements along chaotic paths - Readings from a radioactive decay sensor - Recording atmospheric noise from an antenna - Data from lava lamps (like Cloudflare) - Hash of a random Bitcoin block at the moment of participation The more unpredictable the source, the better. Even if 139,999 participants were agents of an attacker, a single honest participant with a good entropy source is enough.
**Toxic waste** is not a metaphor. If someone kept their secret `τᵢ` AND all other participants were also compromised, the attacker could reconstruct `s` and forge invalid proofs. The term "toxic waste" underscores that these values **must be destroyed** - like radioactive waste.
**Alternatives to trusted setup:** STARKs and FRI-based schemes require no trusted setup at all (transparent setup). But their proofs are significantly larger (~50 KB vs 48 bytes for KZG). This is the classic trade-off: simplicity of setup vs proof size.
More than 140,000 people participated in the Ethereum KZG Ceremony. How many of them would need to be dishonest to compromise the system?
Danksharding: KZG scales Ethereum
Why did Ethereum invest so much effort in the KZG ceremony? For **danksharding** - a fundamental scalability upgrade named after researcher Dankrad Feist. KZG commitments are the key cryptographic primitive in this plan.
**The problem:** rollups (Arbitrum, Optimism, zkSync) publish transaction data to Ethereum L1 for data availability. This data is written to `calldata` - expensive storage that remains on the blockchain **forever**. Rollups spend ~90% of their gas on publishing data.
A **blob** is a Binary Large Object of 128 KB containing rollup data. Each blob is interpreted as a **degree-4095 polynomial** (4096 coefficients), and a **KZG commitment** is computed for it. This commitment is recorded in the block, while the blob itself is stored separately and **deleted after ~18 days** - it is not needed forever, only for disputing fraud proofs.
**Data Availability Sampling (DAS)** is the next step in full danksharding. Instead of every validator downloading all blobs, each one requests just **a few random points** and verifies them with KZG proofs. Thanks to the erasure coding properties of polynomials, if 50%+ of the points are available, the data is **guaranteed to be recoverable**. This allows scaling the number of blobs without increasing the load on an individual validator.
**Result of EIP-4844:** after activation in March 2024, the cost of data publication for rollups dropped by **10-100x**. Transactions on L2 (Arbitrum, Optimism, Base) became fractions of a cent. Full danksharding promises another order-of-magnitude reduction.
A KZG commitment is just a compact hash of the data - like a Merkle Root but smaller
A KZG commitment is fundamentally different from a hash: it enables **evaluation proofs** - proofs of polynomial values at specific points, without revealing the other data. A Merkle proof proves inclusion of an element; a KZG proof proves the value of a function. It is an algebraic structure, not hash-based, and that is exactly why it supports erasure coding and data availability sampling.
A hash function is a black box: input → output, no algebra. A polynomial commitment preserves the algebraic structure of the polynomial: linearity, divisibility, interpolation. This enables proofs that are impossible with regular hashes - for example, proving that data can be recovered from an incomplete set of points.
EIP-4844 introduced blob transactions. Why are blobs only stored for ~18 days instead of forever, like regular blockchain data?
Key ideas
- A **polynomial commitment** is a cryptographic "fingerprint" of a polynomial (48 bytes) that lets you prove values at individual points without revealing the polynomial. Data is encoded as points on a polynomial graph, giving erasure coding for free
- The **KZG scheme** uses elliptic curves and bilinear pairings: `C = f(s)·G`, proof `π = q(s)·G`, verification via pairing in O(1). The proof is **48 bytes** regardless of the data size
- **Trusted setup** (Powers of Tau) requires the secret `s` to be destroyed. An MPC ceremony with 140,000+ participants guarantees security under the "1-of-N honest" assumption - **one** honest participant is sufficient
- **EIP-4844** introduced blob transactions - 128 KB of data compressed into a 48-byte "fingerprint". KZG lets rollups store data in separate blobs (deleted after 18 days), cutting L2 transaction costs by 10-100x. Full danksharding with Data Availability Sampling is the next step
Related topics
KZG commitments connect elliptic curve cryptography with blockchain scaling:
- Elliptic Curves (ECC) — KZG uses BLS12-381 curve points and bilinear pairings for commitment and verification
- Commitment Schemes — KZG is a specific polynomial commitment scheme with hiding and binding properties
- Hash Structures: Patricia, Verkle Trees — Verkle Trees replace Merkle proofs with polynomial commitments (KZG/IPA) for compact state proofs
- Data Availability — KZG commitments are the foundation of Data Availability Sampling in full danksharding
Вопросы для размышления
- KZG requires a trusted setup while STARKs do not (transparent setup). Why did Ethereum still choose KZG for EIP-4844 despite the need for a ceremony?
- If blobs are deleted after 18 days, how can a new node sync the historical state of a rollup? Who stores that data after deletion?
- Data Availability Sampling lets each validator check just a few random points. What happens if an attacker hides exactly the points nobody requested?
Связанные уроки
- bc-08-commitment — Commitment schemes are the foundation of KZG
- bc-54-zk-intro — KZG is a key primitive for ZK proof systems
- bc-55-zk-snark — KZG commitments are used in PLONK/SNARK
- bc-57-plonk — PLONK is built on top of KZG polynomial commitments
- calc-16-taylor — Polynomials as functions - algebraic analogy
- aa-03-homomorphisms — Homomorphic binding in KZG via group properties
- crypto-34-zkp-basics