Blockchain
PLONK and Universal ZK Systems
Groth16 is the Ferrari of zk-SNARKs: the most compact proof, the fastest verification. But imagine that for every firmware update Ferrari required rebuilding the entire engine from scratch, calling thousands of engineers to a special ceremony. That is exactly what happens with a per-circuit trusted setup: any change to the ZK circuit requires a new MPC ceremony. In 2019 Gabizon, Williamson, and Ciobotaru published PLONK - a protocol with a universal setup where one ceremony serves all circuits. Today PLONKish systems (UltraPlonk, Halo2) run inside zkSync, Scroll, Aztec, and Polygon zkEVM, processing millions of transactions. All the power comes from four ideas: a flexible gate equation, universal SRS, custom gates, and lookup tables.
- **zkSync Era** - a zkRollup by Matter Labs using PLONKish arithmetization (boojum prover) to scale Ethereum: ~$0.1 per transaction instead of ~$5 on L1
- **Scroll** - a zkEVM based on Halo2 (PLONKish with custom gates and lookups), fully EVM-equivalent, allowing Solidity contracts to be deployed without changes
- **Aztec** - a privacy-focused zkRollup where UltraPlonk allows hiding not only amounts but also the called contract functions, creating fully private DeFi transactions
Предварительные знания
PLONKish Arithmetization: More Flexible Than R1CS
In the zk-SNARK lesson we looked at R1CS - a system where each constraint allows exactly one multiplication: `a * b = c`. This works, but creates limitations: for complex operations (range checks, bitwise operations, hashing) everything must be decomposed into elementary multiplications, generating thousands of constraints. **PLONK** (Permutations over Lagrange-bases for Oecumenical Noninteractive arguments of Knowledge, 2019) proposes a radically different approach - a **universal gate equation** with **selector vectors** that "switch on" and "switch off" the required parts.
But how do you connect wires between rows? In R1CS all constraints operate on the same shared witness vector. In PLONK each row has its own `a, b, c`, and there needs to be a way to say: "the output of row 3 is the input of row 7". For this PLONK uses **copy constraints** via a **permutation argument** - a powerful cryptographic mechanism that proves values at certain table positions are equal.
The PLONK paper was published in 2019 (authors: Ariel Gabizon, Zachary Williamson, Oana Ciobotaru). Williamson is a co-founder of **Aztec Protocol**, one of the leading zkRollups. The name stands for "Permutations over Lagrange-bases for Oecumenical Noninteractive arguments of Knowledge" - the capital letters were chosen to spell a memorable word. PLONK became the foundation for a whole family of protocols: TurboPlonk, UltraPlonk, Halo2, each of which extends the base system with custom gates and lookup tables.
**R1CS vs PLONK by the numbers.** For verifying an ECDSA signature: R1CS requires ~300,000 constraints, while PLONKish arithmetization with custom gates needs around ~15,000 rows. For Poseidon hash (a popular ZK-friendly hash): R1CS ~320 constraints per round, PLONKish ~35 rows per round using wide gates. The more complex the operation, the greater the benefit of PLONK's flexibility.
In the PLONK gate equation q_L·a + q_R·b + q_O·c + q_M·(a·b) + q_C = 0, you need to express the constraint "a equals the constant 7". Which selector values are correct?
Universal SRS: One Ceremony for All Circuits
Recall the main problem with Groth16: **per-circuit trusted setup**. Every time the circuit changes (a feature is added to a zkRollup, contract logic is updated) a new Phase 2 ceremony must be held. For evolving projects this is a disaster - an update requires weeks of coordination. PLONK solves this problem via a **Universal and Updateable Structured Reference String** (SRS) - parameters that are generated once and work for **any** circuit of a given size.
How does this work technically? PLONK uses **KZG polynomial commitments** (Kate-Zaverucha-Goldberg, 2010). KZG lets you "commit" to a polynomial and then prove its value at any point using only the SRS. The key property: the commitment does not depend on the specific polynomial - it depends only on the maximum degree. Therefore a single SRS (a set of `[τ⁰]₁, [τ¹]₁, ..., [τᴺ]₁` on the elliptic curve) is suitable for commitments to **any** polynomial of degree ≤ N.
Historically the first large-scale Powers of Tau ceremony was the **Perpetual Powers of Tau** project (started in 2019), open to any participant. The results of this ceremony are reused by many projects: **Hermez** (now Polygon zkEVM), **Loopring**, **Semaphore**, and others. Ethereum ran its own KZG ceremony for EIP-4844 (proto-danksharding) with a record **141,416 participants** - this SRS is now available to all ecosystem projects.
**SRS size and practical limits.** An SRS for N rows takes roughly `48·N` bytes (each G₁ element is 48 bytes in BLS12-381). For N = 2²⁰ (~1M rows): SRS ≈ 48 MB. For N = 2²⁸ (~256M rows): SRS ≈ 12 GB. The prover must load the SRS into memory, which caps the maximum circuit size. In practice, zkRollup projects split computations into chunks of ~2²⁰–2²² rows and aggregate the proofs.
What is the main practical advantage of a universal SRS in PLONK compared to the per-circuit trusted setup in Groth16?
Custom Gates: Beyond Addition and Multiplication
The basic PLONK gate equation `q_L·a + q_R·b + q_O·c + q_M·(a·b) + q_C = 0` is already more flexible than R1CS, but still limited: three wires, one multiplication, linear selectors. What if we need to check that a value falls in the range `[0, 2⁸)`? Or compute a Poseidon hash in the minimum number of rows? **Custom gates** extend the gate equation by adding more wires, high-degree monomials, and access to neighboring rows.
Why are custom gates so important for performance? In ZK systems the main bottleneck is the **number of rows**: the more rows, the longer it takes the prover to compute the proof (time ~ O(N log N) for FFT/NTT operations). A custom gate that encodes a complex operation in one row instead of ten speeds up proving by ~10x for that operation. But there is a trade-off: each new gate type increases the **degree** of the gate polynomial, which affects proof size and the number of protocol rounds.
The **Halo2** architecture deserves special attention. In it, a circuit is divided into **chips** - reusable modules with a specific custom gate configuration. A chip for ECC operations, a chip for Poseidon hash, a chip for range checks. This resembles the hardware design of an FPGA, where different blocks perform specialized operations. Halo2 is used in **Scroll** (zkEVM), **Taiko** (zkRollup), and **PSE** (Privacy and Scaling Explorations, an Ethereum Foundation division) projects. Each composes chips to suit its own needs.
**Degree vs efficiency trade-off.** A gate polynomial of degree d requires d+1 evaluation points in the protocol, which increases proof size. Vanilla PLONK (d=2) - the most compact proof. TurboPlonk with cubic gates (d=3) - a slightly larger proof, but 3-5x fewer constraints. UltraPlonk with degree-5 gates and lookups - an even larger proof, but the number of rows shrinks by 10-50x for typical operations. In practice, the gain from reducing the number of rows far outweighs the growth in proof size.
A range check x ∈ [0, 255] in Vanilla PLONK requires ~9 rows (8 boolean + 1 composition). With a custom gate this check takes 1 row. Why does this matter for the prover?
Lookup Tables: Plookup and LogUp
Custom gates speed up arithmetic operations, but some computations are fundamentally "non-arithmetic": bitwise XOR, range checks for arbitrary ranges, SHA-256 round constant tables, EVM opcode dispatch. Encoding them through gate equations is prohibitively expensive. **Plookup** (2020, Gabizon and Williamson) proposes a different approach: precompute a table of allowed values and prove that a value is **contained** in this table without revealing which one.
The cost of a lookup is amortized: the table is "paid" once (added to the prover's commitments), and each individual lookup costs roughly 1 row. For 10,000 XOR operations: via gates - 500,000 rows, via lookup into a table of 65,536 entries - ~75,000 rows (table + lookups). The more accesses to the same table, the more beneficial the lookup.
How does it all fit together? A modern zkEVM (Scroll, Polygon zkEVM, Taiko) runs on **PLONKish arithmetization** with a **universal SRS**, **custom gates** for arithmetic operations, and **lookup tables** for everything else. The prover takes a block of Ethereum transactions, "executes" them in a ZK circuit using dozens of specialized tables and custom gates, and generates a proof of ~400-700 bytes. The verifier on L1 checks this proof for ~300K gas (~$0.5 at 30 gwei). This was made possible precisely by combining all four techniques from this lesson.
**Benchmark.** Scroll zkEVM (Halo2-based) uses ~28 types of lookup tables and ~15 types of custom gates. Generating a proof for a block of ~100 transactions takes ~3-5 minutes on a GPU cluster. Without custom gates and lookups, proving time is estimated to be days, not minutes. Polygon zkEVM (PIL/eSTARK-based) uses a similar approach with ~50 state machines, each with their own lookup tables.
Lookup tables break zero-knowledge, because the prover reveals which table rows they access
A lookup argument proves only the **fact of membership** of a value in a table, without revealing which specific row. Plookup and LogUp use random challenges β and γ that "mix" the positional information via a grand product or a sum of fractions. The verifier is convinced that all witness values are contained in the table, but does not learn which value corresponds to which row. The zero-knowledge property is preserved via blinding factors - random polynomials added to the commitments.
The intuition suggests that "checking against a table" = "pointing to a row" (like an index into an array). But a lookup argument works differently: it proves multiset containment, not index-based access. It is like proving that a card is in the deck without showing its position - via a cryptographic protocol, not by enumeration.
A zkEVM must prove the correctness of 10,000 XOR (8-bit) operations in a single block. Via PLONKish gates each XOR costs ~50 rows. Via lookup into a precomputed XOR table (256×256 = 65,536 entries) each XOR costs ~1 row. What is the approximate row savings when using lookups?
Key Ideas
- **PLONKish arithmetization** uses the universal gate equation `q_L·a + q_R·b + q_O·c + q_M·(a·b) + q_C = 0` with selector vectors that configure each row. Copy constraints via a permutation argument link values between rows - all without being tied to a specific circuit
- **Universal SRS** (Structured Reference String) is generated by a single Powers of Tau ceremony and works for any circuit of up to N rows. KZG polynomial commitments make the proof ~400-700 bytes. Remember the "ceremony for every firmware update" from the start of the lesson? Universal SRS solves this problem: one ceremony, forever
- **Custom gates** extend the base gate equation: more wires, high-degree monomials, access to neighboring rows. TurboPlonk, UltraPlonk, Halo2 - each adds its own extensions, reducing row count by 5-50x for typical operations
- **Lookup tables** (Plookup, LogUp) allow proving membership of a value in a precomputed table for ~1 row. For zkEVM this is critical: dozens of tables (opcodes, memory, keccak) reduce constraints from billions to millions, turning proof of EVM execution from a theoretical idea into a working system
Related Topics
PLONK sits at the intersection of theoretical cryptography and practical zkRollup infrastructure:
- zk-SNARK: From Theory to Practice — R1CS and Groth16 are PLONK's predecessors. PLONKish arithmetization replaces R1CS with a more flexible system, and universal SRS solves Groth16's per-circuit trusted setup problem
- zk-STARK: Transparent Proofs — STARK and PLONK are the two dominant ZK approaches. STARK uses FRI (transparent, post-quantum), PLONK uses KZG (compact proofs, universal setup). Some systems combine both: PLONK+FRI (Plonky2)
- zkEVM: Proving EVM Execution — zkEVM is the primary consumer of PLONKish systems. The custom gates and lookup tables from this lesson are the building blocks from which ZK circuits for all 140+ EVM opcodes are assembled
- Recursive Proofs: Proofs of Proofs — PLONK with KZG allows efficiently verifying one proof inside another. Recursive composition is the key technique for aggregating proofs in a zkRollup and for IVC (Incrementally Verifiable Computation)
Вопросы для размышления
- An SRS of size 2²⁰ (~1M rows) takes ~48 MB, while one of 2²⁸ (~256M rows) takes ~12 GB. How does the SRS size limit affect zkRollup architecture? Why do projects split computations into blocks rather than generating one enormous SRS?
- Custom gates increase the degree of the gate polynomial, which increases proof size and the number of rounds. How do you find the balance between gate "specialization" and system universality? When is it better to use a lookup instead of a custom gate?
- PLONK with KZG requires a trusted setup, while PLONK with FRI (Plonky2) does not, but proofs are larger. If quantum computers become a reality, KZG-based systems will be broken. How is the industry preparing for this transition and what is the cost of migration?