Information Theory
LDPC Codes: Gallager's Construction
Цели урока
- Define an LDPC code through sparse matrix H and the Tanner graph
- Describe the belief propagation algorithm in LLR form for the AWGN channel
- Explain density evolution and the concept of the decoding threshold SNR
- Know LDPC applications in 5G NR, Wi-Fi 6, and NVMe storage
Предварительные знания
- Channel Capacity
- Error-Correcting Codes
How did a sparse matrix invented in 1960 turn out to be the best practical code 35 years later? Why is belief propagation optimal on trees?
- Wi-Fi 6 (802.11ax): LDPC mandatory for MCS >= 10, enabling 9.6 Gbps in 8x8 MIMO
- 5G NR: LDPC for PDSCH/PUSCH, blocks 500-8448 bits, rates 1/3 to 22/24
- NVMe SSD: LDPC adapts to NAND flash degradation, extending device life 3-5x
- QR codes: Reed-Solomon; its successor LDPC is used in NFC/Bluetooth
35 Years in a Drawer
Robert Gallager (MIT, 1960): doctoral dissertation 'Low-Density Parity-Check Codes'. Showed: with optimal degree distribution LDPC operates near the Shannon limit. Problem: for n=10000 BP needs millions of operations - impossible on 1960s computers. Work forgotten. 1993: Berrou's turbo codes - similar idea (iterative decoding), caused a revolution. 1995: David MacKay and Radford Neal rediscover LDPC from internet archives. Comparison with turbo codes: LDPC wins at long blocks. 2003: IEEE 802.3an (10GBase-T Ethernet). 2009: Wi-Fi 802.11n. 2018: 5G NR. Gallager received the IEEE Hamming Medal in 2000.
LDPC Codes: Sparse Matrices and Tanner Graphs
Robert Gallager developed LDPC codes at MIT in 1960 as part of his doctoral dissertation. The idea: a parity-check matrix with very few ones per row and column. Gallager showed these codes at n -> inf operate near the Shannon limit. The world ignored the work for 35 years - no computational power existed for BP decoding. In 1995 MacKay and Neal rediscovered LDPC, called it 'turbo codes in disguise', and showed: these are the best practical codes. Today LDPC sits in every Wi-Fi chipset and every NVMe SSD.
Wi-Fi 802.11n (2009): LDPC as an optional feature, block lengths 648-1944 bits, code rates 1/2, 2/3, 3/4, 5/6. IEEE 802.11ax (Wi-Fi 6): LDPC is mandatory for MCS >= 10. Every modern Wi-Fi router implements Tanner graphs in a hardware LDPC decoder running at several Gbps.
What does 'low density' in LDPC mean - and why is it the key property?
Sparse H -> sparse Tanner graph with low-degree nodes -> BP messages nearly independent (no short cycles) -> BP converges to optimal decoding. A dense H would create message dependencies and BP failures.
Belief Propagation Decoding
Belief Propagation is an MCMC-like algorithm invented for LDPC as 'sum-product'. But the same algorithm runs in Bayesian networks (Judea Pearl 1988), the Viterbi algorithm for HMMs, and modern neural belief propagation. One algorithm - everywhere. Gallager found it for codes 25 years before Pearl.
Min-sum approximation in BP: instead of exact tanh/atanh, use the min operation over LLRs: L_{c->v} ~ min_{v' != v}|L_{v'->c}| * sign(prod). Loss < 0.2 dB at 3-4x reduction in complexity. All real LDPC chipsets use min-sum.
Why does BP converge to optimal decoding on trees but not necessarily on graphs with cycles?
On a tree: each message carries new independent information -> BP = MAP decoding. With cycles: 'correlation through loops' - a node receives from neighbors information that already incorporates its own LLR.
Density Evolution and the Decoding Threshold
Density Evolution (Richardson-Urbanke, 2001) is a precise tool for analyzing BP in LDPC codes. Instead of tracking individual LLRs, it tracks their distribution as n grows to infinity. Result: for each (lambda, rho) there exists a threshold SNR - below it BP converges to zero error, above it does not.
NVMe SSD and LDPC
NVMe PCIe Gen4 SSD (Samsung 990 Pro): LDPC with 64KB block, code rate 14/15. NAND flash degrades: fresh BER 10^-4, after 3000 erase cycles 10^-2. LDPC adapts: at low BER uses hard-decision BP; on degraded flash switches to soft-decision BP reading 3 thresholds instead of 1. This is density evolution in production: the BP threshold adapts to the current flash error density.
What is the threshold SNR in density evolution for LDPC?
Threshold SNR: below it BP converges (PE -> 0 as n grows), above it does not. Goal of LDPC optimization: bring the threshold SNR as close as possible to the Shannon limit (capacity-approaching). Density evolution computes this threshold exactly for a given (lambda, rho).
LDPC in 5G NR, Wi-Fi 6, and Storage
LDPC codes are literally everywhere. Every Wi-Fi 6 device, every 5G phone, every NVMe SSD uses LDPC. This is the most widely deployed class of capacity-approaching codes.
5G NR LDPC performance at BLER=10^-2: on AWGN at R=0.5 and block length 6144 bits - SNR 1.1 dB from the Shannon bound. At shorter blocks (300 bits) the gap grows to 1.8 dB. Polar codes win at blocks < 512 bits - which is exactly why two code families coexist in 5G NR.
Why does 5G NR use two base graphs (BG1 and BG2) rather than one?
LDPC works better relative to the Shannon bound for large N. Shorter blocks need different (lambda, rho) parameters optimized via density evolution. BG1: 500-8448 information bits. BG2: 40-2560 information bits. Selection happens in real time based on packet length.
Connection to Other Topics
LDPC unifies: information theory (density evolution, capacity-approaching), graph theory (Tanner graph, girth), and algorithms (BP = message passing = sum-product). BP is a universal inference algorithm: the same algorithm runs in Bayesian networks (Pearl 1988), HMMs (Viterbi), and the Kalman filter. The Information Bottleneck (Tishby) connects LDPC to neural network training: minimizing I(X;Z) while maximizing I(Z;Y) is the problem of finding a good code.
- Polar Codes: Arikan's Construction — Co-deployed in 5G NR alongside LDPC for control and data channels respectively
- Information Theory in Deep Learning — Belief propagation on Tanner graphs is the same algorithm as message passing in graphical models
- Channel Capacity — LDPC density evolution computes the capacity threshold for the underlying channel
Итоги
- LDPC defined by sparse H: low row/column weight -> sparse Tanner graph with low-degree nodes
- BP decoding: iterative LLR message exchange along graph edges, O(n) per iteration
- Density evolution: threshold SNR for specific (lambda, rho); optimization via LP
- 5G NR: QC-LDPC BG1/BG2, blocks 500-8448 bits, within 1.1 dB of Shannon at large blocks
Вопросы для размышления
- Why did LDPC remain dormant for 35 years and how did the computational revolution change that?
- What is the difference between min-sum and exact sum-product BP and when does it matter?
- What is the deep connection between LDPC, Bayesian networks, and neural network training through BP?