Linear Algebra
Linear Algebra over Finite Fields
How can 1 petabyte be transmitted over a satellite link with 30% packet loss without retransmission? Linear algebra over finite fields is the only mathematically guaranteed answer.
- **5G NR:** LDPC codes with rate 8/9 and 10 Gbps decoding throughput on Qualcomm ASICs - the physical layer of 5G
- **QR codes:** Reed-Solomon over GF(2^8) corrects up to 30% lost data - that is why damaged QR codes still scan
- **RAID-6 disk arrays:** two redundant disks using Reed-Solomon guarantee recovery from any two simultaneous disk failures
- **Satellite communication:** DVB-S2X uses LDPC plus BCH codes to operate at SNR = -2 dB (signal below noise floor)
Предварительные знания
- Finite fields GF(q)
- Vector spaces
- Hamming distance
Vector spaces over GF(q)
Linear algebra over GF(2) underlies all linear error-correcting codes. The Hamming(7,4) code was the first systematic code (Hamming, 1950, Bell Labs). LDPC codes (Gallager, 1963) are used in 5G NR with code rates up to 8/9 and decoding throughput of 10 Gbps on Qualcomm ASIC chips.
Linear algebra over GF(q) differs from real linear algebra in one key aspect: the characteristic of the field is finite. GF(2): 1+1=0. GF(q) for prime q is the field Z/qZ. GF(q^m) is an extension via an irreducible polynomial over GF(q). All standard theorems (rank, kernel, image) carry over word for word.
Over GF(2) all arithmetic is modulo 2: addition = XOR, multiplication = AND. This makes hardware implementation extremely efficient - a complete Hamming(7,4) decoder on an FPGA requires only 12 logic elements.
How many vectors does a k-dimensional subspace over GF(q) contain?
A k-dim subspace over GF(q) has q^k elements: each of the k basis coordinates can take any of q field values. Over GF(2) this is 2^k - which links directly to coding theory.
Linear codes and the Singleton bound
A linear code [n, k, d] over GF(q) is a k-dim subspace in GF(q)^n. Parameters: n is codeword length, k is message dimension, d is minimum Hamming distance. The Reed-Solomon code attains the Singleton bound d = n - k + 1 and is used in QR codes, Blu-ray, and the Voyager probes.
What does the Singleton bound state for a linear code [n, k, d]?
The Singleton bound says minimum distance d of a linear code [n, k, d] satisfies d <= n - k + 1. Proof: puncturing n - d positions gives an injection GF(q)^k → GF(q)^(n-d), so k <= n - d. Reed-Solomon codes attain equality.
Cryptography applications
In cryptography, linear algebra over GF(2) and GF(2^k) underlies AES, McEliece, and Goldreich-Levin codes. AES SubBytes uses an affine transform over GF(2^8); MixColumns is multiplication by a 4x4 matrix over GF(2^8). Code-based cryptography rests on the NP-hardness of decoding a general linear code.
The Singleton bound d <= n-k+1 follows directly from the requirement that any k columns of H must be linearly independent to ensure unique error correction. MDS codes are those that achieve this bound.
Why does AES operate over GF(2^8) rather than GF(2)?
GF(2) only gives linear operations (XOR). GF(2^8) provides a multiplicative structure on byte data. The multiplicative inverse in GF(2^8) is non-linear and is the cryptographic core of the S-box. Without that structure AES would lack non-linearity.
Connections to algebra, computer science, and cryptography
Linear algebra over finite fields permeates all modern digital technology.
- Cryptography (AES, ECC) — Related topic
- Coding theory: LDPC — Related topic
- Algebraic geometry — Related topic
- Quantum error correction — Related topic
Итоги
- A linear code [n,k,d] is a k-dimensional subspace of GF(q)^n with minimum distance d
- Generator matrix G (k x n) and parity-check matrix H ((n-k) x n) satisfy H*G^T = 0 over GF(q)
- Syndrome s = H*r^T: s = 0 iff r is a codeword; s != 0 uniquely identifies errors when wt(e) <= t
- Singleton bound: d <= n-k+1; MDS codes attain it (Reed-Solomon codes over GF(q^m))
- RS[255,223] over GF(2^8) corrects 16 errors per 255-byte block - used in CDs, QR codes, RAID
- LDPC: sparse H over GF(2) plus Tanner graph plus BP decoding - the physical layer of 5G, Wi-Fi 6, DVB
The Hamming(7,4) code over GF(2) has parameters [n=7, k=4, d=3]. How many errors does it correct, and why is it not an MDS code?
t = floor((d-1)/2) = floor(2/2) = 1. The Singleton bound gives d <= n-k+1 = 7-4+1 = 4. Hamming(7,4) has d=3 < 4, so it is not MDS. An MDS code [7,4,4] exists only over a field with q >= 7 (for example GF(7) or GF(8)) but not over GF(2).