Information Theory
Communication Channel and Capacity
Every transmitted message travels through a physical medium-wire, air, fiber. That medium distorts and corrupts. Channel models let us reason about this mathematically.
- Wi-Fi signal degrades with distance and interference
- Deep-space probes like Voyager operate at SNR so low that without coding, communication would be impossible
- Mobile networks manage hundreds of overlapping channels simultaneously
The Theorem That Founded Digital Communication
In 1948, Shannon proved the noisy-channel coding theorem. Before this, engineers believed that reducing error required reducing speed. Shannon showed this was wrong: transmission at any rate below capacity is possible with arbitrarily small error probability.
Channel Model
A discrete memoryless channel (DMC) maps input X to output Y according to a conditional distribution p(y|x). The channel has no memory: each symbol is corrupted independently. This abstraction is surprisingly powerful.
The BSC models hardware bit-flips and random noise. The BEC models packet drops in networks and erasure codes. A third fundamental model is the AWGN (Additive White Gaussian Noise) channel used for analog transmission: Y = X + Z, where Z ~ N(0, σ²).
Real channels are often modeled as compositions: a wireless link is AWGN, then the receiver quantizes to bits, turning it effectively into a BSC. The gap between these models and reality drives decades of research.
In a Binary Erasure Channel with erasure probability ε = 0.1, what happens to a transmitted bit?
Channel Capacity
Channel capacity C is the maximum rate at which information can be transmitted reliably. Shannon defined it as the maximum mutual information over all input distributions:
For the BEC with erasure probability ε, each non-erased bit carries full information, so C = 1 - ε. This is intuitive: a fraction ε of bits vanishes, and the rest arrive perfectly.
Capacity is a property of the channel, not of the code. No matter how clever the encoding, the rate cannot exceed C. With the right code, the rate can get arbitrarily close.
A BSC has flip probability p = 0.5. What is its capacity?
Shannon's Noisy-Channel Theorem
The theorem has two directions: achievability and converse. Achievability says that for any rate R < C, there exists a code with vanishing error probability as block length n grows. The converse says that for R > C, no code can achieve vanishing error.
The random coding proof uses the concept of typical sequences: with high probability, a long sequence of i.i.d. symbols looks like it came from the true distribution. This is the Asymptotic Equipartition Property in action.
Shannon's theorem gives us an algorithm for building capacity-achieving codes
The original theorem is an existence proof. It says good codes exist, but doesn't construct them. Practical capacity-achieving codes (Polar codes, 2008) came 60 years later.
The random coding argument averages over all codes-it's not a construction. Polar codes by Arıkan were the first explicit capacity-achieving codes.
A channel has capacity C = 0.8 bits/use. A code is designed at rate R = 0.75 bits/use. What does Shannon's theorem guarantee as block length grows?
Shannon-Hartley Formula
For continuous channels with bandwidth B and additive Gaussian noise, Shannon derived the exact capacity. This formula unifies bandwidth and signal power into a single bound.
The Shannon-Hartley formula reveals a crucial asymmetry: capacity grows linearly with bandwidth but only logarithmically with SNR. Doubling bandwidth doubles capacity; doubling power adds just one bit per dimension.
This has profound engineering implications. When the channel is power-limited (SNR is hard to increase), the only path to higher throughput is more bandwidth-spread spectrum, MIMO, mmWave. When bandwidth is limited (licensed spectrum costs money), engineers push SNR via higher-order modulation (256-QAM, 1024-QAM).
| Scenario | B | SNR (dB) | Capacity |
|---|---|---|---|
| Telephone modem | 3.4 kHz | 38 dB | ~56 kbit/s |
| ADSL | 1.1 MHz | ~30 dB | ~8 Mbit/s |
| Wi-Fi 6 (80 MHz) | 80 MHz | 30 dB | ~800 Mbit/s |
| 5G mmWave (800 MHz) | 800 MHz | 20 dB | ~8 Gbit/s |
A channel has B = 10 MHz and SNR = 3 (linear). What is the Shannon capacity?
Channel Capacity: Key Takeaways
- BSC, BEC, and AWGN are the three canonical channel models
- Capacity C = max I(X;Y) over input distributions
- Shannon's theorem: reliable communication is possible iff R < C
- Shannon-Hartley: C = B·log₂(1+SNR) for Gaussian channels
- Bandwidth scales capacity linearly; power only logarithmically
From Channels to Codes
Knowing the capacity bound is one thing-achieving it is another. Error-correcting codes are the engineering bridge between Shannon's existence proof and practical systems.
Вопросы для размышления
- Why does capacity go to zero when the BSC flip probability is 0.5, even though all bits are received?
- If doubling power adds only log₂(2)=1 bit/use, why do engineers still chase SNR improvements?
- How to estimate the theoretical capacity of a home internet connection?