Number Theory
Number Theory in Cryptography
1976: Diffie and Hellman publish 'New Directions in Cryptography'. Before that, secret communication required a prior in-person key exchange. After: two strangers can agree on a secret over a public channel, transmitting nothing secret at all. The entire revolution rests on one mathematical asymmetry-discrete exponentiation is easy in one direction and hard to invert.
- **TLS 1.3:** ECDH (X25519) for key agreement + RSA/ECDSA for certificate authentication-happens on every HTTPS connection
- **SSH:** RSA or Ed25519 for host authentication; ECDH for session key derivation with perfect forward secrecy
- **Signal protocol:** Double Ratchet algorithm built on ECDH; provides both forward secrecy and break-in recovery
Предварительные знания
RSA: Complete Walkthrough
RSA is not magic-it is a direct consequence of Euler's theorem. Understanding RSA from the inside means knowing why 2048-bit keys are secure, how to choose e, why d must be large, and how CRT speeds up decryption by a factor of four.
**RSA step by step:** 1. Key gen: choose p, q (1024-bit primes), n=p·q, φ(n)=(p-1)(q-1) 2. Public exponent: e=65537, gcd(e, φ(n))=1 3. Private exponent: d = e⁻¹ mod φ(n) 4. Keys: public=(n,e), private=(n,d) **Encrypt:** c = mᵉ mod n **Decrypt:** m = cᵈ mod n **Correctness:** cᵈ = m^(ed) = m^(kφ(n)+1) ≡ m (mod n) by Euler's theorem. **Common attacks:** - Small e (e=3): cube-root attack if no padding - Common modulus: two keys with the same n → factorization via GCD - Timing: decryption time leaks d (Kocher 1996)
Real RSA ALWAYS uses OAEP padding (PKCS#1 v2.1). Without padding: with e=3, if m < n^(1/3), then c = m³ as an ordinary integer and the cube root reveals m directly. Never use textbook RSA to encrypt real data.
Why is e=65537 preferred over e=3 in RSA?
The Discrete Logarithm Problem
The discrete logarithm problem (DLP): given g, gˣ ≡ h (mod p), find x. Exponentiation is easy-O(log x) via square-and-multiply. The inverse direction is believed to be hard for large p, which is the security assumption behind Diffie-Hellman.
**Discrete logarithm problem (DLP):** Given: generator g, value h, prime p. Find: x ∈ {0,...,p−2} with gˣ ≡ h (mod p). **Algorithms and complexity:** - Brute force: O(p)-impossible for 2048-bit p - Baby-step giant-step: O(√p) time and space - Index calculus: sub-exponential L_p[1/2, c] - Shor's algorithm: O(log² p)-quantum only **Secure parameters:** p at least 2048 bits; g a primitive root or of large prime order q.
The elliptic-curve DLP (ECDLP) is harder: no sub-exponential index calculus attack is known. ECDH with 256-bit keys gives ~128-bit security-equivalent to 3072-bit classical DH. That's why TLS 1.3 defaults to X25519.
Baby-step giant-step solves DLP in O(√p). Why is this useless against p ≈ 2²⁰⁴⁸?
Diffie-Hellman and ElGamal
The Diffie-Hellman protocol (1976) was revolutionary: two parties establish a shared secret over a public channel without any prior shared secret. Security rests on the DLP. ElGamal generalization turns DH into a full public-key encryption scheme.
**Diffie-Hellman key exchange:** 1. Public parameters: prime p, generator g 2. Alice picks secret a, sends A = gᵃ mod p 3. Bob picks secret b, sends B = gᵇ mod p 4. Alice computes K = Bᵃ = g^(ab) mod p 5. Bob computes K = Aᵇ = g^(ab) mod p Both get K = g^(ab). An eavesdropper sees g, A=gᵃ, B=gᵇ, p-recovering K requires solving DLP (or CDH). **ElGamal encryption:** - Key pair: x (secret), h = gˣ (public) - Encrypt: choose random k; send (c₁, c₂) = (gᵏ, m·hᵏ) - Decrypt: m = c₂ · (c₁ˣ)⁻¹
Modern deployments use elliptic-curve DH (ECDH) instead of multiplicative-group DH. X25519 (Curve25519)-the TLS 1.3 default-gives 128-bit security with 255-bit keys, eight times more compact than equivalent classical DH while being faster to compute.
An eavesdropper sees p, g, A=g^a mod p, B=g^b mod p. What do they need to compute K=g^(ab) mod p?
Key Ideas
- **RSA:** correctness via Euler's theorem; e=65537 resists small-exponent attacks; CRT speeds decryption 4×
- **DLP:** g^x mod p is easy to compute, hard to invert; BSGS O(√p); no sub-exponential ECDLP algorithm known
- **Diffie-Hellman:** shared secret g^(ab) without transmitting secrets; security = hardness of CDH/DLP
- **ElGamal:** randomized encryption over DH; same message yields different ciphertexts each time
Related Topics
Everything before converges here-in practical cryptographic protocols:
- Fermat's and Euler's Theorems — RSA correctness is a direct consequence of Euler's theorem a^φ(n) ≡ 1 (mod n)
- Post-Quantum Cryptography — RSA and DH are broken by Shor's algorithm; Kyber and lattice schemes replace them
- Primality Tests — RSA key generation requires fast primality testing-Miller-Rabin with 40+ rounds
Вопросы для размышления
- Common-modulus attack: two users share n but have different exponents (e₁,d₁) and (e₂,d₂). If the same message m is encrypted as c₁ and c₂, how can an attacker recover m using only the public keys?
- Unauthenticated DH is vulnerable to a man-in-the-middle attack. How does TLS solve this using certificates, and what number-theoretic operation verifies the certificate signature?
- ECDLP is harder than DLP in ℤₚ*. Why does the absence of a sub-exponential algorithm mean ECC keys can be much shorter than RSA keys at equivalent security?