Backend Transport
DNS and TLS: addressing and security
Every HTTPS request begins with two invisible negotiations: a name lookup that may touch four servers across the globe, and a cryptographic handshake that proves the server is who it claims to be. Get either one wrong and the page never loads - or worse, loads from an attacker.
- **Cloudflare 1.1.1.1**: a recursive DNS resolver answering ~1 trillion queries per day, with DNS-over-HTTPS hiding queries from on-path observers.
- **Let's Encrypt**: 350+ million active certificates as of 2024 - free, 90-day, fully automated via the ACME protocol; broke the cost barrier that kept HTTPS optional for a decade.
- **Service meshes (Istio, Linkerd)**: automatic mTLS between every microservice pair, with short-lived SPIFFE identities issued by an internal CA - certificates rotate every hour.
Historical context
In 1983, the Internet still relied on a single HOSTS.TXT file maintained at SRI International and synced manually to all connected machines. Paul Mockapetris at USC Information Sciences Institute proposed a radical alternative in RFC 882 and 883: a distributed, hierarchical database of name records. His insight was to delegate authority so each zone manages its own records, resolvers cache answers by TTL, and no single server bears the full load - the design that still underlies every HTTP request today.
DNS Resolution
**DNS (Domain Name System)** is a hierarchical, distributed database that maps human-readable domain names to IP addresses. Without DNS, every request to `api.github.com` would require knowing the numeric IP in advance. DNS is the phone book of the Internet - but one that is replicated across millions of servers and kept consistent through TTL-based caching.
**DNS TTL and caching**: TTL (Time To Live) defines how many seconds a record should be cached. GitHub uses TTL=60s for fast failover. Most sites use TTL=300-3600s. The catch: when an IP changes, clients that cached the old record keep using it until TTL expires. Best practice - lower TTL to 60s a day before any planned IP change.
The A record for api.example.com is changed to a new IP. TTL = 3600 seconds. How long until all clients see the new IP?
TLS Handshake
**TLS (Transport Layer Security)** is a cryptographic protocol running over TCP that provides three guarantees: **confidentiality** (encryption), **integrity** (tamper detection), and **authentication** (server identity verification). HTTPS is HTTP over TLS. TLS 1.3 (RFC 8446, 2018) is the current standard - it completes a full handshake in 1 round-trip, versus 2 in TLS 1.2.
**Cipher suites in TLS 1.3**: the list was trimmed from 37 options in TLS 1.2 to just 5 in TLS 1.3, all using AEAD (Authenticated Encryption with Associated Data). TLS 1.3 also removed RSA key exchange, requiring forward-secrecy - a compromised private key cannot decrypt past sessions.
TLS 1.3 completes the handshake in 1-RTT, TLS 1.2 in 2-RTT. What made TLS 1.3 faster?
Certificates and PKI
A **TLS certificate** is a signed data structure binding a public key to a domain name. The signature comes from a **Certificate Authority (CA)** - an organization that browsers and OSes trust by default. The chain of trust: Root CA signs Intermediate CA, Intermediate CA signs leaf certificate. Browsers ship with ~150 trusted root CAs pre-installed.
**Certificate Revocation**: if a private key is compromised, the cert must be revoked before expiry. Two mechanisms: CRL (Certificate Revocation List - a file of revoked serial numbers) and OCSP (Online Certificate Status Protocol - real-time check). OCSP Stapling lets the server pre-fetch and cache the OCSP response, avoiding extra round-trips for clients.
A Let's Encrypt certificate has expired. What happens when a browser makes an HTTPS request to the site?
mTLS: mutual authentication
Standard TLS authenticates only the **server** - the client verifies the server's certificate but presents none itself. **mTLS (mutual TLS)** adds client authentication: the server also verifies a certificate presented by the client. This enables service-to-service trust in microservice architectures where both sides must prove identity.
**Service meshes automate mTLS**: Istio, Linkerd, and Consul Connect implement mTLS transparently - each sidecar proxy handles certificate issuance and rotation without any changes to application code. The internal CA (like SPIFFE/SPIRE) issues short-lived certificates (hours, not years) to minimize the blast radius of a compromised key.
Payment-Service calls Fraud-Detection-Service over HTTPS. Is standard TLS sufficient for secure service-to-service communication?
Key ideas
- **DNS resolution**: recursive resolver walks root → TLD → authoritative; caches by TTL. Lower TTL before any planned IP change.
- **TLS 1.3 handshake**: 1-RTT thanks to key_share in ClientHello; AEAD-only cipher suites; forward secrecy is mandatory.
- **Certificate chains**: leaf signed by intermediate signed by root; OCSP stapling avoids extra round-trips for revocation checks.
- **mTLS**: both sides present a certificate - the canonical service-to-service authentication in zero-trust networks.
Вопросы для размышления
- DNS over HTTPS hides queries from network operators but routes everything through one resolver. What centralization risks does this introduce, and how do techniques like Oblivious DNS-over-HTTPS address them?
- TLS 1.3 0-RTT enables faster repeat connections but is vulnerable to replay attacks. Which kinds of HTTP requests are safe to send as 0-RTT data, and which must wait for the full handshake?
- Let's Encrypt issues 90-day certificates because automation forces good hygiene. What operational changes does a team need to make when moving from 1-year to 90-day certificate lifetimes?
Связанные уроки
- bt-01-overview — Transport stack overview is required before DNS and TLS
- bt-03-serialization — Serialization formats are used across network protocols
- web-04 — Fetch API relies on DNS resolution and TLS for every HTTPS request
- devops-04 — Container networking uses DNS for service discovery between containers
- se-04 — PKI certificate chains mirror Dependency Inversion - depend on abstractions (CAs), not concrete servers
- cloud-04 — EC2 instances register DNS records so services can discover them by name
- sec-01
- web-01
- devops-05
- ds-04-consistent-hashing
- net-23-https-tls