Computer Networks
HTTP: The Language of the Web
When you open google.com, the browser sends a text message 'GET / HTTP/1.1' and receives the page's HTML. This simple text protocol is the foundation of the entire web, from search engines to banking.
- **API development:** understanding methods and status codes is the foundation of REST API
- **Debugging:** curl and DevTools show raw HTTP requests
- **Performance:** correct headers = caching and speed
Предварительные знания
The HTTP Protocol
**HTTP** (HyperText Transfer Protocol) - a text-based application-layer protocol. Runs on top of TCP (port 80) or TLS (port 443). The client sends a request - the server returns a response. Each request is independent - HTTP does not store state between requests.
**Stateless** means: the server does not remember previous requests. Each request contains everything it needs (cookies, tokens). This simplifies scaling - any server can handle any request.
**HTTP versions:** HTTP/0.9 (1991) - GET only. HTTP/1.0 (1996) - headers, methods. HTTP/1.1 (1997) - keep-alive, chunked. HTTP/2 (2015) - binary, streams. HTTP/3 (2022) - QUIC instead of TCP.
Why is HTTP called a stateless protocol?
Request and Response
An **HTTP request** consists of: a start line (method + path + version), headers, and an optional body. An **HTTP response**: start line (version + status code + phrase), headers, and a body. Headers and body are separated by a blank line.
**Content-Length** - body size in bytes. Alternative: `Transfer-Encoding: chunked` - body arrives in parts, size is not known in advance. Used for streaming and dynamic content.
What separates headers and body in an HTTP message?
HTTP Methods
A **method** specifies the action on a resource. GET - retrieve, POST - create, PUT - replace, PATCH - partially modify, DELETE - remove. Methods are divided into safe (don't change data) and idempotent (repeating gives the same result).
**Why is DELETE idempotent?** Deleting a resource 1 time or 10 times - the result is the same: the resource is gone. But POST is not idempotent: each call creates a new resource (e.g., 10 orders instead of one).
Which method is NOT idempotent?
Status Codes
A **status code** is a three-digit number indicating the result of a request. The first digit determines the class: 1xx - informational, 2xx - success, 3xx - redirection, 4xx - client error, 5xx - server error.
**401 vs 403:** 401 Unauthorized means 'who are you?' - you need to log in. 403 Forbidden means 'I know you, but you can't' - access is denied even with authorization. The name 401 is historically inaccurate.
HTTP requests always require a body
GET, HEAD, DELETE are usually without a body; POST, PUT, PATCH - with a body
The method defines the semantics. GET requests data - it has nothing to send. POST creates - it needs to pass data. Technically you can add a body to any request, but it breaks semantics.
The client received code 304. What does this mean?
Key Ideas
- **HTTP = text protocol** over TCP, stateless
- **Request:** method + path + headers + body; **Response:** status + headers + body
- **Methods:** GET/POST/PUT/DELETE - actions; idempotency matters for retries
- **Status codes:** 2xx success, 3xx redirect, 4xx client error, 5xx server error
Related Topics
HTTP is the foundation of web development:
- HTTP headers — More on Content-Type, Cache-Control, Cookies
- HTTPS and TLS — Encrypting HTTP traffic
- TCP — The transport that HTTP runs on
Вопросы для размышления
- Why do REST APIs use different methods instead of just POST?
- How does stateless affect scaling of web applications?
- Why is it important to return correct status codes in an API?