DevOps
Docker Compose and multi-container
ML stack: FastAPI + Redis + PostgreSQL + Celery + Flower. Manually - 5 terminals, 5 commands, strict startup order, half the team cannot remember the flags. Compose: one file, `docker compose up`. 30 seconds - the entire stack is running. `docker compose down` - entire stack stopped. This is not a deployment tool - it is a tool that makes development environments reproducible.
- **ML stack**: FastAPI + Celery + Redis + PostgreSQL up in 30 seconds for any team member without the 'how do I install this' conversation
- **CI/CD**: GitHub Actions runs tests against a real PostgreSQL through Compose - not a mock, not SQLite
- **Team of 5**: identical environment for everyone without 'works on my machine'. README has one command: `docker compose up`
- **LLM inference**: vLLM + Redis (rate limiting) + Nginx with a single command - reproducible on any server with Docker
Compose file
**2014.** Orchard released Fig - a tool for running multi-container applications via a single YAML file. Docker acquired Fig and renamed it Compose. One file replaced pages of startup documentation. `docker-compose.yml` describes services, networks and volumes declaratively: not _how to run_ but _what should be running_. Compose reads the file and figures out ordering, networking and mounts itself.
**Secrets in .env**: a `.env` file next to `docker-compose.yml` is picked up automatically. Variables are interpolated via `${VAR_NAME}` syntax in the compose file. Add `.env` to `.gitignore` - keys and passwords stay out of the repository.
**Bind mount vs Named volume**: `./models:/models` is a bind mount - a host directory mapped into the container (good for development, hot reload). `redis_data:/data` is a named volume - Docker manages it entirely (for production persistent data). Both types can coexist in the same compose file.
A compose file defines service `api` with `ports: ["8000:8000"]` and service `redis` without `ports`. What does this mean?
Networking in Compose
Compose automatically creates a bridge network for the project and connects all services to it. Inside this network **built-in DNS** resolves service names to container IPs: a `backend` container reaches the database simply as `postgres`, the cache as `redis`. The service name in the compose file is the hostname. No IP addresses, no `--link`, no manual `/etc/hosts`. This is why `localhost` inside a container is the wrong address for another service.
**localhost trap**: inside a container, `localhost` or `127.0.0.1` points to the loopback of that container itself - not the host machine and not another service. A common mistake when copying `.env` from local development into Docker: `DATABASE_URL=localhost:5432` stops working. Rule: inside Docker use the service name; outside Docker use localhost.
Service `api` in Compose tries to connect to PostgreSQL via `localhost:5432`. PostgreSQL runs as service `db` in the same Compose file. What happens?
depends_on and ordering
**depends_on does not wait for a service to be ready.** It waits for the container to start - these are fundamentally different events. A PostgreSQL process starts in 0.1 seconds but accepts connections only after 2-3 seconds of WAL initialization and recovery. A service with `depends_on: [db]` will start as soon as the db container reaches Running state, but may not yet be able to connect to the database. Without a healthcheck, retry logic must live in the application itself.
**condition: service_completed_successfully** - a special variant for one-shot services (migrations, seed data). Waits for exit code 0. If the migration exits with an error, dependent services will not start.
Service `api` has `depends_on: [db]` with no condition. PostgreSQL takes 3 seconds to fully initialize. What happens?
Health checks
**HEALTHCHECK** is a directive that periodically probes whether a service is working or stuck. Docker runs the test command inside the container and checks the exit code: 0 means healthy, 1 means unhealthy, 2 is reserved. The status affects `depends_on` with `condition: service_healthy`, restart policies, and orchestrators (Swarm, Kubernetes). Checks are written for the specific service: PostgreSQL uses `pg_isready`, Redis uses `redis-cli ping`, HTTP APIs use `curl`.
**start_period**: time after container startup during which failed checks do not count - the container stays in starting state instead of transitioning to unhealthy. Critical for services with slow initialization: loading ML models, JVM warmup, first migration run.
**Compose vs Dockerfile HEALTHCHECK**: healthcheck can be defined in the Dockerfile (`HEALTHCHECK CMD ...`) or in the compose file. The compose version overrides the Dockerfile version. For development it is convenient to override with less strict parameters (interval: 2s instead of 30s).
depends_on + healthcheck guarantees services start in strict order: db, then migrator, then api
depends_on + healthcheck only guarantees a dependent service does not start before its dependency reaches healthy status. Independent services start in parallel - their relative order is undefined
Compose builds a dependency graph and starts independent branches in parallel for speed. If worker-a and worker-b both depend on db, they both wait for db healthy, then start simultaneously. For strict sequential ordering an explicit chain is required: a depends_on b depends_on c.
Two independent services - `worker-a` and `worker-b` - both have `depends_on: db: condition: service_healthy`. What does this configuration guarantee?
Key ideas
- **Compose file** - declarative stack description: services/networks/volumes in one YAML. `.env` for secrets, bind mounts for dev, named volumes for production data
- **Docker DNS**: services address each other by service name (`redis:6379`, not `localhost:6379`). localhost inside a container is the container's own loopback
- **depends_on** without condition - only controls container start order, not readiness. For readiness - use healthcheck + `condition: service_healthy`
- **HEALTHCHECK** - periodic probe via exit code: pg_isready, redis-cli ping, curl. start_period prevents false unhealthy during slow initialization
Related topics
Docker Compose is the bridge between local development and production infrastructure:
- Docker basics — Compose operates Docker primitives: images, volumes, networks
- Kubernetes — Production-grade orchestration built on the same concepts as Compose
- Serverless / Lambda — Alternative packaging model without long-running containers
- Database replication — Classic multi-container scenario: primary + replica in one Compose file
- LLM API integration — vLLM + Redis + backend - typical ML Compose stack with rate limiting
Вопросы для размышления
- A startup team uses docker compose up for local development. The tech lead proposes using the same Compose file to deploy to a production server. What specific problems does this approach create?
- Service `trainer` must start only after both `db` and `migrator` have completed successfully. How would this dependency be described in a Compose file, and why is a simple `depends_on: [db, migrator]` not sufficient?
- The team reports that `docker compose up` produces different behavior for different developers - some get an initialized database, others do not. How would the initialization ordering problem be solved reproducibly?