DevOps

Terraform: Infrastructure as Code Basics

2021: a major provider lost their entire prod cluster due to an erroneous `kubectl delete` run by a junior engineer. Recovery took 8 hours because nobody knew the exact infrastructure state before the incident. With Terraform: `terraform apply` - 15 minutes to a full identical restore.

  • **Spotify** manages thousands of microservices through Terraform - each new service gets a standard resource set via internal modules: RDS, S3, IAM roles, monitoring.
  • **GitHub** uses Terraform for AWS infrastructure management: all changes through PRs, automatic plan in CI, apply only after approval - GitOps principles applied to IaC.
  • **Terraform Cloud** lets teams run plan/apply in a managed environment with centralized state and audit log - every infrastructure change is tracked.

Providers: the Bridge Between Terraform and APIs

In 2014 HashiCorp released Terraform with one idea: infrastructure is code that can be versioned, reviewed, and reproduced. A provider is a plugin that knows the API of a specific cloud or service and translates HCL declarations into API calls. Today the Terraform registry contains 3000+ providers.

**Providers:** official (AWS, Azure, GCP, Kubernetes - from HashiCorp or cloud vendors), partner (Datadog, PagerDuty, Cloudflare), community. Provider versioning is critical: `~> 5.0` means >=5.0, <6.0. **Terraform Registry:** registry.terraform.io - the central registry. `terraform init` downloads providers into `.terraform/providers/`. **Provider lock file:** `.terraform.lock.hcl` pins provider hashes - must be committed to Git.

Why should `.terraform.lock.hcl` be committed to Git?

Resources and Data Sources: Building Blocks

A resource in Terraform is a managed infrastructure object: an EC2 instance, S3 bucket, or DNS record. A data source is an object not managed by Terraform but whose data is needed for configuration (an existing VPC, AMI ID, certificate). Dependencies between resources are computed automatically - Terraform builds a graph and applies in parallel where possible.

**Implicit dependencies:** `aws_instance.web` references `aws_security_group.web.id` - Terraform automatically creates the SG before the instance. **Explicit dependencies:** `depends_on = [aws_s3_bucket.data]` - when the dependency is not visible from attributes. **Data sources:** `data.aws_ami.ubuntu.id` - finds the latest Ubuntu AMI without hardcoding IDs. **Lifecycle:** `create_before_destroy = true` - create a new resource before destroying the old one (zero-downtime replacement). `prevent_destroy = true` - protect prod resources from accidental deletion.

An EC2 instance needs to be created in an existing VPC not managed by Terraform. How to get the VPC ID?

State: How Terraform Remembers What It Created

Terraform state is the `terraform.tfstate` file that stores the mapping between HCL resources and real cloud objects. It is not just a cache - without state Terraform does not know what already exists and will create duplicates. State storage is one of the most important architectural decisions in any Terraform project.

**Problems with local state:** no team collaboration (each person has their own state), no locking (race condition on parallel apply), no backups. **Remote state backend:** S3 + DynamoDB (AWS), Azure Blob Storage, GCS, Terraform Cloud. **State locking:** a DynamoDB table with a `LockID` attribute locks state during apply - prevents concurrent changes. **Sensitive data in state:** passwords and secrets are stored in state in plain text - bucket encryption is mandatory. **Commands:** `terraform state list`, `terraform state show aws_instance.web`, `terraform state mv` (rename), `terraform state rm` (remove from state without deleting the resource).

Two engineers simultaneously run `terraform apply` against the same state. What happens with DynamoDB locking configured?

Modules: Reuse and Abstraction

A Terraform module is a package of HCL files with input variables, resources, and output values. Like a function in programming: it hides implementation details, accepts parameters, and returns data. The Terraform registry contains thousands of ready-made modules: `terraform-aws-modules/vpc/aws`, `terraform-aws-modules/eks/aws`.

**Module structure:** `variables.tf` (input parameters), `main.tf` (resources), `outputs.tf` (output values), `versions.tf` (provider requirements). **Module sources:** local path `./modules/vpc`, Git `github.com/org/terraform-modules//vpc`, Terraform Registry `registry.terraform.io/modules/hashicorp/...`. **Best practices:** a module should not define a backend - state is managed in the root module. Input variables should have validation blocks. Outputs should expose everything a caller might need.

If `terraform plan` looks correct, `terraform apply` is always safe

The plan is built from state and configuration, but the real cloud state may differ from state. Resources may have been deleted manually or changed outside of Terraform.

Before a critical apply, run `terraform plan -refresh=true` to synchronize state with reality. In prod, use `prevent_destroy` and mandatory plan review in CI before any apply.

Using the `terraform-aws-modules/vpc/aws` module from the registry. How to pin the module version?

Key Ideas

  • **Providers** translate HCL into cloud service API calls; `.terraform.lock.hcl` pins versions - must be committed to Git.
  • **State** maps HCL resources to real objects. A remote backend (S3 + DynamoDB) is required for teams: history + locking + encryption.
  • **Modules** are reusable packages with inputs/outputs; pin versions explicitly, never use latest. Plan/apply cycle always: review the plan, get approval, apply.

Related Topics

Terraform is the foundation for GitOps and cluster management:

  • GitOps: ArgoCD and Flux — Terraform creates the cluster; GitOps manages deployments into it - two complementary practices
  • CI/CD Pipelines — terraform plan and apply run from CI/CD pipelines with automated checking and approval

Вопросы для размышления

  • How should Terraform workspaces or separate state files be organized for dev/staging/prod?
  • What should be done when state contains secrets (e.g., an RDS password) - how to handle this without hardcoding?
  • How are Terraform modules tested - what tools (terratest, terraform test) and when should they be used?

Связанные уроки

  • os-12-virtualization
  • net-50-cloud-networking
Terraform: Infrastructure as Code Basics

0

1

Sign In