PostgreSQL

Installation and Configuration

Цели урока

  • Install PostgreSQL on Ubuntu, macOS, or RHEL and understand what initdb does
  • Configure postgresql.conf using SHOW/SET/ALTER SYSTEM
  • Set up pg_hba.conf for secure access control
  • Tune five key performance parameters for specific hardware
  • Run production-ready PostgreSQL in Docker with volumes and healthchecks

Предварительные знания

  • PostgreSQL: History and Architecture

A developer installs PostgreSQL, starts the application, and everything works. A month later, traffic grows - queries slow down, connections time out, disk I/O hits 100%. The problem is not PostgreSQL itself; it is that shared_buffers at 128 MB on a server with 32 GB of RAM is a lawnmower engine in a Ferrari.

  • **A startup on Heroku** experiences a 10x traffic spike overnight - memory parameters must be reconfigured in minutes without downtime
  • **A DevOps engineer** provisions PostgreSQL in Kubernetes - docker-compose, volumes, health checks, custom configuration
  • **A developer** receives `FATAL: no pg_hba.conf entry` when connecting from a new IP - understanding pg_hba.conf is the only path forward
  • **A team** migrates to a new server - forgot to transfer pg_hba.conf, now nobody can connect
  • **A production incident** - work_mem = 512 MB with 200 concurrent complex queries consumed all RAM and the server went down

Michael Stonebraker and the POSTGRES Project

Stonebraker launched POSTGRES at UC Berkeley in 1986 as a successor to Ingres. The core idea was extensibility: user-defined types and rules that no other database offered. In 1996 the project went open-source and added SQL, renaming to PostgreSQL. Stonebraker received the Turing Award in 2014. Today PostgreSQL is the only database in the top five not owned by a corporation.

Installing PostgreSQL

Getting PostgreSQL up and running takes five minutes. A careless installation can cost hours of debugging later. This section covers installing PostgreSQL on any OS, what initdb actually does, and how to manage multiple versions side by side.

**initdb** is the command that creates a database cluster: the directory structure, system catalogs, and configuration files. On Ubuntu/Debian, initdb runs automatically during package installation. On RHEL/Fedora, it must be invoked manually.

**On Ubuntu, multiple clusters can coexist** on the same machine. The pg_lsclusters utility lists all installed clusters:

**pgenv** is a PostgreSQL version manager analogous to nvm for Node.js. Particularly convenient during development when testing against multiple versions: `pgenv install 17.2 && pgenv use 17.2`

What does the initdb command do?

postgresql.conf - the Main Configuration File

**postgresql.conf is the control center of PostgreSQL.** It hosts hundreds of parameters - from the address the server listens on, to buffer sizes and query planner behavior. The file resides in the cluster directory (`/var/lib/postgresql/17/main/postgresql.conf` on Ubuntu).

**There are three ways to change parameters** - and each has its own scope:

**ALTER SYSTEM writes to postgresql.auto.conf**, not to postgresql.conf. Parameters in auto.conf take precedence. If postgresql.conf is edited manually but the value does not apply, check whether it is being overridden in auto.conf.

**listen_addresses = 'localhost'** means PostgreSQL accepts connections only from the local machine. To allow external access, change it to `'*'` or a specific IP - but always configure pg_hba.conf accordingly to maintain security.

ALTER SYSTEM SET work_mem = '64MB' was executed. When does the change take effect?

pg_hba.conf - Who Can Connect

**pg_hba.conf (Host-Based Authentication)** is PostgreSQL's firewall. It determines who can connect, from where, and by what method. Even with a correct password, a connection is rejected if no matching rule exists in pg_hba.conf. This is why `FATAL: no pg_hba.conf entry for host` is one of the most common errors on a first deployment.

MethodDescriptionWhen to use
trustNo password - unconditional trustLocal development only. NEVER in production.
peerOS username must match PostgreSQL usernameLocal Unix sockets (sudo -u postgres psql)
md5Hashed password (legacy)Legacy compatibility only
scram-sha-256Modern authentication (since v10)Production - the de-facto standard
certClient SSL certificateMaximum security, no passwords
rejectReject the connectionExplicit denial for specific users or IP ranges

**Rule order is critical!** PostgreSQL evaluates records top to bottom and applies the FIRST matching rule. If `reject` appears after `trust` for the same address range, the reject rule will never be reached.

**A common first-deployment error:** `FATAL: no pg_hba.conf entry for host "192.168.1.5"`. The client's IP address does not match any rule in pg_hba.conf. Add a rule for that IP or subnet.

pg_hba.conf contains: `host all all 0.0.0.0/0 trust`. What does this mean?

Key Configuration Parameters

Of PostgreSQL's 300+ parameters, five account for 90% of performance. PostgreSQL ships with conservative defaults intentionally - they are sized for a machine with 512 MB of RAM. On a server with 32 GB, that means shared_buffers = 128 MB uses less than 1% of available memory. The result: the server reads from disk constantly instead of from cache.

ParameterDefaultRecommendationWhat it does
shared_buffers128 MB25% RAMSize of the main buffer cache. All data passes through it.
effective_cache_size4 GB50-75% RAMA hint to the planner about available OS page cache. Does not allocate memory!
work_mem4 MB32-256 MBMemory per sort/hash operation. Caution: multiplies across operations and sessions.
maintenance_work_mem64 MB512 MB - 2 GBMemory for VACUUM, CREATE INDEX, ALTER TABLE.
max_connections10050-200Maximum concurrent connections. Each connection uses ~5-10 MB of RAM.

**work_mem is the most deceptive parameter.** A single complex query can consume work_mem multiple times - once per sort or hash operation. With work_mem = 256 MB and a query containing four sort operations, that is 1 GB of RAM for a single query. Multiply by 50 concurrent queries and the total reaches 50 GB.

**effective_cache_size does not allocate memory!** It is a hint for the query planner: "when choosing between an Index Scan and a Sequential Scan, assume that ~12 GB of data may already be cached by the OS." Increasing this value makes the planner more willing to use indexes.

**pgtune** (https://pgtune.leopard.in.ua/) is an online PostgreSQL tuning calculator. Enter server specifications and workload type (OLTP/OLAP/Mixed) to receive optimized values for all key parameters.

A server has 32 GB RAM, SSD storage, and serves an OLTP workload. What is the optimal value for shared_buffers?

PostgreSQL in Docker

**Docker is the fastest way to run PostgreSQL for development.** One command, a working server in ten seconds. There are, however, a few traps worth knowing about before starting - the kind that silently destroy data.

**Without a volume, all data will be lost!** The container stores its data internally. Running `docker rm pg-dev` will permanently destroy every database, table, and row. Always mount a volume for the `/var/lib/postgresql/data` directory.

**shm_size** is an important Docker parameter. By default, Docker allocates only 64 MB of shared memory, but PostgreSQL may require more - especially when shared_buffers is increased. The error `could not resize shared memory segment` is the signal to increase shm_size.

**For Mac M1/M2:** when building a Docker image for deployment on an x86_64 server, always specify the target platform: `docker build --platform linux/amd64`. Without this flag, the result is an `exec format error` on the server.

PostgreSQL's default settings are fine for production - just install and run

The defaults are tuned for minimal resources: shared_buffers=128MB, work_mem=4MB, max_connections=100. Production deployments require tuning these values to match the actual server hardware.

PostgreSQL intentionally ships with conservative defaults so it can start even on a machine with 512 MB of RAM. On a server with 32 GB, those defaults use less than 1% of available capacity. Without tuning, shared_buffers is so small that PostgreSQL reads from disk constantly, and a tiny work_mem causes sort operations to spill to disk.

PostgreSQL was started with `docker run` without a `-v` volume flag. What happens when `docker rm` is run?

Key Takeaways

  • **Installation:** apt/brew + initdb creates the cluster. Use pg_lsclusters to manage multiple versions on Ubuntu
  • **postgresql.conf** is the control center. SHOW/SET to inspect values, ALTER SYSTEM for persistent changes, pg_reload_conf() to apply without a restart
  • **pg_hba.conf** is PostgreSQL's firewall. Rule order matters: the first match wins. Use scram-sha-256 in production, trust only locally
  • **Five key parameters:** shared_buffers (25% RAM), effective_cache_size (75% RAM), work_mem (handle with care), maintenance_work_mem, max_connections
  • **Docker:** always mount a volume, use healthcheck, pass parameters via command, shm_size >= 128 MB
  • shared_buffers = 128 MB on 32 GB RAM - that is 0.4% utilization of the server's capacity

Related Topics

PostgreSQL configuration is the starting point for tuning and security:

  • Data Types — Next step: what data types PostgreSQL offers and how to choose among them
  • Memory Tuning — shared_buffers, work_mem and other memory parameters in detail
  • Security — Advanced pg_hba.conf configuration, SSL, and Row-Level Security

Вопросы для размышления

  • With 100 microservices each having a connection pool of 10 - will max_connections = 100 be sufficient? What needs to change?
  • Why is the recommended shared_buffers 25% of RAM rather than 75%? Where does the rest of the memory go?
  • After migrating to a new server, pg_hba.conf was not transferred. What error appears, and how is it fixed?

Связанные уроки

  • pg-01-intro — PostgreSQL architecture - context for understanding configuration
  • pg-03-data-types — Next step: PostgreSQL data types
  • pg-43-tuning-memory — Deep dive into shared_buffers, work_mem, buffer cache
  • pg-48-security — Advanced pg_hba.conf, SSL, Row-Level Security
  • pg-45-tuning-connections — max_connections, PgBouncer - connection pooling in production
  • db-01-intro — Foundational understanding of why a DBMS is needed
  • db-03-acid
Installation and Configuration

0

1

Sign In