Robotics

Modern Control Theory

PID is effective but has no internal model of the system - it reacts to error but cannot reason about what will happen five seconds ahead. LQR and MPC do. They use a mathematical model of the plant to compute the optimal sequence of actions. The difference is between a driver watching only the speedometer and a driver who sees the entire road ahead.

**SpaceX Falcon 9:** MPC for vertical landing with explicit thrust constraints
**Tesla Autopilot:** LQR/MPC for lane keeping and adaptive cruise control
**Boston Dynamics Spot:** MPC for balance and locomotion on uneven terrain
**Industrial process control:** MPC has managed oil refineries since the 1980s

State Space Representation

A PID controller tracks one variable using only its error. For a complex system - an inverted pendulum on a cart, for instance - multiple variables must be tracked simultaneously (position, velocity, angle, angular velocity) and their mutual influence must be captured. **State space** is the mathematical language for this.

Standard form: dx/dt = A*x + B*u, y = C*x + D*u. Here x is the state vector (all system variables), u is the control input, and y is the measured output. Matrices A, B, C, D completely describe the linear dynamics of the system.

In the state space equation dx/dt = Ax + Bu, matrix A describes:

LQR: Linear Quadratic Regulator

The control objective is to bring the system to a desired state while minimising total cost. **LQR** formalises this as an optimisation problem: find the linear feedback law u = -Kx that minimises a quadratic cost functional.

Cost functional J = integral(x^T Q x + u^T R u) dt. Matrix Q penalises state deviation (how much deviating from equilibrium matters). Matrix R penalises control effort (energy cost). The Q/R ratio determines how aggressive the regulator is.

**Tuning Q and R:** the central engineering choice in LQR. A large Q[2,2] (angle penalty) produces aggressive pole stabilisation. A large R produces conservative control with less energy use. The ratio Q/R matters more than absolute values. A common starting point: Q=I, R=I, then adjust.

Increasing matrix R in the LQR cost functional results in:

MPC: Model Predictive Control

LQR is optimal for linear unconstrained systems. Real robots have actuator limits, workspace boundaries, and safety envelopes. **MPC** solves an optimisation problem at each control step, explicitly incorporating all constraints.

At each time step, MPC looks N steps ahead, optimises the control sequence u_0, u_1, ..., u_{N-1}, applies only u_0, then repeats with updated state measurements. This is the **receding horizon** (sliding window) principle.

Property	PID	LQR	MPC
Constraints (u_min, u_max)	None (manual clip)	None	Explicit
Nonlinear systems	Partial	No	Yes (NMPC)
Multi-axis control	Difficult	Yes	Yes
Computational cost	Low O(1)	Low O(n)	High O(N*n^2)
Tuning parameters	3 gains	Q, R matrices	Q, R + horizon N

MPC applies only u_0 from the optimised sequence u_0...u_{N-1}. The reason is:

Optimal Control and Pontryagin's Maximum Principle

LQR and MPC are specific instances of optimal control. The theoretical foundation is **Pontryagin's Maximum Principle** (PMP, 1956): necessary conditions for the optimality of any control law, analogous to Lagrange multipliers for variational problems.

Given the problem: minimise J = integral(L(x,u) dt) + phi(x(T)), PMP introduces a **co-state** vector lambda. The optimal control minimises the Hamiltonian H(x, u, lambda) = L(x,u) + lambda^T * f(x,u) at every instant.

**Connection between methods:** PMP provides necessary conditions for optimality. Bellman dynamic programming gives sufficient conditions via the Hamilton-Jacobi-Bellman (HJB) equation. For linear systems with quadratic cost, both methods converge to the same algebraic Riccati equation whose solution is the LQR gain K.

Bang-bang control is optimal for minimum-time problems. It means the control signal: