Robotics
Modern Control Theory
PID is effective but has no internal model of the system - it reacts to error but cannot reason about what will happen five seconds ahead. LQR and MPC do. They use a mathematical model of the plant to compute the optimal sequence of actions. The difference is between a driver watching only the speedometer and a driver who sees the entire road ahead.
- **SpaceX Falcon 9:** MPC for vertical landing with explicit thrust constraints
- **Tesla Autopilot:** LQR/MPC for lane keeping and adaptive cruise control
- **Boston Dynamics Spot:** MPC for balance and locomotion on uneven terrain
- **Industrial process control:** MPC has managed oil refineries since the 1980s
State Space Representation
A PID controller tracks one variable using only its error. For a complex system - an inverted pendulum on a cart, for instance - multiple variables must be tracked simultaneously (position, velocity, angle, angular velocity) and their mutual influence must be captured. **State space** is the mathematical language for this.
Standard form: dx/dt = A*x + B*u, y = C*x + D*u. Here x is the state vector (all system variables), u is the control input, and y is the measured output. Matrices A, B, C, D completely describe the linear dynamics of the system.
In the state space equation dx/dt = Ax + Bu, matrix A describes:
LQR: Linear Quadratic Regulator
The control objective is to bring the system to a desired state while minimising total cost. **LQR** formalises this as an optimisation problem: find the linear feedback law u = -Kx that minimises a quadratic cost functional.
Cost functional J = integral(x^T Q x + u^T R u) dt. Matrix Q penalises state deviation (how much deviating from equilibrium matters). Matrix R penalises control effort (energy cost). The Q/R ratio determines how aggressive the regulator is.
**Tuning Q and R:** the central engineering choice in LQR. A large Q[2,2] (angle penalty) produces aggressive pole stabilisation. A large R produces conservative control with less energy use. The ratio Q/R matters more than absolute values. A common starting point: Q=I, R=I, then adjust.
Increasing matrix R in the LQR cost functional results in:
MPC: Model Predictive Control
LQR is optimal for linear unconstrained systems. Real robots have actuator limits, workspace boundaries, and safety envelopes. **MPC** solves an optimisation problem at each control step, explicitly incorporating all constraints.
At each time step, MPC looks N steps ahead, optimises the control sequence u_0, u_1, ..., u_{N-1}, applies only u_0, then repeats with updated state measurements. This is the **receding horizon** (sliding window) principle.
| Property | PID | LQR | MPC |
|---|---|---|---|
| Constraints (u_min, u_max) | None (manual clip) | None | Explicit |
| Nonlinear systems | Partial | No | Yes (NMPC) |
| Multi-axis control | Difficult | Yes | Yes |
| Computational cost | Low O(1) | Low O(n) | High O(N*n^2) |
| Tuning parameters | 3 gains | Q, R matrices | Q, R + horizon N |
MPC applies only u_0 from the optimised sequence u_0...u_{N-1}. The reason is:
Optimal Control and Pontryagin's Maximum Principle
LQR and MPC are specific instances of optimal control. The theoretical foundation is **Pontryagin's Maximum Principle** (PMP, 1956): necessary conditions for the optimality of any control law, analogous to Lagrange multipliers for variational problems.
Given the problem: minimise J = integral(L(x,u) dt) + phi(x(T)), PMP introduces a **co-state** vector lambda. The optimal control minimises the Hamiltonian H(x, u, lambda) = L(x,u) + lambda^T * f(x,u) at every instant.
**Connection between methods:** PMP provides necessary conditions for optimality. Bellman dynamic programming gives sufficient conditions via the Hamilton-Jacobi-Bellman (HJB) equation. For linear systems with quadratic cost, both methods converge to the same algebraic Riccati equation whose solution is the LQR gain K.
Bang-bang control is optimal for minimum-time problems. It means the control signal:
Modern Control Theory
- State space dx/dt = Ax + Bu: unified language for multivariable dynamic systems
- LQR: optimal linear feedback u=-Kx, minimises J=integral(x^TQx + u^TRu)dt
- K is found by solving the algebraic Riccati equation: A^TP + PA - PBR^{-1}B^TP + Q = 0
- MPC: optimise over N-step horizon with explicit constraints, apply only first control
- PMP: necessary conditions for optimality via Hamiltonian and co-state lambda
- LQR < MPC in flexibility; LQR > MPC in computational cost
Related topics
Optimal control bridges classical control theory with optimisation and modern machine learning.
- PID Controller — Practical predecessor that MPC extends with constraints and optimality
- Kalman Filter — State estimation required before applying LQR/MPC (LQG = LQR + Kalman)
- Reinforcement Learning for robotics — Model-free alternative to MPC for complex nonlinear systems
Вопросы для размышления
- Why did MPC displace LQR in industrial applications despite its higher computational cost?
- LQR assumes a linear system. How is it applied to nonlinear robots in practice?
- What do MPC in control and planning in RL (MCTS, Dreamer) have in common?