Computer Graphics

Linear Algebra for Graphics

Цели урока

  • Compute dot and cross products and understand their geometric meaning
  • Build rotation, scale and translation matrices
  • Understand why TRS order is critical and read matrix chains right to left
  • Work with homogeneous coordinates and 4x4 matrices

Предварительные знания

  • Rasterization and Pixels

Pixar 'Toy Story' 1995: the first feature-length CGI film. 800,000 hours of rendering on a farm of 117 Sun SparcStations. Today: Unreal Engine 5 Nanite - real-time rendering of one billion polygons at 60 FPS. Thirty years of progress, one foundation: the same linear algebra, just running on a GPU.

  • **Unity/Unreal Engine:** every object's Transform is a TRS 4x4 matrix. The GPU applies it to all mesh vertices simultaneously
  • **NVIDIA RTX 4090:** 82.6 teraflops - literally billions of 4x4 matrix multiplications every second
  • **Robotics:** kinematics of a Boston Dynamics arm is a chain of 4x4 matrices, each joint is rotation + translation
  • **AR/VR (Apple Vision Pro):** real-time head tracking drives the view matrix. Latency above 10 ms and the brain rejects the image
  • **NeRF / Gaussian Splatting:** 3D scene representations encode camera poses as transformation matrices - the same 4x4

Ivan Sutherland and the birth of computer graphics

In 1963, MIT graduate student Ivan Sutherland defended his thesis and demonstrated **Sketchpad** - the world's first interactive graphical program. First GUI, first direct manipulation, first use of transformation matrices for 2D objects. Sutherland sketched what lives in every GPU today: object hierarchy plus transformation matrices. He received the Turing Award in 1988.

Vectors: dot and cross product

Pixar, 1995. **Toy Story** - the first feature-length CGI film. 800,000 hours of rendering on a farm of 117 Sun SparcStations. Today, Unreal Engine 5 Nanite runs real-time rendering of one billion polygons on a single GPU. Thirty years - from 800K hours to 60 FPS. The difference comes down to one thing: **linear algebra on the GPU**.

**Vector** in 3D - a triple (x, y, z). Two key products: **dot product** a·b = |a||b|cos(theta) (scalar, reveals the angle between vectors), **cross product** a x b = normal vector to the plane (vector, reveals orientation).

OperationResultUse in graphics
a · b (dot)ScalarPhong shading (cos of angle), frustum culling, FOV
a x b (cross)Vector perp to a and bTriangle normals, backface culling
|a| (length)ScalarDistance between points, normalization
normalize(a)Unit vectorDirections: camera, light source, shaders

**Dot product** answers "how aligned are two vectors" (> 0 - acute, = 0 - perpendicular, < 0 - obtuse). The Phong shader reduces to: brightness = max(0, dot(normal, light)). **Cross product** gives surface orientation - without it there are no normals, no shading, no 3D graphics.

The dot product of two unit vectors equals 0. What does this mean geometrically?

Matrices and Their Multiplication

A vector describes a point or direction. A **matrix** describes a **transformation** - how to change a vector: rotate it, scale it, project it. Multiplying a matrix by a vector applies the transformation. Every vertex in a 3D scene passes through several such multiplications every frame.

**Matrix** NxM - a table of numbers. In graphics: 3x3 for linear transformations, 4x4 for affine (with translation). Multiplying A(m x n) by v(n x 1) produces u(m x 1). The key insight: matrices can be **multiplied together** - a chain of transformations collapses into one.

**Composition of transformations** - matrix multiplication. M1 is rotation, M2 is scale, so M2 @ M1 - first rotate, then scale. One matrix replaces the entire chain. This is not just convenience - it is what allows the GPU to process thousands of vertices in a single shader call.

**Matrix multiplication is NOT commutative!** A x B != B x A. Rotate-then-Scale gives a different result than Scale-then-Rotate. This is the source of hundreds of engine bugs every year.

PropertyTrue?Meaning for graphics
A x B = B x A (commutativity)No!Order of transformations is critical
(A x B) x C = A x (B x C) (associativity)YesPrecompute M = A x B x C in advance
I x A = A (identity)YesI - no transformation (identity)
A x A^-1 = I (inverse)Yes (if it exists)View matrix = inverse(camera TRS)

Matrix M = R x S (R - rotation, S - scale). When applying M x v, what happens FIRST?

Basic Transformations: TRS

Three fundamental 3D transformations: **Translation** (moving), **Rotation** (rotating), **Scale** (scaling). In Unity, Unreal, and Blender, every Transform component is TRS. Every GameObject, every mesh, every camera is described by this triple.

**TRS** - the standard order: Translation x Rotation x Scale. First scale (S), then rotate (R), then translate (T). Read right to left. This is not merely a convention - it is the only order where Scale does not distort Translation.

But **translation** cannot be expressed with a 3x3 matrix. Addition (x + tx, y + ty, z + tz) is not a linear operation. For that, the switch to **homogeneous coordinates** and 4x4 matrices is required.

Transformation3x3 matrix?4x4 matrix?Property
ScaleYesYesLinear
RotationYesYesLinear, preserves lengths
TranslationNo!YesAffine (not linear)
Perspective projectionNo!YesNon-linear (division by w)

An object is scaled by 2 (S), rotated 90 degrees (R), translated by (3,0,0) (T). How should the matrix be written?

Homogeneous Coordinates and 4x4 Matrices

Scale and rotation are linear operations (v' = M x v). Translation is not (v' = v + t). How can everything be unified into a single matrix? **Homogeneous coordinates**: a fourth coordinate **w** is added, and translation becomes a matrix multiplication on 4x4.

**Homogeneous coordinates** - representing a 3D point (x, y, z) as (x, y, z, 1) in 4D. w = 1 for points, w = 0 for directions (vectors). A 4x4 matrix encodes translation, rotation, scale, and projection in a single multiplication - this is exactly what a vertex shader does.

The w = 0 trick for vectors: translation does not affect directions. Physically sound: a surface normal pointing "right" stays pointing "right" regardless of where the object moves. Shaders rely on this for lighting calculations.

w coordinateTypeTranslation affects?Example
w = 1Point (position)YesMesh vertex, camera position
w = 0Direction (vector)NoNormal vector, light direction
w != 0, 1After projection-Perspective divide: (x/w, y/w, z/w)

Every frame of every 3D game involves millions of 4x4 matrix multiplications on 4D vectors. The RTX 4090 delivers 82.6 teraflops. Homogeneous coordinates unify the entire pipeline (TRS + view + projection) into a chain of matrix multiplications. Without this mathematics there is no Cyberpunk, no Avatar, no real-time 3D.

The order of transformations (TRS vs SRT) does not matter - the result is the same

Order is critically important! TRS means: first Scale, then Rotate, then Translate (reading right to left). SRT: first Translate, then Rotate, then Scale. The results are completely different, because Scale also scales the Translation component.

Matrix multiplication is non-commutative: A x B != B x A. With SRT, scaling is applied after translation, distorting the object's position. With TRS, scaling is first, translation is last - and it is not distorted.

Direction vector (1, 0, 0) is stored as (1, 0, 0, 0). What happens when multiplied by translation matrix T?

Key Ideas

  • **Dot product** = cos(angle) times lengths. Foundation of Phong lighting, backface culling, FOV
  • **Cross product** = perpendicular vector. Foundation of normals, triangle orientation
  • **4x4 matrices** in homogeneous coordinates combine TRS + projection into a single multiplication
  • **TRS order is critical:** Scale - Rotate - Translate (read right to left). A x B != B x A
  • **w = 1** for points (translation applies), **w = 0** for directions (translation ignored)
  • **Perspective divide:** dividing by w creates the perspective effect - distant objects appear smaller

Related Topics

Linear algebra is the language of all 3D graphics:

  • Rasterization and Pixels — Transformed vertices in screen space are passed to the rasterizer
  • Coordinate Spaces — TRS, View, Projection matrices are the transitions model->world->view->clip
  • Computational Geometry — Cross product and orientation - shared foundation with comp. geometry

Вопросы для размышления

  • Why are GPUs optimized for 4x4 matrix multiplication rather than arbitrary matrix sizes?
  • Quaternions are often used instead of matrices for rotation. What advantages do they offer?
  • How would rotation around an arbitrary point (not the origin) be implemented using TRS matrices?

Связанные уроки

  • cg-01 — Rasterization and screen space introduced in the previous lesson
  • cg-03 — Coordinate spaces model/world/view/clip follow directly from TRS and 4x4
  • cgeom-01 — Cross product and orientation test - shared foundation with computational geometry
  • la-06-transformations — Full theory of linear maps, change-of-basis and eigendecomposition
  • arch-15-gpu-architecture — GPUs are optimized precisely for 4x4 matrix multiplication on 4D vectors
  • la-05-matrices-intro — Core matrix operations if a refresher is needed
Linear Algebra for Graphics

0

1

Sign In