Computer Graphics

Triangle Rasterization

1996. id Software releases Quake. For the first time in gaming history: a fully 3D world, perspective-correct texture mapping, real polygonal models instead of sprites. Michael Abrash and John Carmack wrote a software rasterizer fast enough to push 320x200 pixels at 30fps on a 486. Every triangle: barycentric coordinates, z-buffer, texture lookup - thousands of times per second. A modern GPU does this for 100 million triangles in 16 milliseconds. The algorithm is identical - just in silicon.

  • **Unreal Engine 5**: Nanite virtual geometry renders scenes with trillions of polygons through hierarchical GPU rasterization
  • **Three.js/WebGL**: every browser 3D project uses these exact algorithms through the GPU pipeline in fragment shaders
  • **NVIDIA OptiX**: hybrid rendering - rasterization for primary visibility, ray tracing for shadows and reflections

Historical context

In 1974, Ed Catmull - future president of Pixar - completed his PhD dissertation at the University of Utah describing the z-buffer algorithm. At the time it was purely academic: an SGI workstation cost $50,000. Concurrently, Henri Gouraud developed smooth shading (1971) and Bui Tuong Phong developed specular shading (1973) at the same institution. The University of Utah in the 1970s was the global center of computer graphics research - Jim Clark (Silicon Graphics founder) and Alan Kay (Smalltalk) both studied there. The SGI Reality Engine (1993) made z-buffering a hardware feature. The GeForce 256 (1999) put hardware T&L on consumer cards; the GeForce 3 (2001) added programmable vertex and pixel shaders. Catmull received the Turing Award in 2019.

Scanline: row-by-row rasterization

0

1

Sign In

The scanline algorithm rasterizes a triangle by sweeping horizontal lines from top to bottom. For each row Y, it computes the left (x_left) and right (x_right) edge intersections, then fills every pixel between them. The triangle is split into two sub-triangles at the middle vertex. This sequential row-fill gives excellent cache locality on CPU - pixels written left-to-right match memory layout.

Scanline is efficient for CPU rasterization: sequential horizontal fill matches cache line layout. GPU rasterizers use a different approach - **barycentric coordinates** tested in parallel over all pixels of the bounding box simultaneously. A SIMD shader evaluating 32 pixels at once beats scanline's sequential logic, even though it tests pixels outside the triangle and discards them.

Why do GPU rasterizers use bounding-box + barycentric testing instead of the scanline algorithm?

Barycentric Coordinates

Barycentric coordinates (λ₁, λ₂, λ₃) for point P relative to triangle (A, B, C): each coordinate is the ratio of the sub-triangle area to the total area. Key properties: λ₁ + λ₂ + λ₃ = 1. A point lies inside the triangle if and only if all three coordinates are in [0, 1]. This sum-to-one constraint is what makes barycentric coordinates a universal interpolation tool.

Barycentric coordinates are not just a point-in-triangle test - they are a **universal interpolation engine**: color, UV coordinates, normals, any vertex attribute interpolates smoothly as λ = l1*attr0 + l2*attr1 + l3*attr2. At vertex v0: λ₁=1, λ₂=λ₃=0, so the attribute equals a0 exactly. This is the mathematical foundation of Gouraud shading and texture mapping.

What property of barycentric coordinates guarantees correct attribute interpolation inside a triangle?

Z-Buffer: depth-based visibility

The z-buffer (depth buffer) solves the visibility problem: when multiple triangles project to the same pixel, render the nearest one. The algorithm keeps a per-pixel depth value initialized to infinity. For each rasterized fragment: if its interpolated Z is smaller than the stored depth, update both color and depth. Fragments further away are discarded without touching the framebuffer.

**Z-fighting**: two triangles with nearly identical Z values cause flickering due to floating-point precision limits. Solutions: polygon offset (glPolygonOffset), reversed z-buffer (store 1/z instead of z - float precision is denser near zero, improving near-object accuracy), or shadow bias for shadow mapping. **Early Z-test** on GPU: depth check runs before the fragment shader to discard invisible fragments without executing expensive shading.

Why does a reversed z-buffer (storing 1/z instead of z) provide better visibility precision?

Perspective-Correct Attribute Interpolation

Linear barycentric interpolation in screen space is incorrect under perspective projection. The problem: the perspective divide (dividing by w) introduces a nonlinear distortion. A texture applied to a floor plane using screen-space linear UV interpolation visibly slides and warps as the camera moves. The fix is **perspective-correct interpolation**: interpolate attr/w, then multiply by the fragment's 1/w.

Perspective-correct interpolation was first implemented correctly on SGI workstations in 1993. Before that, 3D games (Doom, 1993) used linear screen-space interpolation - the infamous texture 'warping' visible on large surfaces. Quake (1996) was the first mass-market product with perspective-correct texture mapping on consumer hardware. Modern GPUs hardwire this computation at no extra cost.

Rasterization is obsolete - all modern renderers use ray tracing

Rasterization dominates real-time rendering (games, VR) due to speed; ray tracing is used for offline rendering and hybrid approaches (DXR, NVIDIA RTX) in games

Rasterization scales as O(n_triangles * avg_pixels_per_triangle); ray tracing as O(n_pixels * ray_depth * BVH_cost). For 60fps at 4K, rasterization is 10-100x faster

Why is linear UV interpolation in screen space incorrect under perspective projection?

Key ideas

  • **Scanline**: row-by-row rasterization with edge interpolation - CPU-efficient, poorly parallelizable
  • **Barycentric coordinates**: λ₁+λ₂+λ₃=1, inside-test and attribute interpolation for color, UV, normals
  • **Z-buffer**: per-pixel depth, closest fragment wins; z-fighting solved by polygon offset or reversed-Z
  • **Perspective-correct interpolation**: interpolate attr/w, divide by 1/w - eliminates texture warping under perspective

Вопросы для размышления

  • How does early Z-test (depth culling before the fragment shader) reduce GPU load, and under what conditions does it fail to activate?
  • Why does Gouraud shading (color interpolation) produce poor specular highlights compared to Phong shading (normal interpolation)?
  • How does MSAA (Multisample Anti-Aliasing) run multiple Z-tests per pixel without running the fragment shader multiple times?

Связанные уроки

  • cg-03 — Ray tracing - the alternative visibility algorithm that rasterization trades quality for speed against
  • cg-05 — Texture mapping and shading models build directly on barycentric interpolation established here
  • cgeom-04 — Barycentric coordinates appear identically in Delaunay triangulation and in rasterization - same math, different domain
  • cg-02 — Perspective projection from cg-02 produces screen-space coordinates that rasterization converts to pixels
  • cg-06 — Shadow mapping and deferred shading pipelines rely on the z-buffer mechanics covered here
  • la-03-cross-product
Triangle Rasterization