Computer Graphics

Particle Systems and VFX: Simulation, Rendering, and GPU Instancing

William Reeves invented particle systems in 1983 for a 60-second sequence in Star Trek II: The Wrath of Khan - the Genesis Device explosion. 43 years later, a single Unreal Engine 5 Niagara emitter can simulate 1 million particles entirely on the GPU at 60 Hz. The lifecycle loop Reeves designed has not changed. The hardware running it is 1 billion times faster.

  • Horizon Forbidden West: 200 000 simultaneous fire particles per wildfire scene - entirely GPU-simulated via Niagara compute shaders
  • Unity VFX Graph: node-based GPU particle system adopted by Riot Games for League of Legends next-gen VFX - zero CPU cost for particle update
  • Fortnite: GPU particles for building destruction debris - 50 000 fragments with Verlet physics and collision, spawned in one frame
  • ARKit particle effects: iPhone renders GPU-instanced confetti at 1 million particles in AR at 60 Hz on A17 Pro - mobile GPU instancing in production

Particle Lifecycle: Emitter, Update, and Render

Every fire effect in every game since William Reeves introduced particle systems at Lucasfilm in 1983 (for Star Trek II: The Wrath of Khan) follows the same lifecycle: **Emit** - spawn particles with initial properties. **Update** - simulate physics each frame. **Render** - draw each particle as a billboard, mesh, or ribbon. Kill - remove particles past their lifetime. The cycle runs at 60 Hz for every effect simultaneously.

The **object pool** is mandatory for particle systems. Allocating and garbage-collecting 10 000 particle objects per second causes GC spikes that drop frames. Pre-allocating a fixed pool and recycling dead particles keeps memory stable. The same pattern appears in audio engines (pooled AudioBuffer objects) and physics engines (pooled RigidBody allocations).

Why does the particle update loop iterate backward (`for i from n-1 to 0`) when removing dead particles?

Physics Simulation: Euler and Verlet Integration

Every particle is a tiny physics simulation. Each frame, the velocity is updated by forces (gravity, wind, turbulence) and the position is updated by velocity. The integration method determines accuracy, stability, and cost. Three methods are used in VFX: explicit Euler, semi-implicit Euler, and Verlet.

For VFX turbulence, **curl noise** (Bridson 2007) generates divergence-free velocity fields that look like natural fluid swirling. Sampling 3D Perlin noise and computing its curl gives a vector field with no sources or sinks - particles spiral without clumping at attractors. Smoke, dust clouds, and underwater caustics all use curl noise in Niagara.

**Verlet integration** does not store velocity explicitly. Velocity is reconstructed as `(pos - prevPos) / dt` when needed. This makes constraint solving trivial: adjust `pos` to satisfy the constraint, and the velocity implicitly changes. Cloth simulation (GPU Gems 3 Chapter 29) uses Verlet + constraint relaxation to simulate 10 000-vertex cloth at real-time rates.

Why is semi-implicit Euler more stable than explicit Euler for spring-based particle constraints?

GPU Instancing and Niagara/VFX Graph Architecture

A wildfire scene in Horizon Forbidden West shows 200 000 simultaneous particle instances. Drawing each with a separate draw call - one call per billboard - would saturate the CPU-GPU command buffer. The solution: **GPU instancing** with a single draw call that renders all particles, reading per-instance data from a GPU buffer.

**Niagara** (Unreal Engine 4.20+) and Unity **VFX Graph** both move particle simulation entirely to the GPU compute pipeline. The CPU emits particles by writing initial data to a GPU buffer. GPU compute shaders then run the update loop (Euler integration, collision, curl noise) for all particles in parallel. The render pass reads the same buffer directly. CPU sees zero per-particle cost for update.

Both systems use a **node graph** to describe particle behavior: Emitter nodes (shape, rate), Particle Update nodes (forces, drag, collision), Render nodes (sprite, mesh, ribbon). Niagara adds **Data Interfaces** to sample external data: skeletal mesh surfaces for emission origin, depth buffer for screen-space collision, audio spectrum for music-reactive particles.

Particle systems require complex physics simulation - fire and smoke need accurate fluid dynamics

Real-time VFX deliberately avoids accurate fluid simulation; curl noise, sprite sheets, and empirically tuned parameters produce convincing results at thousands of times lower cost than SPH or grid fluid solvers

A Navier-Stokes fluid solver for a campfire at real-time frame rates would require a 64x64x64 grid updated at 60 Hz - approximately 25 million cell updates per second. A curl-noise particle system produces visually equivalent fire with 5000 particles at a fraction of the compute. Games choose perception over physics accuracy.

What is the main GPU performance advantage of a single instanced draw call over individual draw calls per particle?

Key ideas

  • Particle lifecycle: emit (randomized initial state), update (physics integration, color fade), kill (return to pool)
  • Object pool mandatory: pre-allocate max particles, recycle dead ones to avoid GC pressure at 60 Hz
  • Semi-implicit Euler: update velocity before position - symplectic, energy-stable, one extra line of code over explicit
  • GPU instancing: one draw call for all particles - instance buffer holds position/color/size, vertex shader reads via instance_index
  • Niagara/VFX Graph: GPU compute runs full update loop in parallel - CPU submits emit events only

Related topics

Particle systems integrate with skeletal animation and the post-processing pipeline.

  • Skeletal Animation — Particle emitters attach to skeleton joints for physically grounded VFX placement
  • Post-Processing — Particle additive blending feeds the HDR accumulation buffer - bloom amplifies bright fire and spark pixels

Вопросы для размышления

  • How would a GPU particle system handle collision with the scene depth buffer without reading back to the CPU?
  • Design a ribbon particle system for a sword trail effect - what additional per-particle state is needed beyond position and color?
  • When does GPU particle simulation become slower than CPU - what is the crossover point in particle count and why?

Связанные уроки

  • cg-17 — Particles attach to skeletal joints - fire from a torch bone, dust from foot impacts
  • cg-16 — Particle rendering feeds into the HDR post-processing pipeline for bloom and glow
  • alg-01-big-o — GPU parallel complexity analysis applies to particle update shaders
  • la-03-cross-product
Particle Systems and VFX: Simulation, Rendering, and GPU Instancing

0

1

Sign In