Computer Graphics
Graphics Engine Architecture
Unreal Engine 5 renders Lumen GI, Nanite geometry, and ray tracing at the same time on an RTX 4090 at 60 fps in 4K. Behind that lie years of engineering: RDG automates memory aliasing, parallel command buffers use every CPU core, and RHI abstracts D3D12, Vulkan, and Metal. Renderer architecture is what scales from indie to AAA.
- **id Software DOOM Eternal** holds 60 fps everywhere from Switch to RTX 3090 thanks to aggressive RHI abstraction and platform-specific paths for each GPU vendor
- **Filament (Google)** is an open-source physically-based renderer for Android, iOS, and Web. It powers Google Maps 3D, Google Poly, and most Google AR experiences
- **bgfx** is a cross-platform renderer by the author of Minecraft. It supports 12 APIs (from D3D9 to WebGPU) through a single API, used by dozens of indie engines
- **Apple Game Porting Toolkit** translates D3D12 calls into Metal at runtime, letting Windows games run on Mac without recompilation. A vivid example of what API abstraction enables
Renderer Architecture
**The renderer is the core of a graphics engine.** Its architecture is modular: RenderGraph (dependency graph between passes), RHI (Rendering Hardware Interface, an abstraction over GPU APIs), Material System (shaders and parameters), Lighting System (GI, shadows), and PostProcess (after rendering). Each layer is independent. The shadow algorithm can be swapped without touching the material system.
Render Dependency Graph (RDG, also called Frame Graph) is the modern approach that replaced manual render-target management. RDG: describe passes and their resource dependencies, and the system automatically places resources in memory with aliasing (different passes share memory when their lifetimes do not overlap), inserts barriers, and compiles the graph for a specific GPU. Unreal RDG, Vulkan Render Pass, and Metal Tile Shading are all built on this pattern.
Render Dependency Graph performs automatic memory aliasing between passes. What does that mean?
Multithreaded Rendering
**Rendering on a modern CPU** spans several threads: Main Thread (game logic, physics), Render Thread (preparing rendering commands, culling), RHI Thread (translating high-level commands into API calls), and GPU Driver Thread. Three frames are in flight at once: the Render Thread prepares frame N+1 while the GPU renders frame N.
Parallel Command Buffer Recording: Vulkan and Metal allow recording commands in parallel on different CPU threads. The game splits visible objects across threads, each thread records a Secondary Command Buffer, and the primary buffer executes all of them at the end. On an 8-core CPU: 8x parallel command building. id Tech 7 (DOOM Eternal) leans on parallel recording for minimal CPU frame time.
Triple Buffering (3 frames in flight) increases input latency. When is single-threaded rendering without buffering the right call?
Graphics Asset Pipeline
**The asset pipeline** transforms source formats (PBR textures in PNG or EXR, meshes in FBX, shaders in HLSL) into runtime-ready formats. Key tasks: texture compression (BC7/ASTC), mip generation, mesh optimization (vertex cache optimization, LOD generation), and shader compilation (HLSL to DXIL/SPIRV/MSL). The whole flow is automated by a content build system.
Shader permutation explosion is a real headache: a single PBR material shader carries hundreds of #ifdef flags (HAS_NORMAL_MAP, TRANSPARENT, SKINNED, INSTANCED, SHADOW_CAST, etc). Ten boolean flags equal 1024 permutations, each compiled separately. The PSO (Pipeline State Object) cache in D3D12 and Vulkan: compile once, load at startup. Unreal uses ShaderCompileWorker for parallel compilation at startup.
Texture compression formats BC7 (PC) and ASTC (mobile) are used instead of PNG or JPG. Why cannot PNG be used directly on the GPU?
GPU API Abstraction (RHI)
**Rendering Hardware Interface (RHI)** is an abstraction layer over GPU APIs. Unreal Engine supports D3D11, D3D12, Vulkan, Metal, and OpenGL through a single RHI API. The developer writes rendering once via RHI and the engine translates it to the right API. Alternatives: bgfx (cross-platform renderer) and WGPU (WebGPU, Rust, web plus native).
WebGPU is the next generation of web graphics APIs, replacing WebGL. Exposed via wgpu (Rust) and Dawn (Google C++). Design: compute shaders, explicit resource management (like Vulkan or D3D12), but memory-safe. Firefox, Chrome, and Safari are rolling out native support. Unreal Engine 5.4 added a WebGPU experimental backend. Babylon.js already supports WebGPU.
Vulkan is always faster than OpenGL, so always switch to Vulkan
Vulkan is faster when used properly: explicit resource management, parallel command recording, minimal driver overhead. Used incorrectly, it can be slower than OpenGL because of management overhead
Vulkan provides control, not automatic speedups. The OpenGL driver handles many tasks for you (resource tracking, barrier insertion, PSO creation) at the cost of CPU overhead. Vulkan requires doing these manually. Done well, that is faster; done poorly, slower
Unreal Engine RHI translates calls into D3D12 on Windows and Metal on macOS. What limitation does this abstraction carry?
Key ideas
- **RDG/Frame Graph:** automatic memory aliasing, barrier insertion, async compute scheduling; describe dependencies and the system optimizes
- **Multithreading:** triple buffering (3 frames in flight) for throughput; parallel command recording; VR needs a single-threaded path for latency
- **Asset Pipeline:** texture compression (BC7/ASTC) for hardware decode; mesh optimization; PSO cache for shader permutations
- **RHI:** unified API over D3D12, Vulkan, and Metal; lowest common denominator; platform-specific features via extensions
Related topics
Graphics Engine Architecture brings the CG stack together:
- Particle Systems and VFX — Particle rendering is a dedicated pass in the frame graph; GPU particles use async compute to overlap with the main rendering
- Animation: skeletal and morph — GPU skinning is part of the rendering pipeline before rasterization; skinned meshes are evaluated in a pre-pass compute shader
- Graphics in interviews (FAANG) — Architectural questions on renderer design are standard on senior graphics engineer interviews at NVIDIA, AMD, Apple, and Epic
Вопросы для размышления
- Render Dependency Graph automatically inserts GPU barriers (pipeline barriers in Vulkan, resource transitions in D3D12). How does RDG know which barriers are needed, and why does manual barrier management so often lead to suboptimal results (overly broad barriers) or race conditions (missed barriers)?
- Apple Metal offers Tile Shading: compute happens directly on tile memory (128 KB on-chip) between rasterization passes without a roundtrip to main VRAM. How does this change deferred shading architecture for mobile GPUs with tile-based rendering?
- Shader permutation systems: a single Material in Unreal can generate 1000+ HLSL permutations (skinned/static, shadow/no-shadow, transparent/opaque, etc). How do modern engines balance startup compilation time, runtime compilation latency, and shader cache storage size?