Compilers

Cranelift and alternative backends

Cloudflare Workers run Wasm modules in 5ms. If they used LLVM -O2, compilation would take seconds. Serverless would not work. Cranelift's answer: compile 10x faster at 90% of the performance. And QBE shows that a full production-quality backend can fit in 15,000 lines of C that one person can read end to end. Different tasks call for different backends.

**Wasmtime + Cranelift**: Cloudflare Workers, Fastly Compute@Edge, ByteDance all use Cranelift for Wasm JIT with cold start under 10ms, compared with minutes for Docker containers
**rustc_codegen_cranelift** (experimental): Rust debug builds compile 2-3x faster than with the LLVM backend. A meaningful improvement to the dev loop for large Rust projects
**Python 3.13 copy-and-patch JIT**: about 5% speedup on real benchmarks with minimal compile-time overhead. The first JIT in CPython after 30 years of pure interpretation

Cranelift IR and architecture

Cranelift is a compiler backend written in Rust, developed by the Bytecode Alliance for Wasmtime. Architecture: Cranelift IR (SSA, inspired by sea-of-nodes), ISLE (an instruction selection DSL), the regalloc2 register allocator, and machine code emitters for x86-64, ARM64, RISC-V, and s390x. The goal is safety and fast compilation, not maximum throughput.

Cranelift is used by Wasmtime (the Bytecode Alliance's main Wasm runtime), Wasmer (through the cranelift feature), and rustc_codegen_cranelift (an experimental rustc backend). Cranelift is written in Rust and parts of its codegen are formally verified through ISLE plus an SMT solver. That makes it a safer backend than LLVM for security-critical workloads.

What is the main goal of Cranelift compared to LLVM?

Fast compilation: trade-offs

Cranelift is tuned for JIT scenarios where compilation time matters. At `opt_level=speed_and_size` it produces code with about 90% of LLVM's performance at about 10% of LLVM's compile time. Wasmtime compiles around 100MB/s of Wasm bytecode (Cranelift) vs around 10MB/s (LLVM). That is the difference between a 100ms and a 1s cold start for serverless.

Python 3.13 added a copy-and-patch JIT using LLVM and templates: it copies a pre-compiled instruction template and patches the operands. That delivers about 5% speedup at minimal compile time. Python 3.14+ is considering a more aggressive JIT. JavaScript V8 Sparkplug uses a similar approach: a baseline JIT without optimizations, just mapping bytecode to native code.

Why is Cranelift the choice for Wasmtime instead of LLVM?

QBE: a minimalist backend

QBE (Quick Backend) is a minimalist compiler backend in around 15,000 lines of C. It supports x86-64, ARM64, and RISC-V. Enough for production-quality languages: Cproc (C compiler), cQube (Hare language backend), cc (simple C compiler). QBE IR is simple and readable. The project's goal: show that a backend does not have to be a 7+ million line beast.

QBE was built by Quentin Carbonneaux as an alternative for language authors who do not need all of LLVM. The Hare language (a systems language by sircmpwn/Drew DeVault) uses QBE as its primary backend. QBE supports basic optimizations: SSA-based DCE, copy elision, basic register allocation. For research compilers, QBE is the right balance between complexity and capability.

Which kind of project is QBE the best backend choice for?

GCC backend and libgccjit

The GCC backend (RTL, or Register Transfer Language) is a mature LLVM alternative with 40+ years of optimization work. GCC generates better code for some patterns: FORTRAN HPC, auto-vectorization through Graphite (polyhedral transformations). libgccjit is the public API for embedding GCC as a JIT backend.

Python 3.13 uses a copy-and-patch JIT that internally uses LLVM to generate templates (not for JIT compilation at runtime). GCC Graphite (polyhedral loops) sometimes vectorizes better than LLVM Polly for Fortran/HPC code. GFortran (the Fortran compiler in GCC) still produces more optimal code for some BLAS-like operations than Flang (LLVM Fortran).

LLVM always generates better code than GCC

GCC and LLVM compete: GCC leads for Fortran HPC and some auto-vectorization patterns, LLVM leads for C++/Rust and has a stronger toolchain ecosystem

GCC Graphite (polyhedral loop transformations) gives 5-15% better throughput in several HPC benchmarks for numeric code. Different compilers use different algorithms and heuristics. The comparison depends on the specific code

What is GCC RTL (Register Transfer Language)?

Key ideas

Cranelift: a Rust backend for Wasmtime. 10x faster compilation than LLVM -O2 at 90% of the performance. Ideal for JIT and serverless
QBE: a 15k-line C backend supporting x86-64/ARM64/RISC-V. For research compilers where readability matters more than peak optimization
Backend choice is a trade-off: LLVM (peak optimization), Cranelift (fast JIT compilation), QBE (simplicity), GCC (maturity + Fortran)

Вопросы для размышления

Cranelift compiles 10x faster than LLVM at 90% of the performance. For which production workloads is that 10% performance critical enough to justify slower compilation?
QBE shows a backend can be written in 15k lines. Why then do most new languages (Rust, Zig, Swift) choose LLVM despite its complexity?
Python 3.13's copy-and-patch JIT delivers about 5% speedup. Why not 50%? What Python characteristics make aggressive JIT compilation hard compared with JavaScript?

Связанные уроки

arch-04-cpu