Register Allocation
The compiler uses a linear-scan register allocator to assign frequently-used SSA values to physical callee-saved registers (r9-r12), reducing memory traffic.
Overview
Every LLVM SSA value gets a dedicated 8-byte stack slot (the baseline). The register allocator improves on this by keeping hot values in registers across block boundaries and loop iterations.
Eligibility
- Only functions with loop back-edges are considered (loop-free functions skip allocation)
- Values must have ≥3 uses (
MIN_USES_FOR_ALLOCATION) - Live intervals are computed from use-def analysis with loop extension
Available Registers
Callee-saved registers r9-r12, minus those used for incoming parameters:
- A function with 2 parameters uses r9-r10 → r11-r12 are available for allocation
- In non-leaf functions, registers needed for outgoing call arguments are also reserved
Allocation Strategy
- Build candidate intervals from use-def live-interval analysis
- Filter by minimum-use threshold
- Run linear scan: assign to available callee-saved registers, evict lower-priority intervals when needed
- Naturally expired intervals remain in the mapping (earlier uses still benefit)
- Evicted intervals are removed entirely (whole-interval mapping invalid after eviction)
Runtime Integration
load_operandchecks regalloc before stack: usesMoveRegfrom allocated reg instead ofLoadIndU64store_to_slotuses write-through: copies to allocated reg AND stores to stack- Dead store elimination removes the stack store if never loaded
- After calls in non-leaf functions, allocated register mappings are invalidated and lazily reloaded
Cross-Block Propagation
- Leaf functions:
alloc_reg_slotis preserved across all block boundaries (allocated registers are never clobbered by calls) - Non-leaf functions: Predecessor exit snapshots are intersected at multi-predecessor blocks — only entries where ALL predecessors agree are kept
- Back-edges (unprocessed predecessors) are treated conservatively
Debugging
Enable allocator logs with RUST_LOG=wasm_pvm::regalloc=debug:
regalloc::run()prints candidate/assignment statslower_function()prints per-function usage counters (alloc_load_hits,alloc_store_hits, etc.)
Quick triage:
allocatable_regs=0→ no allocation will happen- Non-zero
allocated_valueswith near-zero load/store hits → move/reload overhead dominates
For the full development journey, see Regalloc Cross-Block Journey.