Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Translation Module

The translation module orchestrates the end-to-end WASM → LLVM IR → PVM lowering and assembles the final SPI/JAM output.

Source: crates/wasm-pvm/src/translate/

Files

FileRole
mod.rsPipeline dispatch, SPI assembly, entry header + data sections
wasm_module.rsWASM section parsing into WasmModule
memory_layout.rsMemory address constants and helper functions

Pipeline

  1. Parse module sections in wasm_module.rs (WasmModule::parse()).
  2. Translate WASM operators to LLVM IR in llvm_frontend/function_builder.rs.
  3. Run LLVM optimization pipeline (mem2reg, instcombine, simplifycfg, optional inlining, cleanup passes).
  4. Lower LLVM IR to PVM instructions in llvm_backend/mod.rs.
  5. Build SPI sections in mod.rs:
    • Entry header and dispatch tables
    • ro_data (jump table refs + passive data)
    • rw_data (globals + active data segments), with trailing zero trim
    • Encoded PVM blob + metadata

Key Behaviors

  • calculate_heap_pages() uses WASM initial_pages (not max), with a minimum of 16 WASM pages for (memory 0).
  • compute_wasm_memory_base() lays out (in order) the (optional) mem-size slot at GLOBAL_MEMORY_BASE, user globals, passive segment lengths, and (optionally) the 256-byte parameter overflow area, then places wasm_memory_base immediately after. No 4KB alignment is applied — anan-as page-aligns the rw_data tail (heapZerosStart) separately, so the base may sit at any byte offset. Mem-size is emitted only when the module uses memory.size/memory.grow/memory.init; overflow (tracked by needs_param_overflow) is emitted only when any module type signature has more than MAX_LOCAL_REGS (4) parameters — this covers both local function declarations and call_indirect target types.
  • build_rw_data() copies globals and active segments into a contiguous image, then trims trailing zero bytes before SPI encoding.
  • Call return addresses are pre-assigned as jump-table refs ((idx + 1) * 2) at emission time; fixup resolution accepts direct (LoadImmJump) and indirect (LoadImm / LoadImmJumpInd) return-address carriers.
  • Entry resolution prefers canonical export names (main, main2) over aliases (refine*, accumulate*) regardless of export order.
  • Entry exports (main/main2 and aliases) must target local (non-imported) functions; imported targets are rejected during parse with Error::Internal to avoid index-underflow panics.
  • WASM name custom section (subsection 1, function names) is parsed into local_function_names: Vec<Option<String>>. WasmModule::local_function_display_name(local_idx) returns the name-section entry, falling back to the export name, then wasm_func_<global_idx>. Used by the function-body translator to wrap operator-dispatch errors in Error::Located { func_idx, func_name, op_offset, source } — the diagnostic surface for unsupported features. Errors emitted later in the pipeline (LLVM-to-PVM lowering, adapter merge) do not get this wrapping; they fire after the WASM byte offset has been lost.

Current Memory Layout

AddressPurpose
0x10000Read-only data
0x30000Mem-size slot (4 bytes, only when memory.size/grow/init used), then user globals (per-global width: 4 B for i32/f32, 8 B for i64/f64 — see docs/src/learnings.md “Global Storage Width”; addresses precomputed at parse time as WasmModule::global_offsets), passive segment length slots (4 bytes each), and (when any type signature has >4 params) a 256-byte parameter overflow area. Total size = align_up_8(globals_region_size(...)) + 256 when overflow is reserved (the overflow base is 8-byte aligned — see compute_param_overflow_base), else just globals_region_size(...).
region_endWASM linear memory — placed without 4KB alignment immediately after the last region. For a module that only declares memory and never uses memory.size/grow/init, wasm_memory_base collapses to 0x30000. A memory-op-using program with zero user globals, no passive segments, and no overflow lands at 0x30004. A program that also needs overflow (e.g. a 5+ param call_indirect target) lands at 0x30108.

Anti-Patterns

  1. Don’t change layout constants without validating pvm-in-pvm tests.
  2. Don’t bypass Result error handling with panics in library code.
  3. Don’t assume rw_data must include trailing zero bytes.