Translation Module
The translation module orchestrates the end-to-end WASM → LLVM IR → PVM lowering and assembles the final SPI/JAM output.
Source: crates/wasm-pvm/src/translate/
Files
| File | Role |
|---|---|
mod.rs | Pipeline dispatch, SPI assembly, entry header + data sections |
wasm_module.rs | WASM section parsing into WasmModule |
memory_layout.rs | Memory address constants and helper functions |
Pipeline
- Parse module sections in
wasm_module.rs(WasmModule::parse()). - Translate WASM operators to LLVM IR in
llvm_frontend/function_builder.rs. - Run LLVM optimization pipeline (
mem2reg,instcombine,simplifycfg, optional inlining, cleanup passes). - Lower LLVM IR to PVM instructions in
llvm_backend/mod.rs. - Build SPI sections in
mod.rs:- Entry header and dispatch tables
ro_data(jump table refs + passive data)rw_data(globals + active data segments), with trailing zero trim- Encoded PVM blob + metadata
Key Behaviors
calculate_heap_pages()uses WASMinitial_pages(not max), with a minimum of 16 WASM pages for(memory 0).compute_wasm_memory_base()lays out (in order) the (optional) mem-size slot atGLOBAL_MEMORY_BASE, user globals, passive segment lengths, and (optionally) the 256-byte parameter overflow area, then placeswasm_memory_baseimmediately after. No 4KB alignment is applied — anan-as page-aligns the rw_data tail (heapZerosStart) separately, so the base may sit at any byte offset. Mem-size is emitted only when the module usesmemory.size/memory.grow/memory.init; overflow (tracked byneeds_param_overflow) is emitted only when any module type signature has more thanMAX_LOCAL_REGS(4) parameters — this covers both local function declarations andcall_indirecttarget types.build_rw_data()copies globals and active segments into a contiguous image, then trims trailing zero bytes before SPI encoding.- Call return addresses are pre-assigned as jump-table refs
((idx + 1) * 2)at emission time; fixup resolution accepts direct (LoadImmJump) and indirect (LoadImm/LoadImmJumpInd) return-address carriers. - Entry resolution prefers canonical export names (
main,main2) over aliases (refine*,accumulate*) regardless of export order. - Entry exports (
main/main2and aliases) must target local (non-imported) functions; imported targets are rejected during parse withError::Internalto avoid index-underflow panics. - WASM
namecustom section (subsection 1, function names) is parsed intolocal_function_names: Vec<Option<String>>.WasmModule::local_function_display_name(local_idx)returns the name-section entry, falling back to the export name, thenwasm_func_<global_idx>. Used by the function-body translator to wrap operator-dispatch errors inError::Located { func_idx, func_name, op_offset, source }— the diagnostic surface for unsupported features. Errors emitted later in the pipeline (LLVM-to-PVM lowering, adapter merge) do not get this wrapping; they fire after the WASM byte offset has been lost.
Current Memory Layout
| Address | Purpose |
|---|---|
0x10000 | Read-only data |
0x30000 | Mem-size slot (4 bytes, only when memory.size/grow/init used), then user globals (per-global width: 4 B for i32/f32, 8 B for i64/f64 — see docs/src/learnings.md “Global Storage Width”; addresses precomputed at parse time as WasmModule::global_offsets), passive segment length slots (4 bytes each), and (when any type signature has >4 params) a 256-byte parameter overflow area. Total size = align_up_8(globals_region_size(...)) + 256 when overflow is reserved (the overflow base is 8-byte aligned — see compute_param_overflow_base), else just globals_region_size(...). |
region_end | WASM linear memory — placed without 4KB alignment immediately after the last region. For a module that only declares memory and never uses memory.size/grow/init, wasm_memory_base collapses to 0x30000. A memory-op-using program with zero user globals, no passive segments, and no overflow lands at 0x30004. A program that also needs overflow (e.g. a 5+ param call_indirect target) lands at 0x30108. |
Anti-Patterns
- Don’t change layout constants without validating pvm-in-pvm tests.
- Don’t bypass
Resulterror handling with panics in library code. - Don’t assume
rw_datamust include trailing zero bytes.