github.com/AR1011/wazero@v1.0.5/site/content/docs/how_do_compiler_functions_work.md (about) 1 # How do compiler functions work? 2 3 WebAssembly runtimes let you call functions defined in wasm. How this works in 4 wazero is different depending on your `RuntimeConfig`. 5 6 - `RuntimeConfigCompiler` compiles machine code from your wasm, and jumps to 7 that when invoking a function. 8 - `RuntimeConfigInterpreter` does not generate code. It interprets wasm and 9 executes go statements that correspond to WebAssembly instructions. 10 11 How the compiler works precisely is a large topic, and discussed at length on 12 this page. For more general information on architecture, etc., please refer to 13 [Docs](..). 14 15 ## Engines 16 17 Our [Docs](..) introduce the "engine" concept of wazero. More precisely, there 18 are three types of engines, `Engine`, `ModuleEngine` and `callEngine`. Each has 19 a different scope and role: 20 21 - `Engine` has the same lifetime as `Runtime`. This compiles a `CompiledModule` 22 into machine code, which is both cached and memory-mapped as an executable. 23 - `ModuleEngine` is a virtual machine with the same lifetime as its [Module][api-module]. 24 Notably, this binds each [function instance][spec-function-instance] to 25 corresponding machine code owned by its `Engine`. 26 - `callEngine` is the implementation of [api.Function][api-function] in a 27 [Module][api-module]. This implements `Function.Call(...)` by invoking 28 machine code corresponding to a function instance in `ModuleEngine` and 29 managing the [call stack][call-stack] representing the invocation. 30 31 Here is a diagram showing the relationships of these engines: 32 33 ```goat 34 .-----------> Instantiated module Exported Function 35 /1:N | | 36 / | v 37 | +----------+ v +----------------+ +------------+ 38 | | Engine |--------------->| ModuleEngine |----------------->| callEngine | 39 | +----------+ +----------------+ +------------+ 40 | | | | | 41 . | | | | 42 main.wasm -->| .--------------------->| '-----------------+ | 43 | / | | | 44 v . v v v 45 +--------------+ +-----------------------------------+ +----------+ 46 | Machine Code | |[(func_instance, machine_code),...]| |Call Stack| 47 +--------------+ +-----------------------------------+ +----------+ 48 ^ ^ 49 | | 50 | | 51 +----------------------------------+ 52 | 53 | 54 | 55 Function.Call() 56 ``` 57 58 ## Callbacks from machine code to Go 59 60 Go source can be compiled to invoke native library functions using CGO. 61 However, [CGO is not GO][cgo-not-go]. To call native functions in pure Go, we 62 need a different approach with unique constraints. 63 64 The most notable constraints are: 65 66 - machine code must not manipulate the Goroutine or system stack 67 - we cannot modify the signal handler of Go at runtime 68 69 ### Handling the call stack 70 71 One constraint is the generated machine code must not manipulate Goroutine 72 (or system) stack. Otherwise, the Go runtime gets corrupted, which results in 73 fatal execution errors. This means we cannot[^1] call Go functions (host 74 functions) directly from machine code (compiled from wasm). This is routinely 75 needed in WebAssembly, as system calls such as WASI are defined in Go, but 76 invoked from Wasm. To handle this, we employ a "trampoline strategy". 77 78 Let's explain the "trampoline strategy" with an example. `random_get` is a host 79 function defined in Go, called from machine code compiled from guest `main` 80 function. Let's say the wasm function corresponding to that is called `_start`. 81 `_start` function is called by wazero by default on `Instantiate`. 82 83 Here is a TinyGo source file describing this. 84 85 ```go 86 //go:import wasi_snapshot_preview1 random_get 87 func random_get(age int32)package main 88 89 import "unsafe" 90 91 // random_get is a function defined on the host, specifically, the wazero 92 // program written in Go. 93 // 94 //go:wasmimport wasi_snapshot_preview1 random_get 95 func random_get(ptr uintptr, size uint32) (errno uint32) 96 97 // main is compiled to wasm, so this is the guest. Conventionally, this ends up 98 // named `_start`. 99 func main() { 100 // Define a buffer to hold random data 101 size := uint32(8) 102 buf := make([]byte, size) 103 104 // Fill the buffer with random data using an imported host function. 105 // The host needs to know where in guest memory to place the random data. 106 // To communicate this, we have to convert buf to a uintptr. 107 errno := random_get(uintptr(unsafe.Pointer(&buf[0])), size) 108 if errno != 0 { 109 panic(errno) 110 } 111 } 112 ``` 113 114 When `_start` calls `random_get`, it exits execution first. wazero calls the Go 115 function mapped to `random_get` like a usual Go program. Finally, wazero 116 transfers control back to machine code again, resuming `_start` after the call 117 instruction to `random_get`. 118 119 Here's what the "trampoline strategy" looks like in a diagram. For simplicity, 120 we'll say the wasm memory offset of the `buf` is zero, but it will be different 121 in real execution. 122 123 ```goat 124 | Go | Machine Code 125 | (compiled from main.wasm) 126 | | 127 v 128 | `Instantiate(ctx, mainWasm)` | 129 | | 130 v v | 131 | +----------------+ +------------+ 132 | |func exec_native|-------|--------> |func _start | 133 v +----------------+ +------------+ 134 | | / 135 | Go func call +----------------+ / ptr=0,size=8 136 v .----------------|func exec_native|<------|-------. status=call_host_fn(name=rand_get) 137 | / ptr=0,size=8 +----------------+ exit 138 | v | 139 v +-------------+ +----------------+ 140 | |func rand_get|--------->|func exec_native|-------|-------. 141 | +-------------+ errno=0 +----------------+ continue \ errno=0 142 v | \ 143 | | +------------+ 144 | | |func _start | 145 v | +------------+ 146 ``` 147 148 ### Signal handling 149 150 Code compiled to wasm use [runtime traps][spec-trap] to abort execution. For 151 example, a `panic` compiled with TinyGo becomes a wasm function named 152 `runtime._panic`, which issues an [unreachable][spec-unreachable] instruction 153 after printing the message to STDERR. 154 155 ```go 156 package main 157 158 func main() { 159 panic("help") 160 } 161 ``` 162 163 Native JIT compilers set custom signal handlers for [Wasm runtime traps][spec-trap], 164 such as the [unreachable][spec-unreachable] instruction. However, we cannot 165 safely [modify the signal handler of Go at runtime][signal-handler-discussion]. 166 As described in the first section, wazero always exits the execution of machine 167 code. Machine code sets status when it encounters an `unreachable` instruction. 168 This is read by wazero, which propagates it back with `ErrRuntimeUnreachable`. 169 170 Here's a diagram showing this: 171 172 ```goat 173 | Go | Machine Code 174 | (compiled from main.wasm) 175 | | 176 v 177 | `Instantiate(ctx, mainWasm)` | 178 | | 179 v v | 180 | +----------------+ +------------+ 181 | |func exec_native|---------|-------------------------> |func _start | 182 v +----------------+ +------------+ 183 | | | 184 | +----------------+ exit +--------------------+ 185 v |func exec_native|<--------|---------------------- |func runtime._panic | 186 | +----------------+ status=unreachable +--------------------+ 187 | | | 188 v | 189 | panic(WasmRuntimeErrUnreachable) | 190 ``` 191 192 One thing you will notice above is that the calls between wasm functions, such 193 as from `_start` to `runtime._panic` do not use a trampoline. The trampoline 194 strategy is only used between wasm and the host. 195 196 ## Summary 197 198 When an exported wasm function is called, using a wazero API, such as 199 `Function.Call()`, wazero allocates a `callEngine` and starts invocation. This 200 begins with jumping to machine code compiled from the Wasm binary. When that 201 code makes a callback to the host, it exits execution, passing control back to 202 `exec_native` which then calls a Go function and resumes the machine code 203 afterwards. In the face of Wasm runtime errors, we exit the machine code 204 execution with the proper status, and return the control back to `exec_native` 205 function, just like host function calls. Just instead of calling a Go function, 206 we call `panic` with a corresponding error. This jumping is why the strategy is 207 called a trampoline, and only used between the guest wasm and the host running 208 it. 209 210 For more details, see [RATIONALE.md][compiler-rationale]. 211 212 [call-stack]: https://en.wikipedia.org/wiki/Call_stack 213 [api-function]: https://pkg.go.dev/github.com/AR1011/wazero@v1.0.0-rc.1/api#Function 214 [api-module]: https://pkg.go.dev/github.com/AR1011/wazero@v1.0.0-rc.1/api#Module 215 [spec-function-instance]: https://www.w3.org/TR/2019/REC-wasm-core-1-20191205/#function-instances%E2%91%A0 216 [spec-trap]: https://www.w3.org/TR/2019/REC-wasm-core-1-20191205/#trap 217 [spec-unreachable]: https://www.w3.org/TR/2019/REC-wasm-core-1-20191205/#syntax-instr-control 218 [compiler-rationale]: https://github.com/AR1011/wazero/blob/v1.0.0-rc.1/internal/engine/compiler/RATIONALE.md 219 [signal-handler-discussion]: https://gophers.slack.com/archives/C1C1YSQBT/p1675992411241409 220 [cgo-not-go]: https://www.youtube.com/watch?v=PAAkCSZUG1c&t=757s 221 222 [^1]: 223 it's technically possible to call it directly, but that would come with performing "stack switching" in the native code. 224 It's almost the same as what wazero does: exiting the execution of machine code, then call the target Go function (using the caller of machine code as a "trampoline").