github.com/AR1011/wazero@v1.0.5/site/content/docs/how_do_compiler_functions_work.md (about)

     1  # How do compiler functions work?
     2  
     3  WebAssembly runtimes let you call functions defined in wasm. How this works in
     4  wazero is different depending on your `RuntimeConfig`.
     5  
     6  - `RuntimeConfigCompiler` compiles machine code from your wasm, and jumps to
     7    that when invoking a function.
     8  - `RuntimeConfigInterpreter` does not generate code. It interprets wasm and
     9    executes go statements that correspond to WebAssembly instructions.
    10  
    11  How the compiler works precisely is a large topic, and discussed at length on
    12  this page. For more general information on architecture, etc., please refer to
    13  [Docs](..).
    14  
    15  ## Engines
    16  
    17  Our [Docs](..) introduce the "engine" concept of wazero. More precisely, there
    18  are three types of engines, `Engine`, `ModuleEngine` and `callEngine`. Each has
    19  a different scope and role:
    20  
    21  - `Engine` has the same lifetime as `Runtime`. This compiles a `CompiledModule`
    22    into machine code, which is both cached and memory-mapped as an executable.
    23  - `ModuleEngine` is a virtual machine with the same lifetime as its [Module][api-module].
    24    Notably, this binds each [function instance][spec-function-instance] to
    25    corresponding machine code owned by its `Engine`.
    26  - `callEngine` is the implementation of [api.Function][api-function] in a
    27    [Module][api-module]. This implements `Function.Call(...)` by invoking
    28    machine code corresponding to a function instance in `ModuleEngine` and
    29    managing the [call stack][call-stack] representing the invocation.
    30  
    31  Here is a diagram showing the relationships of these engines:
    32  
    33  ```goat
    34        .-----------> Instantiated module                                 Exported Function
    35       /1:N                   |                                                  |
    36      /                       |                                                  v
    37     |     +----------+       v        +----------------+                  +------------+
    38     |     |  Engine  |--------------->|  ModuleEngine  |----------------->| callEngine |
    39     |     +----------+                +----------------+                  +------------+
    40     |          |                               |                            |      |
    41     .          |                               |                            |      |
    42   main.wasm -->|        .--------------------->|          '-----------------+      |
    43                |       /                       |          |                        |
    44                v      .                        v          v                        v
    45        +--------------+      +-----------------------------------+            +----------+
    46        | Machine Code |      |[(func_instance, machine_code),...]|            |Call Stack|
    47        +--------------+      +-----------------------------------+            +----------+
    48                                                 ^                                  ^
    49                                                 |                                  |
    50                                                 |                                  |
    51                                                 +----------------------------------+
    52                                                                 |
    53                                                                 |
    54                                                                 |
    55                                                          Function.Call()
    56  ```
    57  
    58  ## Callbacks from machine code to Go
    59  
    60  Go source can be compiled to invoke native library functions using CGO.
    61  However, [CGO is not GO][cgo-not-go]. To call native functions in pure Go, we
    62  need a different approach with unique constraints.
    63  
    64  The most notable constraints are:
    65  
    66  - machine code must not manipulate the Goroutine or system stack
    67  - we cannot modify the signal handler of Go at runtime
    68  
    69  ### Handling the call stack
    70  
    71  One constraint is the generated machine code must not manipulate Goroutine
    72  (or system) stack. Otherwise, the Go runtime gets corrupted, which results in
    73  fatal execution errors. This means we cannot[^1] call Go functions (host
    74  functions) directly from machine code (compiled from wasm). This is routinely
    75  needed in WebAssembly, as system calls such as WASI are defined in Go, but
    76  invoked from Wasm. To handle this, we employ a "trampoline strategy".
    77  
    78  Let's explain the "trampoline strategy" with an example. `random_get` is a host
    79  function defined in Go, called from machine code compiled from guest `main`
    80  function. Let's say the wasm function corresponding to that is called `_start`.
    81  `_start` function is called by wazero by default on `Instantiate`.
    82  
    83  Here is a TinyGo source file describing this.
    84  
    85  ```go
    86  //go:import wasi_snapshot_preview1 random_get
    87  func random_get(age int32)package main
    88  
    89  import "unsafe"
    90  
    91  // random_get is a function defined on the host, specifically, the wazero
    92  // program written in Go.
    93  //
    94  //go:wasmimport wasi_snapshot_preview1 random_get
    95  func random_get(ptr uintptr, size uint32) (errno uint32)
    96  
    97  // main is compiled to wasm, so this is the guest. Conventionally, this ends up
    98  // named `_start`.
    99  func main() {
   100      // Define a buffer to hold random data
   101  	size := uint32(8)
   102      buf := make([]byte, size)
   103  
   104  	// Fill the buffer with random data using an imported host function.
   105      // The host needs to know where in guest memory to place the random data.
   106  	// To communicate this, we have to convert buf to a uintptr.
   107      errno := random_get(uintptr(unsafe.Pointer(&buf[0])), size)
   108      if errno != 0 {
   109          panic(errno)
   110      }
   111  }
   112  ```
   113  
   114  When `_start` calls `random_get`, it exits execution first. wazero calls the Go
   115  function mapped to `random_get` like a usual Go program. Finally, wazero
   116  transfers control back to machine code again, resuming `_start` after the call
   117  instruction to `random_get`.
   118  
   119  Here's what the "trampoline strategy" looks like in a diagram. For simplicity,
   120  we'll say the wasm memory offset of the `buf` is zero, but it will be different
   121  in real execution.
   122  
   123  ```goat
   124     |                                     Go              |           Machine Code
   125     |                                                           (compiled from main.wasm)
   126     |                                                     |
   127     v
   128     |                        `Instantiate(ctx, mainWasm)` |
   129     |                                     |
   130     v                                     v               |
   131     |                            +----------------+                  +------------+
   132     |                            |func exec_native|-------|--------> |func _start |
   133     v                            +----------------+                  +------------+
   134     |                                                     |         /
   135     |            Go func call    +----------------+                / ptr=0,size=8
   136     v           .----------------|func exec_native|<------|-------. status=call_host_fn(name=rand_get)
   137     |          /  ptr=0,size=8   +----------------+     exit
   138     |         v                                           |
   139     v   +-------------+          +----------------+
   140     |   |func rand_get|--------->|func exec_native|-------|-------.
   141     |   +-------------+ errno=0  +----------------+    continue    \ errno=0
   142     v                                                     |         \
   143     |                                                     |          +------------+
   144     |                                                     |          |func _start |
   145     v                                                     |          +------------+
   146  ```
   147  
   148  ### Signal handling
   149  
   150  Code compiled to wasm use [runtime traps][spec-trap] to abort execution. For
   151  example, a `panic` compiled with TinyGo becomes a wasm function named
   152  `runtime._panic`, which issues an [unreachable][spec-unreachable] instruction
   153  after printing the message to STDERR.
   154  
   155  ```go
   156  package main
   157  
   158  func main() {
   159  	panic("help")
   160  }
   161  ```
   162  
   163  Native JIT compilers set custom signal handlers for [Wasm runtime traps][spec-trap],
   164  such as the [unreachable][spec-unreachable] instruction. However, we cannot
   165  safely [modify the signal handler of Go at runtime][signal-handler-discussion].
   166  As described in the first section, wazero always exits the execution of machine
   167  code. Machine code sets status when it encounters an `unreachable` instruction.
   168  This is read by wazero, which propagates it back with `ErrRuntimeUnreachable`.
   169  
   170  Here's a diagram showing this:
   171  
   172  ```goat
   173     |                               Go                 |                             Machine Code
   174     |                                                                          (compiled from main.wasm)
   175     |                                                  |
   176     v
   177     |                   `Instantiate(ctx, mainWasm)`   |
   178     |                                |
   179     v                                v                 |
   180     |                       +----------------+                                     +------------+
   181     |                       |func exec_native|---------|-------------------------> |func _start |
   182     v                       +----------------+                                     +------------+
   183     |                                                  |                                 |
   184     |                       +----------------+                  exit           +--------------------+
   185     v                       |func exec_native|<--------|---------------------- |func runtime._panic |
   186     |                       +----------------+            status=unreachable   +--------------------+
   187     |                              |                   |
   188     v                              |
   189     |                panic(WasmRuntimeErrUnreachable)  |
   190  ```
   191  
   192  One thing you will notice above is that the calls between wasm functions, such
   193  as from `_start` to `runtime._panic` do not use a trampoline. The trampoline
   194  strategy is only used between wasm and the host.
   195  
   196  ## Summary
   197  
   198  When an exported wasm function is called, using a wazero API, such as
   199  `Function.Call()`, wazero allocates a `callEngine` and starts invocation. This
   200  begins with jumping to machine code compiled from the Wasm binary. When that
   201  code makes a callback to the host, it exits execution, passing control back to
   202  `exec_native` which then calls a Go function and resumes the machine code
   203  afterwards. In the face of Wasm runtime errors, we exit the machine code
   204  execution with the proper status, and return the control back to `exec_native`
   205  function, just like host function calls. Just instead of calling a Go function,
   206  we call `panic` with a corresponding error. This jumping is why the strategy is
   207  called a trampoline, and only used between the guest wasm and the host running
   208  it.
   209  
   210  For more details, see [RATIONALE.md][compiler-rationale].
   211  
   212  [call-stack]: https://en.wikipedia.org/wiki/Call_stack
   213  [api-function]: https://pkg.go.dev/github.com/AR1011/wazero@v1.0.0-rc.1/api#Function
   214  [api-module]: https://pkg.go.dev/github.com/AR1011/wazero@v1.0.0-rc.1/api#Module
   215  [spec-function-instance]: https://www.w3.org/TR/2019/REC-wasm-core-1-20191205/#function-instances%E2%91%A0
   216  [spec-trap]: https://www.w3.org/TR/2019/REC-wasm-core-1-20191205/#trap
   217  [spec-unreachable]: https://www.w3.org/TR/2019/REC-wasm-core-1-20191205/#syntax-instr-control
   218  [compiler-rationale]: https://github.com/AR1011/wazero/blob/v1.0.0-rc.1/internal/engine/compiler/RATIONALE.md
   219  [signal-handler-discussion]: https://gophers.slack.com/archives/C1C1YSQBT/p1675992411241409
   220  [cgo-not-go]: https://www.youtube.com/watch?v=PAAkCSZUG1c&t=757s
   221  
   222  [^1]:
   223      it's technically possible to call it directly, but that would come with performing "stack switching" in the native code.
   224      It's almost the same as what wazero does: exiting the execution of machine code, then call the target Go function (using the caller of machine code as a "trampoline").