github.com/tetratelabs/wazero@v1.7.3-0.20240513003603-48f702e154b5/site/content/docs/how_the_optimizing_compiler_works/appendix.md (about)

     1  +++
     2  title = "Appendix: Trampolines"
     3  layout = "single"
     4  +++
     5  
     6  Trampolines are used to interface between the Go runtime and the generated
     7  code, in two cases:
     8  
     9  - when we need to **enter the generated code** from the Go runtime.
    10  - when we need to **leave the generated code** to invoke a host function
    11    (written in Go).
    12  
    13  In this section we want to complete the picture of how a Wasm function gets
    14  translated from Wasm to executable code in the optimizing compiler, by
    15  describing how to jump into the execution of the generated code at run-time.
    16  
    17  ## Entering the Generated Code
    18  
    19  At run-time, user space invokes a Wasm function through the public
    20  `api.Function` interface, using methods `Call()` or `CallWithStack()`.  The
    21  implementation of this method, in turn, eventually invokes an ASM
    22  **trampoline**. The signature of this trampoline in Go code is:
    23  
    24  ```go
    25  func entrypoint(
    26  	preambleExecutable, functionExecutable *byte,
    27  	executionContextPtr uintptr, moduleContextPtr *byte,
    28  	paramResultStackPtr *uint64,
    29  	goAllocatedStackSlicePtr uintptr)
    30  ```
    31  
    32  - `preambleExecutable` is a pointer to the generated code for the preamble (see
    33    below)
    34  - `functionExecutable` is a pointer to the generated code for the function (as
    35    described in the previous sections).
    36  - `executionContextPtr` is a raw pointer to the `wazevo.executionContext`
    37    struct. This struct is used to save the state of the Go runtime before
    38  entering or leaving the generated code. It also holds shared state between the
    39  Go runtime and the generated code, such as the exit code that is used to
    40  terminate execution on failure, or suspend it to invoke host functions.
    41  - `moduleContextPtr` is a pointer to the `wazevo.moduleContextOpaque` struct.
    42    This struct Its contents are basically the pointers to the module instance,
    43  specific objects as well as functions. This is sometimes called "VMContext" in
    44  other Wasm runtimes.
    45  - `paramResultStackPtr` is a pointer to the slice where the arguments and
    46    results of the function are passed.
    47  - `goAllocatedStackSlicePtr` is an aligned pointer to the Go-allocated stack
    48    for holding values and call frames. For further details refer to
    49    [Backend ยง Prologue and Epilogue](../backend/#prologue-and-epilogue)
    50  
    51  The trampoline can be found in`backend/isa/<arch>/abi_entry_<arch>.s`.
    52  
    53  For each given architecture, the trampoline:
    54  - moves the arguments to specific registers to match the behavior of the entry preamble or trampoline function, and
    55  - finally, it jumps into the execution of the generated code for the preamble
    56  
    57  The **preamble** that will be jumped from `entrypoint` function is generated per function signature.
    58  
    59  This is implemented in `machine.CompileEntryPreamble(*ssa.Signature)`.
    60  
    61  The preamble sets the fields in the `wazevo.executionContext`.
    62  
    63  At the beginning of the preamble:
    64  
    65  - Set a register to point to the `*wazevo.executionContext` struct.
    66  - Save the stack pointers, frame pointers, return addresses, etc. to that
    67    struct.
    68  - Update the stack pointer to point to `paramResultStackPtr`.
    69  
    70  The generated code works in concert with the assumption that the preamble has
    71  been entered through the aforementioned trampoline. Thus, it assumes that the
    72  arguments can be found in some specific registers.
    73  
    74  The preamble then assigns the arguments pointed at by `paramResultStackPtr` to
    75  the registers and stack location that the generated code expects.
    76  
    77  Finally, it invokes the generated code for the function.
    78  
    79  The epilogue reverses part of the process, finally returning control to the
    80  caller of the `entrypoint()` function, and the Go runtime. The caller of
    81  `entrypoint()` is also responsible for completing the cleaning up procedure by
    82  invoking `afterGoFunctionCallEntrypoint()` (again, implemented in
    83  backend-specific ASM).  which will restore the stack pointers and return
    84  control to the caller of the function.
    85  
    86  The arch-specific code can be found in
    87  `backend/isa/<arch>/abi_entry_preamble.go`.
    88  
    89  [wazero-engine-stack]: https://github.com/tetratelabs/wazero/blob/095b49f74a5e36ce401b899a0c16de4eeb46c054/internal/engine/compiler/engine.go#L77-L132
    90  [abi-arm64]: https://tip.golang.org/src/cmd/compile/abi-internal#arm64-architecture
    91  [abi-amd64]: https://tip.golang.org/src/cmd/compile/abi-internal#amd64-architecture
    92  [abi-cc]: https://tip.golang.org/src/cmd/compile/abi-internal#function-call-argument-and-result-passing
    93  
    94  
    95  ## Leaving the Generated Code
    96  
    97  In "[How do compiler functions work?][how-do-compiler-functions-work]", we
    98  already outlined how _leaving_ the generated code works with the help of a
    99  function. We will complete here the picture by briefly describing the code that
   100  is generated.
   101  
   102  When the generated code needs to return control to the Go runtime, it inserts a
   103  meta-instruction that is called `exitSequence` in both `amd64` and `arm64`
   104  backends.  This meta-instruction sets the `exitCode` in the
   105  `wazevo.executionContext` struct, restore the stack pointers and then returns
   106  control to the caller of the `entrypoint()` function described above.
   107  
   108  As described in "[How do compiler functions
   109  work?][how-do-compiler-functions-work]", the mechanism is essentially the same
   110  when invoking a host function or raising an error. However, when a function is
   111  invoked the `exitCode` also indicates the identifier of the host function to be
   112  invoked.
   113  
   114  The magic really happens in the `backend.Machine.CompileGoFunctionTrampoline()`
   115  method.  This method is actually invoked when host modules are being
   116  instantiated.  It generates a trampoline that is used to invoke such functions
   117  from the generated code.
   118  
   119  This trampoline implements essentially the same prologue as the `entrypoint()`,
   120  but it also reserves space for the arguments and results of the function to be
   121  invoked.
   122  
   123  A host function has the signature:
   124  
   125  ```
   126  func(ctx context.Context, stack []uint64)
   127  ```
   128  
   129  the function arguments in the `stack` parameter are copied over to the reserved
   130  slots of the real stack. For instance, on `arm64` the stack layout would look
   131  as follows (on `amd64` it would be similar):
   132  
   133  ```goat
   134                    (high address)
   135      SP ------> +-----------------+  <----+
   136                 |     .......     |       |
   137                 |      ret Y      |       |
   138                 |     .......     |       |
   139                 |      ret 0      |       |
   140                 |      arg X      |       |  size_of_arg_ret
   141                 |     .......     |       |
   142                 |      arg 1      |       |
   143                 |      arg 0      |  <----+ <-------- originalArg0Reg
   144                 | size_of_arg_ret |
   145                 |  ReturnAddress  |
   146                 +-----------------+ <----+
   147                 |      xxxx       |      |  ;; might be padded to make it 16-byte aligned.
   148            +--->|  arg[N]/ret[M]  |      |
   149   sliceSize|    |   ............  |      | goCallStackSize
   150            |    |  arg[1]/ret[1]  |      |
   151            +--->|  arg[0]/ret[0]  | <----+ <-------- arg0ret0AddrReg
   152                 |    sliceSize    |
   153                 |   frame_size    |
   154                 +-----------------+
   155                    (low address)
   156  ```
   157  
   158  Finally, the trampoline jumps into the execution of the host function using the
   159  `exitSequence` meta-instruction.
   160  
   161  Upon return, the process is reversed.
   162  
   163  ## Code
   164  
   165  - The trampoline to enter the generated function is implemented by the
   166    `backend.Machine.CompileEntryPreamble()` method.
   167  - The trampoline to return traps and invoke host functions is generated by
   168    `backend.Machine.CompileGoFunctionTrampoline()` method.
   169  
   170  You can find arch-specific implementations in
   171  `backend/isa/<arch>/abi_go_call.go`,
   172  `backend/isa/<arch>/abi_entry_preamble.go`, etc. The trampolines are found
   173  under `backend/isa/<arch>/abi_entry_<arch>.s`.
   174  
   175  ## Further References
   176  
   177  - Go's [internal ABI documentation][abi-internal] details the calling convention similar to the one we use in both arm64 and amd64 backend.
   178  - Raphael Poss's [The Go low-level calling convention on
   179    x86-64][go-call-conv-x86] is also an excellent reference for `amd64`.
   180  
   181  [abi-internal]: https://tip.golang.org/src/cmd/compile/abi-internal
   182  [go-call-conv-x86]: https://dr-knz.net/go-calling-convention-x86-64.html
   183  [proposal-register-cc]: https://go.googlesource.com/proposal/+/master/design/40724-register-calling.md#background
   184  [how-do-compiler-functions-work]: ../../how_do_compiler_functions_work/
   185