github.com/tetratelabs/wazero@v1.7.3-0.20240513003603-48f702e154b5/site/content/docs/how_the_optimizing_compiler_works/_index.md (about)

     1  +++
     2  title = "How the Optimizing Compiler Works"
     3  layout = "single"
     4  +++
     5  
     6  wazero supports two modes of execution: interpreter mode and compilation mode.
     7  The interpreter mode is a fallback mode for platforms where compilation is not
     8  supported. Compilation mode is otherwise the default mode of execution: it
     9  translates Wasm modules to native code to get the best run-time performance.
    10  
    11  Translating Wasm bytecode into machine code can take multiple forms.  wazero
    12  1.0 performs a straightforward translation from a given instruction to a native
    13  instruction. wazero 2.0 introduces an optimizing compiler that is able to
    14  perform nontrivial optimizing transformations, such as constant folding or
    15  dead-code elimination, and it makes better use of the underlying hardware, such
    16  as CPU registers. This document digs deeper into what we mean when we say
    17  "optimizing compiler", and explains how it is implemented in wazero.
    18  
    19  This document is intended for maintainers, researchers, developers and in
    20  general anyone interested in understanding the internals of wazero.
    21  
    22  What is an Optimizing Compiler?
    23  -------------------------------
    24  
    25  Wazero supports an _optimizing_ compiler in the style of other optimizing
    26  compilers such as LLVM's or V8's. Traditionally an optimizing
    27  compiler performs compilation in a number of steps.
    28  
    29  Compare this to the **old compiler**, where compilation happens in one step or
    30  two, depending on how you count:
    31  
    32  
    33  ```goat
    34      Input         +---------------+     +---------------+
    35   Wasm Binary ---->| DecodeModule  |---->| CompileModule |----> wazero IR
    36                    +---------------+     +---------------+
    37  ```
    38  
    39  That is, the module is (1) validated then (2) translated to an Intermediate
    40  Representation (IR). The wazero IR can then be executed directly (in the case
    41  of the interpreter) or it can be further processed and translated into native
    42  code by the compiler. This compiler performs a straightforward translation from
    43  the IR to native code, without any further passes. The wazero IR is not intended
    44  for further processing beyond immediate execution or straightforward
    45  translation.
    46  
    47  ```goat
    48                  +----   wazero IR    ----+
    49                  |                        |
    50                  v                        v
    51          +--------------+         +--------------+
    52          |   Compiler   |         | Interpreter  |- - -  executable
    53          +--------------+         +--------------+
    54                  |
    55       +----------+---------+
    56       |                    |
    57       v                    v
    58  +---------+          +---------+
    59  |  ARM64  |          |  AMD64  |
    60  | Backend |          | Backend |    - - - - - - - - -   executable
    61  +---------+          +---------+
    62  ```
    63  
    64  
    65  Validation and translation to an IR in a compiler are usually called the
    66  **front-end** part of a compiler, while code-generation occurs in what we call
    67  the **back-end** of a compiler. The front-end is the part of a compiler that is
    68  closer to the input, and it generally indicates machine-independent processing,
    69  such as parsing and static validation. The back-end is the part of a compiler
    70  that is closer to the output, and it generally includes machine-specific
    71  procedures, such as code-generation.
    72  
    73  In the **optimizing** compiler, we still decode and translate Wasm binaries to
    74  an intermediate representation in the front-end, but we use a textbook
    75  representation called an **SSA** or "Static Single-Assignment Form", that is
    76  intended for further transformation.
    77  
    78  The benefit of choosing an IR that is meant for transformation is that a lot of
    79  optimization passes can apply directly to the IR, and thus be
    80  machine-independent. Then the back-end can be relatively simpler, in that it
    81  will only have to deal with machine-specific concerns.
    82  
    83  The wazero optimizing compiler implements the following compilation passes:
    84  
    85  * Front-End:
    86    - Translation to SSA
    87    - Optimization
    88    - Block Layout
    89    - Control Flow Analysis
    90  
    91  * Back-End:
    92    - Instruction Selection
    93    - Registry Allocation
    94    - Finalization and Encoding
    95  
    96  ```goat
    97       Input          +-------------------+      +-------------------+
    98    Wasm Binary   --->|   DecodeModule    |----->|   CompileModule   |--+
    99                      +-------------------+      +-------------------+  |
   100             +----------------------------------------------------------+
   101             |
   102             |  +---------------+            +---------------+
   103             +->|   Front-End   |----------->|   Back-End    |
   104                +---------------+            +---------------+
   105                        |                            |
   106                        v                            v
   107                       SSA                 Instruction Selection
   108                        |                            |
   109                        v                            v
   110                  Optimization              Registry Allocation
   111                        |                            |
   112                        v                            v
   113                  Block Layout             Finalization/Encoding
   114  ```
   115  
   116  Like the other engines, the implementation can be found under `engine`, specifically
   117  in the `wazevo` sub-package. The entry-point is found under `internal/engine/wazevo/engine.go`,
   118  where the implementation of the interface `wasm.Engine` is found.
   119  
   120  All the passes can be dumped to the console for debugging, by enabling, the build-time
   121  flags under `internal/engine/wazevo/wazevoapi/debug_options.go`. The flags are disabled
   122  by default and should only be enabled during debugging. These may also change in the future.
   123  
   124  In the following we will assume all paths to be relative to the `internal/engine/wazevo`,
   125  so we will omit the prefix.
   126  
   127  ## Index
   128  
   129  - [Front-End](frontend/)
   130  - [Back-End](backend/)
   131  - [Appendix](appendix/)