github.com/tinygo-org/tinygo@v0.31.3-0.20240404173401-90b0bf646c27/interp/README.md

github.com/tinygo-org/tinygo@v0.31.3-0.20240404173401-90b0bf646c27/interp/README.md (about)

     1  # Partial evaluation of initialization code in Go
     2  
     3  For several reasons related to code size and memory consumption (see below), it
     4  is best to try to evaluate as much initialization code at compile time as
     5  possible and only run unknown expressions (e.g. external calls) at runtime. This
     6  is in practice a partial evaluator of the `runtime.initAll` function, which
     7  calls each package initializer.
     8  
     9  This package is a rewrite of a previous partial evaluator that worked
    10  directly on LLVM IR and used the module and LLVM constants as intermediate
    11  values. This newer version instead uses a mostly Go intermediate form. It
    12  compiles functions and extracts relevant data first (compiler.go), then
    13  executes those functions (interpreter.go) in a memory space that can be
    14  rolled back per function (memory.go). This means that it is not necessary to
    15  scan functions to see whether they can be run at compile time, which was very
    16  error prone. Instead it just tries to execute everything and if it hits
    17  something it cannot interpret (such as a store to memory-mapped I/O) it rolls
    18  back the execution of that function and runs the function at runtime instead.
    19  All in all, this design provides several benefits:
    20  
    21    * Much better error handling. By being able to revert to runtime execution
    22      without the need for scanning functions, this version is able to
    23      automatically work around many bugs in the previous implementation.
    24    * More correct memory model. This is not inherent to the new design, but the
    25      new design also made the memory model easier to reason about.
    26    * Faster execution of initialization code. While it is not much faster for
    27      normal interpretation (maybe 25% or so) due to the compilation overhead,
    28      it should be a whole lot faster for loops as it doesn't have to call into
    29      LLVM (via CGo) for every operation.
    30  
    31  As mentioned, this partial evaulator comes in three parts: a compiler, an
    32  interpreter, and a memory manager.
    33  
    34  ## Compiler
    35  
    36  The main task of the compiler is that it extracts all necessary data from
    37  every instruction in a function so that when this instruction is interpreted,
    38  no additional CGo calls are necessary. This is not currently done for all
    39  instructions (`runtime.alloc` is a notable exception), but at least it does
    40  so for the vast majority of instructions.
    41  
    42  ## Interpreter
    43  
    44  The interpreter runs an instruction just like it would if it were executed
    45  'for real'. The vast majority of instructions can be executed at compile
    46  time. As indicated above, some instructions need to be executed at runtime
    47  instead.
    48  
    49  ## Memory
    50  
    51  Memory is represented as objects (the `object` type) that contains data that
    52  will eventually be stored in a global and values (the `value` interface) that
    53  can be worked with while running the interpreter. Values therefore are only
    54  used locally and are always passed by value (just like most LLVM constants)
    55  while objects represent the backing storage (like LLVM globals). Some values
    56  are pointer values, and point to an object.
    57  
    58  Importantly, this partial evaluator can roll back the execution of a
    59  function. This is implemented by creating a new memory view per function
    60  activation, which makes sure that any change to a global (such as a store
    61  instruction) is stored in the memory view. It creates a copy of the object
    62  and stores that in the memory view to be modified. Once the function has
    63  executed successfully, all these modified objects are then copied into the
    64  parent function, up to the root function invocation which (on successful
    65  execution) writes the values back into the LLVM module. This way, function
    66  invocations can be rolled back without leaving a trace.
    67  
    68  Pointer values point to memory objects, but not to a particular memory
    69  object. Every memory object is given an index, and pointers use that index to
    70  look up the current active object for the pointer to load from or to copy
    71  when storing to it.
    72  
    73  Rolling back a function should roll back everything, including the few
    74  instructions emitted at runtime. This is done by treating instructions much
    75  like memory objects and removing the created instructions when necessary.
    76  
    77  ## Why is this necessary?
    78  
    79  A partial evaluator is hard to get right, so why go through all the trouble of
    80  writing one?
    81  
    82  The answer is that globals with initializers are much easier to optimize by
    83  LLVM than initialization code. Also, there are a few other benefits:
    84  
    85    * Dead globals are trivial to optimize away.
    86    * Constant globals are easier to detect. Remember that Go does not have global
    87      constants in the same sense as that C has them. Constants are useful because
    88      they can be propagated and provide some opportunities for other
    89      optimizations (like dead code elimination when branching on the contents of
    90      a global).
    91    * Constants are much more efficient on microcontrollers, as they can be
    92      allocated in flash instead of RAM.
    93  
    94  The Go SSA package does not create constant initializers for globals.
    95  Instead, it emits initialization functions, so if you write the following:
    96  
    97  ```go
    98  var foo = []byte{1, 2, 3, 4}
    99  ```
   100  
   101  It would generate something like this:
   102  
   103  ```go
   104  var foo []byte
   105  
   106  func init() {
   107      foo = make([]byte, 4)
   108      foo[0] = 1
   109      foo[1] = 2
   110      foo[2] = 3
   111      foo[3] = 4
   112  }
   113  ```
   114  
   115  This is of course hugely wasteful, it's much better to create `foo` as a
   116  global array instead of initializing it at runtime.
   117  
   118  For more details, see [this section of the
   119  documentation](https://tinygo.org/compiler-internals/differences-from-go/).