github.com/bir3/gocompiler@v0.9.2202/src/cmd/compile/internal/ssa/README.md (about)

     1  <!---
     2  // Copyright 2018 The Go Authors. All rights reserved.
     3  // Use of this source code is governed by a BSD-style
     4  // license that can be found in the LICENSE file.
     5  -->
     6  
     7  ## Introduction to the Go compiler's SSA backend
     8  
     9  This package contains the compiler's Static Single Assignment form component. If
    10  you're not familiar with SSA, its [Wikipedia
    11  article](https://en.wikipedia.org/wiki/Static_single_assignment_form) is a good
    12  starting point.
    13  
    14  It is recommended that you first read [cmd/compile/README.md](../../README.md)
    15  if you are not familiar with the Go compiler already. That document gives an
    16  overview of the compiler, and explains what is SSA's part and purpose in it.
    17  
    18  ### Key concepts
    19  
    20  The names described below may be loosely related to their Go counterparts, but
    21  note that they are not equivalent. For example, a Go block statement has a
    22  variable scope, yet SSA has no notion of variables nor variable scopes.
    23  
    24  It may also be surprising that values and blocks are named after their unique
    25  sequential IDs. They rarely correspond to named entities in the original code,
    26  such as variables or function parameters. The sequential IDs also allow the
    27  compiler to avoid maps, and it is always possible to track back the values to Go
    28  code using debug and position information.
    29  
    30  #### Values
    31  
    32  Values are the basic building blocks of SSA. Per SSA's very definition, a
    33  value is defined exactly once, but it may be used any number of times. A value
    34  mainly consists of a unique identifier, an operator, a type, and some arguments.
    35  
    36  An operator or `Op` describes the operation that computes the value. The
    37  semantics of each operator can be found in `_gen/*Ops.go`. For example, `OpAdd8`
    38  takes two value arguments holding 8-bit integers and results in their addition.
    39  Here is a possible SSA representation of the addition of two `uint8` values:
    40  
    41  	// var c uint8 = a + b
    42  	v4 = Add8 <uint8> v2 v3
    43  
    44  A value's type will usually be a Go type. For example, the value in the example
    45  above has a `uint8` type, and a constant boolean value will have a `bool` type.
    46  However, certain types don't come from Go and are special; below we will cover
    47  `memory`, the most common of them.
    48  
    49  See [value.go](value.go) for more information.
    50  
    51  #### Memory types
    52  
    53  `memory` represents the global memory state. An `Op` that takes a memory
    54  argument depends on that memory state, and an `Op` which has the memory type
    55  impacts the state of memory. This ensures that memory operations are kept in the
    56  right order. For example:
    57  
    58  	// *a = 3
    59  	// *b = *a
    60  	v10 = Store <mem> {int} v6 v8 v1
    61  	v14 = Store <mem> {int} v7 v8 v10
    62  
    63  Here, `Store` stores its second argument (of type `int`) into the first argument
    64  (of type `*int`). The last argument is the memory state; since the second store
    65  depends on the memory value defined by the first store, the two stores cannot be
    66  reordered.
    67  
    68  See [cmd/compile/internal/types/type.go](../types/type.go) for more information.
    69  
    70  #### Blocks
    71  
    72  A block represents a basic block in the control flow graph of a function. It is,
    73  essentially, a list of values that define the operation of this block. Besides
    74  the list of values, blocks mainly consist of a unique identifier, a kind, and a
    75  list of successor blocks.
    76  
    77  The simplest kind is a `plain` block; it simply hands the control flow to
    78  another block, thus its successors list contains one block.
    79  
    80  Another common block kind is the `exit` block. These have a final value, called
    81  control value, which must return a memory state. This is necessary for functions
    82  to return some values, for example - the caller needs some memory state to
    83  depend on, to ensure that it receives those return values correctly.
    84  
    85  The last important block kind we will mention is the `if` block. It has a single
    86  control value that must be a boolean value, and it has exactly two successor
    87  blocks. The control flow is handed to the first successor if the bool is true,
    88  and to the second otherwise.
    89  
    90  Here is a sample if-else control flow represented with basic blocks:
    91  
    92  	// func(b bool) int {
    93  	// 	if b {
    94  	// 		return 2
    95  	// 	}
    96  	// 	return 3
    97  	// }
    98  	b1:
    99  	  v1 = InitMem <mem>
   100  	  v2 = SP <uintptr>
   101  	  v5 = Addr <*int> {~r1} v2
   102  	  v6 = Arg <bool> {b}
   103  	  v8 = Const64 <int> [2]
   104  	  v12 = Const64 <int> [3]
   105  	  If v6 -> b2 b3
   106  	b2: <- b1
   107  	  v10 = VarDef <mem> {~r1} v1
   108  	  v11 = Store <mem> {int} v5 v8 v10
   109  	  Ret v11
   110  	b3: <- b1
   111  	  v14 = VarDef <mem> {~r1} v1
   112  	  v15 = Store <mem> {int} v5 v12 v14
   113  	  Ret v15
   114  
   115  <!---
   116  TODO: can we come up with a shorter example that still shows the control flow?
   117  -->
   118  
   119  See [block.go](block.go) for more information.
   120  
   121  #### Functions
   122  
   123  A function represents a function declaration along with its body. It mainly
   124  consists of a name, a type (its signature), a list of blocks that form its body,
   125  and the entry block within said list.
   126  
   127  When a function is called, the control flow is handed to its entry block. If the
   128  function terminates, the control flow will eventually reach an exit block, thus
   129  ending the function call.
   130  
   131  Note that a function may have zero or multiple exit blocks, just like a Go
   132  function can have any number of return points, but it must have exactly one
   133  entry point block.
   134  
   135  Also note that some SSA functions are autogenerated, such as the hash functions
   136  for each type used as a map key.
   137  
   138  For example, this is what an empty function can look like in SSA, with a single
   139  exit block that returns an uninteresting memory state:
   140  
   141  	foo func()
   142  	  b1:
   143  	    v1 = InitMem <mem>
   144  	    Ret v1
   145  
   146  See [func.go](func.go) for more information.
   147  
   148  ### Compiler passes
   149  
   150  Having a program in SSA form is not very useful on its own. Its advantage lies
   151  in how easy it is to write optimizations that modify the program to make it
   152  better. The way the Go compiler accomplishes this is via a list of passes.
   153  
   154  Each pass transforms a SSA function in some way. For example, a dead code
   155  elimination pass will remove blocks and values that it can prove will never be
   156  executed, and a nil check elimination pass will remove nil checks which it can
   157  prove to be redundant.
   158  
   159  Compiler passes work on one function at a time, and by default run sequentially
   160  and exactly once.
   161  
   162  The `lower` pass is special; it converts the SSA representation from being
   163  machine-independent to being machine-dependent. That is, some abstract operators
   164  are replaced with their non-generic counterparts, potentially reducing or
   165  increasing the final number of values.
   166  
   167  <!---
   168  TODO: Probably explain here why the ordering of the passes matters, and why some
   169  passes like deadstore have multiple variants at different stages.
   170  -->
   171  
   172  See the `passes` list defined in [compile.go](compile.go) for more information.
   173  
   174  ### Playing with SSA
   175  
   176  A good way to see and get used to the compiler's SSA in action is via
   177  `GOSSAFUNC`. For example, to see func `Foo`'s initial SSA form and final
   178  generated assembly, one can run:
   179  
   180  	GOSSAFUNC=Foo go build
   181  
   182  The generated `ssa.html` file will also contain the SSA func at each of the
   183  compile passes, making it easy to see what each pass does to a particular
   184  program. You can also click on values and blocks to highlight them, to help
   185  follow the control flow and values.
   186  
   187  The value specified in GOSSAFUNC can also be a package-qualified function
   188  name, e.g.
   189  
   190  	GOSSAFUNC=blah.Foo go build
   191  
   192  This will match any function named "Foo" within a package whose final
   193  suffix is "blah" (e.g. something/blah.Foo, anotherthing/extra/blah.Foo).
   194  
   195  If non-HTML dumps are needed, append a "+" to the GOSSAFUNC value
   196  and dumps will be written to stdout:
   197  
   198  	GOSSAFUNC=Bar+ go build
   199  
   200  <!---
   201  TODO: need more ideas for this section
   202  -->
   203  
   204  ### Hacking on SSA
   205  
   206  While most compiler passes are implemented directly in Go code, some others are
   207  code generated. This is currently done via rewrite rules, which have their own
   208  syntax and are maintained in `_gen/*.rules`. Simpler optimizations can be written
   209  easily and quickly this way, but rewrite rules are not suitable for more complex
   210  optimizations.
   211  
   212  To read more on rewrite rules, have a look at the top comments in
   213  [_gen/generic.rules](_gen/generic.rules) and [_gen/rulegen.go](_gen/rulegen.go).
   214  
   215  Similarly, the code to manage operators is also code generated from
   216  `_gen/*Ops.go`, as it is easier to maintain a few tables than a lot of code.
   217  After changing the rules or operators, run `go generate cmd/compile/internal/ssa`
   218  to generate the Go code again.
   219  
   220  <!---
   221  TODO: more tips and info could likely go here
   222  -->