github.com/grumpyhome/grumpy@v0.3.1-0.20201208125205-7b775405bdf1/grumpy-runtime-src/runtime/doc.go (about)

     1  // Copyright 2016 Google Inc. All Rights Reserved.
     2  //
     3  // Licensed under the Apache License, Version 2.0 (the "License");
     4  // you may not use this file except in compliance with the License.
     5  // You may obtain a copy of the License at
     6  //
     7  //     http://www.apache.org/licenses/LICENSE-2.0
     8  //
     9  // Unless required by applicable law or agreed to in writing, software
    10  // distributed under the License is distributed on an "AS IS" BASIS,
    11  // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    12  // See the License for the specific language governing permissions and
    13  // limitations under the License.
    14  
    15  /*
    16  Package grumpy is the Grumpy runtime's Python API, analogous to CPython's C API.
    17  
    18  Data model
    19  
    20  All Python objects are represented by structs that are binary compatible with
    21  grumpy.Object, so for example the result of the Python expression "object()" is
    22  just an Object pointer. More complex primitive types like str and dict are
    23  represented by structs that augment Object by embedding it as their first field
    24  and holding other data in subsequent fields.  These augmented structs can
    25  themselves be embedded for yet more complex types.
    26  
    27  Objects contain a pointer to their Python type, represented by grumpy.Type, and
    28  a pointer to their attribute dict, represented by grumpy.Dict. This dict may be
    29  nil as in the case of str or non-nil as in the case of type objects. Note that
    30  Grumpy objects do not have a refcount since Grumpy relies on Go's garbage
    31  collection to manage object lifetimes.
    32  
    33  Every Type object holds references to all its base classes as well as every
    34  class in its MRO list.
    35  
    36  Grumpy types also hold a reflect.Type instance known as the type's "basis".  A
    37  type's basis represents the Go struct used to store instances of the type. It
    38  is an important invariant of the Grumpy runtime that an instance of a
    39  particular Python type is stored in the Go struct that is that type's basis.
    40  Violation of this invariant would mean that, for example, a str object could
    41  end up being stored in an unaugmented Object and accessing the str's value
    42  would access invalid memory. This invariant is enforced by Grumpy's API for
    43  primitive types and user defined classes.
    44  
    45  Upcasting and downcasting along the basis hierarchy is sometimes necessary, for
    46  example when passing a Str to a function accepting an Object. Upcasts are
    47  accomplished by accessing the embedded base type basis of the subclass, e.g.
    48  accessing the Object member of the Str struct. Downcasting requires
    49  unsafe.Pointer conversions. The safety of these conversions is guaranteed by
    50  the invariant discussed above. E.g. it is valid to cast an *Object with type
    51  StrType to a *Str because it was allocated with storage represented by
    52  StrType's basis, which is struct Str.
    53  
    54  Execution model
    55  
    56  User defined Python code blocks (modules, classes and functions) are
    57  implemented as Go closures with a state machine that allows the body of the
    58  block to be re-entered for exception handling, yield statements, etc. The
    59  generated code for the body of a code block looks something like this:
    60  
    61  	01:	func(f *Frame) (*Object, *BaseException) {
    62  	02:		switch (f.State()) {
    63  	03:		case 0: goto Label0
    64  	04:		case 1: goto Label1
    65  	05:		...
    66  	06:		}
    67  	07:	Label0:
    68  	08:		...
    69  	09:	Label1:
    70  	10:		...
    71  	11:	...
    72  	12:	}
    73  
    74  Frame is the basis type for Grumpy's "frame" objects and is very similar to
    75  CPython's type of the same name. The first argument f, therefore, represents a
    76  level in the Python stack. Upon entry into the body, the frame's state variable
    77  is checked and control jumps to the appropriate label. Upon first entry, the
    78  state variable will be 0 and the so execution will start at Label0. Later
    79  invocations may start at other labels. For example, an exception raised in the
    80  try block of a try/finally will cause the function above to return an exception
    81  as its second return value. The caller will then set state to the label
    82  corresponding to the finally clause and call back into the body.
    83  
    84  Python exceptions are represented by the BaseException basis struct. Grumpy API
    85  functions and generated code blocks propagate exceptions by returning
    86  *BaseException as their last return value. Exceptions are raised with the
    87  Frame.Raise*() methods which create exception objects to be propagated and set
    88  the exc info indicator for the current frame stack, similar to CPython. Python
    89  except clauses down the stack can then handle the propagated exception.
    90  
    91  Each generated body function is owned by a Block struct that is very similar to
    92  CPython's code object. Each Block has a name (e.g. the class' name) and the
    93  filename where the Python code was defined. A block is invoked via the
    94  *Block.Exec method which pushes a new frame on the call stack and then
    95  repeatedly calls the body function. This interplay is depicted below:
    96  
    97  	 *Block.Exec
    98  	 --> +-+
    99  	     | | block func
   100  	     |1| --> +-+
   101  	     | |     |2|
   102  	     | | <-- +-+
   103  	     | | --> +-+
   104  	     | |     |2|
   105  	     | | <-- +-+
   106  	 <-- +-+
   107  
   108  	1. *Block.Exec repeatedly calls block function until finished or an
   109  	   unhandled exception is encountered
   110  
   111  	2. Dispatch switch passes control to appropriate part of block function
   112  	   and executes
   113  
   114  When the body returns with a nil exception, the accompanying value is the
   115  returned from the block. If an exception is returned then the "checkpoint
   116  stack" is examined. This data structure stores recovery points within body
   117  that need to be executed when an exception occurs. Expanding on the try/finally
   118  example above, when an exception is raised in the try clause, the finally
   119  checkpoint is popped off the stack and its value is assigned to state. Body
   120  then gets called again and control is passed to the finally label.
   121  
   122  To make things concrete, here is a block of code containing a
   123  try/finally:
   124  
   125  	01:	try:
   126  	02:		print "foo"
   127  	03:	finally:
   128  	04:		print "bar"
   129  
   130  The generated code for this sinippet would look something like this:
   131  
   132  	01:	func(f *Frame) (*Object, *BaseException) {
   133  	02:		switch state {
   134  	03:		case 0: goto Label0
   135  	04:		case 1: goto Label1
   136  	05:		}
   137  	06:	Label0:
   138  	07:		// line 1: try:
   139  	08:		f.PushCheckpoint(1)
   140  	09:		// line 2: print foo
   141  	10:		raised = Print(f, []*Object{NewStr("foo").ToObject()})
   142  	11:		if raised != nil {
   143  	12:			return nil, raised
   144  	13:		}
   145  	14:		f.PopCheckpoint()
   146  	15:	Label1:
   147  	16:		exc, tb = πF.RestoreExc(nil, nil)
   148  	17:		// line 4: print bar
   149  	18:		raised = Print(f, []*Object{NewStr("bar").ToObject()})
   150  	19:		if raised != nil {
   151  	20:			return nil, raised
   152  	21:		}
   153  	22:		if exc != nil {
   154  	24:			return nil, f.Raise(exc, nil, tb)
   155  	24:		}
   156  	25:		return None, nil
   157  	26:	}
   158  
   159  There are a few relevant things worth noting here:
   160  
   161  1. Upon entering the try clause on line 8, a checkpoint pointing to Label1 (the
   162     finally clause) is pushed onto the stack. If the try clause does not raise,
   163     the checkpoint is popped on line 14 and control falls through to Label1
   164     without having to re-enter the body function.
   165  
   166  2. Lines 10 and 18 are the two print statements. Exceptions raised during
   167     execution of these statements are returned immediately. In general,
   168     Python statements map to one or more Grumpy API function calls which may
   169     propagate exceptions.
   170  
   171  3. Control of the finally clause begins on line 16 where the exception
   172     indicator is cleared and its original value is stored and re-raised at the
   173     end of the clause. This matches CPython's behavior where exc info is cleared
   174     during the finally block.
   175  
   176  A stack is used to store checkpoints because checkpoints can be nested.
   177  Continuing the example above, the finally clause itself could be in an except
   178  handler, e.g.:
   179  
   180  	01:	try:
   181  	02:		try:
   182  	03:			print "foo"
   183  	04:		finally:
   184  	05:			print "bar"
   185  	06:	except SomeException:
   186  	07:		print "baz"
   187  
   188  Once the finally clause completes, it re-raises the exception and control is
   189  passed to the except handler label because it's next in the checkpoint stack.
   190  If the exception is an instance of SomeException then execution continues
   191  within the except clause. If it is some other kind of exception then it will be
   192  returned and control will be passed to the caller to find another checkpoint or
   193  unwind the call stack.
   194  
   195  Call model
   196  
   197  Python callables are represented by the Function basis struct and the
   198  corresponding Python "function" type. As in CPython, class methods and global
   199  functions are instances of this type. Associated with each instance is a Go
   200  function with the signature:
   201  
   202  	func(f *Frame, args Args, kwargs KWArgs) (*Object, *BaseException)
   203  
   204  The args slice and kwargs dict contain the positional and keyword arguments
   205  provided by the caller. Both builtin functions and those in user defined Python
   206  code are called using this convention, however the latter are wrapped in a
   207  layer represented by the FunctionSpec struct that validates arguments and
   208  substitutes absent keyword parameters with their default. Once the spec is
   209  validated, it passes control to the spec function:
   210  
   211  	func(f *Frame, args []*Object) (*Object, *BaseException)
   212  
   213  Here, the args slice contains an element for each argument present in the
   214  Python function's parameter list, in the same order. Every value is non-nil
   215  since default values have been substituted where necessary by the function
   216  spec. If parameters with the * or ** specifiers are present in the function
   217  signature, they are the last element(s) in args and hold any extra positional
   218  or keyword arguments provided by the caller.
   219  
   220  Generated code within the spec function consists of three main parts:
   221  
   222  	+----------------------+
   223  	| Spec func            |
   224  	| ---------            |
   225  	| Declare locals       |
   226  	| Declare temporaries  |
   227  	| +------------------+ |
   228  	| | Body func        | |
   229  	| | ----------       | |
   230  	| | Dispatch switch  | |
   231  	| | Labels           | |
   232  	| +------------------+ |
   233  	| Block.Exec(body)     |
   234  	+----------------------+
   235  
   236  Locals and temporaries are defined as local variables at the top of the spec
   237  function. Below that, the body function is defined which is stateless except
   238  for what it inherits from its enclosing scope and from the passed frame. This
   239  is important because the body function will be repeatedly reenetered, but all
   240  of the state will have a lifetime longer than any particular invocation because
   241  it belongs to the spec function's scope. Finally, *Block.Exec is called which
   242  drives the state machine, calling into the body function as appropriate.
   243  
   244  Generator functions work much the same way except that instead of calling Exec
   245  on the block directly, the block is returned and the generator's next() method
   246  calls Exec until its contents are exhausted.
   247  
   248  */
   249  package grumpy