github.com/google/grumpy@v0.0.0-20171122020858-3ec87959189c/runtime/doc.go (about) 1 // Copyright 2016 Google Inc. All Rights Reserved. 2 // 3 // Licensed under the Apache License, Version 2.0 (the "License"); 4 // you may not use this file except in compliance with the License. 5 // You may obtain a copy of the License at 6 // 7 // http://www.apache.org/licenses/LICENSE-2.0 8 // 9 // Unless required by applicable law or agreed to in writing, software 10 // distributed under the License is distributed on an "AS IS" BASIS, 11 // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 // See the License for the specific language governing permissions and 13 // limitations under the License. 14 15 /* 16 Package grumpy is the Grumpy runtime's Python API, analogous to CPython's C API. 17 18 Data model 19 20 All Python objects are represented by structs that are binary compatible with 21 grumpy.Object, so for example the result of the Python expression "object()" is 22 just an Object pointer. More complex primitive types like str and dict are 23 represented by structs that augment Object by embedding it as their first field 24 and holding other data in subsequent fields. These augmented structs can 25 themselves be embedded for yet more complex types. 26 27 Objects contain a pointer to their Python type, represented by grumpy.Type, and 28 a pointer to their attribute dict, represented by grumpy.Dict. This dict may be 29 nil as in the case of str or non-nil as in the case of type objects. Note that 30 Grumpy objects do not have a refcount since Grumpy relies on Go's garbage 31 collection to manage object lifetimes. 32 33 Every Type object holds references to all its base classes as well as every 34 class in its MRO list. 35 36 Grumpy types also hold a reflect.Type instance known as the type's "basis". A 37 type's basis represents the Go struct used to store instances of the type. It 38 is an important invariant of the Grumpy runtime that an instance of a 39 particular Python type is stored in the Go struct that is that type's basis. 40 Violation of this invariant would mean that, for example, a str object could 41 end up being stored in an unaugmented Object and accessing the str's value 42 would access invalid memory. This invariant is enforced by Grumpy's API for 43 primitive types and user defined classes. 44 45 Upcasting and downcasting along the basis hierarchy is sometimes necessary, for 46 example when passing a Str to a function accepting an Object. Upcasts are 47 accomplished by accessing the embedded base type basis of the subclass, e.g. 48 accessing the Object member of the Str struct. Downcasting requires 49 unsafe.Pointer conversions. The safety of these conversions is guaranteed by 50 the invariant discussed above. E.g. it is valid to cast an *Object with type 51 StrType to a *Str because it was allocated with storage represented by 52 StrType's basis, which is struct Str. 53 54 Execution model 55 56 User defined Python code blocks (modules, classes and functions) are 57 implemented as Go closures with a state machine that allows the body of the 58 block to be re-entered for exception handling, yield statements, etc. The 59 generated code for the body of a code block looks something like this: 60 61 01: func(f *Frame) (*Object, *BaseException) { 62 02: switch (f.State()) { 63 03: case 0: goto Label0 64 04: case 1: goto Label1 65 05: ... 66 06: } 67 07: Label0: 68 08: ... 69 09: Label1: 70 10: ... 71 11: ... 72 12: } 73 74 Frame is the basis type for Grumpy's "frame" objects and is very similar to 75 CPython's type of the same name. The first argument f, therefore, represents a 76 level in the Python stack. Upon entry into the body, the frame's state variable 77 is checked and control jumps to the appropriate label. Upon first entry, the 78 state variable will be 0 and the so execution will start at Label0. Later 79 invocations may start at other labels. For example, an exception raised in the 80 try block of a try/finally will cause the function above to return an exception 81 as its second return value. The caller will then set state to the label 82 corresponding to the finally clause and call back into the body. 83 84 Python exceptions are represented by the BaseException basis struct. Grumpy API 85 functions and generated code blocks propagate exceptions by returning 86 *BaseException as their last return value. Exceptions are raised with the 87 Frame.Raise*() methods which create exception objects to be propagated and set 88 the exc info indicator for the current frame stack, similar to CPython. Python 89 except clauses down the stack can then handle the propagated exception. 90 91 Each generated body function is owned by a Block struct that is very similar to 92 CPython's code object. Each Block has a name (e.g. the class' name) and the 93 filename where the Python code was defined. A block is invoked via the 94 *Block.Exec method which pushes a new frame on the call stack and then 95 repeatedly calls the body function. This interplay is depicted below: 96 97 *Block.Exec 98 --> +-+ 99 | | block func 100 |1| --> +-+ 101 | | |2| 102 | | <-- +-+ 103 | | --> +-+ 104 | | |2| 105 | | <-- +-+ 106 <-- +-+ 107 108 1. *Block.Exec repeatedly calls block function until finished or an 109 unhandled exception is encountered 110 111 2. Dispatch switch passes control to appropriate part of block function 112 and executes 113 114 When the body returns with a nil exception, the accompanying value is the 115 returned from the block. If an exception is returned then the "checkpoint 116 stack" is examined. This data structure stores recovery points within body 117 that need to be executed when an exception occurs. Expanding on the try/finally 118 example above, when an exception is raised in the try clause, the finally 119 checkpoint is popped off the stack and its value is assigned to state. Body 120 then gets called again and control is passed to the finally label. 121 122 To make things concrete, here is a block of code containing a 123 try/finally: 124 125 01: try: 126 02: print "foo" 127 03: finally: 128 04: print "bar" 129 130 The generated code for this sinippet would look something like this: 131 132 01: func(f *Frame) (*Object, *BaseException) { 133 02: switch state { 134 03: case 0: goto Label0 135 04: case 1: goto Label1 136 05: } 137 06: Label0: 138 07: // line 1: try: 139 08: f.PushCheckpoint(1) 140 09: // line 2: print foo 141 10: raised = Print(f, []*Object{NewStr("foo").ToObject()}) 142 11: if raised != nil { 143 12: return nil, raised 144 13: } 145 14: f.PopCheckpoint() 146 15: Label1: 147 16: exc, tb = πF.RestoreExc(nil, nil) 148 17: // line 4: print bar 149 18: raised = Print(f, []*Object{NewStr("bar").ToObject()}) 150 19: if raised != nil { 151 20: return nil, raised 152 21: } 153 22: if exc != nil { 154 24: return nil, f.Raise(exc, nil, tb) 155 24: } 156 25: return None, nil 157 26: } 158 159 There are a few relevant things worth noting here: 160 161 1. Upon entering the try clause on line 8, a checkpoint pointing to Label1 (the 162 finally clause) is pushed onto the stack. If the try clause does not raise, 163 the checkpoint is popped on line 14 and control falls through to Label1 164 without having to re-enter the body function. 165 166 2. Lines 10 and 18 are the two print statements. Exceptions raised during 167 execution of these statements are returned immediately. In general, 168 Python statements map to one or more Grumpy API function calls which may 169 propagate exceptions. 170 171 3. Control of the finally clause begins on line 16 where the exception 172 indicator is cleared and its original value is stored and re-raised at the 173 end of the clause. This matches CPython's behavior where exc info is cleared 174 during the finally block. 175 176 A stack is used to store checkpoints because checkpoints can be nested. 177 Continuing the example above, the finally clause itself could be in an except 178 handler, e.g.: 179 180 01: try: 181 02: try: 182 03: print "foo" 183 04: finally: 184 05: print "bar" 185 06: except SomeException: 186 07: print "baz" 187 188 Once the finally clause completes, it re-raises the exception and control is 189 passed to the except handler label because it's next in the checkpoint stack. 190 If the exception is an instance of SomeException then execution continues 191 within the except clause. If it is some other kind of exception then it will be 192 returned and control will be passed to the caller to find another checkpoint or 193 unwind the call stack. 194 195 Call model 196 197 Python callables are represented by the Function basis struct and the 198 corresponding Python "function" type. As in CPython, class methods and global 199 functions are instances of this type. Associated with each instance is a Go 200 function with the signature: 201 202 func(f *Frame, args Args, kwargs KWArgs) (*Object, *BaseException) 203 204 The args slice and kwargs dict contain the positional and keyword arguments 205 provided by the caller. Both builtin functions and those in user defined Python 206 code are called using this convention, however the latter are wrapped in a 207 layer represented by the FunctionSpec struct that validates arguments and 208 substitutes absent keyword parameters with their default. Once the spec is 209 validated, it passes control to the spec function: 210 211 func(f *Frame, args []*Object) (*Object, *BaseException) 212 213 Here, the args slice contains an element for each argument present in the 214 Python function's parameter list, in the same order. Every value is non-nil 215 since default values have been substituted where necessary by the function 216 spec. If parameters with the * or ** specifiers are present in the function 217 signature, they are the last element(s) in args and hold any extra positional 218 or keyword arguments provided by the caller. 219 220 Generated code within the spec function consists of three main parts: 221 222 +----------------------+ 223 | Spec func | 224 | --------- | 225 | Declare locals | 226 | Declare temporaries | 227 | +------------------+ | 228 | | Body func | | 229 | | ---------- | | 230 | | Dispatch switch | | 231 | | Labels | | 232 | +------------------+ | 233 | Block.Exec(body) | 234 +----------------------+ 235 236 Locals and temporaries are defined as local variables at the top of the spec 237 function. Below that, the body function is defined which is stateless except 238 for what it inherits from its enclosing scope and from the passed frame. This 239 is important because the body function will be repeatedly reenetered, but all 240 of the state will have a lifetime longer than any particular invocation because 241 it belongs to the spec function's scope. Finally, *Block.Exec is called which 242 drives the state machine, calling into the body function as appropriate. 243 244 Generator functions work much the same way except that instead of calling Exec 245 on the block directly, the block is returned and the generator's next() method 246 calls Exec until its contents are exhausted. 247 248 */ 249 package grumpy