github.com/grailbio/bigslice@v0.0.0-20230519005545-30c4c12152ad/docs/implementation.md (about) 1 --- 2 title: Bigslice - implementation 3 layout: default 4 --- 5 6 # About the Bigslice implementation 7 {:.no_toc} 8 9 In this document, 10 we'll attempt to describe some of the high-level 11 implementation details of Bigslice. 12 The goal of this is to help the user understand 13 the internals of Bigslice, 14 and to help implementors of new slice operations. 15 16 * ToC 17 {:toc} 18 19 # What is a `bigslice.Slice`? 20 21 A [`bigslice.Slice`](https://godoc.org/github.com/grailbio/bigslice#Slice) 22 represents a collection of rows of data. `bigslice.Slice` values 23 are typed, and contain one or more columns of data. 24 By convention, 25 we write the type schematically using a Java-style generics syntax. 26 For example, 27 the type `Slice<string, int>` describes a `bigslice.Slice` 28 with two columns: 29 the first is string-typed; 30 the second is integer-typed. 31 32 Bigslice slices are *sharded*: 33 their underlying dataset is split into a number of underlying partitions. 34 `bigslice.Slice` is an interface, 35 and the user may implement custom `bigslice.Slice`s. 36 37 ``` 38 type Slice interface { 39 slicetype.Type 40 41 // Name returns a unique (composite) name for this Slice that also has 42 // useful context for diagnostic or status display. 43 Name() Name 44 45 // NumShard returns the number of shards in this Slice. 46 NumShard() int 47 // ShardType returns the sharding type of this Slice. 48 ShardType() ShardType 49 50 // NumDep returns the number of dependencies of this Slice. 51 NumDep() int 52 // Dep returns the i'th dependency for this Slice. 53 Dep(i int) Dep 54 55 // Combiner is an optional function that is used to combine multiple 56 // values with the same key from the slice's output. No combination 57 // is performed if nil. 58 Combiner() *reflect.Value 59 60 // Reader returns a Reader for a shard of this Slice. The reader 61 // itself computes the shard's values on demand. The caller must 62 // provide Readers for all of this shard's dependencies, constructed 63 // according to the dependency type (see Dep). 64 Reader(shard int, deps []sliceio.Reader) sliceio.Reader 65 } 66 ``` 67 68 A `bigslice.Slice` may declare dependencies on other slices. 69 At runtime, 70 these dependencies are materialized by the Bigslice pipeline 71 and provided as input to func `Reader`. 72 73 The kernel of a slice operation is `Reader`: 74 it is invoked at runtime to produce the actual rows 75 computed by the slice operation. 76 The Bigslice runtime provides materialized 77 [readers](https://godoc.org/github.com/grailbio/bigslice/sliceio#Reader) 78 for each of the slice's dependencies; 79 the returned reader is the output of the operation. 80 81 `sliceio.Reader` is analogous to `io.Reader`, 82 but operating on a 83 [`frame.Frame`](https://godoc.org/github.com/grailbio/bigslice/frame#Frame), 84 which is typed according to the slice. 85 The `sliceio.Reader` implementation is responsible for 86 filling the provided frame with up to `frame.Len()` rows of output. 87 88 ``` 89 type Reader interface { 90 // Read reads a vector of records from the underlying Slice. Each 91 // passed-in column should be a value containing a slice of column 92 // values. The number of columns should match the number of columns 93 // in the slice; their types should match the corresponding column 94 // types of the slice. Each column should have the same slice 95 // length. 96 // 97 // Read returns the total number of records read, or an error. When 98 // no more records are available, Read returns EOF. Read may return 99 // EOF when n > 0. In this case, n records were read, but no more 100 // are available. 101 // 102 // Read should never reuse any allocated memory in the frame; 103 // its callers should not mutate the data returned. 104 // 105 // Read should not be called concurrently. 106 Read(ctx context.Context, frame frame.Frame) (int, error) 107 } 108 ``` 109 110 Frames are pre-allocated and managed by the Bigslice runtime. 111 They are laid out in a columnar fashion, 112 so the underlying data layout can be exploited for locality. 113 114 # Frames 115 116 [Frames](https://godoc.org/github.com/grailbio/bigslice/frame#Frame) 117 are used within Bigslice to store data and operate on it. 118 Frames represent a rectangular data frame, 119 comprising one or more columns 120 and one or more rows. 121 Frames follow the semantics of Go's slices. 122 That is, 123 each frame is a descriptor of an underlying set of typed arrays. 124 Each frame stores a pointer to the underlying data, 125 together with its offset, length, and capacity. 126 Thus, frames may be appended, copied, and sub-sliced 127 in the manner of Go slices. 128 (However, the set of columns remain fixed once a Frame has been created.) 129 Frames also store the type of each column 130 so that operations may be type-checked at runtime. 131 132 Frames implement a columnar memory layout: 133 that is, 134 each column in a Frame is an independent, contiguous array. 135 An an example, consider a 3-column Frame 136 with types `A`, `B`, and `C` 137 of length 4. 138 This frame has the following memory layout: `AAAABBBBCCCC`. 139 (a row-based layout would have the layout `ABCABCABCABC`.) 140 141 In addition to managing storage, 142 Frames provide a set of [type-driven operations](https://godoc.org/github.com/grailbio/bigslice/frame#Ops) 143 that are required to implement various aspects of Bigslice. 144 For example, 145 operations are required for hashing and sorting data 146 (e.g., for a reduce or group-by operation); 147 Ops also allow users to supply custom 148 encoding and decoding functions 149 (by default, Bigslice will use [gob](https://godoc.org/encoding/gob)). 150 User-provided operations are provided through 151 [frame.RegisterOps](https://godoc.org/github.com/grailbio/bigslice/frame#RegisterOps). 152 153 Frames thus provide a mechanism to efficiently 154 and safely manage and operate on rectangular data. 155 Frames expose operations in a type-oblivious way, 156 so that algorithms can be generalized. 157 For example, 158 the Bigslice [combiner](https://github.com/grailbio/bigslice/blob/cafa2ff6e7ea96fa4d094a9f2149109825b3774a/exec/combiner.go#L148) 159 implements a hash table on top of frames, 160 without any knowledge of the types of the values it contains. 161 162 # What is a `bigslice.FuncValue`? 163 164 Bigslice performs computations which are defined as functions that return a 165 `bigslice.Slice`, e.g.: 166 ``` 167 func wordCount(url string) bigslice.Slice { 168 ... 169 } 170 ``` 171 172 To distribute computation, Bigslice invokes these functions on remote executors 173 running in different processes. However, because Go provides no convenient way 174 to serialize executable code for remote execution, these functions are 175 represented as 176 [`bigslice.FuncValue`](https://pkg.go.dev/github.com/grailbio/bigslice#FuncValue)s, 177 created by 178 [`bigslice.Func`](https://pkg.go.dev/github.com/grailbio/bigslice#Func). 179 `bigslice.Func` builds a global registry of `FuncValue`s that is identical 180 across processes, requiring callers to call it in deterministic order. This 181 registry allows Bigslice to refer to the same function across process 182 boundaries by index in the registry, so instead of serializing executable code, 183 Bigslice serializes the index. Consequently, a full invocation of a function, 184 i.e. the function and its arguments, is represented by a 185 [`bigslice.Invocation`](https://pkg.go.dev/github.com/grailbio/bigslice#Invocation), 186 which is also serializable. 187 188 # Tasks 189 `bigslice.Slice`s and their dependencies form an acyclic directed graph. To 190 compute a slice's contents, this graph is 191 [compiled](https://github.com/grailbio/bigslice/blob/79c34a735576b13527741b003c10f52150ebe081/exec/compile.go#L111) 192 into a corresponding acyclic directed graph of 193 [exec.Task](https://pkg.go.dev/github.com/grailbio/bigslice/exec#Task)s. A 194 `Task` is the unit of computation in Bigslice: tasks are scheduled by Bigslice 195 to run, possibly remotely and in parallel, to compute slice contents. Each 196 edge in the graph represents a dependency between tasks. For example, a single 197 task may perform a `Map` transformation, and it would depend on the task that 198 computed the shard of data to be mapped. 199 200 The [exec.Task](https://pkg.go.dev/github.com/grailbio/bigslice/exec#Task) 201 structure represents both the graph and the execution state of the graph, e.g. 202 whether the task has been successfully computed. 203 204 The 205 [`bigslice.Eval`](https://pkg.go.dev/github.com/grailbio/bigslice/exec#Eval) 206 function computes task graphs from the provided set of roots, dispatching to 207 the given 208 [`exec.Executor`](https://pkg.go.dev/github.com/grailbio/bigslice/exec#Executor) 209 to compute individual tasks when their dependencies have been satisified. 210 `Eval` also reschedules tasks if their results are lost due to operational 211 faults, e.g. the loss of a remote machine. `(exec.Executor).Run` runs the 212 given task, providing the data from its dependencies. 213 214 A `Task`s performs its computation in its `Do` function: 215 ``` 216 type Task struct { 217 Do func([]sliceio.Reader) sliceio.Reader 218 ... 219 } 220 ``` 221 222 Data from its dependencies are provided by the input slice of readers. The 223 returned reader reads out the result of the computation (into a `Frame`).