github.com/grailbio/bigslice@v0.0.0-20230519005545-30c4c12152ad/docs/implementation.md (about)

     1  ---
     2  title: Bigslice - implementation
     3  layout: default
     4  ---
     5  
     6  # About the Bigslice implementation
     7  {:.no_toc}
     8  
     9  In this document,
    10  we'll attempt to describe some of the high-level
    11  implementation details of Bigslice.
    12  The goal of this is to help the user understand
    13  the internals of Bigslice,
    14  and to help implementors of new slice operations.
    15  
    16  * ToC
    17  {:toc}
    18  
    19  # What is a `bigslice.Slice`?
    20  
    21  A [`bigslice.Slice`](https://godoc.org/github.com/grailbio/bigslice#Slice)
    22  represents a collection of rows of data. `bigslice.Slice` values
    23  are typed, and contain one or more columns of data.
    24  By convention,
    25  we write the type schematically using a Java-style generics syntax.
    26  For example,
    27  the type `Slice<string, int>` describes a `bigslice.Slice`
    28  with two columns:
    29  the first is string-typed;
    30  the second is integer-typed.
    31  
    32  Bigslice slices are *sharded*:
    33  their underlying dataset is split into a number of underlying partitions.
    34  `bigslice.Slice` is an interface,
    35  and the user may implement custom `bigslice.Slice`s.
    36  
    37  ```
    38  type Slice interface {
    39  	slicetype.Type
    40  
    41  	// Name returns a unique (composite) name for this Slice that also has
    42  	// useful context for diagnostic or status display.
    43  	Name() Name
    44  
    45  	// NumShard returns the number of shards in this Slice.
    46  	NumShard() int
    47  	// ShardType returns the sharding type of this Slice.
    48  	ShardType() ShardType
    49  
    50  	// NumDep returns the number of dependencies of this Slice.
    51  	NumDep() int
    52  	// Dep returns the i'th dependency for this Slice.
    53  	Dep(i int) Dep
    54  
    55  	// Combiner is an optional function that is used to combine multiple
    56  	// values with the same key from the slice's output. No combination
    57  	// is performed if nil.
    58  	Combiner() *reflect.Value
    59  
    60  	// Reader returns a Reader for a shard of this Slice. The reader
    61  	// itself computes the shard's values on demand. The caller must
    62  	// provide Readers for all of this shard's dependencies, constructed
    63  	// according to the dependency type (see Dep).
    64  	Reader(shard int, deps []sliceio.Reader) sliceio.Reader
    65  }
    66  ```
    67  
    68  A `bigslice.Slice` may declare dependencies on other slices.
    69  At runtime,
    70  these dependencies are materialized by the Bigslice pipeline
    71  and provided as input to func `Reader`.
    72  
    73  The kernel of a slice operation is `Reader`:
    74  it is invoked at runtime to produce the actual rows
    75  computed by the slice operation.
    76  The Bigslice runtime provides materialized
    77  [readers](https://godoc.org/github.com/grailbio/bigslice/sliceio#Reader)
    78  for each of the slice's dependencies;
    79  the returned reader is the output of the operation.
    80  
    81  `sliceio.Reader` is analogous to `io.Reader`,
    82  but operating on a
    83  [`frame.Frame`](https://godoc.org/github.com/grailbio/bigslice/frame#Frame),
    84  which is typed according to the slice.
    85  The `sliceio.Reader` implementation is responsible for
    86  filling the provided frame with up to `frame.Len()` rows of output.
    87  
    88  ```
    89  type Reader interface {
    90  	// Read reads a vector of records from the underlying Slice. Each
    91  	// passed-in column should be a value containing a slice of column
    92  	// values. The number of columns should match the number of columns
    93  	// in the slice; their types should match the corresponding column
    94  	// types of the slice. Each column should have the same slice
    95  	// length.
    96  	//
    97  	// Read returns the total number of records read, or an error. When
    98  	// no more records are available, Read returns EOF. Read may return
    99  	// EOF when n > 0. In this case, n records were read, but no more
   100  	// are available.
   101  	//
   102  	// Read should never reuse any allocated memory in the frame;
   103  	// its callers should not mutate the data returned.
   104  	//
   105  	// Read should not be called concurrently.
   106  	Read(ctx context.Context, frame frame.Frame) (int, error)
   107  }
   108  ```
   109  
   110  Frames are pre-allocated and managed by the Bigslice runtime.
   111  They are laid out in a columnar fashion,
   112  so the underlying data layout can be exploited for locality.
   113  
   114  # Frames
   115  
   116  [Frames](https://godoc.org/github.com/grailbio/bigslice/frame#Frame)
   117  are used within Bigslice to store data and operate on it.
   118  Frames represent a rectangular data frame,
   119  comprising one or more columns
   120  and one or more rows.
   121  Frames follow the semantics of Go's slices.
   122  That is,
   123  each frame is a descriptor of an underlying set of typed arrays.
   124  Each frame stores a pointer to the underlying data,
   125  together with its offset, length, and capacity.
   126  Thus, frames may be appended, copied, and sub-sliced
   127  in the manner of Go slices.
   128  (However, the set of columns remain fixed once a Frame has been created.)
   129  Frames also store the type of each column
   130  so that operations may be type-checked at runtime.
   131  
   132  Frames implement a columnar memory layout:
   133  that is, 
   134  each column in a Frame is an independent, contiguous array.
   135  An an example, consider a 3-column Frame 
   136  with types `A`, `B`, and `C`
   137  of length 4.
   138  This frame has the following memory layout: `AAAABBBBCCCC`.
   139  (a row-based layout would have the layout `ABCABCABCABC`.)
   140  
   141  In addition to managing storage,
   142  Frames provide a set of [type-driven operations](https://godoc.org/github.com/grailbio/bigslice/frame#Ops)
   143  that are required to implement various aspects of Bigslice.
   144  For example,
   145  operations are required for hashing and sorting data 
   146  (e.g., for a reduce or group-by operation);
   147  Ops also allow users to supply custom 
   148  encoding and decoding functions
   149  (by default, Bigslice will use [gob](https://godoc.org/encoding/gob)).
   150  User-provided operations are provided through 
   151  [frame.RegisterOps](https://godoc.org/github.com/grailbio/bigslice/frame#RegisterOps).
   152  
   153  Frames thus provide a mechanism to efficiently 
   154  and safely manage and operate on rectangular data.
   155  Frames expose operations in a type-oblivious way,
   156  so that algorithms can be generalized.
   157  For example,
   158  the Bigslice [combiner](https://github.com/grailbio/bigslice/blob/cafa2ff6e7ea96fa4d094a9f2149109825b3774a/exec/combiner.go#L148)
   159  implements a hash table on top of frames,
   160  without any knowledge of the types of the values it contains.
   161  
   162  # What is a `bigslice.FuncValue`?
   163  
   164  Bigslice performs computations which are defined as functions that return a
   165  `bigslice.Slice`, e.g.:
   166  ```
   167  func wordCount(url string) bigslice.Slice {
   168  	...
   169  }
   170  ```
   171  
   172  To distribute computation, Bigslice invokes these functions on remote executors
   173  running in different processes.  However, because Go provides no convenient way
   174  to serialize executable code for remote execution, these functions are
   175  represented as
   176  [`bigslice.FuncValue`](https://pkg.go.dev/github.com/grailbio/bigslice#FuncValue)s,
   177  created by
   178  [`bigslice.Func`](https://pkg.go.dev/github.com/grailbio/bigslice#Func).
   179  `bigslice.Func` builds a global registry of `FuncValue`s that is identical
   180  across processes, requiring callers to call it in deterministic order.  This
   181  registry allows Bigslice to refer to the same function across process
   182  boundaries by index in the registry, so instead of serializing executable code,
   183  Bigslice serializes the index.  Consequently, a full invocation of a function,
   184  i.e. the function and its arguments, is represented by a
   185  [`bigslice.Invocation`](https://pkg.go.dev/github.com/grailbio/bigslice#Invocation),
   186  which is also serializable.
   187  
   188  # Tasks
   189  `bigslice.Slice`s and their dependencies form an acyclic directed graph.  To
   190  compute a slice's contents, this graph is
   191  [compiled](https://github.com/grailbio/bigslice/blob/79c34a735576b13527741b003c10f52150ebe081/exec/compile.go#L111)
   192  into a corresponding acyclic directed graph of
   193  [exec.Task](https://pkg.go.dev/github.com/grailbio/bigslice/exec#Task)s.  A
   194  `Task` is the unit of computation in Bigslice: tasks are scheduled by Bigslice
   195  to run, possibly remotely and in parallel, to compute slice contents.  Each
   196  edge in the graph represents a dependency between tasks.  For example, a single
   197  task may perform a `Map` transformation, and it would depend on the task that
   198  computed the shard of data to be mapped.
   199  
   200  The [exec.Task](https://pkg.go.dev/github.com/grailbio/bigslice/exec#Task)
   201  structure represents both the graph and the execution state of the graph, e.g.
   202  whether the task has been successfully computed.
   203  
   204  The
   205  [`bigslice.Eval`](https://pkg.go.dev/github.com/grailbio/bigslice/exec#Eval)
   206  function computes task graphs from the provided set of roots, dispatching to
   207  the given
   208  [`exec.Executor`](https://pkg.go.dev/github.com/grailbio/bigslice/exec#Executor)
   209  to compute individual tasks when their dependencies have been satisified.
   210  `Eval` also reschedules tasks if their results are lost due to operational
   211  faults, e.g. the loss of a remote machine.  `(exec.Executor).Run` runs the
   212  given task, providing the data from its dependencies.
   213  
   214  A `Task`s performs its computation in its `Do` function:
   215  ```
   216  type Task struct {
   217  	Do func([]sliceio.Reader) sliceio.Reader
   218  	...
   219  }
   220  ```
   221  
   222  Data from its dependencies are provided by the input slice of readers.  The
   223  returned reader reads out the result of the computation (into a `Frame`).