github.com/grailbio/bigslice@v0.0.0-20230519005545-30c4c12152ad/doc.go (about) 1 // Copyright 2018 GRAIL, Inc. All rights reserved. 2 // Use of this source code is governed by the Apache 2.0 3 // license that can be found in the LICENSE file. 4 5 // TODO(marius): fill this in some more, especially once tooling 6 // improves. 7 8 /* 9 Package bigslice implements a distributed data processing system. 10 Users compose computations by operating over large collections ("big 11 slices") of data, transforming them with a handful of combinators. 12 While users express computations using collections-style operations, 13 bigslice takes care of the details of parallel execution and 14 distribution across multiple machines. 15 16 Bigslice jobs can run locally, but uses bigmachine for distribution 17 among a cluster of compute nodes. In either case, user code does not 18 change; the details of distribution are handled by the combination 19 of bigmachine and bigslice. 20 21 Because Go cannot easily serialize code to be sent over the wire and 22 executed remotely, bigslice programs have to be written with a few 23 constraints: 24 25 1. All slices must be constructed by bigslice funcs (bigslice.Func), and 26 all such functions must be instantiated before exec.Start is called. This 27 rule is easy to follow: if funcs are global variables, and exec.Start is 28 called from a program's main, then the program is compliant. 29 30 2. The driver program must be compiled on the same GOOS and GOARCH as the 31 target architecture. When running locally, this is not a concern, but 32 programs that require distribution must be run from a linux/amd64 binary. 33 Bigslice also supports the fat binary format implemented by 34 github.com/grailbio/base/fatbin. The bigslice tool 35 (github.com/grailbio/bigslice/cmd/bigslice) uses this package to compile 36 portable fat binaries. 37 38 Some Bigslice operations may be annotated with runtime pragmas: directives 39 for the Bigslice runtime. See Pragma for details. 40 41 User provided functions in Bigslice 42 43 Functions provided to the various bigslice combinators (e.g., bigslice.Map) 44 may take an additional argument of type context.Context. If specified, then 45 the lifetime of the context is tied to that of the underlying bigslice task. 46 Additionally, the context carries a metrics scope 47 (github.com/grailbio/base/bigslice/metrics.Scope) which can be used to update 48 metric values during data processing. 49 50 */ 51 package bigslice