github.com/wzzhu/tensor@v0.9.24/README.md (about)

     1  # Package `tensor` [![GoDoc](https://godoc.org/github.com/wzzhu/tensor?status.svg)](https://godoc.org/github.com/wzzhu/tensor) [![GitHub version](https://badge.fury.io/gh/gorgonia%2Ftensor.svg)](https://badge.fury.io/gh/gorgonia%2Ftensor)  [![Build Status](https://travis-ci.org/gorgonia/tensor.svg?branch=master)](https://travis-ci.org/gorgonia/tensor) [![Coverage Status](https://coveralls.io/repos/github/gorgonia/tensor/badge.svg?branch=master)](https://coveralls.io/github/gorgonia/tensor?branch=master) [![Go Report Card](https://goreportcard.com/badge/github.com/wzzhu/tensor)](https://goreportcard.com/report/github.com/wzzhu/tensor) [![unstable](http://badges.github.io/stability-badges/dist/unstable.svg)](http://github.com/badges/stability-badges)#
     2  
     3  Package `tensor` is a package that provides efficient, generic (by some definitions of generic) n-dimensional arrays in Go. Also in this package are functions and methods that are used commonly in arithmetic, comparison and linear algebra operations.
     4  
     5  The main purpose of this package is to support the operations required by [Gorgonia](https://gorgonia.org/gorgonia).
     6  
     7  ## Introduction ##
     8  In the data analysis world, [Numpy](http://http://www.numpy.org/) and [Matlab](https://www.mathworks.com/products/matlab.html) currently reign supreme. Both tools rely heavily on having performant n-dimensional arrays, or tensors. **There is an obvious need for multidimensional arrays in Go**.
     9  
    10  While slices are cool, a large majority of scientific and numeric computing work relies heavily on matrices (two-dimensional arrays), three dimensional arrays and so on. In Go, the typical way of getting multidimensional arrays is to use something like `[][]T`. Applications that are more math heavy may opt to use the very excellent Gonum [`matrix` package](https://github.com/gonum/matrix). What then if we want to go beyond having a `float64` matrix? What if we wanted a 3-dimensional `float32` array?
    11  
    12  It comes to reason then there should be a data structure that handles these things. The `tensor` package fits in that niche.
    13  
    14  ### Basic Idea: Tensor ###
    15  A tensor is a multidimensional array. It's like a slice, but works in multiple dimensions.
    16  
    17  With slices, there are usage patterns that are repeated enough that warrant abstraction - `append`, `len`, `cap`, `range` are abstractions used to manipulate and query slices. Additionally slicing operations (`a[:1]` for example) are also abstractions provided by the language. Andrew Gerrand wrote a very good write up on [Go's slice usage and internals](https://blog.golang.org/go-slices-usage-and-internals).
    18  
    19  Tensors come with their own set of usage patterns and abstractions. Most of these have analogues in slices, enumerated below (do note that certain slice operation will have more than one tensor analogue - this is due to the number of options available):
    20  
    21  | Slice Operation | Tensor Operation |
    22  |:---------------:|:----------------:|
    23  | `len(a)`        | `T.Shape()`      |
    24  | `cap(a)`        | `T.DataSize()`   |
    25  | `a[:]`          | `T.Slice(...)`   |
    26  | `a[0]`          | `T.At(x,y)`      |
    27  | `append(a, ...)`| `T.Stack(...)`, `T.Concat(...)`   |
    28  | `copy(dest, src)`| `T.CopyTo(dest)`, `tensor.Copy(dest, src)` |
    29  | `for _, v := range a` | `for i, err := iterator.Next(); err == nil; i, err = iterator.Next()` |
    30  
    31  Some operations for a tensor does not have direct analogues to slice operations. However, they stem from the same idea, and can be considered a superset of all operations common to slices. They're enumerated below:
    32  
    33  | Tensor Operation | Basic idea in slices |
    34  |:----------------:|:--------------------:|
    35  |`T.Strides()`     | The stride of a slice will always be one element |
    36  |`T.Dims()`        | The dimensions of a slice will always be one |
    37  |`T.Size()`        | The size of a slice will always be its length |
    38  |`T.Dtype()`       | The type of a slice is always known at compile time |
    39  |`T.Reshape()`     | Given the shape of a slice is static, you can't really reshape a slice |
    40  |`T.T(...)` / `T.Transpose()` / `T.UT()` | No equivalent with slices |
    41  
    42  
    43  ## The Types of Tensors ##
    44  
    45  As of the current revision of this package, only dense tensors are supported. Support for sparse matrix (in form of a sparse column matrix and dictionary of keys matrix) will be coming shortly.
    46  
    47  
    48  ### Dense Tensors ###
    49  
    50  The `*Dense` tensor is the primary tensor and is represented by a singular flat array, regardless of dimensions. See the [Design of `*Dense`](#design-of-dense) section for more information. It can hold any data type.
    51  
    52  ### Compressed Sparse Column Matrix ###
    53  
    54  Documentation Coming soon
    55  
    56  ### Compressed Sparse Row Matrix ###
    57  
    58  Documentation Coming soon
    59  
    60  ## Usage ##
    61  
    62  To install: `go get -u "github.com/wzzhu/tensor"`
    63  
    64  To create a matrix with package `tensor` is easy:
    65  
    66  ```go
    67  // Creating a (2,2) matrix of int:
    68  a := New(WithShape(2, 2), WithBacking([]int{1, 2, 3, 4}))
    69  fmt.Printf("a:\n%v\n", a)
    70  
    71  // Output:
    72  // a:
    73  // ⎡1  2⎤
    74  // ⎣3  4⎦
    75  //
    76  ```
    77  
    78  To create a 3-Tensor is just as easy - just put the correct shape and you're good to go:
    79  
    80  ```go
    81  // Creating a (2,3,4) 3-Tensor of float32
    82  b := New(WithBacking(Range(Float32, 0, 24)), WithShape(2, 3, 4))
    83  fmt.Printf("b:\n%1.1f\n", b)
    84  
    85  // Output:
    86  // b:
    87  // ⎡ 0.0   1.0   2.0   3.0⎤
    88  // ⎢ 4.0   5.0   6.0   7.0⎥
    89  // ⎣ 8.0   9.0  10.0  11.0⎦
    90  //
    91  // ⎡12.0  13.0  14.0  15.0⎤
    92  // ⎢16.0  17.0  18.0  19.0⎥
    93  // ⎣20.0  21.0  22.0  23.0⎦
    94  ```
    95  
    96  Accessing and Setting data is fairly easy. Dimensions are 0-indexed, so if you come from an R background, suck it up like I did. Be warned, this is the inefficient way if you want to do a batch access/setting:
    97  
    98  ```go
    99  // Accessing data:
   100  b := New(WithBacking(Range(Float32, 0, 24)), WithShape(2, 3, 4))
   101  x, _ := b.At(0, 1, 2)
   102  fmt.Printf("x: %v\n", x)
   103  
   104  // Setting data
   105  b.SetAt(float32(1000), 0, 1, 2)
   106  fmt.Printf("b:\n%v", b)
   107  
   108  // Output:
   109  // x: 6
   110  // b:
   111  // ⎡   0     1     2     3⎤
   112  // ⎢   4     5  1000     7⎥
   113  // ⎣   8     9    10    11⎦
   114  
   115  // ⎡  12    13    14    15⎤
   116  // ⎢  16    17    18    19⎥
   117  // ⎣  20    21    22    23⎦
   118  ```
   119  
   120  Bear in mind to pass in data of the correct type. This example will cause a panic:
   121  
   122  ```go
   123  // Accessing data:
   124  b := New(WithBacking(Range(Float32, 0, 24)), WithShape(2, 3, 4))
   125  x, _ := b.At(0, 1, 2)
   126  fmt.Printf("x: %v\n", x)
   127  
   128  // Setting data
   129  b.SetAt(1000, 0, 1, 2)
   130  fmt.Printf("b:\n%v", b)
   131  ```
   132  
   133  There is a whole laundry list of methods and functions available at the [godoc](https://godoc.org/github.com/wzzhu/tensor) page
   134  
   135  
   136  
   137  ## Design of `*Dense` ##
   138  
   139  The design of the `*Dense` tensor is quite simple in concept. However, let's start with something more familiar. This is a visual representation of a slice in Go (taken from rsc's excellent blog post on [Go data structures](https://research.swtch.com/godata)):
   140  
   141  ![slice](https://github.com/gorgonia/tensor/blob/master/media/slice.png?raw=true)
   142  
   143  The data structure for `*Dense` is similar, but a lot more complex. Much of the complexity comes from the need to do accounting work on the data structure as well as preserving references to memory locations. This is how the `*Dense` is defined:
   144  
   145  ```go
   146  type Dense struct {
   147  	*AP
   148  	array
   149  	e Engine
   150  
   151  	// other fields elided for simplicity's sake
   152  }
   153  ```
   154  
   155  And here's a visual representation of the `*Dense`.
   156  
   157  ![dense](https://github.com/gorgonia/tensor/blob/master/media/dense.png?raw=true)
   158  
   159  `*Dense` draws its inspiration from Go's slice. Underlying it all is a flat array, and access to elements are controlled by `*AP`. Where a Go is able to store its metadata in a 3-word structure (obviating the need to allocate memory), a `*Dense` unfortunately needs to allocate some memory. The majority of the data is stored in the `*AP` structure, which contains metadata such as shape, stride, and methods for accessing the array.
   160  
   161  `*Dense` embeds an `array` (not to be confused with Go's array), which is an abstracted data structure that looks like this:
   162  
   163  ```
   164  type array struct {
   165  	storage.Header
   166  	t Dtype
   167  	v interface{}
   168  }
   169  ```
   170  
   171  `*storage.Header` is the same structure as `reflect.SliceHeader`, except it stores a `unsafe.Pointer` instead of a `uintptr`. This is done so that eventually when more tests are done to determine how the garbage collector marks data, the `v` field may be removed.
   172  
   173  The `storage.Header` field of the `array` (and hence `*Dense`) is there to provide a quick and easy way to translate back into a slice for operations that use familiar slice semantics, of which much of the operations are dependent upon.
   174  
   175  By default, `*Dense` operations try to use the language builtin slice operations by casting the `*storage.Header` field into a slice. However, to accomodate a larger subset of types, the `*Dense` operations have a fallback to using pointer arithmetic to iterate through the slices for other types with non-primitive kinds (yes, you CAN do pointer arithmetic in Go. It's slow and unsafe). The result is slower operations for types with non-primitive kinds.
   176  
   177  ### Memory Allocation ###
   178  `New()` functions as expected - it returns a pointer of `*Dense` to a array of zeroed memory. The underlying array is allocated, depending on what `ConsOpt` is passed in. With `New()`, `ConsOpt`s are used to determine the exact nature of the `*Dense`. It's a bit icky (I'd have preferred everything to have been known statically at compile time), but it works. Let's look at some examples:
   179  
   180  ``` go
   181  x := New(Of(Float64), WithShape(2,2)) // works
   182  y := New(WithShape(2,2)) // panics
   183  z := New(WithBacking([]int{1,2,3,4})) // works
   184  ```
   185  
   186  The following will happen:
   187  * Line 1 works: This will allocate a `float64` array of size 4.
   188  * Line 2 will cause a panic. This is because the function doesn't know what to allocate - it only knows to allocate an array of *something* for the size of 4.
   189  * Line 3 will NOT fail, because the array has already been allocated (the `*Dense` reuses the same backing array as the slice passed in). Its shape will be set to `(4)`.
   190  
   191  Alternatively you may also pass in an `Engine`. If that's the case then the allocation will use the `Alloc` method of the `Engine` instead:
   192  
   193  ```go
   194  x := New(Of(Float64), WithEngine(myEngine), WithShape(2,2))
   195  ```
   196  
   197  The above call will use `myEngine` to allocate memory instead. This is useful in cases where you may want to manually manage your memory.
   198  
   199  
   200  ### Other failed designs ###
   201  
   202  The alternative designs can be seen in the [ALTERNATIVE DESIGNS document](https://github.com/tensor/blob/master/ALTERNATIVEDESIGNS.md)
   203  
   204  ## Generic Features ##
   205  
   206  Example:
   207  
   208  ```go
   209  
   210  x := New(WithBacking([]string{"hello", "world", "hello", "world"}), WithShape(2,2))
   211  x = New(WithBacking([]int{1,2,3,4}), WithShape(2,2))
   212  ```
   213  
   214  The above code will not cause a compile error, because the structure holding the underlying array (of `string`s and then of `int`s) is a `*Dense`.
   215  
   216  One could argue that this sidesteps the compiler's type checking system, deferring it to runtime (which a number of people consider dangerous). However, tools are being developed to type check these things, and until Go does support typechecked generics, unfortunately this will be the way it has to be.
   217  
   218  Currently, the tensor package supports limited type of genericity - limited to a tensor of any primitive type.
   219  
   220  # How This Package is Developed #
   221  Much of the code in this package is generated. The code to generate them is in the directory `genlib2`. `genlib2` requires [`goimports`](https://godoc.org/golang.org/x/tools/cmd/goimports) binary to be available in the $PATH.
   222  
   223  ## Tests ##
   224  Tests require python with numpy installed. You can select which python intepreter is being used by setting the environment variable `PYTHON_COMMAND` accordingly. The default value is `python`.
   225  
   226  ## Things Knowingly Untested For ##
   227  - `complex64` and `complex128` are excluded from quick check generation process [Issue #11](https://github.com/gorgonia/tensor/issues/11)
   228  
   229  
   230  ### TODO ###
   231  
   232  * [ ] Identity optimizations for op
   233  * [ ] Zero value optimizations
   234  * [ ] fix Random() - super dodgy
   235  
   236  # How To Get Support #
   237  
   238  The best way of support right now is to open a ticket on Github.
   239  
   240  # Contributing #
   241  
   242  Obviously since you are most probably reading this on Github, Github will form the major part of the workflow for contributing to this package.
   243  
   244  See also: CONTRIBUTING.md
   245  
   246  
   247  ## Contributors and Significant Contributors ##
   248  
   249  All contributions are welcome. However, there is a new class of contributor, called Significant Contributors.
   250  
   251  A Significant Contributor is one who has shown *deep understanding* of how the library works and/or its environs.  Here are examples of what constitutes a Significant Contribution:
   252  
   253  * Wrote significant amounts of documentation pertaining to **why**/the mechanics of particular functions/methods and how the different parts affect one another
   254  * Wrote code, and tests around the more intricately connected parts of Gorgonia
   255  * Wrote code and tests, and have at least 5 pull requests accepted
   256  * Provided expert analysis on parts of the package (for example, you may be a floating point operations expert who optimized one function)
   257  * Answered at least 10 support questions.
   258  
   259  Significant Contributors list will be updated once a month (if anyone even uses Gorgonia that is).
   260  
   261  
   262  # Licence #
   263  
   264  Gorgonia and the `tensor` package are licenced under a variant of Apache 2.0. It's for all intents and purposes the same as the Apache 2.0 Licence, with the exception of not being able to commercially profit directly from the package unless you're a Significant Contributor (for example, providing commercial support for the package). It's perfectly fine to profit directly from a derivative of Gorgonia (for example, if you use Gorgonia as a library in your product)
   265  
   266  
   267  Everyone is still allowed to use Gorgonia for commercial purposes (example: using it in a software for your business).
   268  
   269  ## Various Other Copyright Notices ##
   270  
   271  These are the packages and libraries which inspired and were adapted from in the process of writing Gorgonia (the Go packages that were used were already declared above):
   272  
   273  | Source | How it's Used | Licence |
   274  |------|---|-------|
   275  | Numpy  | Inspired large portions. Directly adapted algorithms for a few methods (explicitly labelled in the docs) | MIT/BSD-like. [Numpy Licence](https://github.com/numpy/numpy/blob/master/LICENSE.txt) |