github.com/wzzhu/tensor@v0.9.24/ALTERNATIVEDESIGNS.md (about)

     1  # Alternateive Designs #
     2  
     3  This document holds the alternative designs for the various tensor data structures that had been tried in the past and why they didn't make it to the final designs. That doesn't mean that the current design is the best. It just means that the authors may not have gone far enough with these other designs.
     4  
     5  
     6  ## Single interface, multiple packages ##
     7  
     8  In this design, there is a single interface for dense tensors, which is rather similar to the one that is currently there right now:
     9  
    10  ```
    11  type Tensor interface {
    12  	Shape() Shape
    13  	Strides() []int
    14  	Dtype() Dtype
    15  	Dims() int
    16  	Size() int
    17  	DataSize() int
    18  
    19  	// Basic operations all tensors must support
    20  	Slice(...Slice) (Tensor, error)
    21  	At(...int) (interface{}, error)
    22  	SetAt(v interface{}, coord ...int) error
    23  	Reshape(...int) error
    24  	T(axes ...int) error
    25  	UT()
    26  	Transpose() error // Transpose actually moves the data
    27  	Apply(fn interface{}, opts ...FuncOpt) (Tensor, error)
    28  }
    29  ```
    30  
    31  The idea is then to have subpackages for each type that would implement the `Tensor` like such:
    32  
    33  ```
    34  // in tensor/f32
    35  type Tensor struct {
    36  	
    37  }
    38  // implements tensor.Tensor
    39  
    40  // in tensor/f64
    41  type Tensor struct {
    42  	
    43  }
    44  // implements tensor.Tensor
    45  ```
    46  
    47  Additionally there are interfaces which defined operational types:
    48  
    49  ```
    50  type Adder interface {
    51  	Add(other Tensor) (Tensor, error)
    52  }
    53  
    54  type Number interface {
    55  	Adder
    56  	Suber
    57  	Muler
    58  	Diver
    59  }
    60  
    61  type Real interface {
    62  	Number
    63  	Tanher
    64  	Exper
    65  }
    66  
    67  type Complex interface {
    68  	Real
    69  }
    70  ```
    71  
    72  And there are functions which operated on the `Tensor`s:
    73  
    74  ```
    75  func Add(a, b Tensor) (Tensor, error){
    76  	if adder, ok := a.(Adder); ok {
    77  		return a.Add(other)
    78  	}
    79  	return nil, errors.New("Cannot Add: Not an Adder")
    80  }
    81  ```
    82  
    83  
    84  ### Pros ###
    85  
    86  It is very idiomatic Go, and no reflection was used. It is an ideal model of an abstract data type. 
    87  
    88  ### Cons ###
    89  
    90  1. Having all packages import a common "tensor/types" (which holds `*AP`, `Shape` and `Slice` definitions).
    91  2. It'd be ideal to keep all the packages in sync in terms of the methods and functions that the subpackages export. In reality that turns out to be more difficult than expected. 
    92  3. Performance issues in hot loops: In a number of hot loops, the amount of `runtime.assertI2I2` ended up taking up a large portion of the cycles.
    93  4. Performance issues wrt allocation of objects. Instead of a single pool, every sub pacakge would have to implement its own object pool and manage it.
    94  5. There was a central registry of `Dtype`s, and a variant of the SQL driver pattern was used (you had to `import _ "github.com/chewxy/gorgonia/tensor/f32" to register the `Float32` Dtype). This is ugly. 
    95  6. Cross package requirements: for `Argmax` and `Argmin` related functions, it'd be nice to be able to return a `Tensor` of `int`. That meant having `tensor/i` as a core dependency in the rest of the packages. 
    96  
    97  #### Workarounds ####
    98  
    99  * `Slice` is a interface. All packages that implement `tensor.Tensor` *coulc* implement their own `Slice`. But that'd be a lot of repeat work. 
   100  * `AP` and `Shape` could be made interfaces, but for the latter it means dropping the ability to loop through the shape dimensions.
   101  * Keeping the packages in sync could be solved with code generation programs, but if we were to do that, we might as well merge everything into one package
   102  
   103  ### Notes for revisits ###
   104  
   105  This idea is nice. I'd personally love to revisit (and do from time to time). If we were to revisit this idea, there would have to be some changes, which I will suggest here:
   106  
   107  1. Make `Transpose` and `T` functions that work on `Tensor` instead of making it a `Tensor`-defining method. This would be done the same way as `Stack` and `RollAxis` and `Concat`. 
   108  2. Perhaps re-weight the importance of having a inplace transpose. The in-place transpose was the result of dealing with a very large matrix when my machine didn't have enough memory. It's generally slower than reallocating a new backing array anyway.
   109  
   110  
   111  # One struct, multiple backing interfaces #
   112  
   113  In this design, we abstract away the backing array into a interface. So we'd have this:
   114  
   115  ```
   116  type Tensor struct {
   117  	*AP
   118  
   119  	t Dtype
   120  	data Array
   121  }
   122  
   123  type Array interface {
   124  	Len() int
   125  	Cap() int
   126  	Get(int) interface{}
   127  	Set(int, interface{}) error
   128  	Map(fn interface{}) error
   129  }
   130  ```
   131  
   132  And we'd have these types which implemented the `Array` interface: 
   133  
   134  ```
   135  type Ints []int
   136  type F32s []float64
   137  type F64s []float32
   138  
   139  // and so on and so forth, and each would implement Array
   140  ```
   141  
   142  ### Pros ###
   143  
   144  * Multiple subpackages only when necessary (external, "unhandled" dtypes )
   145  * Shared definition of `*AP`, `Shape`, `Dtype` (no more use of a common package)
   146  * Clean package structure - easier to generate code for
   147  
   148  ### Cons ###
   149  
   150  * Difficult to implement other tensor types (sparse for example)
   151  * VERY VERY slow
   152  
   153  The slowness was caused by excessive calls from `runtime.convT2E` when using `Get` and `Set` methods which for primitive types cause plenty of allocations on the heap. It was unacceptably slow for any deep learning work.
   154  
   155  #### Workarounds ####
   156  
   157  Type switch on known data types, and use slower methods for out-of-bounds data types that do not have specializations on it. This led to ugly unwieldly code, and also changes the pressure from `runtime.convT2E` to `runtime.assertI2I2`, which while performs better than having to allocate primitive values on the heap, still led to a lot of unnecessary cycles being spent on it. 
   158  
   159  # Reflection + Pointers + Interfaces #
   160  
   161  This was the design that was reigning before the refactor at #127. 
   162  
   163  The idea is to combine parts of the first attempt and second attempt and fill up the remaining missing bits with the use of reflections.