github.com/aclements/go-misc@v0.0.0-20240129233631-2f6ede80790c/split/README.md (about) 1 This package is a prototype implementation of split (or "sharded") 2 values for Go. This is a possible solution to 3 https://github.com/golang/go/issues/18802. 4 5 [![](https://godoc.org/github.com/aclements/go-misc/split?status.svg)](https://godoc.org/github.com/aclements/go-misc/split) 6 7 This prototype is very dependent on Go runtime internals. As is, it 8 does not depend on any *modifications* to the Go runtime; however, 9 there is an optional runtime modification that shaves about 4ns off 10 the cost of `Value.Get`. See that method for details. 11 12 Benchmarks 13 ---------- 14 15 With the runtime modification, the single-core overhead of the split 16 value compared to a single atomic counter is about 2 ns, and compared 17 to a non-atomic counter is about 6 ns: 18 19 ``` 20 BenchmarkCounterSplit 200000000 8.15 ns/op 21 BenchmarkCounterShared 300000000 5.96 ns/op 22 BenchmarkCounterSequential 1000000000 2.14 ns/op 23 BenchmarkLazyAggregationSplit 100000000 23.9 ns/op 24 BenchmarkLazyAggregationShared 100000000 23.1 ns/op 25 ``` 26 27 The scaling of the split values to 24 cores is nearly perfect (real 28 cores, no hyperthreads), while the shared values collapse as you'd 29 expect: 30 31 ``` 32 BenchmarkCounterSplit-24 2000000000 0.35 ns/op 8.40 cpu-ns/op 33 BenchmarkCounterShared-24 50000000 24.7 ns/op 593 cpu-ns/op 34 BenchmarkLazyAggregationSplit-24 2000000000 1.03 ns/op 24.7 cpu-ns/op 35 BenchmarkLazyAggregationShared-24 10000000 174 ns/op 4176 cpu-ns/op 36 ``` 37 38 Without the runtime modification, there's a little more overhead in 39 the sequential case, but the scaling isn't affected: 40 41 ``` 42 BenchmarkCounterSplit 100000000 12.3 ns/op 43 BenchmarkCounterShared 300000000 5.97 ns/op 44 BenchmarkCounterSequential 1000000000 2.28 ns/op 45 BenchmarkLazyAggregationSplit 50000000 25.2 ns/op 46 BenchmarkLazyAggregationShared 100000000 23.5 ns/op 47 ```