github.com/alphadose/zenq/v2@v2.8.4/README.md (about)

     1  # ZenQ
     2  
     3  > A low-latency thread-safe queue in golang implemented using a lock-free ringbuffer and runtime internals
     4  
     5  Based on the [LMAX Disruptor Pattern](https://lmax-exchange.github.io/disruptor/disruptor.html)
     6  
     7  ## Features
     8  
     9  * Much faster than native channels in both SPSC (single-producer-single-consumer) and MPSC (multi-producer-single-consumer) modes in terms of `time/op`
    10  * More resource efficient in terms of `memory_allocation/op` and `num_allocations/op` evident while benchmarking large batch size inputs
    11  * Handles the case where NUM_WRITER_GOROUTINES > NUM_CPU_CORES much better than native channels
    12  * Selection from multiple ZenQs just like golang's `select{}` ensuring fair selection and no starvation
    13  * Closing a ZenQ
    14  
    15  Benchmarks to support the above claims [here](#benchmarks)
    16  
    17  ## Installation
    18  
    19  You need Golang [1.19.x](https://go.dev/dl/) or above
    20  
    21  ```bash
    22  $ go get github.com/alphadose/zenq/v2
    23  ```
    24  
    25  ## Usage
    26  
    27  1. Simple Read/Write
    28  ```go
    29  package main
    30  
    31  import (
    32  	"fmt"
    33  
    34  	"github.com/alphadose/zenq/v2"
    35  )
    36  
    37  type payload struct {
    38  	alpha int
    39  	beta  string
    40  }
    41  
    42  func main() {
    43  	zq := zenq.New[payload](10)
    44  
    45  	for j := 0; j < 5; j++ {
    46  		go func() {
    47  			for i := 0; i < 20; i++ {
    48  				zq.Write(payload{
    49  					alpha: i,
    50  					beta:  fmt.Sprint(i),
    51  				})
    52  			}
    53  		}()
    54  	}
    55  
    56  	for i := 0; i < 100; i++ {
    57  		if data, queueOpen := zq.Read(); queueOpen {
    58  			fmt.Printf("%+v\n", data)
    59  		}
    60  	}
    61  }
    62  ```
    63  
    64  2. **Selection** from multiple ZenQs just like golang's native `select{}`. The selection process is fair i.e no single ZenQ gets starved
    65  ```go
    66  package main
    67  
    68  import (
    69  	"fmt"
    70  
    71  	"github.com/alphadose/zenq/v2"
    72  )
    73  
    74  type custom1 struct {
    75  	alpha int
    76  	beta  string
    77  }
    78  
    79  type custom2 struct {
    80  	gamma int
    81  }
    82  
    83  const size = 100
    84  
    85  var (
    86  	zq1 = zenq.New[int](size)
    87  	zq2 = zenq.New[string](size)
    88  	zq3 = zenq.New[custom1](size)
    89  	zq4 = zenq.New[*custom2](size)
    90  )
    91  
    92  func main() {
    93  	go looper(intProducer)
    94  	go looper(stringProducer)
    95  	go looper(custom1Producer)
    96  	go looper(custom2Producer)
    97  
    98  	for i := 0; i < 40; i++ {
    99  
   100  		// Selection occurs here
   101  		if data := zenq.Select(zq1, zq2, zq3, zq4); data != nil {
   102  			switch data.(type) {
   103  			case int:
   104  				fmt.Printf("Received int %d\n", data)
   105  			case string:
   106  				fmt.Printf("Received string %s\n", data)
   107  			case custom1:
   108  				fmt.Printf("Received custom data type number 1 %#v\n", data)
   109  			case *custom2:
   110  				fmt.Printf("Received pointer %#v\n", data)
   111  			}
   112  		}
   113  	}
   114  }
   115  
   116  func intProducer(ctr int) { zq1.Write(ctr) }
   117  
   118  func stringProducer(ctr int) { zq2.Write(fmt.Sprint(ctr * 10)) }
   119  
   120  func custom1Producer(ctr int) { zq3.Write(custom1{alpha: ctr, beta: fmt.Sprint(ctr)}) }
   121  
   122  func custom2Producer(ctr int) { zq4.Write(&custom2{gamma: 1 << ctr}) }
   123  
   124  func looper(producer func(ctr int)) {
   125  	for i := 0; i < 10; i++ {
   126  		producer(i)
   127  	}
   128  }
   129  ```
   130  
   131  ## Benchmarks
   132  
   133  Benchmarking code available [here](./benchmarks)
   134  
   135  Note that if you run the benchmarks with `--race` flag then ZenQ will perform slower because the `--race` flag slows
   136  down the atomic operations in golang. Under normal circumstances, ZenQ will outperform golang native channels.
   137  
   138  ### Hardware Specs
   139  
   140  ```
   141  ❯ neofetch
   142                      'c.          alphadose@ReiEki.local
   143                   ,xNMM.          ----------------------
   144                 .OMMMMo           OS: macOS 12.3 21E230 arm64
   145                 OMMM0,            Host: MacBookAir10,1
   146       .;loddo:' loolloddol;.      Kernel: 21.4.0
   147     cKMMMMMMMMMMNWMMMMMMMMMM0:    Uptime: 6 hours, 41 mins
   148   .KMMMMMMMMMMMMMMMMMMMMMMMWd.    Packages: 86 (brew)
   149   XMMMMMMMMMMMMMMMMMMMMMMMX.      Shell: zsh 5.8
   150  ;MMMMMMMMMMMMMMMMMMMMMMMM:       Resolution: 1440x900
   151  :MMMMMMMMMMMMMMMMMMMMMMMM:       DE: Aqua
   152  .MMMMMMMMMMMMMMMMMMMMMMMMX.      WM: Rectangle
   153   kMMMMMMMMMMMMMMMMMMMMMMMMWd.    Terminal: iTerm2
   154   .XMMMMMMMMMMMMMMMMMMMMMMMMMMk   Terminal Font: FiraCodeNerdFontComplete-Medium 16 (normal)
   155    .XMMMMMMMMMMMMMMMMMMMMMMMMK.   CPU: Apple M1
   156      kMMMMMMMMMMMMMMMMMMMMMMd     GPU: Apple M1
   157       ;KMMMMMMMWXXWMMMMMMMk.      Memory: 1370MiB / 8192MiB
   158         .cooc,.    .,coo:.
   159  
   160  ```
   161  
   162  ### Terminology
   163  
   164  * NUM_WRITERS -> The number of goroutines concurrently writing to ZenQ/Channel
   165  * INPUT_SIZE -> The number of input payloads to be passed through ZenQ/Channel from producers to consumer
   166  
   167  ```bash
   168  Computed from benchstat of 30 benchmarks each via go test -benchmem -bench=. benchmarks/simple/*.go
   169  
   170  name                                     time/op
   171  _Chan_NumWriters1_InputSize600-8          23.2µs ± 1%
   172  _ZenQ_NumWriters1_InputSize600-8          17.9µs ± 1%
   173  _Chan_NumWriters3_InputSize60000-8        5.27ms ± 3%
   174  _ZenQ_NumWriters3_InputSize60000-8        2.36ms ± 2%
   175  _Chan_NumWriters8_InputSize6000000-8       671ms ± 2%
   176  _ZenQ_NumWriters8_InputSize6000000-8       234ms ± 6%
   177  _Chan_NumWriters100_InputSize6000000-8     1.59s ± 4%
   178  _ZenQ_NumWriters100_InputSize6000000-8     309ms ± 2%
   179  _Chan_NumWriters1000_InputSize7000000-8    1.97s ± 0%
   180  _ZenQ_NumWriters1000_InputSize7000000-8    389ms ± 4%
   181  _Chan_Million_Blocking_Writers-8           10.4s ± 2%
   182  _ZenQ_Million_Blocking_Writers-8           2.32s ±21%
   183  
   184  name                                     alloc/op
   185  _Chan_NumWriters1_InputSize600-8           0.00B
   186  _ZenQ_NumWriters1_InputSize600-8           0.00B
   187  _Chan_NumWriters3_InputSize60000-8          109B ±68%
   188  _ZenQ_NumWriters3_InputSize60000-8        24.6B ±107%
   189  _Chan_NumWriters8_InputSize6000000-8       802B ±241%
   190  _ZenQ_NumWriters8_InputSize6000000-8     1.18kB ±100%
   191  _Chan_NumWriters100_InputSize6000000-8    44.2kB ±41%
   192  _ZenQ_NumWriters100_InputSize6000000-8    10.7kB ±38%
   193  _Chan_NumWriters1000_InputSize7000000-8    476kB ± 8%
   194  _ZenQ_NumWriters1000_InputSize7000000-8   90.6kB ±10%
   195  _Chan_Million_Blocking_Writers-8           553MB ± 0%
   196  _ZenQ_Million_Blocking_Writers-8           122MB ± 3%
   197  
   198  name                                     allocs/op
   199  _Chan_NumWriters1_InputSize600-8            0.00
   200  _ZenQ_NumWriters1_InputSize600-8            0.00
   201  _Chan_NumWriters3_InputSize60000-8          0.00
   202  _ZenQ_NumWriters3_InputSize60000-8          0.00
   203  _Chan_NumWriters8_InputSize6000000-8       2.76 ±190%
   204  _ZenQ_NumWriters8_InputSize6000000-8        5.47 ±83%
   205  _Chan_NumWriters100_InputSize6000000-8       159 ±26%
   206  _ZenQ_NumWriters100_InputSize6000000-8      25.1 ±39%
   207  _Chan_NumWriters1000_InputSize7000000-8    1.76k ± 6%
   208  _ZenQ_NumWriters1000_InputSize7000000-8     47.3 ±31%
   209  _Chan_Million_Blocking_Writers-8           2.00M ± 0%
   210  _ZenQ_Million_Blocking_Writers-8           1.00M ± 0%
   211  ```
   212  
   213  The above results show that ZenQ is more efficient than channels in all 3 metrics i.e `time/op`, `mem_alloc/op` and `num_allocs/op` for the following tested cases:-
   214  
   215  1. SPSC
   216  2. MPSC with NUM_WRITER_GOROUTINES < NUM_CPU_CORES
   217  3. MPSC with NUM_WRITER_GOROUTINES > NUM_CPU_CORES
   218  
   219  
   220  ## Cherry on the Cake
   221  
   222  In SPSC mode ZenQ is faster than channels by **92 seconds** in case of input size of 6 * 10<sup>8</sup> elements
   223  
   224  ```bash
   225  ❯ go run benchmarks/simple/main.go
   226  
   227  With Input Batch Size: 60 and Num Concurrent Writers: 1
   228  
   229  Native Channel Runner completed transfer in: 26.916µs
   230  ZenQ Runner completed transfer in: 20.292µs
   231  ====================================================================
   232  
   233  With Input Batch Size: 600 and Num Concurrent Writers: 1
   234  
   235  Native Channel Runner completed transfer in: 135.75µs
   236  ZenQ Runner completed transfer in: 105.792µs
   237  ====================================================================
   238  
   239  With Input Batch Size: 6000 and Num Concurrent Writers: 1
   240  
   241  Native Channel Runner completed transfer in: 2.100209ms
   242  ZenQ Runner completed transfer in: 510.792µs
   243  ====================================================================
   244  
   245  With Input Batch Size: 6000000 and Num Concurrent Writers: 1
   246  
   247  Native Channel Runner completed transfer in: 1.241481917s
   248  ZenQ Runner completed transfer in: 226.068209ms
   249  ====================================================================
   250  
   251  With Input Batch Size: 600000000 and Num Concurrent Writers: 1
   252  
   253  Native Channel Runner completed transfer in: 1m55.074638875s
   254  ZenQ Runner completed transfer in: 22.582667917s
   255  ====================================================================
   256  ```