github.com/alphadose/zenq/v2@v2.8.4/README.md (about) 1 # ZenQ 2 3 > A low-latency thread-safe queue in golang implemented using a lock-free ringbuffer and runtime internals 4 5 Based on the [LMAX Disruptor Pattern](https://lmax-exchange.github.io/disruptor/disruptor.html) 6 7 ## Features 8 9 * Much faster than native channels in both SPSC (single-producer-single-consumer) and MPSC (multi-producer-single-consumer) modes in terms of `time/op` 10 * More resource efficient in terms of `memory_allocation/op` and `num_allocations/op` evident while benchmarking large batch size inputs 11 * Handles the case where NUM_WRITER_GOROUTINES > NUM_CPU_CORES much better than native channels 12 * Selection from multiple ZenQs just like golang's `select{}` ensuring fair selection and no starvation 13 * Closing a ZenQ 14 15 Benchmarks to support the above claims [here](#benchmarks) 16 17 ## Installation 18 19 You need Golang [1.19.x](https://go.dev/dl/) or above 20 21 ```bash 22 $ go get github.com/alphadose/zenq/v2 23 ``` 24 25 ## Usage 26 27 1. Simple Read/Write 28 ```go 29 package main 30 31 import ( 32 "fmt" 33 34 "github.com/alphadose/zenq/v2" 35 ) 36 37 type payload struct { 38 alpha int 39 beta string 40 } 41 42 func main() { 43 zq := zenq.New[payload](10) 44 45 for j := 0; j < 5; j++ { 46 go func() { 47 for i := 0; i < 20; i++ { 48 zq.Write(payload{ 49 alpha: i, 50 beta: fmt.Sprint(i), 51 }) 52 } 53 }() 54 } 55 56 for i := 0; i < 100; i++ { 57 if data, queueOpen := zq.Read(); queueOpen { 58 fmt.Printf("%+v\n", data) 59 } 60 } 61 } 62 ``` 63 64 2. **Selection** from multiple ZenQs just like golang's native `select{}`. The selection process is fair i.e no single ZenQ gets starved 65 ```go 66 package main 67 68 import ( 69 "fmt" 70 71 "github.com/alphadose/zenq/v2" 72 ) 73 74 type custom1 struct { 75 alpha int 76 beta string 77 } 78 79 type custom2 struct { 80 gamma int 81 } 82 83 const size = 100 84 85 var ( 86 zq1 = zenq.New[int](size) 87 zq2 = zenq.New[string](size) 88 zq3 = zenq.New[custom1](size) 89 zq4 = zenq.New[*custom2](size) 90 ) 91 92 func main() { 93 go looper(intProducer) 94 go looper(stringProducer) 95 go looper(custom1Producer) 96 go looper(custom2Producer) 97 98 for i := 0; i < 40; i++ { 99 100 // Selection occurs here 101 if data := zenq.Select(zq1, zq2, zq3, zq4); data != nil { 102 switch data.(type) { 103 case int: 104 fmt.Printf("Received int %d\n", data) 105 case string: 106 fmt.Printf("Received string %s\n", data) 107 case custom1: 108 fmt.Printf("Received custom data type number 1 %#v\n", data) 109 case *custom2: 110 fmt.Printf("Received pointer %#v\n", data) 111 } 112 } 113 } 114 } 115 116 func intProducer(ctr int) { zq1.Write(ctr) } 117 118 func stringProducer(ctr int) { zq2.Write(fmt.Sprint(ctr * 10)) } 119 120 func custom1Producer(ctr int) { zq3.Write(custom1{alpha: ctr, beta: fmt.Sprint(ctr)}) } 121 122 func custom2Producer(ctr int) { zq4.Write(&custom2{gamma: 1 << ctr}) } 123 124 func looper(producer func(ctr int)) { 125 for i := 0; i < 10; i++ { 126 producer(i) 127 } 128 } 129 ``` 130 131 ## Benchmarks 132 133 Benchmarking code available [here](./benchmarks) 134 135 Note that if you run the benchmarks with `--race` flag then ZenQ will perform slower because the `--race` flag slows 136 down the atomic operations in golang. Under normal circumstances, ZenQ will outperform golang native channels. 137 138 ### Hardware Specs 139 140 ``` 141 ❯ neofetch 142 'c. alphadose@ReiEki.local 143 ,xNMM. ---------------------- 144 .OMMMMo OS: macOS 12.3 21E230 arm64 145 OMMM0, Host: MacBookAir10,1 146 .;loddo:' loolloddol;. Kernel: 21.4.0 147 cKMMMMMMMMMMNWMMMMMMMMMM0: Uptime: 6 hours, 41 mins 148 .KMMMMMMMMMMMMMMMMMMMMMMMWd. Packages: 86 (brew) 149 XMMMMMMMMMMMMMMMMMMMMMMMX. Shell: zsh 5.8 150 ;MMMMMMMMMMMMMMMMMMMMMMMM: Resolution: 1440x900 151 :MMMMMMMMMMMMMMMMMMMMMMMM: DE: Aqua 152 .MMMMMMMMMMMMMMMMMMMMMMMMX. WM: Rectangle 153 kMMMMMMMMMMMMMMMMMMMMMMMMWd. Terminal: iTerm2 154 .XMMMMMMMMMMMMMMMMMMMMMMMMMMk Terminal Font: FiraCodeNerdFontComplete-Medium 16 (normal) 155 .XMMMMMMMMMMMMMMMMMMMMMMMMK. CPU: Apple M1 156 kMMMMMMMMMMMMMMMMMMMMMMd GPU: Apple M1 157 ;KMMMMMMMWXXWMMMMMMMk. Memory: 1370MiB / 8192MiB 158 .cooc,. .,coo:. 159 160 ``` 161 162 ### Terminology 163 164 * NUM_WRITERS -> The number of goroutines concurrently writing to ZenQ/Channel 165 * INPUT_SIZE -> The number of input payloads to be passed through ZenQ/Channel from producers to consumer 166 167 ```bash 168 Computed from benchstat of 30 benchmarks each via go test -benchmem -bench=. benchmarks/simple/*.go 169 170 name time/op 171 _Chan_NumWriters1_InputSize600-8 23.2µs ± 1% 172 _ZenQ_NumWriters1_InputSize600-8 17.9µs ± 1% 173 _Chan_NumWriters3_InputSize60000-8 5.27ms ± 3% 174 _ZenQ_NumWriters3_InputSize60000-8 2.36ms ± 2% 175 _Chan_NumWriters8_InputSize6000000-8 671ms ± 2% 176 _ZenQ_NumWriters8_InputSize6000000-8 234ms ± 6% 177 _Chan_NumWriters100_InputSize6000000-8 1.59s ± 4% 178 _ZenQ_NumWriters100_InputSize6000000-8 309ms ± 2% 179 _Chan_NumWriters1000_InputSize7000000-8 1.97s ± 0% 180 _ZenQ_NumWriters1000_InputSize7000000-8 389ms ± 4% 181 _Chan_Million_Blocking_Writers-8 10.4s ± 2% 182 _ZenQ_Million_Blocking_Writers-8 2.32s ±21% 183 184 name alloc/op 185 _Chan_NumWriters1_InputSize600-8 0.00B 186 _ZenQ_NumWriters1_InputSize600-8 0.00B 187 _Chan_NumWriters3_InputSize60000-8 109B ±68% 188 _ZenQ_NumWriters3_InputSize60000-8 24.6B ±107% 189 _Chan_NumWriters8_InputSize6000000-8 802B ±241% 190 _ZenQ_NumWriters8_InputSize6000000-8 1.18kB ±100% 191 _Chan_NumWriters100_InputSize6000000-8 44.2kB ±41% 192 _ZenQ_NumWriters100_InputSize6000000-8 10.7kB ±38% 193 _Chan_NumWriters1000_InputSize7000000-8 476kB ± 8% 194 _ZenQ_NumWriters1000_InputSize7000000-8 90.6kB ±10% 195 _Chan_Million_Blocking_Writers-8 553MB ± 0% 196 _ZenQ_Million_Blocking_Writers-8 122MB ± 3% 197 198 name allocs/op 199 _Chan_NumWriters1_InputSize600-8 0.00 200 _ZenQ_NumWriters1_InputSize600-8 0.00 201 _Chan_NumWriters3_InputSize60000-8 0.00 202 _ZenQ_NumWriters3_InputSize60000-8 0.00 203 _Chan_NumWriters8_InputSize6000000-8 2.76 ±190% 204 _ZenQ_NumWriters8_InputSize6000000-8 5.47 ±83% 205 _Chan_NumWriters100_InputSize6000000-8 159 ±26% 206 _ZenQ_NumWriters100_InputSize6000000-8 25.1 ±39% 207 _Chan_NumWriters1000_InputSize7000000-8 1.76k ± 6% 208 _ZenQ_NumWriters1000_InputSize7000000-8 47.3 ±31% 209 _Chan_Million_Blocking_Writers-8 2.00M ± 0% 210 _ZenQ_Million_Blocking_Writers-8 1.00M ± 0% 211 ``` 212 213 The above results show that ZenQ is more efficient than channels in all 3 metrics i.e `time/op`, `mem_alloc/op` and `num_allocs/op` for the following tested cases:- 214 215 1. SPSC 216 2. MPSC with NUM_WRITER_GOROUTINES < NUM_CPU_CORES 217 3. MPSC with NUM_WRITER_GOROUTINES > NUM_CPU_CORES 218 219 220 ## Cherry on the Cake 221 222 In SPSC mode ZenQ is faster than channels by **92 seconds** in case of input size of 6 * 10<sup>8</sup> elements 223 224 ```bash 225 ❯ go run benchmarks/simple/main.go 226 227 With Input Batch Size: 60 and Num Concurrent Writers: 1 228 229 Native Channel Runner completed transfer in: 26.916µs 230 ZenQ Runner completed transfer in: 20.292µs 231 ==================================================================== 232 233 With Input Batch Size: 600 and Num Concurrent Writers: 1 234 235 Native Channel Runner completed transfer in: 135.75µs 236 ZenQ Runner completed transfer in: 105.792µs 237 ==================================================================== 238 239 With Input Batch Size: 6000 and Num Concurrent Writers: 1 240 241 Native Channel Runner completed transfer in: 2.100209ms 242 ZenQ Runner completed transfer in: 510.792µs 243 ==================================================================== 244 245 With Input Batch Size: 6000000 and Num Concurrent Writers: 1 246 247 Native Channel Runner completed transfer in: 1.241481917s 248 ZenQ Runner completed transfer in: 226.068209ms 249 ==================================================================== 250 251 With Input Batch Size: 600000000 and Num Concurrent Writers: 1 252 253 Native Channel Runner completed transfer in: 1m55.074638875s 254 ZenQ Runner completed transfer in: 22.582667917s 255 ==================================================================== 256 ```