github.com/badrootd/nibiru-cometbft@v0.37.5-0.20240307173500-2a75559eee9b/docs/architecture/adr-043-blockchain-riri-org.md (about) 1 # ADR 043: Blockhchain Reactor Riri-Org 2 3 ## Changelog 4 5 - 18-06-2019: Initial draft 6 - 08-07-2019: Reviewed 7 - 29-11-2019: Implemented 8 - 14-02-2020: Updated with the implementation details 9 10 ## Context 11 12 The blockchain reactor is responsible for two high level processes:sending/receiving blocks from peers and FastSync-ing blocks to catch upnode who is far behind. The goal of [ADR-40](https://github.com/tendermint/tendermint/blob/v0.37.x/docs/architecture/adr-040-blockchain-reactor-refactor.md) was to refactor these two processes by separating business logic currently wrapped up in go-channels into pure `handle*` functions. While the ADR specified what the final form of the reactor might look like it lacked guidance on intermediary steps to get there. 13 The following diagram illustrates the state of the [blockchain-reorg](https://github.com/tendermint/tendermint/pull/3561) reactor which will be referred to as `v1`. 14 15 ![v1 Blockchain Reactor Architecture 16 Diagram](https://github.com/tendermint/tendermint/blob/f9e556481654a24aeb689bdadaf5eab3ccd66829/docs/architecture/img/blockchain-reactor-v1.png) 17 18 While `v1` of the blockchain reactor has shown significant improvements in terms of simplifying the concurrency model, the current PR has run into few roadblocks. 19 20 - The current PR large and difficult to review. 21 - Block gossiping and fast sync processes are highly coupled to the shared `Pool` data structure. 22 - Peer communication is spread over multiple components creating complex dependency graph which must be mocked out during testing. 23 - Timeouts modeled as stateful tickers introduce non-determinism in tests 24 25 This ADR is meant to specify the missing components and control necessary to achieve [ADR-40](https://github.com/tendermint/tendermint/blob/v0.37.x/docs/architecture/adr-040-blockchain-reactor-refactor.md). 26 27 ## Decision 28 29 Partition the responsibilities of the blockchain reactor into a set of components which communicate exclusively with events. Events will contain timestamps allowing each component to track time as internal state. The internal state will be mutated by a set of `handle*` which will produce event(s). The integration between components will happen in the reactor and reactor tests will then become integration tests between components. This design will be known as `v2`. 30 31 ![v2 Blockchain Reactor Architecture 32 Diagram](https://github.com/tendermint/tendermint/blob/584e67ac3fac220c5c3e0652e3582eca8231e814/docs/architecture/img/blockchain-reactor-v2.png) 33 34 ### Fast Sync Related Communication Channels 35 36 The diagram below shows the fast sync routines and the types of channels and queues used to communicate with each other. 37 In addition the per reactor channels used by the sendRoutine to send messages over the Peer MConnection are shown. 38 39 ![v2 Blockchain Channels and Queues 40 Diagram](https://github.com/tendermint/tendermint/blob/5cf570690f989646fb3b615b734da503f038891f/docs/architecture/img/blockchain-v2-channels.png) 41 42 ### Reactor changes in detail 43 44 The reactor will include a demultiplexing routine which will send each message to each sub routine for independent processing. Each sub routine will then select the messages it's interested in and call the handle specific function specified in [ADR-40](https://github.com/tendermint/tendermint/blob/v0.37.x/docs/architecture/adr-040-blockchain-reactor-refactor.md). The demuxRoutine acts as "pacemaker" setting the time in which events are expected to be handled. 45 46 ```go 47 func demuxRoutine(msgs, scheduleMsgs, processorMsgs, ioMsgs) { 48 timer := time.NewTicker(interval) 49 for { 50 select { 51 case <-timer.C: 52 now := evTimeCheck{time.Now()} 53 schedulerMsgs <- now 54 processorMsgs <- now 55 ioMsgs <- now 56 case msg:= <- msgs: 57 msg.time = time.Now() 58 // These channels should produce backpressure before 59 // being full to avoid starving each other 60 schedulerMsgs <- msg 61 processorMsgs <- msg 62 ioMesgs <- msg 63 if msg == stop { 64 break; 65 } 66 } 67 } 68 } 69 70 func processRoutine(input chan Message, output chan Message) { 71 processor := NewProcessor(..) 72 for { 73 msg := <- input 74 switch msg := msg.(type) { 75 case bcBlockRequestMessage: 76 output <- processor.handleBlockRequest(msg)) 77 ... 78 case stop: 79 processor.stop() 80 break; 81 } 82 } 83 84 func scheduleRoutine(input chan Message, output chan Message) { 85 schelduer = NewScheduler(...) 86 for { 87 msg := <-msgs 88 switch msg := input.(type) { 89 case bcBlockResponseMessage: 90 output <- scheduler.handleBlockResponse(msg) 91 ... 92 case stop: 93 schedule.stop() 94 break; 95 } 96 } 97 } 98 ``` 99 100 ## Lifecycle management 101 102 A set of routines for individual processes allow processes to run in parallel with clear lifecycle management. `Start`, `Stop`, and `AddPeer` hooks currently present in the reactor will delegate to the sub-routines allowing them to manage internal state independent without further coupling to the reactor. 103 104 ```go 105 func (r *BlockChainReactor) Start() { 106 r.msgs := make(chan Message, maxInFlight) 107 schedulerMsgs := make(chan Message) 108 processorMsgs := make(chan Message) 109 ioMsgs := make(chan Message) 110 111 go processorRoutine(processorMsgs, r.msgs) 112 go scheduleRoutine(schedulerMsgs, r.msgs) 113 go ioRoutine(ioMsgs, r.msgs) 114 ... 115 } 116 117 func (bcR *BlockchainReactor) ReceiveEnvelope(...) { 118 ... 119 r.msgs <- msg 120 ... 121 } 122 123 func (r *BlockchainReactor) Stop() { 124 ... 125 r.msgs <- stop 126 ... 127 } 128 129 ... 130 func (r *BlockchainReactor) Stop() { 131 ... 132 r.msgs <- stop 133 ... 134 } 135 ... 136 137 func (r *BlockchainReactor) AddPeer(peer p2p.Peer) { 138 ... 139 r.msgs <- bcAddPeerEv{peer.ID} 140 ... 141 } 142 143 ``` 144 145 ## IO handling 146 147 An io handling routine within the reactor will isolate peer communication. Message going through the ioRoutine will usually be one way, using `p2p` APIs. In the case in which the `p2p` API such as `trySend` return errors, the ioRoutine can funnel those message back to the demuxRoutine for distribution to the other routines. For instance errors from the ioRoutine can be consumed by the scheduler to inform better peer selection implementations. 148 149 ```go 150 func (r *BlockchainReacor) ioRoutine(ioMesgs chan Message, outMsgs chan Message) { 151 ... 152 for { 153 msg := <-ioMsgs 154 switch msg := msg.(type) { 155 case scBlockRequestMessage: 156 queued := r.sendBlockRequestToPeer(...) 157 if queued { 158 outMsgs <- ioSendQueued{...} 159 } 160 case scStatusRequestMessage 161 r.sendStatusRequestToPeer(...) 162 case bcPeerError 163 r.Swtich.StopPeerForError(msg.src) 164 ... 165 ... 166 case bcFinished 167 break; 168 } 169 } 170 } 171 172 ``` 173 174 ### Processor Internals 175 176 The processor is responsible for ordering, verifying and executing blocks. The Processor will maintain an internal cursor `height` refering to the last processed block. As a set of blocks arrive unordered, the Processor will check if it has `height+1` necessary to process the next block. The processor also maintains the map `blockPeers` of peers to height, to keep track of which peer provided the block at `height`. `blockPeers` can be used in`handleRemovePeer(...)` to reschedule all unprocessed blocks provided by a peer who has errored. 177 178 ```go 179 type Processor struct { 180 height int64 // the height cursor 181 state ... 182 blocks [height]*Block // keep a set of blocks in memory until they are processed 183 blockPeers [height]PeerID // keep track of which heights came from which peerID 184 lastTouch timestamp 185 } 186 187 func (proc *Processor) handleBlockResponse(peerID, block) { 188 if block.height <= height || block[block.height] { 189 } else if blocks[block.height] { 190 return errDuplicateBlock{} 191 } else { 192 blocks[block.height] = block 193 } 194 195 if blocks[height] && blocks[height+1] { 196 ... = state.Validators.VerifyCommit(...) 197 ... = store.SaveBlock(...) 198 state, err = blockExec.ApplyBlock(...) 199 ... 200 if err == nil { 201 delete blocks[height] 202 height++ 203 lastTouch = msg.time 204 return pcBlockProcessed{height-1} 205 } else { 206 ... // Delete all unprocessed block from the peer 207 return pcBlockProcessError{peerID, height} 208 } 209 } 210 } 211 212 func (proc *Processor) handleRemovePeer(peerID) { 213 events = [] 214 // Delete all unprocessed blocks from peerID 215 for i = height; i < len(blocks); i++ { 216 if blockPeers[i] == peerID { 217 events = append(events, pcBlockReschedule{height}) 218 219 delete block[height] 220 } 221 } 222 return events 223 } 224 225 func handleTimeCheckEv(time) { 226 if time - lastTouch > timeout { 227 // Timeout the processor 228 ... 229 } 230 } 231 ``` 232 233 ## Schedule 234 235 The Schedule maintains the internal state used for scheduling blockRequestMessages based on some scheduling algorithm. The schedule needs to maintain state on: 236 237 - The state `blockState` of every block seem up to height of maxHeight 238 - The set of peers and their peer state `peerState` 239 - which peers have which blocks 240 - which blocks have been requested from which peers 241 242 ```go 243 type blockState int 244 245 const ( 246 blockStateNew = iota 247 blockStatePending, 248 blockStateReceived, 249 blockStateProcessed 250 ) 251 252 type schedule { 253 // a list of blocks in which blockState 254 blockStates map[height]blockState 255 256 // a map of which blocks are available from which peers 257 blockPeers map[height]map[p2p.ID]scPeer 258 259 // a map of peerID to schedule specific peer struct `scPeer` 260 peers map[p2p.ID]scPeer 261 262 // a map of heights to the peer we are waiting for a response from 263 pending map[height]scPeer 264 265 targetPending int // the number of blocks we want in blockStatePending 266 targetReceived int // the number of blocks we want in blockStateReceived 267 268 peerTimeout int 269 peerMinSpeed int 270 } 271 272 func (sc *schedule) numBlockInState(state blockState) uint32 { 273 num := 0 274 for i := sc.minHeight(); i <= sc.maxHeight(); i++ { 275 if sc.blockState[i] == state { 276 num++ 277 } 278 } 279 return num 280 } 281 282 283 func (sc *schedule) popSchedule(maxRequest int) []scBlockRequestMessage { 284 // We only want to schedule requests such that we have less than sc.targetPending and sc.targetReceived 285 // This ensures we don't saturate the network or flood the processor with unprocessed blocks 286 todo := min(sc.targetPending - sc.numBlockInState(blockStatePending), sc.numBlockInState(blockStateReceived)) 287 events := []scBlockRequestMessage{} 288 for i := sc.minHeight(); i < sc.maxMaxHeight(); i++ { 289 if todo == 0 { 290 break 291 } 292 if blockStates[i] == blockStateNew { 293 peer = sc.selectPeer(blockPeers[i]) 294 sc.blockStates[i] = blockStatePending 295 sc.pending[i] = peer 296 events = append(events, scBlockRequestMessage{peerID: peer.peerID, height: i}) 297 todo-- 298 } 299 } 300 return events 301 } 302 ... 303 304 type scPeer struct { 305 peerID p2p.ID 306 numOustandingRequest int 307 lastTouched time.Time 308 monitor flow.Monitor 309 } 310 311 ``` 312 313 # Scheduler 314 315 The scheduler is configured to maintain a target `n` of in flight 316 messages and will use feedback from `_blockResponseMessage`, 317 `_statusResponseMessage` and `_peerError` produce an optimal assignment 318 of scBlockRequestMessage at each `timeCheckEv`. 319 320 ``` 321 322 func handleStatusResponse(peerID, height, time) { 323 schedule.touchPeer(peerID, time) 324 schedule.setPeerHeight(peerID, height) 325 } 326 327 func handleBlockResponseMessage(peerID, height, block, time) { 328 schedule.touchPeer(peerID, time) 329 schedule.markReceived(peerID, height, size(block)) 330 } 331 332 func handleNoBlockResponseMessage(peerID, height, time) { 333 schedule.touchPeer(peerID, time) 334 // reschedule that block, punish peer... 335 ... 336 } 337 338 func handlePeerError(peerID) { 339 // Remove the peer, reschedule the requests 340 ... 341 } 342 343 func handleTimeCheckEv(time) { 344 // clean peer list 345 346 events = [] 347 for peerID := range schedule.peersNotTouchedSince(time) { 348 pending = schedule.pendingFrom(peerID) 349 schedule.setPeerState(peerID, timedout) 350 schedule.resetBlocks(pending) 351 events = append(events, peerTimeout{peerID}) 352 } 353 354 events = append(events, schedule.popSchedule()) 355 356 return events 357 } 358 ``` 359 360 ## Peer 361 362 The Peer Stores per peer state based on messages received by the scheduler. 363 364 ```go 365 type Peer struct { 366 lastTouched timestamp 367 lastDownloaded timestamp 368 pending map[height]struct{} 369 height height // max height for the peer 370 state { 371 pending, // we know the peer but not the height 372 active, // we know the height 373 timeout // the peer has timed out 374 } 375 } 376 ``` 377 378 ## Status 379 380 Implemented 381 382 ## Consequences 383 384 ### Positive 385 386 - Test become deterministic 387 - Simulation becomes a-termporal: no need wait for a wall-time timeout 388 - Peer Selection can be independently tested/simulated 389 - Develop a general approach to refactoring reactors 390 391 ### Negative 392 393 ### Neutral 394 395 ### Implementation Path 396 397 - Implement the scheduler, test the scheduler, review the rescheduler 398 - Implement the processor, test the processor, review the processor 399 - Implement the demuxer, write integration test, review integration tests 400 401 ## References 402 403 - [ADR-40](https://github.com/tendermint/tendermint/blob/v0.37.x/docs/architecture/adr-040-blockchain-reactor-refactor.md): The original blockchain reactor re-org proposal 404 - [Blockchain re-org](https://github.com/tendermint/tendermint/pull/3561): The current blockchain reactor re-org implementation (v1)