github.com/ari-anchor/sei-tendermint@v0.0.0-20230519144642-dc826b7b56bb/docs/architecture/adr-043-blockchain-riri-org.md

github.com/ari-anchor/sei-tendermint@v0.0.0-20230519144642-dc826b7b56bb/docs/architecture/adr-043-blockchain-riri-org.md (about)

     1  # ADR 043: Blockhchain Reactor Riri-Org
     2  
     3  ## Changelog
     4  
     5  - 18-06-2019: Initial draft
     6  - 08-07-2019: Reviewed
     7  - 29-11-2019: Implemented
     8  - 14-02-2020: Updated with the implementation details
     9  
    10  ## Context
    11  
    12  The blockchain reactor is responsible for two high level processes:sending/receiving blocks from peers and FastSync-ing blocks to catch upnode who is far behind. The goal of [ADR-40](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-040-blockchain-reactor-refactor.md) was to refactor these two processes by separating business logic currently wrapped up in go-channels into pure `handle*` functions. While the ADR specified what the final form of the reactor might look like it lacked guidance on intermediary steps to get there.
    13  The following diagram illustrates the state of the [blockchain-reorg](https://github.com/tendermint/tendermint/pull/3561) reactor which will be referred to as `v1`.
    14  
    15  ![v1 Blockchain Reactor Architecture
    16  Diagram](https://github.com/tendermint/tendermint/blob/f9e556481654a24aeb689bdadaf5eab3ccd66829/docs/architecture/img/blockchain-reactor-v1.png)
    17  
    18  While `v1` of the blockchain reactor has shown significant improvements in terms of simplifying the concurrency model, the current PR has run into few roadblocks.
    19  
    20  - The current PR large and difficult to review.
    21  - Block gossiping and fast sync processes are highly coupled to the shared `Pool` data structure.
    22  - Peer communication is spread over multiple components creating complex dependency graph which must be mocked out during testing.
    23  - Timeouts modeled as stateful tickers introduce non-determinism in tests
    24  
    25  This ADR is meant to specify the missing components and control necessary to achieve [ADR-40](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-040-blockchain-reactor-refactor.md).
    26  
    27  ## Decision
    28  
    29  Partition the responsibilities of the blockchain reactor into a set of components which communicate exclusively with events. Events will contain timestamps allowing each component to track time as internal state. The internal state will be mutated by a set of `handle*` which will produce event(s). The integration between components will happen in the reactor and reactor tests will then become integration tests between components. This design will be known as `v2`.
    30  
    31  ![v2 Blockchain Reactor Architecture
    32  Diagram](https://github.com/tendermint/tendermint/blob/584e67ac3fac220c5c3e0652e3582eca8231e814/docs/architecture/img/blockchain-reactor-v2.png)
    33  
    34  ### Fast Sync Related Communication Channels
    35  
    36  The diagram below shows the fast sync routines and the types of channels and queues used to communicate with each other.
    37  In addition the per reactor channels used by the sendRoutine to send messages over the Peer MConnection are shown.
    38  
    39  ![v2 Blockchain Channels and Queues
    40  Diagram](https://github.com/tendermint/tendermint/blob/5cf570690f989646fb3b615b734da503f038891f/docs/architecture/img/blockchain-v2-channels.png)
    41  
    42  ### Reactor changes in detail
    43  
    44  The reactor will include a demultiplexing routine which will send each message to each sub routine for independent processing. Each sub routine will then select the messages it's interested in and call the handle specific function specified in [ADR-40](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-040-blockchain-reactor-refactor.md). The demuxRoutine acts as "pacemaker" setting the time in which events are expected to be handled.
    45  
    46  ```go
    47  func demuxRoutine(msgs, scheduleMsgs, processorMsgs, ioMsgs) {
    48  	timer := time.NewTicker(interval)
    49  	for {
    50  		select {
    51  			case <-timer.C:
    52  				now := evTimeCheck{time.Now()}
    53  				schedulerMsgs <- now
    54  				processorMsgs <- now
    55  				ioMsgs <- now
    56  			case msg:= <- msgs:
    57  				msg.time = time.Now()
    58  				// These channels should produce backpressure before
    59  				// being full to avoid starving each other
    60  				schedulerMsgs <- msg
    61  				processorMsgs <- msg
    62  				ioMesgs <- msg
    63  				if msg == stop {
    64  					break;
    65  				}
    66  		}
    67  	}
    68  }
    69  
    70  func processRoutine(input chan Message, output chan Message) {
    71  	processor := NewProcessor(..)
    72  	for {
    73  		msg := <- input
    74  		switch msg := msg.(type) {
    75  			case bcBlockRequestMessage:
    76  				output <- processor.handleBlockRequest(msg))
    77  			...
    78  			case stop:
    79  				processor.stop()
    80  				break;
    81  	}
    82  }
    83  
    84  func scheduleRoutine(input chan Message, output chan Message) {
    85  	schelduer = NewScheduler(...)
    86  	for {
    87  		msg := <-msgs
    88  		switch msg := input.(type) {
    89  			case bcBlockResponseMessage:
    90  				output <- scheduler.handleBlockResponse(msg)
    91  			...
    92  			case stop:
    93  				schedule.stop()
    94  				break;
    95  		}
    96  	}
    97  }
    98  ```
    99  
   100  ## Lifecycle management
   101  
   102  A set of routines for individual processes allow processes to run in parallel with clear lifecycle management. `Start`, `Stop`, and `AddPeer` hooks currently present in the reactor will delegate to the sub-routines allowing them to manage internal state independent without further coupling to the reactor.
   103  
   104  ```go
   105  func (r *BlockChainReactor) Start() {
   106  	r.msgs := make(chan Message, maxInFlight)
   107  	schedulerMsgs := make(chan Message)
   108  	processorMsgs := make(chan Message)
   109  	ioMsgs := make(chan Message)
   110  
   111  	go processorRoutine(processorMsgs, r.msgs)
   112  	go scheduleRoutine(schedulerMsgs, r.msgs)
   113  	go ioRoutine(ioMsgs, r.msgs)
   114  	...
   115  }
   116  
   117  func (bcR *BlockchainReactor) Receive(...) {
   118  	...
   119  	r.msgs <- msg
   120  	...
   121  }
   122  
   123  func (r *BlockchainReactor) Stop() {
   124  	...
   125  	r.msgs <- stop
   126  	...
   127  }
   128  
   129  ...
   130  func (r *BlockchainReactor) Stop() {
   131  	...
   132  	r.msgs <- stop
   133  	...
   134  }
   135  ...
   136  
   137  func (r *BlockchainReactor) AddPeer(peer p2p.Peer) {
   138  	...
   139  	r.msgs <- bcAddPeerEv{peer.ID}
   140  	...
   141  }
   142  
   143  ```
   144  
   145  ## IO handling
   146  
   147  An io handling routine within the reactor will isolate peer communication. Message going through the ioRoutine will usually be one way, using `p2p` APIs. In the case in which the `p2p` API such as `trySend` return errors, the ioRoutine can funnel those message back to the demuxRoutine for distribution to the other routines. For instance errors from the ioRoutine can be consumed by the scheduler to inform better peer selection implementations.
   148  
   149  ```go
   150  func (r *BlockchainReacor) ioRoutine(ioMesgs chan Message, outMsgs chan Message) {
   151  	...
   152  	for {
   153  		msg := <-ioMsgs
   154  		switch msg := msg.(type) {
   155  			case scBlockRequestMessage:
   156  				queued := r.sendBlockRequestToPeer(...)
   157  				if queued {
   158  					outMsgs <- ioSendQueued{...}
   159  				}
   160  			case scStatusRequestMessage
   161  				r.sendStatusRequestToPeer(...)
   162  			case bcPeerError
   163  				r.Swtich.StopPeerForError(msg.src)
   164  				...
   165  			...
   166  			case bcFinished
   167  				break;
   168  		}
   169  	}
   170  }
   171  
   172  ```
   173  
   174  ### Processor Internals
   175  
   176  The processor is responsible for ordering, verifying and executing blocks. The Processor will maintain an internal cursor `height` refering to the last processed block. As a set of blocks arrive unordered, the Processor will check if it has `height+1` necessary to process the next block. The processor also maintains the map `blockPeers` of peers to height, to keep track of which peer provided the block at `height`. `blockPeers` can be used in`handleRemovePeer(...)` to reschedule all unprocessed blocks provided by a peer who has errored.
   177  
   178  ```go
   179  type Processor struct {
   180  	height int64 // the height cursor
   181  	state ...
   182  	blocks [height]*Block	 // keep a set of blocks in memory until they are processed
   183  	blockPeers [height]PeerID // keep track of which heights came from which peerID
   184  	lastTouch timestamp
   185  }
   186  
   187  func (proc *Processor) handleBlockResponse(peerID, block) {
   188      if block.height <= height || block[block.height] {
   189  	} else if blocks[block.height] {
   190  		return errDuplicateBlock{}
   191  	} else  {
   192  		blocks[block.height] = block
   193  	}
   194  
   195  	if blocks[height] && blocks[height+1] {
   196  		... = state.Validators.VerifyCommit(...)
   197  		... = store.SaveBlock(...)
   198  		state, err = blockExec.ApplyBlock(...)
   199  		...
   200  		if err == nil {
   201  			delete blocks[height]
   202  			height++
   203  			lastTouch = msg.time
   204  			return pcBlockProcessed{height-1}
   205  		} else {
   206  			... // Delete all unprocessed block from the peer
   207  			return pcBlockProcessError{peerID, height}
   208  		}
   209  	}
   210  }
   211  
   212  func (proc *Processor) handleRemovePeer(peerID) {
   213  	events = []
   214  	// Delete all unprocessed blocks from peerID
   215  	for i = height; i < len(blocks); i++ {
   216  		if blockPeers[i] == peerID {
   217  			events = append(events, pcBlockReschedule{height})
   218  
   219  			delete block[height]
   220  		}
   221  	}
   222  	return events
   223  }
   224  
   225  func handleTimeCheckEv(time) {
   226  	if time - lastTouch > timeout {
   227  		// Timeout the processor
   228  		...
   229  	}
   230  }
   231  ```
   232  
   233  ## Schedule
   234  
   235  The Schedule maintains the internal state used for scheduling blockRequestMessages based on some scheduling algorithm. The schedule needs to maintain state on:
   236  
   237  - The state `blockState` of every block seem up to height of maxHeight
   238  - The set of peers and their peer state `peerState`
   239  - which peers have which blocks
   240  - which blocks have been requested from which peers
   241  
   242  ```go
   243  type blockState int
   244  
   245  const (
   246  	blockStateNew = iota
   247  	blockStatePending,
   248  	blockStateReceived,
   249  	blockStateProcessed
   250  )
   251  
   252  type schedule {
   253      // a list of blocks in which blockState
   254  	blockStates        map[height]blockState
   255  
   256      // a map of which blocks are available from which peers
   257  	blockPeers         map[height]map[p2p.ID]scPeer
   258  
   259      // a map of peerID to schedule specific peer struct `scPeer`
   260  	peers              map[p2p.ID]scPeer
   261  
   262      // a map of heights to the peer we are waiting for a response from
   263  	pending map[height]scPeer
   264  
   265  	targetPending  int // the number of blocks we want in blockStatePending
   266  	targetReceived int // the number of blocks we want in blockStateReceived
   267  
   268  	peerTimeout        int
   269  	peerMinSpeed       int
   270  }
   271  
   272  func (sc *schedule) numBlockInState(state blockState) uint32 {
   273  	num := 0
   274  	for i := sc.minHeight(); i <= sc.maxHeight(); i++ {
   275  		if sc.blockState[i] == state {
   276  			num++
   277  		}
   278  	}
   279  	return num
   280  }
   281  
   282  
   283  func (sc *schedule) popSchedule(maxRequest int) []scBlockRequestMessage {
   284  	// We only want to schedule requests such that we have less than sc.targetPending and sc.targetReceived
   285  	// This ensures we don't saturate the network or flood the processor with unprocessed blocks
   286  	todo := min(sc.targetPending - sc.numBlockInState(blockStatePending), sc.numBlockInState(blockStateReceived))
   287  	events := []scBlockRequestMessage{}
   288  	for i := sc.minHeight(); i < sc.maxMaxHeight(); i++ {
   289  		if todo == 0 {
   290  			break
   291  		}
   292  		if blockStates[i] == blockStateNew {
   293  			peer = sc.selectPeer(blockPeers[i])
   294  			sc.blockStates[i] = blockStatePending
   295  			sc.pending[i] = peer
   296  			events = append(events, scBlockRequestMessage{peerID: peer.peerID, height: i})
   297  			todo--
   298  		}
   299  	}
   300  	return events
   301  }
   302  ...
   303  
   304  type scPeer struct {
   305  	peerID               p2p.ID
   306  	numOustandingRequest int
   307  	lastTouched          time.Time
   308  	monitor              flow.Monitor
   309  }
   310  
   311  ```
   312  
   313  # Scheduler
   314  
   315  The scheduler is configured to maintain a target `n` of in flight
   316  messages and will use feedback from `_blockResponseMessage`,
   317  `_statusResponseMessage` and `_peerError` produce an optimal assignment
   318  of scBlockRequestMessage at each `timeCheckEv`.
   319  
   320  ```
   321  
   322  func handleStatusResponse(peerID, height, time) {
   323  	schedule.touchPeer(peerID, time)
   324  	schedule.setPeerHeight(peerID, height)
   325  }
   326  
   327  func handleBlockResponseMessage(peerID, height, block, time) {
   328  	schedule.touchPeer(peerID, time)
   329  	schedule.markReceived(peerID, height, size(block))
   330  }
   331  
   332  func handleNoBlockResponseMessage(peerID, height, time) {
   333  	schedule.touchPeer(peerID, time)
   334  	// reschedule that block, punish peer...
   335      ...
   336  }
   337  
   338  func handlePeerError(peerID)  {
   339      // Remove the peer, reschedule the requests
   340      ...
   341  }
   342  
   343  func handleTimeCheckEv(time) {
   344  	// clean peer list
   345  
   346      events = []
   347  	for peerID := range schedule.peersNotTouchedSince(time) {
   348  		pending = schedule.pendingFrom(peerID)
   349  		schedule.setPeerState(peerID, timedout)
   350  		schedule.resetBlocks(pending)
   351  		events = append(events, peerTimeout{peerID})
   352      }
   353  
   354  	events = append(events, schedule.popSchedule())
   355  
   356  	return events
   357  }
   358  ```
   359  
   360  ## Peer
   361  
   362  The Peer Stores per peer state based on messages received by the scheduler.
   363  
   364  ```go
   365  type Peer struct {
   366  	lastTouched timestamp
   367  	lastDownloaded timestamp
   368  	pending map[height]struct{}
   369  	height height // max height for the peer
   370  	state {
   371  		pending,   // we know the peer but not the height
   372  		active,    // we know the height
   373  		timeout    // the peer has timed out
   374  	}
   375  }
   376  ```
   377  
   378  ## Status
   379  
   380  Implemented
   381  
   382  ## Consequences
   383  
   384  ### Positive
   385  
   386  - Test become deterministic
   387  - Simulation becomes a-termporal: no need wait for a wall-time timeout
   388  - Peer Selection can be independently tested/simulated
   389  - Develop a general approach to refactoring reactors
   390  
   391  ### Negative
   392  
   393  ### Neutral
   394  
   395  ### Implementation Path
   396  
   397  - Implement the scheduler, test the scheduler, review the rescheduler
   398  - Implement the processor, test the processor, review the processor
   399  - Implement the demuxer, write integration test, review integration tests
   400  
   401  ## References
   402  
   403  - [ADR-40](https://github.com/tendermint/tendermint/blob/master/docs/architecture/adr-040-blockchain-reactor-refactor.md): The original blockchain reactor re-org proposal
   404  - [Blockchain re-org](https://github.com/tendermint/tendermint/pull/3561): The current blockchain reactor re-org implementation (v1)