github.com/aakash4dev/cometbft@v0.38.2/spec/p2p/reactor-api/p2p-api.md (about) 1 # API for Reactors 2 3 This document describes the API provided by the p2p layer to the protocol 4 layer, namely to the registered reactors. 5 6 This API consists of two interfaces: the one provided by the `Switch` instance, 7 and the ones provided by multiple `Peer` instances, one per connected peer. 8 The `Switch` instance is provided to every reactor as part of the reactor's 9 [registration procedure][reactor-registration]. 10 The multiple `Peer` instances are provided to every registered reactor whenever 11 a [new connection with a peer][reactor-addpeer] is established. 12 13 > **Note** 14 > 15 > The practical reasons that lead to the interface to be provided in two parts, 16 > `Switch` and `Peer` instances are discussed in more datail in the 17 > [knowledge-base repository](https://github.com/cometbft/knowledge-base/blob/main/p2p/reactors/switch-peer.md). 18 19 ## `Switch` API 20 21 The [`Switch`][switch-type] is the central component of the p2p layer 22 implementation. It manages all the reactors running in a node and keeps track 23 of the connections with peers. 24 The table below summarizes the interaction of the standard reactors with the `Switch`: 25 26 | `Switch` API method | consensus | block sync | state sync | mempool | evidence | PEX | 27 |--------------------------------------------|-----------|------------|------------|---------|-----------|-------| 28 | `Peers() IPeerSet` | x | x | | | | x | 29 | `NumPeers() (int, int, int)` | | x | | | | x | 30 | `Broadcast(Envelope) chan bool` | x | x | x | | | | 31 | `MarkPeerAsGood(Peer)` | x | | | | | | 32 | `StopPeerForError(Peer, interface{})` | x | x | x | x | x | x | 33 | `StopPeerGracefully(Peer)` | | | | | | x | 34 | `Reactor(string) Reactor` | | x | | | | | 35 36 The above list is not exhaustive as it does not include all the `Switch` methods 37 invoked by the PEX reactor, a special component that should be considered part 38 of the p2p layer. This document does not cover the operation of the PEX reactor 39 as a connection manager. 40 41 ### Peers State 42 43 The first two methods in the switch API allow reactors to query the state of 44 the p2p layer: the set of connected peers. 45 46 func (sw *Switch) Peers() IPeerSet 47 48 The `Peers()` method returns the current set of connected peers. 49 The returned `IPeerSet` is an immutable concurrency-safe copy of this set. 50 Observe that the `Peer` handlers returned by this method were previously 51 [added to the reactor][reactor-addpeer] via the `InitPeer(Peer)` method, 52 but not yet removed via the `RemovePeer(Peer)` method. 53 Thus, a priori, reactors should already have this information. 54 55 func (sw *Switch) NumPeers() (outbound, inbound, dialing int) 56 57 The `NumPeers()` method returns the current number of connected peers, 58 distinguished between `outbound` and `inbound` peers. 59 An `outbound` peer is a peer the node has dialed to, while an `inbound` peer is 60 a peer the node has accepted a connection from. 61 The third field `dialing` reports the number of peers to which the node is 62 currently attempting to connect, so not (yet) connected peers. 63 64 > **Note** 65 > 66 > The third field returned by `NumPeers()`, the number of peers in `dialing` 67 > state, is not an information that should regard the protocol layer. 68 > In fact, with the exception of the PEX reactor, which can be considered part 69 > of the p2p layer implementation, no standard reactor actually uses this 70 > information, that could be removed when this interface is refactored. 71 72 ### Broadcast 73 74 The switch provides, mostly for historical or retro-compatibility reasons, 75 a method for sending a message to all connected peers: 76 77 func (sw *Switch) Broadcast(e Envelope) chan bool 78 79 The `Broadcast()` method is not blocking and returns a channel of booleans. 80 For every connected `Peer`, it starts a background thread for sending the 81 message to that peer, using the `Peer.Send()` method 82 (which is blocking, as detailed in [Send Methods](#send-methods)). 83 The result of each unicast send operation (success or failure) is added to the 84 returned channel, which is closed when all operations are completed. 85 86 > **Note** 87 > 88 > - The current _implementation_ of the `Switch.Broadcast(Envelope)` method is 89 > not efficient, as the marshalling of the provided message is performed as 90 > part of the `Peer.Send(Envelope)` helper method, that is, once per 91 > connected peer. 92 > - The return value of the broadcast method is not considered by any of the 93 > standard reactors that employ the method. One of the reasons is that is is 94 > not possible to associate each of the boolean outputs added to the 95 > returned channel to a peer. 96 97 ### Vetting Peers 98 99 The p2p layer relies on the registered reactors to gauge the _quality_ of peers. 100 The following method can be invoked by a reactor to inform the p2p layer that a 101 peer has presented a "good" behaviour. 102 This information is registered in the node's address book and influences the 103 operation of the Peer Exchange (PEX) protocol, as node discovery adopts a bias 104 towards "good" peers: 105 106 func (sw *Switch) MarkPeerAsGood(peer Peer) 107 108 At the moment, it is up to the consensus reactor to vet a peer. 109 In the current logic, a peer is marked as good whenever the consensus protocol 110 collects a multiple of `votesToContributeToBecomeGoodPeer = 10000` useful votes 111 or `blocksToContributeToBecomeGoodPeer = 10000` useful block parts from that peer. 112 By "useful", the consensus implementation considers messages that are valid and 113 that are received by the node when the node is expected for such information, 114 which excludes duplicated or late received messages. 115 116 > **Note** 117 > 118 > The switch doesn't currently provide a method to mark a peer as a bad peer. 119 > In fact, the peer quality management is really implemented in the current 120 > version of the p2p layer. 121 > This topic is being discussed in the [knowledge-base repository](https://github.com/cometbft/knowledge-base/blob/main/p2p/reactors/peer-quality.md). 122 123 ### Stopping Peers 124 125 Reactors can instruct the p2p layer to disconnect from a peer. 126 Using the p2p layer's nomenclature, the reactor requests a peer to be stopped. 127 The peer's send and receive routines are in fact stopped, interrupting the 128 communication with the peer. 129 The `Peer` is then [removed from every registered reactor][reactor-removepeer], 130 using the `RemovePeer(Peer)` method, and from the set of connected peers. 131 132 func (sw *Switch) StopPeerForError(peer Peer, reason interface{}) 133 134 All the standard reactors employ the above method for disconnecting from a peer 135 in case of errors. 136 These are errors that occur when processing a message received from a `Peer`. 137 The produced `error` is provided to the method as the `reason`. 138 139 The `StopPeerForError()` method has an important *caveat*: if the peer to be 140 stopped is configured as a _persistent peer_, the switch will attempt 141 reconnecting to that same peer. 142 While this behaviour makes sense when the method is invoked by other components 143 of the p2p layer (e.g., in the case of communication errors), it does not make 144 sense when it is invoked by a reactor. 145 146 > **Note** 147 > 148 > A more comprehensive discussion regarding this topic can be found on the 149 > [knowledge-base repository](https://github.com/cometbft/knowledge-base/blob/main/p2p/reactors/stop-peer.md). 150 151 func (sw *Switch) StopPeerGracefully(peer Peer) 152 153 The second method instructs the switch to disconnect from a peer for no 154 particular reason. 155 This method is only adopted by the PEX reactor of a node operating in _seed mode_, 156 as seed nodes disconnect from a peer after exchanging peer addresses with it. 157 158 ### Reactors Table 159 160 The switch keeps track of all registered reactors, indexed by unique reactor names. 161 A reactor can therefore use the switch to access another `Reactor` from its `name`: 162 163 func (sw *Switch) Reactor(name string) Reactor 164 165 This method is currently only used by the Block Sync reactor to access the 166 Consensus reactor implementation, from which it uses the exported 167 `SwitchToConsensus()` method. 168 While available, this inter-reactor interaction approach is discouraged and 169 should be avoided, as it violates the assumption that reactors are independent. 170 171 172 ## `Peer` API 173 174 The [`Peer`][peer-interface] interface represents a connected peer. 175 A `Peer` instance encapsulates a multiplex connection that implements the 176 actual communication (sending and receiving messages) with a peer. 177 When a connection is established with a peer, the `Switch` provides the 178 corresponding `Peer` instance to all registered reactors. 179 From this point, reactors can use the methods of the new `Peer` instance. 180 181 The table below summarizes the interaction of the standard reactors with 182 connected peers, with the `Peer` methods used by them: 183 184 | `Peer` API method | consensus | block sync | state sync | mempool | evidence | PEX | 185 |--------------------------------------------|-----------|------------|------------|---------|-----------|-------| 186 | `ID() ID` | x | x | x | x | x | x | 187 | `IsRunning() bool` | x | | | x | x | | 188 | `Quit() <-chan struct{}` | | | | x | x | | 189 | `Get(string) interface{}` | x | | | x | x | | 190 | `Set(string, interface{})` | x | | | | | | 191 | `Send(Envelope) bool` | x | x | x | x | x | x | 192 | `TrySend(Envelope) bool` | x | x | | | | | 193 194 The above list is not exhaustive as it does not include all the `Peer` methods 195 invoked by the PEX reactor, a special component that should be considered part 196 of the p2p layer. This document does not cover the operation of the PEX reactor 197 as a connection manager. 198 199 ### Identification 200 201 Nodes in the p2p network are configured with a unique cryptographic key pair. 202 The public part of this key pair is verified when establishing a connection 203 with the peer, as part of the authentication handshake, and constitutes the 204 peer's `ID`: 205 206 func (p Peer) ID() p2p.ID 207 208 Observe that each time the node connects to a peer (e.g., after disconnecting 209 from it), a new (distinct) `Peer` handler is provided to the reactors via 210 `InitPeer(Peer)` method. 211 In fact, the `Peer` handler is associated to a _connection_ with a peer, not to 212 the actual _node_ in the network. 213 To keep track of actual peers, the unique peer `p2p.ID` provided by the above 214 method should be employed. 215 216 ### Peer state 217 218 The switch starts the peer's send and receive routines before adding the peer 219 to every registered reactor using the `AddPeer(Peer)` method. 220 The reactors then usually start routines to interact with the new connected 221 peer using the received `Peer` handler. 222 For these routines it is useful to check whether the peer is still connected 223 and its send and receive routines are still running: 224 225 func (p Peer) IsRunning() bool 226 func (p Peer) Quit() <-chan struct{} 227 228 The above two methods provide the same information about the state of a `Peer` 229 instance in two different ways. 230 Both of them are defined in the [`Service`][service-interface] interface. 231 The `IsRunning()` method is synchronous and returns whether the peer has been 232 started and has not been stopped. 233 The `Quit()` method returns a channel that is closed when the peer is stopped; 234 it is an asynchronous state query. 235 236 ### Key-value store 237 238 Each `Peer` instance provides a synchronized key-value store that allows 239 sharing peer-specific state between reactors: 240 241 242 func (p Peer) Get(key string) interface{} 243 func (p Peer) Set(key string, data interface{}) 244 245 This key-value store can be seen as an asynchronous mechanism to exchange the 246 state of a peer between reactors. 247 In the current use-case of this mechanism, the Consensus reactor populates the 248 key-value store with a `PeerState` instance for each connected peer. 249 The Consensus reactor routines interacting with a peer read and update the 250 shared peer state. 251 The Evidence and Mempool reactors, in their turn, periodically query the 252 key-value store of each peer for retrieving, in particular, the last height 253 reported by the peer. 254 This information, produced by the Consensus reactor, influences the interaction 255 of these two reactors with their peers. 256 257 > **Note** 258 > 259 > More details of how this key-value store is used to share state between reactors can be found on the 260 > [knowledge-base repository](https://github.com/cometbft/knowledge-base/blob/main/p2p/reactors/peer-kvstore.md). 261 262 ### Send methods 263 264 Finally, a `Peer` instance allows a reactor to send messages to companion 265 reactors running at that peer. 266 This is ultimately the goal of the switch when it provides `Peer` instances to 267 the registered reactors. 268 There are two methods for sending messages: 269 270 func (p Peer) Send(e Envelope) bool 271 func (p Peer) TrySend(e Envelope) bool 272 273 The two message-sending methods receive an `Envelope`, whose content should be 274 set as follows: 275 276 - `ChannelID`: the channel the message should be sent through, which defines 277 the reactor that will process the message; 278 - `Src`: this field represents the source of an incoming message, which is 279 irrelevant for outgoing messages; 280 - `Message`: the actual message's payload, which is marshalled using protocol buffers. 281 282 The two message-sending methods attempt to add the message (`e.Payload`) to the 283 send queue of the peer's destination channel (`e.ChannelID`). 284 There is a send queue for each registered channel supported by the peer, and 285 each send queue has a capacity. 286 The capacity of the send queues for each channel are [configured][reactor-channels] 287 by reactors via the corresponding `ChannelDescriptor`. 288 289 The two message-sending methods return whether it was possible to enqueue 290 the marshalled message to the channel's send queue. 291 The most common reason for these methods to return `false` is the channel's 292 send queue being full. 293 Further reasons for returning `false` are: the peer being stopped, providing a 294 non-registered channel ID, or errors when marshalling the message's payload. 295 296 The difference between the two message-sending methods is _when_ they return `false`. 297 The `Send()` method is a _blocking_ method, it returns `false` if the message 298 could not be enqueued, because the channel's send queue is still full, after a 299 10-second _timeout_. 300 The `TrySend()` method is a _non-blocking_ method, it _immediately_ returns 301 `false` when the channel's send queue is full. 302 303 [peer-interface]: ../../../p2p/peer.go 304 [service-interface]: ../../../libs/service/service.go 305 [switch-type]: ../../../p2p/switch.go 306 307 [reactor-interface]: ../../../p2p/base_reactor.go 308 [reactor-registration]: ./reactor.md#registration 309 [reactor-channels]: ./reactor.md#registration 310 [reactor-addpeer]: ./reactor.md#peer-management 311 [reactor-removepeer]: ./reactor.md#stop-peer