github.com/aakash4dev/cometbft@v0.38.2/spec/p2p/reactor-api/p2p-api.md

github.com/aakash4dev/cometbft@v0.38.2/spec/p2p/reactor-api/p2p-api.md (about)

     1  # API for Reactors
     2  
     3  This document describes the API provided by the p2p layer to the protocol
     4  layer, namely to the registered reactors.
     5  
     6  This API consists of two interfaces: the one provided by the `Switch` instance,
     7  and the ones provided by multiple `Peer` instances, one per connected peer.
     8  The `Switch` instance is provided to every reactor as part of the reactor's
     9  [registration procedure][reactor-registration].
    10  The multiple `Peer` instances are provided to every registered reactor whenever
    11  a [new connection with a peer][reactor-addpeer] is established.
    12  
    13  > **Note**
    14  >
    15  > The practical reasons that lead to the interface to be provided in two parts,
    16  > `Switch` and `Peer` instances are discussed in more datail in the
    17  > [knowledge-base repository](https://github.com/cometbft/knowledge-base/blob/main/p2p/reactors/switch-peer.md).
    18  
    19  ## `Switch` API
    20  
    21  The [`Switch`][switch-type] is the central component of the p2p layer
    22  implementation.  It manages all the reactors running in a node and keeps track
    23  of the connections with peers.
    24  The table below summarizes the interaction of the standard reactors with the `Switch`:
    25  
    26  | `Switch` API method                                     | consensus | block sync | state sync | mempool | evidence  | PEX   |
    27  |--------------------------------------------|-----------|------------|------------|---------|-----------|-------|
    28  | `Peers() IPeerSet`                         | x         | x          |            |         |           | x     |
    29  | `NumPeers() (int, int, int)`               |           | x          |            |         |           | x     |
    30  | `Broadcast(Envelope) chan bool`            | x         | x          | x          |         |           |       |
    31  | `MarkPeerAsGood(Peer)`                     | x         |            |            |         |           |       |
    32  | `StopPeerForError(Peer, interface{})`      | x         | x          | x          | x       | x         | x     |
    33  | `StopPeerGracefully(Peer)`                 |           |            |            |         |           | x     |
    34  | `Reactor(string) Reactor`                  |           | x          |            |         |           |       |
    35  
    36  The above list is not exhaustive as it does not include all the `Switch` methods
    37  invoked by the PEX reactor, a special component that should be considered part
    38  of the p2p layer. This document does not cover the operation of the PEX reactor
    39  as a connection manager.
    40  
    41  ### Peers State
    42  
    43  The first two methods in the switch API allow reactors to query the state of
    44  the p2p layer: the set of connected peers.
    45  
    46      func (sw *Switch) Peers() IPeerSet
    47  
    48  The `Peers()` method returns the current set of connected peers.
    49  The returned `IPeerSet` is an immutable concurrency-safe copy of this set.
    50  Observe that the `Peer` handlers returned by this method were previously
    51  [added to the reactor][reactor-addpeer] via the `InitPeer(Peer)` method,
    52  but not yet removed via the `RemovePeer(Peer)` method.
    53  Thus, a priori, reactors should already have this information.
    54  
    55      func (sw *Switch) NumPeers() (outbound, inbound, dialing int)
    56  
    57  The `NumPeers()` method returns the current number of connected peers,
    58  distinguished between `outbound` and `inbound` peers.
    59  An `outbound` peer is a peer the node has dialed to, while an `inbound` peer is
    60  a peer the node has accepted a connection from.
    61  The third field `dialing` reports the number of peers to which the node is
    62  currently attempting to connect, so not (yet) connected peers.
    63  
    64  > **Note**
    65  >
    66  > The third field returned by `NumPeers()`, the number of peers in `dialing`
    67  > state, is not an information that should regard the protocol layer.
    68  > In fact, with the exception of the PEX reactor, which can be considered part
    69  > of the p2p layer implementation, no standard reactor actually uses this
    70  > information, that could be removed when this interface is refactored.
    71  
    72  ### Broadcast
    73  
    74  The switch provides, mostly for historical or retro-compatibility reasons,
    75  a method for sending a message to all connected peers:
    76  
    77      func (sw *Switch) Broadcast(e Envelope) chan bool
    78  
    79  The `Broadcast()` method is not blocking and returns a channel of booleans.
    80  For every connected `Peer`, it starts a background thread for sending the
    81  message to that peer, using the `Peer.Send()` method
    82  (which is blocking, as detailed in [Send Methods](#send-methods)).
    83  The result of each unicast send operation (success or failure) is added to the
    84  returned channel, which is closed when all operations are completed.
    85  
    86  > **Note**
    87  >
    88  > - The current _implementation_ of the `Switch.Broadcast(Envelope)` method is
    89  >   not efficient, as the marshalling of the provided message is performed as
    90  >   part of the `Peer.Send(Envelope)` helper method, that is, once per
    91  >   connected peer.
    92  > - The return value of the broadcast method is not considered by any of the
    93  >   standard reactors that employ the method. One of the reasons is that is is
    94  >   not possible to associate each of the boolean outputs added to the
    95  >   returned channel to a peer.
    96  
    97  ### Vetting Peers
    98  
    99  The p2p layer relies on the registered reactors to gauge the _quality_ of peers.
   100  The following method can be invoked by a reactor to inform the p2p layer that a
   101  peer has presented a "good" behaviour.
   102  This information is registered in the node's address book and influences the
   103  operation of the Peer Exchange (PEX) protocol, as node discovery adopts a bias
   104  towards "good" peers:
   105  
   106      func (sw *Switch) MarkPeerAsGood(peer Peer)
   107  
   108  At the moment, it is up to the consensus reactor to vet a peer.
   109  In the current logic, a peer is marked as good whenever the consensus protocol
   110  collects a multiple of `votesToContributeToBecomeGoodPeer = 10000` useful votes
   111  or `blocksToContributeToBecomeGoodPeer = 10000` useful block parts from that peer.
   112  By "useful", the consensus implementation considers messages that are valid and
   113  that are received by the node when the node is expected for such information,
   114  which excludes duplicated or late received messages.
   115  
   116  > **Note**
   117  >
   118  > The switch doesn't currently provide a method to mark a peer as a bad peer.
   119  > In fact, the peer quality management is really implemented in the current
   120  > version of the p2p layer.
   121  > This topic is being discussed in the [knowledge-base repository](https://github.com/cometbft/knowledge-base/blob/main/p2p/reactors/peer-quality.md).
   122  
   123  ### Stopping Peers
   124  
   125  Reactors can instruct the p2p layer to disconnect from a peer.
   126  Using the p2p layer's nomenclature, the reactor requests a peer to be stopped.
   127  The peer's send and receive routines are in fact stopped, interrupting the
   128  communication with the peer.
   129  The `Peer` is then [removed from every registered reactor][reactor-removepeer],
   130  using the `RemovePeer(Peer)` method, and from the set of connected peers.
   131  
   132      func (sw *Switch) StopPeerForError(peer Peer, reason interface{})
   133  
   134  All the standard reactors employ the above method for disconnecting from a peer
   135  in case of errors.
   136  These are errors that occur when processing a message received from a `Peer`.
   137  The produced `error` is provided to the method as the `reason`.
   138  
   139  The `StopPeerForError()` method has an important *caveat*: if the peer to be
   140  stopped is configured as a _persistent peer_, the switch will attempt
   141  reconnecting to that same peer.
   142  While this behaviour makes sense when the method is invoked by other components
   143  of the p2p layer (e.g., in the case of communication errors), it does not make
   144  sense when it is invoked by a reactor.
   145  
   146  > **Note**
   147  >
   148  > A more comprehensive discussion regarding this topic can be found on the
   149  > [knowledge-base repository](https://github.com/cometbft/knowledge-base/blob/main/p2p/reactors/stop-peer.md).
   150  
   151      func (sw *Switch) StopPeerGracefully(peer Peer)
   152  
   153  The second method instructs the switch to disconnect from a peer for no
   154  particular reason.
   155  This method is only adopted by the PEX reactor of a node operating in _seed mode_,
   156  as seed nodes disconnect from a peer after exchanging peer addresses with it.
   157  
   158  ### Reactors Table
   159  
   160  The switch keeps track of all registered reactors, indexed by unique reactor names.
   161  A reactor can therefore use the switch to access another `Reactor` from its `name`:
   162  
   163      func (sw *Switch) Reactor(name string) Reactor
   164  
   165  This method is currently only used by the Block Sync reactor to access the
   166  Consensus reactor implementation, from which it uses the exported
   167  `SwitchToConsensus()` method.
   168  While available, this inter-reactor interaction approach is discouraged and
   169  should be avoided, as it violates the assumption that reactors are independent.
   170  
   171  
   172  ## `Peer` API
   173  
   174  The [`Peer`][peer-interface] interface represents a connected peer.
   175  A `Peer` instance encapsulates a multiplex connection that implements the
   176  actual communication (sending and receiving messages) with a peer.
   177  When a connection is established with a peer, the `Switch` provides the
   178  corresponding `Peer` instance to all registered reactors.
   179  From this point, reactors can use the methods of the new `Peer` instance.
   180  
   181  The table below summarizes the interaction of the standard reactors with
   182  connected peers, with the `Peer` methods used by them:
   183  
   184  | `Peer` API method                                     | consensus | block sync | state sync | mempool | evidence  | PEX   |
   185  |--------------------------------------------|-----------|------------|------------|---------|-----------|-------|
   186  | `ID() ID`                                  | x         | x          | x          | x       | x         | x     |
   187  | `IsRunning() bool`                         | x         |            |            | x       | x         |       |
   188  | `Quit() <-chan struct{}`                   |           |            |            | x       | x         |       |
   189  | `Get(string) interface{}`                  | x         |            |            | x       | x         |       |
   190  | `Set(string, interface{})`                 | x         |            |            |         |           |       |
   191  | `Send(Envelope) bool`                      | x         | x          | x          | x       | x         | x     |
   192  | `TrySend(Envelope) bool`                   | x         | x          |            |         |           |       |
   193  
   194  The above list is not exhaustive as it does not include all the `Peer` methods
   195  invoked by the PEX reactor, a special component that should be considered part
   196  of the p2p layer. This document does not cover the operation of the PEX reactor
   197  as a connection manager.
   198  
   199  ### Identification
   200  
   201  Nodes in the p2p network are configured with a unique cryptographic key pair.
   202  The public part of this key pair is verified when establishing a connection
   203  with the peer, as part of the authentication handshake, and constitutes the
   204  peer's `ID`:
   205  
   206      func (p Peer) ID() p2p.ID
   207  
   208  Observe that each time the node connects to a peer (e.g., after disconnecting
   209  from it), a new (distinct) `Peer` handler is provided to the reactors via
   210  `InitPeer(Peer)` method.
   211  In fact, the `Peer` handler is associated to a _connection_ with a peer, not to
   212  the actual _node_ in the network.
   213  To keep track of actual peers, the unique peer `p2p.ID` provided by the above
   214  method should be employed.
   215  
   216  ### Peer state
   217  
   218  The switch starts the peer's send and receive routines before adding the peer
   219  to every registered reactor using the `AddPeer(Peer)` method.
   220  The reactors then usually start routines to interact with the new connected
   221  peer using the received `Peer` handler.
   222  For these routines it is useful to check whether the peer is still connected
   223  and its send and receive routines are still running:
   224  
   225      func (p Peer) IsRunning() bool
   226      func (p Peer) Quit() <-chan struct{}
   227  
   228  The above two methods provide the same information about the state of a `Peer`
   229  instance in two different ways.
   230  Both of them are defined in the  [`Service`][service-interface] interface.
   231  The `IsRunning()` method is synchronous and returns whether the peer has been
   232  started and has not been stopped.
   233  The `Quit()` method returns a channel that is closed when the peer is stopped;
   234  it is an asynchronous state query.
   235  
   236  ### Key-value store
   237  
   238  Each `Peer` instance provides a synchronized key-value store that allows
   239  sharing peer-specific state between reactors:
   240  
   241  
   242      func (p Peer) Get(key string) interface{}
   243      func (p Peer) Set(key string, data interface{})
   244  
   245  This key-value store can be seen as an asynchronous mechanism to exchange the
   246  state of a peer between reactors.
   247  In the current use-case of this mechanism, the Consensus reactor populates the
   248  key-value store with a `PeerState` instance for each connected peer.
   249  The Consensus reactor routines interacting with a peer read and update the
   250  shared peer state.
   251  The Evidence and Mempool reactors, in their turn, periodically query the
   252  key-value store of each peer for retrieving, in particular, the last height
   253  reported by the peer.
   254  This information, produced by the Consensus reactor, influences the interaction
   255  of these two reactors with their peers.
   256  
   257  > **Note**
   258  >
   259  > More details of how this key-value store is used to share state between reactors can be found on the
   260  > [knowledge-base repository](https://github.com/cometbft/knowledge-base/blob/main/p2p/reactors/peer-kvstore.md).
   261  
   262  ### Send methods
   263  
   264  Finally, a `Peer` instance allows a reactor to send messages to companion
   265  reactors running at that peer.
   266  This is ultimately the goal of the switch when it provides `Peer` instances to
   267  the registered reactors.
   268  There are two methods for sending messages:
   269  
   270      func (p Peer) Send(e Envelope) bool
   271      func (p Peer) TrySend(e Envelope) bool
   272  
   273  The two message-sending methods receive an `Envelope`, whose content should be
   274  set as follows:
   275  
   276  - `ChannelID`: the channel the message should be sent through, which defines
   277    the reactor that will process the message;
   278  - `Src`: this field represents the source of an incoming message, which is
   279    irrelevant for outgoing messages;
   280  - `Message`: the actual message's payload, which is marshalled using protocol buffers.
   281  
   282  The two message-sending methods attempt to add the message (`e.Payload`) to the
   283  send queue of the peer's destination channel (`e.ChannelID`).
   284  There is a send queue for each registered channel supported by the peer, and
   285  each send queue has a capacity.
   286  The capacity of the send queues for each channel are [configured][reactor-channels]
   287  by reactors via the corresponding `ChannelDescriptor`.
   288  
   289  The two message-sending methods return whether it was possible to enqueue
   290  the marshalled message to the channel's send queue.
   291  The most common reason for these methods to return `false` is the channel's
   292  send queue being full.
   293  Further reasons for returning `false` are: the peer being stopped, providing a
   294  non-registered channel ID, or errors when marshalling the message's payload.
   295  
   296  The difference between the two message-sending methods is _when_ they return `false`.
   297  The `Send()` method is a _blocking_ method, it returns `false` if the message
   298  could not be enqueued, because the channel's send queue is still full, after a
   299  10-second _timeout_.
   300  The `TrySend()` method is a _non-blocking_ method, it _immediately_ returns
   301  `false` when the channel's send queue is full.
   302  
   303  [peer-interface]: ../../../p2p/peer.go
   304  [service-interface]: ../../../libs/service/service.go
   305  [switch-type]: ../../../p2p/switch.go
   306  
   307  [reactor-interface]: ../../../p2p/base_reactor.go
   308  [reactor-registration]: ./reactor.md#registration
   309  [reactor-channels]: ./reactor.md#registration
   310  [reactor-addpeer]: ./reactor.md#peer-management
   311  [reactor-removepeer]: ./reactor.md#stop-peer