github.com/number571/tendermint@v0.34.11-gost/docs/architecture/adr-062-p2p-architecture.md

github.com/number571/tendermint@v0.34.11-gost/docs/architecture/adr-062-p2p-architecture.md (about)

     1  # ADR 062: P2P Architecture and Abstractions
     2  
     3  ## Changelog
     4  
     5  - 2020-11-09: Initial version (@erikgrinaker)
     6  
     7  - 2020-11-13: Remove stream IDs, move peer errors onto channel, note on moving PEX into core (@erikgrinaker)
     8  
     9  - 2020-11-16: Notes on recommended reactor implementation patterns, approve ADR (@erikgrinaker)
    10  
    11  - 2021-02-04: Update with new P2P core and Transport API changes (@erikgrinaker).
    12  
    13  ## Context
    14  
    15  In [ADR 061](adr-061-p2p-refactor-scope.md) we decided to refactor the peer-to-peer (P2P) networking stack. The first phase is to redesign and refactor the internal P2P architecture, while retaining protocol compatibility as far as possible.
    16  
    17  ## Alternative Approaches
    18  
    19  Several variations of the proposed design were considered, including e.g. calling interface methods instead of passing messages (like the current architecture), merging channels with streams, exposing the internal peer data structure to reactors, being message format-agnostic via arbitrary codecs, and so on. This design was chosen because it has very loose coupling, is simpler to reason about and more convenient to use, avoids race conditions and lock contention for internal data structures, gives reactors better control of message ordering and processing semantics, and allows for QoS scheduling and backpressure in a very natural way.
    20  
    21  [multiaddr](https://github.com/multiformats/multiaddr) was considered as a transport-agnostic peer address format over regular URLs, but it does not appear to have very widespread adoption, and advanced features like protocol encapsulation and tunneling do not appear to be immediately useful to us.
    22  
    23  There were also proposals to use LibP2P instead of maintaining our own P2P stack, which were rejected (for now) in [ADR 061](adr-061-p2p-refactor-scope.md).
    24  
    25  The initial version of this ADR had a byte-oriented multi-stream transport API, but this had to be abandoned/postponed to maintain backwards-compatibility with the existing MConnection protocol which is message-oriented. See the rejected RFC in [tendermint/spec#227](https://github.com/tendermint/spec/pull/227) for details.
    26  
    27  ## Decision
    28  
    29  The P2P stack will be redesigned as a message-oriented architecture, primarily relying on Go channels for communication and scheduling. It will use a message-oriented transport to binary messages with individual peers, bidirectional peer-addressable channels to send and receive Protobuf messages, a router to route messages between reactors and peers, and a peer manager to manage peer lifecycle information. Message passing is asynchronous with at-most-once delivery.
    30  
    31  ## Detailed Design
    32  
    33  This ADR is primarily concerned with the architecture and interfaces of the P2P stack, not implementation details. The interfaces described here should therefore be considered a rough architecture outline, not a complete and final design.
    34  
    35  Primary design objectives have been:
    36  
    37  * Loose coupling between components, for a simpler, more robust, and test-friendly architecture.
    38  * Pluggable transports (not necessarily networked).
    39  * Better scheduling of messages, with improved prioritization, backpressure, and performance.
    40  * Centralized peer lifecycle and connection management.
    41  * Better peer address detection, advertisement, and exchange.
    42  * Wire-level backwards compatibility with current P2P network protocols, except where it proves too obstructive.
    43  
    44  The main abstractions in the new stack are:
    45  
    46  * `Transport`: An arbitrary mechanism to exchange binary messages with a peer across a `Connection`.
    47  * `Channel`: A bidirectional channel to asynchronously exchange Protobuf messages with peers using node ID addressing.
    48  * `Router`: Maintains transport connections to relevant peers and routes channel messages.
    49  * `PeerManager`: Manages peer lifecycle information, e.g. deciding which peers to dial and when, using a `peerStore` for storage.
    50  * Reactor: A design pattern loosely defined as "something which listens on a channel and reacts to messages".
    51  
    52  These abstractions are illustrated in the following diagram (representing the internals of node A) and described in detail below.
    53  
    54  ![P2P Architecture Diagram](img/adr-062-architecture.svg)
    55  
    56  ### Transports
    57  
    58  Transports are arbitrary mechanisms for exchanging binary messages with a peer. For example, a gRPC transport would connect to a peer over TCP/IP and send data using the gRPC protocol, while an in-memory transport might communicate with a peer running in another goroutine using internal Go channels. Note that transports don't have a notion of a "peer" or "node" as such - instead, they establish connections between arbitrary endpoint addresses (e.g. IP address and port number), to decouple them from the rest of the P2P stack.
    59  
    60  Transports must satisfy the following requirements:
    61  
    62  * Be connection-oriented, and support both listening for inbound connections and making outbound connections using endpoint addresses.
    63  
    64  * Support sending binary messages with distinct channel IDs (although channels and channel IDs are a higher-level application protocol concept explained in the Router section, they are threaded through the transport layer as well for backwards compatibilty with the existing MConnection protocol).
    65  
    66  * Exchange the MConnection `NodeInfo` and public key via a node handshake, and possibly encrypt or sign the traffic as appropriate.
    67  
    68  The initial transport is a port of the current MConnection protocol currently used by Tendermint, and should be backwards-compatible at the wire level. An in-memory transport for testing has also been implemented. There are plans to explore a QUIC transport that may replace the MConnection protocol.
    69  
    70  The `Transport` interface is as follows:
    71  
    72  ```go
    73  // Transport is a connection-oriented mechanism for exchanging data with a peer.
    74  type Transport interface {
    75      // Protocols returns the protocols supported by the transport. The Router
    76      // uses this to pick a transport for an Endpoint.
    77      Protocols() []Protocol
    78  
    79      // Endpoints returns the local endpoints the transport is listening on, if any.
    80      // How to listen is transport-dependent, e.g. MConnTransport uses Listen() while
    81      // MemoryTransport starts listening via MemoryNetwork.CreateTransport().
    82      Endpoints() []Endpoint
    83  
    84      // Accept waits for the next inbound connection on a listening endpoint, blocking
    85      // until either a connection is available or the transport is closed. On closure,
    86      // io.EOF is returned and further Accept calls are futile.
    87      Accept() (Connection, error)
    88  
    89      // Dial creates an outbound connection to an endpoint.
    90      Dial(context.Context, Endpoint) (Connection, error)
    91  
    92      // Close stops accepting new connections, but does not close active connections.
    93      Close() error
    94  }
    95  ```
    96  
    97  How the transport configures listening is transport-dependent, and not covered by the interface. This typically happens during transport construction, where a single instance of the transport is created and set to listen on an appropriate network interface before being passed to the router.
    98  
    99  #### Endpoints
   100  
   101  `Endpoint` represents a transport endpoint (e.g. an IP address and port). A connection always has two endpoints: one at the local node and one at the remote peer. Outbound connections to remote endpoints are made via `Dial()`, and inbound connections to listening endpoints are returned via `Accept()`.
   102  
   103  The `Endpoint` struct is:
   104  
   105  ```go
   106  // Endpoint represents a transport connection endpoint, either local or remote.
   107  //
   108  // Endpoints are not necessarily networked (see e.g. MemoryTransport) but all
   109  // networked endpoints must use IP as the underlying transport protocol to allow
   110  // e.g. IP address filtering. Either IP or Path (or both) must be set.
   111  type Endpoint struct {
   112      // Protocol specifies the transport protocol.
   113      Protocol Protocol
   114  
   115      // IP is an IP address (v4 or v6) to connect to. If set, this defines the
   116      // endpoint as a networked endpoint.
   117      IP net.IP
   118  
   119      // Port is a network port (either TCP or UDP). If 0, a default port may be
   120      // used depending on the protocol.
   121      Port uint16
   122  
   123      // Path is an optional transport-specific path or identifier.
   124      Path string
   125  }
   126  
   127  // Protocol identifies a transport protocol.
   128  type Protocol string
   129  ```
   130  
   131  Endpoints are arbitrary transport-specific addresses, but if they are networked they must use IP addresses and thus rely on IP as a fundamental packet routing protocol. This enables policies for address discovery, advertisement, and exchange - for example, a private `192.168.0.0/24` IP address should only be advertised to peers on that IP network, while the public address `8.8.8.8` may be advertised to all peers. Similarly, any port numbers if given must represent TCP and/or UDP port numbers, in order to use [UPnP](https://en.wikipedia.org/wiki/Universal_Plug_and_Play) to autoconfigure e.g. NAT gateways.
   132  
   133  Non-networked endpoints (without an IP address) are considered local, and will only be advertised to other peers connecting via the same protocol. For example, the in-memory transport used for testing uses `Endpoint{Protocol: "memory", Path: "foo"}` as an address for the node "foo", and this should only be advertised to other nodes using `Protocol: "memory"`.
   134  
   135  #### Connections
   136  
   137  A connection represents an established transport connection between two endpoints (i.e. two nodes), which can be used to exchange binary messages with logical channel IDs (corresponding to the higher-level channel IDs used in the router). Connections are set up either via `Transport.Dial()` (outbound) or `Transport.Accept()` (inbound).
   138  
   139  Once a connection is esablished, `Transport.Handshake()` must be called to perform a node handshake, exchanging node info and public keys to verify node identities. Node handshakes should not really be part of the transport layer (it's an application protocol concern), this exists for backwards-compatibility with the existing MConnection protocol which conflates the two. `NodeInfo` is part of the existing MConnection protocol, but does not appear to be documented in the specification -- refer to the Go codebase for details.
   140  
   141  The `Connection` interface is shown below. It omits certain additions that are currently implemented for backwards compatibility with the legacy P2P stack and are planned to be removed before the final release.
   142  
   143  ```go
   144  // Connection represents an established connection between two endpoints.
   145  type Connection interface {
   146      // Handshake executes a node handshake with the remote peer. It must be
   147      // called once the connection is established, and returns the remote peer's
   148      // node info and public key. The caller is responsible for validation.
   149      Handshake(context.Context, NodeInfo, crypto.PrivKey) (NodeInfo, crypto.PubKey, error)
   150  
   151      // ReceiveMessage returns the next message received on the connection,
   152      // blocking until one is available. Returns io.EOF if closed.
   153      ReceiveMessage() (ChannelID, []byte, error)
   154  
   155      // SendMessage sends a message on the connection. Returns io.EOF if closed.
   156      SendMessage(ChannelID, []byte) error
   157  
   158      // LocalEndpoint returns the local endpoint for the connection.
   159      LocalEndpoint() Endpoint
   160  
   161      // RemoteEndpoint returns the remote endpoint for the connection.
   162      RemoteEndpoint() Endpoint
   163  
   164      // Close closes the connection.
   165      Close() error
   166  }
   167  ```
   168  
   169  This ADR initially proposed a byte-oriented multi-stream connection API that follows more typical networking API conventions (using e.g. `io.Reader` and `io.Writer` interfaces which easily compose with other libraries). This would also allow moving the responsibility for message framing, node handshakes, and traffic scheduling to the common router instead of reimplementing this across transports, and would allow making better use of multi-stream protocols such as QUIC. However, this would require minor breaking changes to the MConnection protocol which were rejected, see [tendermint/spec#227](https://github.com/tendermint/spec/pull/227) for details. This should be revisited when starting work on a QUIC transport.
   170  
   171  ### Peer Management
   172  
   173  Peers are other Tendermint nodes. Each peer is identified by a unique `NodeID` (tied to the node's private key).
   174  
   175  #### Peer Addresses
   176  
   177  Nodes have one or more `NodeAddress` addresses expressed as URLs that they can be reached at. Examples of node addresses might be e.g.:
   178  
   179  * `mconn://nodeid@host.domain.com:25567/path`
   180  * `memory:nodeid`
   181  
   182  Addresses are resolved into one or more transport endpoints, e.g. by resolving DNS hostnames into IP addresses. Peers should always be expressed as address URLs rather than endpoints (which are a lower-level transport construct).
   183  
   184  ```go
   185  // NodeID is a hex-encoded crypto.Address. It must be lowercased
   186  // (for uniqueness) and of length 40.
   187  type NodeID string
   188  
   189  // NodeAddress is a node address URL. It differs from a transport Endpoint in
   190  // that it contains the node's ID, and that the address hostname may be resolved
   191  // into multiple IP addresses (and thus multiple endpoints).
   192  //
   193  // If the URL is opaque, i.e. of the form "scheme:opaque", then the opaque part
   194  // is expected to contain a node ID.
   195  type NodeAddress struct {
   196      NodeID   NodeID
   197      Protocol Protocol
   198      Hostname string
   199      Port     uint16
   200      Path     string
   201  }
   202  
   203  // ParseNodeAddress parses a node address URL into a NodeAddress, normalizing
   204  // and validating it.
   205  func ParseNodeAddress(urlString string) (NodeAddress, error)
   206  
   207  // Resolve resolves a NodeAddress into a set of Endpoints, e.g. by expanding
   208  // out a DNS hostname to IP addresses.
   209  func (a NodeAddress) Resolve(ctx context.Context) ([]Endpoint, error)
   210  ```
   211  
   212  #### Peer Manager
   213  
   214  The P2P stack needs to track a lot of internal state about peers, such as their addresses, connection state, priorities, availability, failures, retries, and so on. This responsibility has been separated out to a `PeerManager`, which track this state for the `Router` (but does not maintain the actual transport connections themselves, which is the router's responsibility).
   215  
   216  The `PeerManager` is a synchronous state machine, where all state transitions are serialized (implemented as synchronous method calls holding an exclusive mutex lock). Most peer state is intentionally kept internal, stored in a `peerStore` database that persists it as appropriate, and the external interfaces pass the minimum amount of information necessary in order to avoid shared state between router goroutines. This design significantly simplifies the model, making it much easier to reason about and test than if it was baked into the asynchronous ball of concurrency that the P2P networking core must necessarily be. As peer lifecycle events are expected to be relatively infrequent, this should not significantly impact performance either.
   217  
   218  The `Router` uses the `PeerManager` to request which peers to dial and evict, and reports in with peer lifecycle events such as connections, disconnections, and failures as they occur. The manager can reject these events (e.g. reject an inbound connection) by returning errors. This happens as follows:
   219  
   220  * Outbound connections, via `Transport.Dial`:
   221      * `DialNext()`: returns a peer address to dial, or blocks until one is available.
   222      * `DialFailed()`: reports a peer dial failure.
   223      * `Dialed()`: reports a peer dial success.
   224      * `Ready()`: reports the peer as routed and ready.
   225      * `Disconnected()`: reports a peer disconnection.
   226  
   227  * Inbound connections, via `Transport.Accept`:
   228      * `Accepted()`: reports an inbound peer connection.
   229      * `Ready()`: reports the peer as routed and ready.
   230      * `Disconnected()`: reports a peer disconnection.
   231  
   232  * Evictions, via `Connection.Close`:
   233      * `EvictNext()`: returns a peer to disconnect, or blocks until one is available.
   234      * `Disconnected()`: reports a peer disconnection.
   235  
   236  These calls have the following interface:
   237  
   238  ```go
   239  // DialNext returns a peer address to dial, blocking until one is available.
   240  func (m *PeerManager) DialNext(ctx context.Context) (NodeAddress, error)
   241  
   242  // DialFailed reports a dial failure for the given address.
   243  func (m *PeerManager) DialFailed(address NodeAddress) error
   244  
   245  // Dialed reports a successful outbound connection to the given address.
   246  func (m *PeerManager) Dialed(address NodeAddress) error
   247  
   248  // Accepted reports a successful inbound connection from the given node.
   249  func (m *PeerManager) Accepted(peerID NodeID) error
   250  
   251  // Ready reports the peer as fully routed and ready for use.
   252  func (m *PeerManager) Ready(peerID NodeID) error
   253  
   254  // EvictNext returns a peer ID to disconnect, blocking until one is available.
   255  func (m *PeerManager) EvictNext(ctx context.Context) (NodeID, error)
   256  
   257  // Disconnected reports a peer disconnection.
   258  func (m *PeerManager) Disconnected(peerID NodeID) error
   259  ```
   260  
   261  Internally, the `PeerManager` uses a numeric peer score to prioritize peers, e.g. when deciding which peers to dial next. The scoring policy has not yet been implemented, but should take into account e.g. node configuration such a `persistent_peers`, uptime and connection failures, performance, and so on. The manager will also attempt to automatically upgrade to better-scored peers by evicting lower-scored peers when a better one becomes available (e.g. when a persistent peer comes back online after an outage).
   262  
   263  The `PeerManager` should also have an API for reporting peer behavior from reactors that affects its score (e.g. signing a block increases the score, double-voting decreases it or even bans the peer), but this has not yet been designed and implemented.
   264  
   265  Additionally, the `PeerManager` provides `PeerUpdates` subscriptions that will receive `PeerUpdate` events whenever significant peer state changes happen. Reactors can use these e.g. to know when peers are connected or disconnected, and take appropriate action. This is currently fairly minimal:
   266  
   267  ```go
   268  // Subscribe subscribes to peer updates. The caller must consume the peer updates
   269  // in a timely fashion and close the subscription when done, to avoid stalling the
   270  // PeerManager as delivery is semi-synchronous, guaranteed, and ordered.
   271  func (m *PeerManager) Subscribe() *PeerUpdates
   272  
   273  // PeerUpdate is a peer update event sent via PeerUpdates.
   274  type PeerUpdate struct {
   275      NodeID NodeID
   276      Status PeerStatus
   277  }
   278  
   279  // PeerStatus is a peer status.
   280  type PeerStatus string
   281  
   282  const (
   283      PeerStatusUp   PeerStatus = "up"   // Connected and ready.
   284      PeerStatusDown PeerStatus = "down" // Disconnected.
   285  )
   286  
   287  // PeerUpdates is a real-time peer update subscription.
   288  type PeerUpdates struct { ... }
   289  
   290  // Updates returns a channel for consuming peer updates.
   291  func (pu *PeerUpdates) Updates() <-chan PeerUpdate
   292  
   293  // Close closes the peer updates subscription.
   294  func (pu *PeerUpdates) Close()
   295  ```
   296  
   297  The `PeerManager` will also be responsible for providing peer information to the PEX reactor that can be gossipped to other nodes. This requires an improved system for peer address detection and advertisement, that e.g. reliably detects peer and self addresses and only gossips private network addresses to other peers on the same network, but this system has not yet been fully designed and implemented.
   298  
   299  ### Channels
   300  
   301  While low-level data exchange happens via the `Transport`, the high-level API is based on a bidirectional `Channel` that can send and receive Protobuf messages addressed by `NodeID`. A channel is identified by an arbitrary `ChannelID` identifier, and can exchange Protobuf messages of one specific type (since the type to unmarshal into must be predefined). Message delivery is asynchronous and at-most-once.
   302  
   303  The channel can also be used to report peer errors, e.g. when receiving an invalid or malignant message. This may cause the peer to be disconnected or banned depending on `PeerManager` policy, but should probably be replaced by a broader peer behavior API that can also report good behavior.
   304  
   305  A `Channel` has this interface:
   306  
   307  ```go
   308  // ChannelID is an arbitrary channel ID.
   309  type ChannelID uint16
   310  
   311  // Channel is a bidirectional channel to exchange Protobuf messages with peers.
   312  type Channel struct {
   313      ID          ChannelID        // Channel ID.
   314      In          <-chan Envelope  // Inbound messages (peers to reactors).
   315      Out         chan<- Envelope  // outbound messages (reactors to peers)
   316      Error       chan<- PeerError // Peer error reporting.
   317      messageType proto.Message    // Channel's message type, for e.g. unmarshaling.
   318  }
   319  
   320  // Close closes the channel, also closing Out and Error.
   321  func (c *Channel) Close() error
   322  
   323  // Envelope specifies the message receiver and sender.
   324  type Envelope struct {
   325      From      NodeID        // Sender (empty if outbound).
   326      To        NodeID        // Receiver (empty if inbound).
   327      Broadcast bool          // Send to all connected peers, ignoring To.
   328      Message   proto.Message // Message payload.
   329  }
   330  
   331  // PeerError is a peer error reported via the Error channel.
   332  type PeerError struct {
   333      NodeID   NodeID
   334      Err      error
   335  }
   336  ```
   337  
   338  A channel can reach any connected peer, and will automatically (un)marshal the Protobuf messages. Message scheduling and queueing is a `Router` implementation concern, and can use any number of algorithms such as FIFO, round-robin, priority queues, etc. Since message delivery is not guaranteed, both inbound and outbound messages may be dropped, buffered, reordered, or blocked as appropriate.
   339  
   340  Since a channel can only exchange messages of a single type, it is often useful to use a wrapper message type with e.g. a Protobuf `oneof` field that specifies a set of inner message types that it can contain. The channel can automatically perform this (un)wrapping if the outer message type implements the `Wrapper` interface (see [Reactor Example](#reactor-example) for an example):
   341  
   342  ```go
   343  // Wrapper is a Protobuf message that can contain a variety of inner messages.
   344  // If a Channel's message type implements Wrapper, the channel will
   345  // automatically (un)wrap passed messages using the container type, such that
   346  // the channel can transparently support multiple message types.
   347  type Wrapper interface {
   348      proto.Message
   349  
   350      // Wrap will take a message and wrap it in this one.
   351      Wrap(proto.Message) error
   352  
   353      // Unwrap will unwrap the inner message contained in this message.
   354      Unwrap() (proto.Message, error)
   355  }
   356  ```
   357  
   358  ### Routers
   359  
   360  The router exeutes P2P networking for a node, taking instructions from and reporting events to the `PeerManager`, maintaining transport connections to peers, and routing messages between channels and peers.
   361  
   362  Practically all concurrency in the P2P stack has been moved into the router and reactors, while as many other responsibilities as possible have been moved into separate components such as the `Transport` and `PeerManager` that can remain largely synchronous. Limiting concurrency to a single core component makes it much easier to reason about since there is only a single concurrency structure, while the remaining components can be serial, simple, and easily testable.
   363  
   364  The `Router` has a very minimal API, since it is mostly driven by `PeerManager` and `Transport` events:
   365  
   366  ```go
   367  // Router maintains peer transport connections and routes messages between
   368  // peers and channels.
   369  type Router struct {
   370      // Some details have been omitted below.
   371  
   372      logger          log.Logger
   373      options         RouterOptions
   374      nodeInfo        NodeInfo
   375      privKey         crypto.PrivKey
   376      peerManager     *PeerManager
   377      transports      []Transport
   378  
   379      peerMtx         sync.RWMutex
   380      peerQueues      map[NodeID]queue
   381  
   382      channelMtx      sync.RWMutex
   383      channelQueues   map[ChannelID]queue
   384  }
   385  
   386  // OpenChannel opens a new channel for the given message type. The caller must
   387  // close the channel when done, before stopping the Router. messageType is the
   388  // type of message passed through the channel.
   389  func (r *Router) OpenChannel(id ChannelID, messageType proto.Message) (*Channel, error)
   390  
   391  // Start starts the router, connecting to peers and routing messages.
   392  func (r *Router) Start() error
   393  
   394  // Stop stops the router, disconnecting from all peers and stopping message routing.
   395  func (r *Router) Stop() error
   396  ```
   397  
   398  All Go channel sends in the `Router` and reactors are blocking (the router also selects on signal channels for closure and shutdown). The responsibility for message scheduling, prioritization, backpressure, and load shedding is centralized in a core `queue` interface that is used at contention points (i.e. from all peers to a single channel, and from all channels to a single peer):
   399  
   400  ```go
   401  // queue does QoS scheduling for Envelopes, enqueueing and dequeueing according
   402  // to some policy. Queues are used at contention points, i.e.:
   403  // - Receiving inbound messages to a single channel from all peers.
   404  // - Sending outbound messages to a single peer from all channels.
   405  type queue interface {
   406      // enqueue returns a channel for submitting envelopes.
   407      enqueue() chan<- Envelope
   408  
   409      // dequeue returns a channel ordered according to some queueing policy.
   410      dequeue() <-chan Envelope
   411  
   412      // close closes the queue. After this call enqueue() will block, so the
   413      // caller must select on closed() as well to avoid blocking forever. The
   414      // enqueue() and dequeue() channels will not be closed.
   415      close()
   416  
   417      // closed returns a channel that's closed when the scheduler is closed.
   418      closed() <-chan struct{}
   419  }
   420  ```
   421  
   422  The current implementation is `fifoQueue`, which is a simple unbuffered lossless queue that passes messages in the order they were received and blocks until the message is delivered (i.e. it is a Go channel). The router will need a more sophisticated queueing policy, but this has not yet been implemented.
   423  
   424  The internal `Router` goroutine structure and design is described in the `Router` GoDoc, which is included below for reference:
   425  
   426  ```go
   427  // On startup, three main goroutines are spawned to maintain peer connections:
   428  //
   429  //   dialPeers(): in a loop, calls PeerManager.DialNext() to get the next peer
   430  //   address to dial and spawns a goroutine that dials the peer, handshakes
   431  //   with it, and begins to route messages if successful.
   432  //
   433  //   acceptPeers(): in a loop, waits for an inbound connection via
   434  //   Transport.Accept() and spawns a goroutine that handshakes with it and
   435  //   begins to route messages if successful.
   436  //
   437  //   evictPeers(): in a loop, calls PeerManager.EvictNext() to get the next
   438  //   peer to evict, and disconnects it by closing its message queue.
   439  //
   440  // When a peer is connected, an outbound peer message queue is registered in
   441  // peerQueues, and routePeer() is called to spawn off two additional goroutines:
   442  //
   443  //   sendPeer(): waits for an outbound message from the peerQueues queue,
   444  //   marshals it, and passes it to the peer transport which delivers it.
   445  //
   446  //   receivePeer(): waits for an inbound message from the peer transport,
   447  //   unmarshals it, and passes it to the appropriate inbound channel queue
   448  //   in channelQueues.
   449  //
   450  // When a reactor opens a channel via OpenChannel, an inbound channel message
   451  // queue is registered in channelQueues, and a channel goroutine is spawned:
   452  //
   453  //   routeChannel(): waits for an outbound message from the channel, looks
   454  //   up the recipient peer's outbound message queue in peerQueues, and submits
   455  //   the message to it.
   456  //
   457  // All channel sends in the router are blocking. It is the responsibility of the
   458  // queue interface in peerQueues and channelQueues to prioritize and drop
   459  // messages as appropriate during contention to prevent stalls and ensure good
   460  // quality of service.
   461  ```
   462  
   463  ### Reactor Example
   464  
   465  While reactors are a first-class concept in the current P2P stack (i.e. there is an explicit `p2p.Reactor` interface), they will simply be a design pattern in the new stack, loosely defined as "something which listens on a channel and reacts to messages".
   466  
   467  Since reactors have very few formal constraints, they can be implemented in a variety of ways. There is currently no recommended pattern for implementing reactors, to avoid overspecification and scope creep in this ADR. However, prototyping and developing a reactor pattern should be done early during implementation, to make sure reactors built using the `Channel` interface can satisfy the needs for convenience, deterministic tests, and reliability.
   468  
   469  Below is a trivial example of a simple echo reactor implemented as a function. The reactor will exchange the following Protobuf messages:
   470  
   471  ```protobuf
   472  message EchoMessage {
   473      oneof inner {
   474          PingMessage ping = 1;
   475          PongMessage pong = 2;
   476      }
   477  }
   478  
   479  message PingMessage {
   480      string content = 1;
   481  }
   482  
   483  message PongMessage {
   484      string content = 1;
   485  }
   486  ```
   487  
   488  Implementing the `Wrapper` interface for `EchoMessage` allows transparently passing `PingMessage` and `PongMessage` through the channel, where it will automatically be (un)wrapped in an `EchoMessage`:
   489  
   490  ```go
   491  func (m *EchoMessage) Wrap(inner proto.Message) error {
   492      switch inner := inner.(type) {
   493      case *PingMessage:
   494          m.Inner = &EchoMessage_PingMessage{Ping: inner}
   495      case *PongMessage:
   496          m.Inner = &EchoMessage_PongMessage{Pong: inner}
   497      default:
   498          return fmt.Errorf("unknown message %T", inner)
   499      }
   500      return nil
   501  }
   502  
   503  func (m *EchoMessage) Unwrap() (proto.Message, error) {
   504      switch inner := m.Inner.(type) {
   505      case *EchoMessage_PingMessage:
   506          return inner.Ping, nil
   507      case *EchoMessage_PongMessage:
   508          return inner.Pong, nil
   509      default:
   510          return nil, fmt.Errorf("unknown message %T", inner)
   511      }
   512  }
   513  ```
   514  
   515  The reactor itself would be implemented e.g. like this:
   516  
   517  ```go
   518  // RunEchoReactor wires up an echo reactor to a router and runs it.
   519  func RunEchoReactor(router *p2p.Router, peerManager *p2p.PeerManager) error {
   520      channel, err := router.OpenChannel(1, &EchoMessage{})
   521      if err != nil {
   522          return err
   523      }
   524      defer channel.Close()
   525      peerUpdates := peerManager.Subscribe()
   526      defer peerUpdates.Close()
   527  
   528      return EchoReactor(context.Background(), channel, peerUpdates)
   529  }
   530  
   531  // EchoReactor provides an echo service, pinging all known peers until the given
   532  // context is canceled.
   533  func EchoReactor(ctx context.Context, channel *p2p.Channel, peerUpdates *p2p.PeerUpdates) error {
   534      ticker := time.NewTicker(5 * time.Second)
   535      defer ticker.Stop()
   536  
   537      for {
   538          select {
   539          // Send ping message to all known peers every 5 seconds.
   540          case <-ticker.C:
   541              channel.Out <- Envelope{
   542                  Broadcast: true,
   543                  Message:   &PingMessage{Content: "👋"},
   544              }
   545  
   546          // When we receive a message from a peer, either respond to ping, output
   547          // pong, or report peer error on unknown message type.
   548          case envelope := <-channel.In:
   549              switch msg := envelope.Message.(type) {
   550              case *PingMessage:
   551                  channel.Out <- Envelope{
   552                      To:      envelope.From,
   553                      Message: &PongMessage{Content: msg.Content},
   554                  }
   555  
   556              case *PongMessage:
   557                  fmt.Printf("%q replied with %q\n", envelope.From, msg.Content)
   558  
   559              default:
   560                  channel.Error <- PeerError{
   561                      PeerID: envelope.From,
   562                      Err:    fmt.Errorf("unexpected message %T", msg),
   563                  }
   564              }
   565  
   566          // Output info about any peer status changes.
   567          case peerUpdate := <-peerUpdates:
   568              fmt.Printf("Peer %q changed status to %q", peerUpdate.PeerID, peerUpdate.Status)
   569  
   570          // Exit when context is canceled.
   571          case <-ctx.Done():
   572              return nil
   573          }
   574      }
   575  }
   576  ```
   577  
   578  ## Status
   579  
   580  Partially implemented ([#5670](https://github.com/number571/tendermint/issues/5670))
   581  
   582  ## Consequences
   583  
   584  ### Positive
   585  
   586  * Reduced coupling and simplified interfaces should lead to better understandability, increased reliability, and more testing.
   587  
   588  * Using message passing via Go channels gives better control of backpressure and quality-of-service scheduling.
   589  
   590  * Peer lifecycle and connection management is centralized in a single entity, making it easier to reason about.
   591  
   592  * Detection, advertisement, and exchange of node addresses will be improved.
   593  
   594  * Additional transports (e.g. QUIC) can be implemented and used in parallel with the existing MConn protocol.
   595  
   596  * The P2P protocol will not be broken in the initial version, if possible.
   597  
   598  ### Negative
   599  
   600  * Fully implementing the new design as indended is likely to require breaking changes to the P2P protocol at some point, although the initial implementation shouldn't.
   601  
   602  * Gradually migrating the existing stack and maintaining backwards-compatibility will be more labor-intensive than simply replacing the entire stack.
   603  
   604  * A complete overhaul of P2P internals is likely to cause temporary performance regressions and bugs as the implementation matures.
   605  
   606  * Hiding peer management information inside the `PeerManager` may prevent certain functionality or require additional deliberate interfaces for information exchange, as a tradeoff to simplify the design, reduce coupling, and avoid race conditions and lock contention.
   607  
   608  ### Neutral
   609  
   610  * Implementation details around e.g. peer management, message scheduling, and peer and endpoint advertisement are not yet determined.
   611  
   612  ## References
   613  
   614  * [ADR 061: P2P Refactor Scope](adr-061-p2p-refactor-scope.md)
   615  * [#5670 p2p: internal refactor and architecture redesign](https://github.com/number571/tendermint/issues/5670)