github.com/badrootd/nibiru-cometbft@v0.37.5-0.20240307173500-2a75559eee9b/docs/architecture/adr-073-libp2p.md (about)

     1  # ADR 073: Adopt LibP2P
     2  
     3  ## Changelog
     4  
     5  - 2021-11-02: Initial Draft (@tychoish)
     6  
     7  ## Status
     8  
     9  Proposed.
    10  
    11  ## Context
    12  
    13  
    14  As part of the 0.35 development cycle, the Tendermint team completed
    15  the first phase of the work described in ADRs 61 and 62, which included a
    16  large scale refactoring of the reactors and the p2p message
    17  routing. This replaced the switch and many of the other legacy
    18  components without breaking protocol or network-level
    19  interoperability and left the legacy connection/socket handling code.
    20  
    21  Following the release, the team has reexamined the state of the code
    22  and the design, as well as Tendermint's requirements. The notes
    23  from that process are available in the [P2P Roadmap
    24  RFC][rfc].
    25  
    26  This ADR supersedes the decisions made in ADRs 60 and 61, but
    27  builds on the completed portions of this work. Previously, the
    28  boundaries of peer management, message handling, and the higher level
    29  business logic (e.g., "the reactors") were intermingled, and core
    30  elements of the p2p system were responsible for the orchestration of
    31  higher-level business logic. Refactoring the legacy components
    32  made it more obvious that this entanglement of responsibilities
    33  had outsized influence on the entire implementation, making
    34  it difficult to iterate within the current abstractions.
    35  It would not be viable to maintain interoperability with legacy
    36  systems while also achieving many of our broader objectives.
    37  
    38  LibP2P is a thoroughly-specified implementation of a peer-to-peer
    39  networking stack, designed specifically for systems such as
    40  ours. Adopting LibP2P as the basis of Tendermint will allow the
    41  Tendermint team to focus more of their time on other differentiating
    42  aspects of the system, and make it possible for the ecosystem as a
    43  whole to take advantage of tooling and efforts of the LibP2P
    44  platform.
    45  
    46  ## Alternative Approaches
    47  
    48  As discussed in the [P2P Roadmap RFC][rfc], the primary alternative would be to
    49  continue development of Tendermint's home-grown peer-to-peer
    50  layer. While that would give the Tendermint team maximal control
    51  over the peer system, the current design is unexceptional on its
    52  own merits, and the prospective maintenance burden for this system
    53  exceeds our tolerances for the medium term.
    54  
    55  Tendermint can and should differentiate itself not on the basis of
    56  its networking implementation or peer management tools, but providing
    57  a consistent operator experience, a battle-tested consensus algorithm,
    58  and an ergonomic user experience.
    59  
    60  ## Decision
    61  
    62  Tendermint will adopt libp2p during the 0.37 development cycle,
    63  replacing the bespoke Tendermint P2P stack. This will remove the
    64  `Endpoint`, `Transport`, `Connection`, and `PeerManager` abstractions
    65  and leave the reactors, `p2p.Router` and `p2p.Channel`
    66  abstractions.
    67  
    68  LibP2P may obviate the need for a dedicated peer exchange (PEX)
    69  reactor, which would also in turn obviate the need for a dedicated
    70  seed mode. If this is the case, then all of this functionality would
    71  be removed.
    72  
    73  If it turns out (based on the advice of Protocol Labs) that it makes
    74  sense to maintain separate pubsub or gossipsub topics
    75  per-message-type, then the `Router` abstraction could also
    76  be entirely subsumed.
    77  
    78  ## Detailed Design
    79  
    80  ### Implementation Changes
    81  
    82  The seams in the P2P implementation between the higher level
    83  constructs (reactors), the routing layer (`Router`) and the lower
    84  level connection and peer management code make this operation
    85  relatively straightforward to implement. A key
    86  goal in this design is to minimize the impact on the reactors
    87  (potentially entirely,) and completely remove the lower level
    88  components (e.g., `Transport`, `Connection` and `PeerManager`) using the
    89  separation afforded by the `Router` layer. The current state of the
    90  code makes these changes relatively surgical, and limited to a small
    91  number of methods:
    92  
    93  - `p2p.Router.OpenChannel` will still return a `Channel` structure
    94    which will continue to serve as a pipe between the reactors and the
    95    `Router`. The implementation will no longer need the queue
    96    implementation, and will instead start goroutines that
    97    are responsible for routing the messages from the channel to libp2p
    98    fundamentals, replacing the current `p2p.Router.routeChannel`.
    99  
   100  - The current `p2p.Router.dialPeers` and `p2p.Router.acceptPeers`,
   101    are responsible for establishing outbound and inbound connections,
   102    respectively. These methods will be removed, along with
   103    `p2p.Router.openConnection`, and the libp2p connection manager will
   104    be responsible for maintaining network connectivity.
   105  
   106  - The `p2p.Channel` interface will change to replace Go
   107    channels with a more functional interface for sending messages.
   108    New methods on this object will take contexts to support safe
   109    cancellation, and return errors, and will block rather than
   110    running asynchronously. The `Out` channel through which
   111    reactors send messages to Peers, will be replaced by a `Send`
   112    method, and the Error channel will be replaced by an `Error`
   113    method.
   114  
   115  - Reactors will be passed an interface that will allow them to
   116    access Peer information from libp2p. This will supplant the
   117    `p2p.PeerUpdates` subscription.
   118  
   119  - Add some kind of heartbeat message at the application level
   120    (e.g. with a reactor,) potentially connected to libp2p's DHT to be
   121    used by reactors for service discovery, message targeting, or other
   122    features.
   123  
   124  - Replace the existing/legacy handshake protocol with [Noise](http://www.noiseprotocol.org/noise.html).
   125  
   126  This project will initially use the TCP-based transport protocols within
   127  libp2p. QUIC is also available as an option that we may implement later.
   128  We will not support mixed networks in the initial release, but will
   129  revisit that possibility later if there is a demonstrated need.
   130  
   131  ### Upgrade and Compatibility
   132  
   133  Because the routers and all current P2P libraries are `internal`
   134  packages and not part of the public API, the only changes to the public
   135  API surface area of Tendermint will be different configuration
   136  file options, replacing the current P2P options with options relevant
   137  to libp2p.
   138  
   139  However, it will not be possible to run a network with both networking
   140  stacks active at once, so the upgrade to the version of Tendermint
   141  will need to be coordinated between all nodes of the network. This is
   142  consistent with the expectations around upgrades for Tendermint moving
   143  forward, and will help manage both the complexity of the project and
   144  the implementation timeline.
   145  
   146  ## Open Questions
   147  
   148  - What is the role of Protocol Labs in the implementation of libp2p in
   149    tendermint, both during the initial implementation and on an ongoing
   150    basis thereafter?
   151  
   152  - Should all P2P traffic for a given node be pushed to a single topic,
   153    so that a topic maps to a specific ChainID, or should
   154    each reactor (or type of message) have its own topic? How many
   155    topics can a libp2p network support? Is there testing that validates
   156    the capabilities?
   157  
   158  - Tendermint presently provides a very coarse QoS-like functionality
   159    using priorities based on message-type.
   160    This intuitively/theoretically ensures that evidence and consensus
   161    messages don't get starved by blocksync/statesync messages. It's
   162    unclear if we can or should attempt to replicate this with libp2p.
   163  
   164  - What kind of QoS functionality does libp2p provide and what kind of
   165    metrics does libp2p provide about it's QoS functionality?
   166  
   167  - Is it possible to store additional (and potentially arbitrary)
   168    information into the DHT as part of the heartbeats between nodes,
   169    such as the latest height, and then access that in the
   170    reactors. How frequently can the DHT be updated?
   171  
   172  - Does it make sense to have reactors continue to consume inbound
   173    messages from a Channel (`In`) or is there another interface or
   174    pattern that we should consider?
   175  
   176    - We should avoid exposing Go channels when possible, and likely
   177  	some kind of alternate iterator likely makes sense for processing
   178  	messages within the reactors.
   179  
   180  - What are the security and protocol implications of tracking
   181    information from peer heartbeats and exposing that to reactors?
   182  
   183  - How much (or how little) configuration can Tendermint provide for
   184    libp2p, particularly on the first release?
   185  
   186    - In general, we should not support byo-functionality for libp2p
   187  	components within Tendermint, and reduce the configuration surface
   188  	area, as much as possible.
   189  
   190  - What are the best ways to provide request/response semantics for
   191    reactors on top of libp2p? Will it be possible to add
   192    request/response semantics in a future release or is there
   193    anticipatory work that needs to be done as part of the initial
   194    release?
   195  
   196  ## Consequences
   197  
   198  ### Positive
   199  
   200  - Reduce the maintenance burden for the Tendermint Core team by
   201    removing a large swath of legacy code that has proven to be
   202    difficult to modify safely.
   203  
   204  - Remove the responsibility for maintaining and developing the entire
   205    peer management system (p2p) and stack.
   206  
   207  - Provide users with a more stable peer and networking system,
   208    Tendermint can improve operator experience and network stability.
   209  
   210  ### Negative
   211  
   212  - By deferring to library implementations for peer management and
   213    networking, Tendermint loses some flexibility for innovating at the
   214    peer and networking level. However, Tendermint should be innovating
   215    primarily at the consensus layer, and libp2p does not preclude
   216    optimization or development in the peer layer.
   217  
   218  - Libp2p is a large dependency and Tendermint would become dependent
   219    upon Protocol Labs' release cycle and prioritization for bug
   220    fixes. If this proves onerous, it's possible to maintain a vendor
   221    fork of relevant components as needed.
   222  
   223  ### Neutral
   224  
   225  - N/A
   226  
   227  ## References
   228  
   229  - [ADR 61: P2P Refactor Scope][adr61]
   230  - [ADR 62: P2P Architecture][adr62]
   231  - [P2P Roadmap RFC][rfc]
   232  
   233  [adr61]: ./adr-061-p2p-refactor-scope.md
   234  [adr62]: ./adr-062-p2p-architecture.md
   235  [rfc]: ../rfc/rfc-000-p2p-roadmap.rst