github.com/badrootd/nibiru-cometbft@v0.37.5-0.20240307173500-2a75559eee9b/docs/architecture/adr-016-protocol-versions.md (about)

     1  # ADR 016: Protocol Versions
     2  
     3  ## TODO
     4  
     5  - How to / should we version the authenticated encryption handshake itself (ie.
     6    upfront protocol negotiation for the P2PVersion)
     7  - How to / should we version ABCI itself? Should it just be absorbed by the
     8    BlockVersion?
     9  
    10  ## Changelog
    11  
    12  - 18-09-2018: Updates after working a bit on implementation
    13      - ABCI Handshake needs to happen independently of starting the app
    14        conns so we can see the result
    15      - Add question about ABCI protocol version
    16  - 16-08-2018: Updates after discussion with SDK team
    17      - Remove signalling for next version from Header/ABCI
    18  - 03-08-2018: Updates from discussion with Jae:
    19    - ProtocolVersion contains Block/AppVersion, not Current/Next
    20    - signal upgrades to Tendermint using EndBlock fields
    21    - dont restrict peer compatibilty by version to simplify syncing old nodes
    22  - 28-07-2018: Updates from review
    23    - split into two ADRs - one for protocol, one for chains
    24    - include signalling for upgrades in header
    25  - 16-07-2018: Initial draft - was originally joint ADR for protocol and chain
    26    versions
    27  
    28  ## Context
    29  
    30  Here we focus on software-agnostic protocol versions.
    31  
    32  The Software Version is covered by SemVer and described elsewhere.
    33  It is not relevant to the protocol description, suffice to say that if any protocol version
    34  changes, the software version changes, but not necessarily vice versa.
    35  
    36  Software version should be included in NodeInfo for convenience/diagnostics.
    37  
    38  We are also interested in versioning across different blockchains in a
    39  meaningful way, for instance to differentiate branches of a contentious
    40  hard-fork. We leave that for a later ADR.
    41  
    42  ## Requirements
    43  
    44  We need to version components of the blockchain that may be independently upgraded.
    45  We need to do it in a way that is scalable and maintainable - we can't just litter
    46  the code with conditionals.
    47  
    48  We can consider the complete version of the protocol to contain the following sub-versions:
    49  BlockVersion, P2PVersion, AppVersion. These versions reflect the major sub-components
    50  of the software that are likely to evolve together, at different rates, and in different ways,
    51  as described below.
    52  
    53  The BlockVersion defines the core of the blockchain data structures and
    54  should change infrequently.
    55  
    56  The P2PVersion defines how peers connect and communicate with eachother - it's
    57  not part of the blockchain data structures, but defines the protocols used to build the
    58  blockchain. It may change gradually.
    59  
    60  The AppVersion determines how we compute app specific information, like the
    61  AppHash and the Results.
    62  
    63  All of these versions may change over the life of a blockchain, and we need to
    64  be able to help new nodes sync up across version changes. This means we must be willing
    65  to connect to peers with older version.
    66  
    67  ### BlockVersion
    68  
    69  - All tendermint hashed data-structures (headers, votes, txs, responses, etc.).
    70    - Note the semantic meaning of a transaction may change according to the AppVersion, but the way txs are merklized into the header is part of the BlockVersion
    71  - It should be the least frequent/likely to change.
    72    - Tendermint should be stabilizing - it's just Atomic Broadcast.
    73    - We can start considering for Tendermint v2.0 in a year
    74  - It's easy to determine the version of a block from its serialized form
    75  
    76  ### P2PVersion
    77  
    78  - All p2p and reactor messaging (messages, detectable behavior)
    79  - Will change gradually as reactors evolve to improve performance and support new features - eg proposed new message types BatchTx in the mempool and HasBlockPart in the consensus
    80  - It's easy to determine the version of a peer from its first serialized message/s
    81  - New versions must be compatible with at least one old version to allow gradual upgrades
    82  
    83  ### AppVersion
    84  
    85  - The ABCI state machine (txs, begin/endblock behavior, commit hashing)
    86  - Behaviour and message types will change abruptly in the course of the life of a chain
    87  - Need to minimize complexity of the code for supporting different AppVersions at different heights
    88  - Ideally, each version of the software supports only a _single_ AppVersion at one time
    89    - this means we checkout different versions of the software at different heights instead of littering the code
    90      with conditionals
    91    - minimize the number of data migrations required across AppVersion (ie. most AppVersion should be able to read the same state from disk as previous AppVersion).
    92  
    93  ## Ideal
    94  
    95  Each component of the software is independently versioned in a modular way and its easy to mix and match and upgrade.
    96  
    97  ## Proposal
    98  
    99  Each of BlockVersion, AppVersion, P2PVersion, is a monotonically increasing uint64.
   100  
   101  To use these versions, we need to update the block Header, the p2p NodeInfo, and the ABCI.
   102  
   103  ### Header
   104  
   105  Block Header should include a `Version` struct as its first field like:
   106  
   107  ```
   108  type Version struct {
   109      Block uint64
   110      App uint64
   111  }
   112  ```
   113  
   114  Here, `Version.Block` defines the rules for the current block, while
   115  `Version.App` defines the app version that processed the last block and computed
   116  the `AppHash` in the current block. Together they provide a complete description
   117  of the consensus-critical protocol.
   118  
   119  Since we have settled on a proto3 header, the ability to read the BlockVersion out of the serialized header is unanimous.
   120  
   121  Using a Version struct gives us more flexibility to add fields without breaking
   122  the header.
   123  
   124  The ProtocolVersion struct includes both the Block and App versions - it should
   125  serve as a complete description of the consensus-critical protocol.
   126  
   127  ### NodeInfo
   128  
   129  NodeInfo should include a Version struct as its first field like:
   130  
   131  ```
   132  type Version struct {
   133      P2P uint64
   134      Block uint64
   135      App uint64
   136  
   137      Other []string
   138  }
   139  ```
   140  
   141  Note this effectively makes `Version.P2P` the first field in the NodeInfo, so it
   142  should be easy to read this out of the serialized header if need be to facilitate an upgrade.
   143  
   144  The `Version.Other` here should include additional information like the name of the software client and
   145  it's SemVer version - this is for convenience only. Eg.
   146  `tendermint-core/v0.22.8`. It's a `[]string` so it can include information about
   147  the version of Tendermint, of the app, of Tendermint libraries, etc.
   148  
   149  ### ABCI
   150  
   151  Since the ABCI is responsible for keeping Tendermint and the App in sync, we
   152  need to communicate version information through it.
   153  
   154  On startup, we use Info to perform a basic handshake. It should include all the
   155  version information.
   156  
   157  We also need to be able to update versions in the life of a blockchain. The
   158  natural place to do this is EndBlock.
   159  
   160  Note that currently the result of the Handshake isn't exposed anywhere, as the
   161  handshaking happens inside the `proxy.AppConns` abstraction. We will need to
   162  remove the handshaking from the `proxy` package so we can call it independently
   163  and get the result, which should contain the application version.
   164  
   165  #### Info
   166  
   167  RequestInfo should add support for protocol versions like:
   168  
   169  ```
   170  message RequestInfo {
   171    string version
   172    uint64 block_version
   173    uint64 p2p_version
   174  }
   175  ```
   176  
   177  Similarly, ResponseInfo should return the versions:
   178  
   179  ```
   180  message ResponseInfo {
   181    string data
   182  
   183    string version
   184    uint64 app_version
   185  
   186    int64 last_block_height
   187    bytes last_block_app_hash
   188  }
   189  ```
   190  
   191  The existing `version` fields should be called `software_version` but we leave
   192  them for now to reduce the number of breaking changes.
   193  
   194  #### EndBlock
   195  
   196  Updating the version could be done either with new fields or by using the
   197  existing `tags`. Since we're trying to communicate information that will be
   198  included in Tendermint block Headers, it should be native to the ABCI, and not
   199  something embedded through some scheme in the tags. Thus, version updates should
   200  be communicated through EndBlock.
   201  
   202  EndBlock already contains `ConsensusParams`. We can add version information to
   203  the ConsensusParams as well:
   204  
   205  ```
   206  message ConsensusParams {
   207  
   208    BlockSize block_size
   209    EvidenceParams evidence_params
   210    VersionParams version
   211  }
   212  
   213  message VersionParams {
   214      uint64 block_version
   215      uint64 app_version
   216  }
   217  ```
   218  
   219  For now, the `block_version` will be ignored, as we do not allow block version
   220  to be updated live. If the `app_version` is set, it signals that the app's
   221  protocol version has changed, and the new `app_version` will be included in the
   222  `Block.Header.Version.App` for the next block.
   223  
   224  ### BlockVersion
   225  
   226  BlockVersion is included in both the Header and the NodeInfo.
   227  
   228  Changing BlockVersion should happen quite infrequently and ideally only for
   229  critical upgrades. For now, it is not encoded in ABCI, though it's always
   230  possible to use tags to signal an external process to co-ordinate an upgrade.
   231  
   232  Note Ethereum has not had to make an upgrade like this (everything has been at state machine level, AFAIK).
   233  
   234  ### P2PVersion
   235  
   236  P2PVersion is not included in the block Header, just the NodeInfo.
   237  
   238  P2PVersion is the first field in the NodeInfo. NodeInfo is also proto3 so this is easy to read out.
   239  
   240  Note we need the peer/reactor protocols to take the versions of peers into account when sending messages:
   241  
   242  - don't send messages they don't understand
   243  - don't send messages they don't expect
   244  
   245  Doing this will be specific to the upgrades being made.
   246  
   247  Note we also include the list of reactor channels in the NodeInfo and already don't send messages for channels the peer doesn't understand.
   248  If upgrades always use new channels, this simplifies the development cost of backwards compatibility.
   249  
   250  Note NodeInfo is only exchanged after the authenticated encryption handshake to ensure that it's private.
   251  Doing any version exchange before encrypting could be considered information leakage, though I'm not sure
   252  how much that matters compared to being able to upgrade the protocol.
   253  
   254  XXX: if needed, can we change the meaning of the first byte of the first message to encode a handshake version?
   255  this is the first byte of a 32-byte ed25519 pubkey.
   256  
   257  ### AppVersion
   258  
   259  AppVersion is also included in the block Header and the NodeInfo.
   260  
   261  AppVersion essentially defines how the AppHash and LastResults are computed.
   262  
   263  ### Peer Compatibility
   264  
   265  Restricting peer compatibility based on version is complicated by the need to
   266  help old peers, possibly on older versions, sync the blockchain.
   267  
   268  We might be tempted to say that we only connect to peers with the same
   269  AppVersion and BlockVersion (since these define the consensus critical
   270  computations), and a select list of P2PVersions (ie. those compatible with
   271  ours), but then we'd need to make accomodations for connecting to peers with the
   272  right Block/AppVersion for the height they're on.
   273  
   274  For now, we will connect to peers with any version and restrict compatibility
   275  solely based on the ChainID. We leave more restrictive rules on peer
   276  compatibiltiy to a future proposal.
   277  
   278  ### Future Changes
   279  
   280  It may be valuable to support an `/unsafe_stop?height=_` endpoint to tell Tendermint to shutdown at a given height.
   281  This could be use by an external manager process that oversees upgrades by
   282  checking out and installing new software versions and restarting the process. It
   283  would subscribe to the relevant upgrade event (needs to be implemented) and call `/unsafe_stop` at
   284  the correct height (of course only after getting approval from its user!)
   285  
   286  ## Consequences
   287  
   288  ### Positive
   289  
   290  - Make tendermint and application versions native to the ABCI to more clearly
   291    communicate about them
   292  - Distinguish clearly between protocol versions and software version to
   293    facilitate implementations in other languages
   294  - Versions included in key data structures in easy to discern way
   295  - Allows proposers to signal for upgrades and apps to decide when to actually change the
   296    version (and start signalling for a new version)
   297  
   298  ### Neutral
   299  
   300  - Unclear how to version the initial P2P handshake itself
   301  - Versions aren't being used (yet) to restrict peer compatibility
   302  - Signalling for a new version happens through the proposer and must be
   303    tallied/tracked in the app.
   304  
   305  ### Negative
   306  
   307  - Adds more fields to the ABCI
   308  - Implies that a single codebase must be able to handle multiple versions