github.com/aakash4dev/cometbft@v0.38.2/spec/light-client/supervisor/supervisor_001_draft.md (about)

     1  # Draft of Light Client Supervisor for discussion
     2  
     3  ## TODOs
     4  
     5  This specification in done in parallel with updates on the
     6  verification specification. So some hyperlinks have to be placed to
     7  the correct files eventually.
     8  
     9  # Light Client Sequential Supervisor
    10  <!-- markdown-link-check-disable -->
    11  The light client implements a read operation of a
    12  [header](CMBC-HEADER-link) from the [blockchain](CMBC-SEQ-link), by
    13  communicating with full nodes, a so-called primary and several
    14  so-called witnesses. As some full nodes may be faulty, this
    15  functionality must be implemented in a fault-tolerant way.
    16  
    17  In a Cosmos blockchain, the validator set may change with every
    18  new block.  The staking and unbonding mechanism induces a [security
    19  model](CMBC-FM-2THIRDS-link): starting at time *Time* of the
    20  [header](CMBC-HEADER-link),
    21  more than two-thirds of the next validators of a new block are correct
    22  for the duration of *TrustedPeriod*.
    23  
    24  [Light Client Verification](https://informal.systems) implements the fault-tolerant read
    25  operation designed for this security model. That is, it is safe if the
    26  model assumptions are satisfied and makes progress if it communicates
    27  to a correct primary.
    28  
    29  However, if the [security model](CMBC-FM-2THIRDS-link) is violated,
    30  faulty peers (that have been validators at some point in the past) may
    31  launch attacks on the Cosmos network, and on the light
    32  client. These attacks as well as an axiomatization of blocks in
    33  general are defined in [a document that contains the definitions that
    34  are currently in detection.md](https://informal.systems).
    35  
    36  If there is a light client attack (but no
    37  successful attack on the network), the safety of the verification step
    38  may be violated (as we operate outside its basic assumption).
    39  The light client also
    40  contains a defense mechanism against light clients attacks, called detection.
    41  
    42  [Light Client Detection](https://informal.systems) implements a cross check of the result
    43  of the verification step. If there is a light client attack, and the
    44  light client is connected to a correct peer, the light client as a
    45  whole is safe, that is, it will not operate on invalid
    46  blocks. However, in this case it cannot successfully read, as
    47  inconsistent blocks are in the system. However, in this case the
    48  detection performs a distributed computation that results in so-called
    49  evidence. Evidence can be used to prove
    50  to a correct full node that there has been a
    51  light client attack.
    52  
    53  [Light Client Evidence Accountability](https://informal.systems) is a protocol run on a
    54  full node to check whether submitted evidence indeed proves the
    55  existence of a light client attack. Further, from the evidence and its
    56  own knowledge about the blockchain, the full node computes a set of
    57  bonded full nodes (that at some point had more than one third of the
    58  voting power) that participated in the attack that will be reported
    59  via ABCI to the application.
    60  
    61  In this document we specify
    62  
    63  - Initialization of the Light Client
    64  - The interaction of [verification](https://informal.systems) and [detection](https://informal.systems)
    65  
    66  The details of these two protocols are captured in their own
    67  documents, as is the [accountability](https://informal.systems) protocol.
    68  
    69  > Another related line is IBC attack detection and submission at the
    70  > relayer, as well as attack verification at the IBC handler. This
    71  > will call for yet another spec.
    72  
    73  # Status
    74  
    75  This document is work in progress. In order to develop the
    76  specification step-by-step,
    77  it assumes certain details of [verification](https://informal.systems) and
    78  [detection](https://informal.systems) that are not specified in the respective current
    79  versions yet. This inconsistencies will be addresses over several
    80  upcoming PRs.
    81  
    82  # Part I - Cosmos Blockchain
    83  
    84  See [verification spec](addLinksWhenDone)
    85  
    86  # Part II - Sequential Problem Definition
    87  
    88  #### **[LC-SEQ-INIT-LIVE.1]**
    89  
    90  Upon initialization, the light client gets as input a header of the
    91  blockchain, or the genesis file of the blockchain, and eventually
    92  stores a header of the blockchain.
    93  
    94  #### **[LC-SEQ-LIVE.1]**
    95  
    96  The light client gets a sequence of heights as inputs. For each input
    97  height *targetHeight*, it eventually stores the header of height
    98  *targetHeight*.
    99  
   100  #### **[LC-SEQ-SAFE.1]**
   101  
   102  The light client never stores a header which is not in the blockchain.
   103  
   104  # Part III - Light Client as Distributed System
   105  
   106  ## Computational Model
   107  
   108  The light client communicates with remote processes only via the
   109  [verification](TODO) and the [detection](TODO) protocols. The
   110  respective assumptions are given there.
   111  
   112  ## Distributed Problem Statement
   113  
   114  ### Two Kinds of Liveness
   115  
   116  In case of light client attacks, the sequential problem statement
   117  cannot always be satisfied. The lightclient cannot decide which block
   118  is from the chain and which is not. As a result, the light client just
   119  creates evidence, submits it, and terminates.
   120  For the liveness property, we thus add the
   121  possibility that instead of adding a lightblock, we also might terminate
   122  in case there is an attack.
   123  
   124  #### **[LC-DIST-TERM.1]**
   125  
   126  The light client either runs forever or it *terminates on attack*.
   127  
   128  ### Design choices
   129  
   130  #### [LC-DIST-STORE.1]
   131  
   132  The light client has a local data structure called LightStore
   133  that contains light blocks (that contain a header).
   134  
   135  > The light store exposes functions to query and update it. They are
   136  > specified [here](TODO:onceVerificationIsMerged).
   137  
   138  **TODO:** reference light store invariant [LCV-INV-LS-ROOT.2] once
   139  verification is merged
   140  
   141  #### **[LC-DIST-SAFE.1]**
   142  
   143  It is always the case that every header in *LightStore* was
   144  generated by an instance of Tendermint consensus.
   145  
   146  #### **[LC-DIST-LIVE.1]**
   147  
   148  Whenever the light client gets a new height *h* as input,
   149  
   150  - and there is
   151  no light client attack up to height *h*, then the lightclient
   152  eventually puts the lightblock of height *h* in the lightstore and
   153  wait for another input.
   154  - otherwise, that is, if there
   155  is a light client attack on height *h*, then the light client
   156  must perform one of the following:
   157      - it terminates on attack.
   158      - it eventually puts the lightblock of height *h* in the lightstore and
   159  wait for another input.
   160  
   161  > Observe that the "existence of a lightclient attack" just means that some node has generated a conflicting block. It does not necessarily mean that a (faulty) peer sends such a block to "our" lightclient. Thus, even if there is an attack somewhere in the system, our lightclient might still continue to operate normally.
   162  
   163  ### Solving the sequential specification
   164  
   165  [LC-DIST-SAFE.1] is guaranteed by the detector; in particular it
   166  follows from
   167  [[LCD-DIST-INV-STORE.1]](TODO)
   168  [[LCD-DIST-LIVE.1]](TODO)
   169  
   170  # Part IV - Light Client Supervisor Protocol
   171  
   172  We provide a specification for a sequential Light Client Supervisor.
   173  The local code for verification is presented by a sequential function
   174  `Sequential-Supervisor` to highlight the control flow of this
   175  functionality. Each lightblock is first verified with a primary, and then
   176  cross-checked with secondaries, and if all goes well, the lightblock
   177  is
   178  added (with the attribute "trusted") to the
   179  lightstore. Intermiate lightblocks that were used to verify the target
   180  block but were not cross-checked are stored as "verified"
   181  
   182  > We note that if a different concurrency model is considered
   183  > for an implementation, the semantics of the lightstore might change:
   184  > In a concurrent implementation, we might do verification for some
   185  > height *h*, add the
   186  > lightblock to the lightstore, and start concurrent threads that
   187  >
   188  > - do verification for the next height *h' != h*
   189  > - do cross-checking for height *h*. If we find an attack, we remove
   190  >   *h* from the lightstore.
   191  > - the user might already start to use *h*
   192  >
   193  > Thus, this concurrency model changes the semantics of the
   194  > lightstore (not all lightblocks that are read by the user are
   195  > trusted; they may be removed if
   196  > we find a problem). Whether this is desirable, and whether the gain in
   197  > performance is worth it, we keep for future versions/discussion of
   198  > lightclient protocols.
   199  
   200  ## Definitions
   201  
   202  ### Peers
   203  
   204  #### **[LC-DATA-PEERS.1]:**
   205  
   206  A fixed set of full nodes is provided in the configuration upon
   207  initialization. Initially this set is partitioned into
   208  
   209  - one full node that is the *primary* (singleton set),
   210  - a set *Secondaries* (of fixed size, e.g., 3),
   211  - a set *FullNodes*; it excludes *primary* and *Secondaries* nodes.
   212  - A set *FaultyNodes* of nodes that the light client suspects of
   213      being faulty; it is initially empty
   214  
   215  #### **[LC-INV-NODES.1]:**
   216  
   217  The detector shall maintain the following invariants:
   218  
   219  - *FullNodes \intersect Secondaries = {}*
   220  - *FullNodes \intersect FaultyNodes = {}*
   221  - *Secondaries \intersect FaultyNodes = {}*
   222  
   223  and the following transition invariant
   224  
   225  - *FullNodes' \union Secondaries' \union FaultyNodes' = FullNodes
   226     \union Secondaries \union FaultyNodes*
   227  
   228  #### **[LC-FUNC-REPLACE-PRIMARY.1]:**
   229  
   230  ```go
   231  Replace_Primary(root-of-trust LightBlock)
   232  ```
   233  
   234  - Implementation remark
   235      - the primary is replaced by a secondary
   236      - to maintain a constant size of secondaries, need to
   237          - pick a new secondary *nsec* while ensuring [LC-INV-ROOT-AGREED.1]
   238          - that is, we need to ensure that root-of-trust = FetchLightBlock(nsec, root-of-trust.Header.Height)
   239  - Expected precondition
   240      - *FullNodes* is nonempty
   241  - Expected postcondition
   242      - *primary* is moved to *FaultyNodes*
   243      - a secondary *s* is moved from *Secondaries* to primary
   244  - Error condition
   245      - if precondition is violated
   246  
   247  #### **[LC-FUNC-REPLACE-SECONDARY.1]:**
   248  
   249  ```go
   250  Replace_Secondary(addr Address, root-of-trust LightBlock)
   251  ```
   252  
   253  - Implementation remark
   254      - maintain [LC-INV-ROOT-AGREED.1], that is,
   255      ensure root-of-trust = FetchLightBlock(nsec, root-of-trust.Header.Height)
   256  - Expected precondition
   257      - *FullNodes* is nonempty
   258  - Expected postcondition
   259      - addr is moved from *Secondaries* to *FaultyNodes*
   260      - an address *nsec* is moved from *FullNodes* to *Secondaries*
   261  - Error condition
   262      - if precondition is violated
   263  
   264  ### Data Types
   265  
   266  The core data structure of the protocol is the LightBlock.
   267  
   268  #### **[LC-DATA-LIGHTBLOCK.1]**
   269  
   270  ```go
   271  type LightBlock struct {
   272                  Header          Header
   273                  Commit          Commit
   274                  Validators      ValidatorSet
   275                  NextValidators  ValidatorSet
   276                  Provider        PeerID
   277  }
   278  ```
   279  
   280  #### **[LC-DATA-LIGHTSTORE.1]**
   281  
   282  LightBlocks are stored in a structure which stores all LightBlock from
   283  initialization or received from peers.
   284  
   285  ```go
   286  type LightStore struct {
   287          ...
   288  }
   289  
   290  ```
   291  
   292  We use the functions that the LightStore exposes, which
   293  are defined in the [verification specification](TODO).
   294  
   295  ### Inputs
   296  
   297  The lightclient is initialized with LCInitData
   298  
   299  #### **[LC-DATA-INIT.1]**
   300  
   301  ```go
   302  type LCInitData struct {
   303      lightBlock     LightBlock
   304      genesisDoc     GenesisDoc
   305  }
   306  ```
   307  
   308  where only one of the components must be provided. `GenesisDoc` is
   309  defined in the [CometBFT
   310  Types](https://github.com/aakash4dev/cometbft/blob/main/types/genesis.go).
   311  
   312  #### **[LC-DATA-GENESIS.1]**
   313  
   314  ```go
   315  type GenesisDoc struct {
   316      GenesisTime     time.Time                `json:"genesis_time"`
   317      ChainID         string                   `json:"chain_id"`
   318      InitialHeight   int64                    `json:"initial_height"`
   319      ConsensusParams *tmproto.ConsensusParams `json:"consensus_params,omitempty"`
   320      Validators      []GenesisValidator       `json:"validators,omitempty"`
   321      AppHash         tmbytes.HexBytes         `json:"app_hash"`
   322      AppState        json.RawMessage          `json:"app_state,omitempty"`
   323  }
   324  ```
   325  
   326  We use the following function
   327  `makeblock` so that we create a lightblock from the genesis
   328  file in order to do verification based on the data from the genesis
   329  file using the same verification function we use in normal operation.
   330  
   331  #### **[LC-FUNC-MAKEBLOCK.1]**
   332  
   333  ```go
   334  func makeblock (genesisDoc GenesisDoc) (lightBlock LightBlock))
   335  ```
   336  
   337  - Implementation remark
   338      - none
   339  - Expected precondition
   340      - none
   341  - Expected postcondition
   342      - lightBlock.Header.Height =  genesisDoc.InitialHeight
   343      - lightBlock.Header.Time = genesisDoc.GenesisTime
   344      - lightBlock.Header.LastBlockID = nil
   345      - lightBlock.Header.LastCommit = nil
   346      - lightBlock.Header.Validators = genesisDoc.Validators
   347      - lightBlock.Header.NextValidators = genesisDoc.Validators
   348      - lightBlock.Header.Data = nil
   349      - lightBlock.Header.AppState =  genesisDoc.AppState
   350      - lightBlock.Header.LastResult = nil
   351      - lightBlock.Commit = nil
   352      - lightBlock.Validators = genesisDoc.Validators
   353      - lightBlock.NextValidators = genesisDoc.Validators
   354      - lightBlock.Provider = nil
   355  - Error condition
   356      - none
   357  
   358  ----
   359  
   360  ### Configuration Parameters
   361  
   362  #### **[LC-INV-ROOT-AGREED.1]**
   363  
   364  In the Sequential-Supervisor, it is always the case that the primary
   365  and all secondaries agree on lightStore.Latest().
   366  
   367  ### Assumptions
   368  
   369  We have to assume that the initialization data (the lightblock or the
   370  genesis file) are consistent with the blockchain. This is subjective
   371  initialization and it cannot be checked locally.
   372  
   373  ### Invariants
   374  
   375  #### **[LC-INV-PEERLIST.1]:**
   376  
   377  The peer list contains a primary and a secondary.
   378  
   379  > If the invariant is violated, the light client does not have enough
   380  > peers to download headers from. As a result, the light client
   381  > needs to terminate in case this invariant is violated.
   382  
   383  ## Supervisor
   384  
   385  ### Outline
   386  
   387  The supervisor implements the functionality of the lightclient. It is
   388  initialized with a genesis file or with a lightblock the user
   389  trusts. This initialization is subjective, that is, the security of
   390  the lightclient is based on the validity of the input. If the genesis
   391  file or the lightblock deviate from the actual ones on the blockchain,
   392  the lightclient provides no guarantees.
   393  
   394  After initialization, the supervisor awaits an input, that is, the
   395  height of the next lightblock that should be obtained. Then it
   396  downloads, verifies, and cross-checks a lightblock, and if all tests
   397  go through, the light block (and possibly other lightblocks) are added
   398  to the lightstore, which is returned in an output event to the user.
   399  
   400  The following main loop does the interaction with the user (input,
   401  output) and calls the following two functions:
   402  
   403  - `InitLightClient`: it initializes the lightstore either with the
   404    provided lightblock or with the lightblock that corresponds to the
   405    first block generated by the blockchain (by the validators defined
   406    by the genesis file)
   407  - `VerifyAndDetect`: takes as input a lightstore and a height and
   408    returns the updated lightstore.
   409  
   410  #### **[LC-FUNC-SUPERVISOR.1]:**
   411  
   412  ```go
   413  func Sequential-Supervisor (initdata LCInitData) (Error) {
   414  
   415      lightStore,result := InitLightClient(initData);
   416      if result != OK {
   417          return result;
   418      }
   419  
   420      loop {
   421          // get the next height
   422          nextHeight := input();
   423    
   424          lightStore,result := VerifyAndDetect(lightStore, nextHeight);
   425    
   426          if result == OK {
   427              output(LightStore.Get(targetHeight));
   428     // we only output a trusted lightblock
   429          }
   430          else {
   431              return result
   432          }
   433          // QUESTION: is it OK to generate output event in normal case,
   434          // and terminate with failure in the (light client) attack case?
   435      }
   436  }
   437  ```
   438  
   439  - Implementation remark
   440      - infinite loop unless a light client attack is detected
   441      - In typical implementations (e.g., the one in Rust),
   442     there are mutliple input actions:
   443        `VerifytoLatest`, `LatestTrusted`, and `GetStatus`. The
   444        information can be easily obtained from the lightstore, so that
   445        we do not treat these requests explicitly here but just consider
   446     the request for a block of a given height which requires more
   447     involved computation and communication.
   448  - Expected precondition
   449      - *LCInitData* contains a genesis file or a lightblock.
   450  - Expected postcondition
   451      - if a light client attack is detected: it stops and submits
   452        evidence (in `InitLightClient` or `VerifyAndDetect`)
   453      - otherwise: non. It runs forever.
   454  - Invariant: *lightStore* contains trusted lightblocks only.
   455  - Error condition
   456      - if `InitLightClient` or `VerifyAndDetect` fails (if a attack is
   457   detected, or if [LCV-INV-TP.1] is violated)
   458  
   459  ----
   460  
   461  ### Details of the Functions
   462  
   463  #### Initialization
   464  
   465  The light client is based on subjective initialization. It has to
   466  trust the initial data given to it by the user. It cannot do any
   467  detection of attack. So either upon initialization we obtain a
   468  lightblock and just initialize the lightstore with it. Or in case of a
   469  genesis file, we download, verify, and cross-check the first block, to
   470  initialize the lightstore with this first block. The reason is that
   471  we want to maintain [LCV-INV-TP.1] from the beginning.
   472  
   473  > If the lightclient is initialized with a lightblock, one might think
   474  > it may increase trust, when one cross-checks the initial light
   475  > block. However, if a peer provides a conflicting
   476  > lightblock, the question is to distinguish the case of a
   477  > [bogus](https://informal.systems) block (upon which operation should proceed) from a
   478  > [light client attack](https://informal.systems) (upon which operation should stop). In
   479  > case of a bogus block, the lightclient might be forced to do
   480  > backwards verification until the blocks are out of the trusting
   481  > period, to make sure no previous validator set could have generated
   482  > the bogus block, which effectively opens up a DoS attack on the lightclient
   483  > without adding effective robustness.
   484  
   485  #### **[LC-FUNC-INIT.1]:**
   486  
   487  ```go
   488  func InitLightClient (initData LCInitData) (LightStore, Error) {
   489  
   490      if LCInitData.LightBlock != nil {
   491          // we trust the provided initial block.
   492          newblock := LCInitData.LightBlock
   493      }
   494      else {
   495          genesisBlock := makeblock(initData.genesisDoc);
   496  
   497          result := NoResult;
   498          while result != ResultSuccess {
   499              current = FetchLightBlock(PeerList.primary(), genesisBlock.Header.Height + 1)
   500              // QUESTION: is the height with "+1" OK?
   501  
   502              if CANNOT_VERIFY = ValidAndVerify(genesisBlock, current) {
   503                  Replace_Primary();
   504              }
   505              else {
   506                  result = ResultSuccess
   507              }
   508          }
   509    
   510          // cross-check
   511    auxLS := new LightStore
   512    auxLS.Add(current)
   513          Evidences := AttackDetector(genesisBlock, auxLS)
   514          if Evidences.Empty {
   515              newBlock := current
   516          }
   517          else {
   518              // [LC-SUMBIT-EVIDENCE.1]
   519              submitEvidence(Evidences);
   520              return(nil, ErrorAttack);
   521          }
   522      }
   523  
   524      lightStore := new LightStore;
   525      lightStore.Add(newBlock);
   526      return (lightStore, OK);
   527  }
   528  
   529  ```
   530  
   531  - Implementation remark
   532      - none
   533  - Expected precondition
   534      - *LCInitData* contains either a genesis file of a lightblock
   535      - if genesis it passes `ValidateAndComplete()` see [CometBFT](https://informal.systems)
   536  - Expected postcondition
   537      - *lightStore* initialized with trusted lightblock. It has either been
   538        cross-checked (from genesis) or it has initial trust from the
   539        user.
   540  - Error condition
   541      - if precondition is violated
   542      - empty peerList
   543  
   544  ----
   545  
   546  #### Main verification and detection logic
   547  
   548  #### **[LC-FUNC-MAIN-VERIF-DETECT.1]:**
   549  
   550  ```go
   551  func VerifyAndDetect (lightStore LightStore, targetHeight Height)
   552                       (LightStore, Result) {
   553  
   554      b1, r1 = lightStore.Get(targetHeight)
   555      if r1 == true {
   556          if b1.State == StateTrusted {
   557              // block already there and trusted
   558              return (lightStore, ResultSuccess)
   559    }
   560    else {
   561              // We have a lightblock in the store, but it has not been 
   562              // cross-checked by now. We do that now.
   563              root_of_trust, auxLS := lightstore.TraceTo(b1);
   564     
   565              // Cross-check
   566              Evidences := AttackDetector(root_of_trust, auxLS);
   567              if Evidences.Empty {
   568                  // no attack detected, we trust the new lightblock
   569                  lightStore.Update(auxLS.Latest(), 
   570                                    StateTrusted, 
   571                                    verfiedLS.Latest().verification-root);
   572                  return (lightStore, OK);
   573              }
   574              else {
   575                  // there is an attack, we exit
   576    submitEvidence(Evidences);
   577                  return(lightStore, ErrorAttack);
   578              }
   579          }
   580      }
   581  
   582      // get the lightblock with maximum height smaller than targetHeight
   583      // would typically be the heighest, if we always move forward
   584      root_of_trust, r2 = lightStore.LatestPrevious(targetHeight);
   585  
   586      if r2 = false {
   587          // there is no lightblock from which we can do forward
   588          // (skipping) verification. Thus we have to go backwards.
   589          // No cross-check needed. We trust hashes. Therefore, we
   590          // directly return the result
   591          return Backwards(primary, lightStore.Lowest(), targetHeight)
   592      }
   593      else {
   594          // Forward verification + detection
   595          result := NoResult;
   596          while result != ResultSuccess {
   597              verifiedLS,result := VerifyToTarget(primary,
   598                                                  root_of_trust,
   599                                                  nextHeight);
   600              if result == ResultFailure {
   601                  // pick new primary (promote a secondary to primary)
   602                  Replace_Primary(root_of_trust);
   603              }
   604              else if result == ResultExpired {
   605                  return (lightStore, result)
   606              }
   607          }
   608  
   609          // Cross-check
   610          Evidences := AttackDetector(root_of_trust, verifiedLS);
   611          if Evidences.Empty {
   612              // no attack detected, we trust the new lightblock
   613              verifiedLS.Update(verfiedLS.Latest(), 
   614                                StateTrusted, 
   615                                verfiedLS.Latest().verification-root);
   616              lightStore.store_chain(verifidLS);
   617              return (lightStore, OK);
   618          }
   619          else {
   620              // there is an attack, we exit
   621              return(lightStore, ErrorAttack);
   622          }
   623      }
   624  }
   625  ```
   626  
   627  - Implementation remark
   628      - none
   629  - Expected precondition
   630      - none
   631  - Expected postcondition
   632      - lightblock of height *targetHeight* (and possibly additional blocks) added to *lightStore*
   633  - Error condition
   634      - an attack is detected
   635      - [LC-DATA-PEERLIST-INV.1] is violated
   636  
   637  ----