github.com/lazyledger/lazyledger-core@v0.35.0-dev.0.20210613111200-4c651f053571/docs/lazy-adr/adr-004-mvp-light-client.md (about)

     1  # ADR 004: Data Availability Sampling Light Client
     2  
     3  ## Changelog
     4  
     5  - 2021-05-03: Initial Draft
     6  
     7  ## Context
     8  
     9  We decided to augment the existing [RPC-based Tendermint light client](https://github.com/tendermint/tendermint/blob/bc643b19c48495077e0394d3e21e1d2a52c99548/light/doc.go#L2-L126) by adding the possibility to additionally validate blocks by doing Data Availability Sampling (DAS).
    10  In general, DAS gives light clients assurance that the data behind the block header they validated is actually available in the network and hence, that state fraud proofs could be generated.
    11  See [ADR 002](adr-002-ipld-da-sampling.md) for more context on DAS.
    12  
    13  A great introduction on the Tendermint light client (and light clients in general) can be found in this series of [blog posts](https://medium.com/tendermint/everything-you-need-to-know-about-the-tendermint-light-client-f80d03856f98) as well as this [paper](https://arxiv.org/abs/2010.07031).
    14  
    15  This ADR describes the changes necessary to augment the existing Tendermint light client implementation with DAS from a UX as well as from a protocol perspective.
    16  
    17  ## Alternative Approaches
    18  
    19  Ideally, the light client should not just request [signed headers](https://github.com/tendermint/tendermint/blob/bc643b19c48495077e0394d3e21e1d2a52c99548/light/doc.go#L35-L52) from [a few pre-configured peers](https://github.com/tendermint/tendermint/blob/bc643b19c48495077e0394d3e21e1d2a52c99548/light/setup.go#L51-L52) but instead also discover peers from a p2p network.
    20  We will eventually implement this. For more context, we refer to this [issue](https://github.com/lazyledger/lazyledger-core/issues/86).
    21  This would require that the (signed) headers are provided via other means than the RPC.
    22  See this [abandoned pull request](https://github.com/tendermint/tendermint/pull/4508) and [issue](https://github.com/tendermint/tendermint/issues/4456) in the Tendermint repository and also this [suggestion](https://github.com/lazyledger/lazyledger-core/issues/86#issuecomment-831182564) by [@Wondertan](https://github.com/Wondertan) in this repository.
    23  
    24  For some use-cases—like DAS light validator nodes, or the light clients of a Data Availability Layer that are run by full nodes of an Optimistic Rollup—it would even make sense that the light client (passively) participates in the consensus protocol to some extent; i.e. runs a subset of the consensus reactor to Consensus messages ([Votes](https://github.com/tendermint/tendermint/blob/bc643b19c48495077e0394d3e21e1d2a52c99548/types/vote.go#L48-L59) etc.) come in as early as possible.
    25  Then light clients would not need to wait for the canonical commit to be included in the next [block](https://github.com/tendermint/tendermint/blob/bc643b19c48495077e0394d3e21e1d2a52c99548/types/block.go#L48).
    26  
    27  For the RPC-based light client it could also make sense to add a new RPC endpoint to tendermint for clients to retrieve the [`DataAvailabilityHeader`](https://github.com/lazyledger/lazyledger-core/blob/50f722a510dd2ba8e3d31931c9d83132d6318d4b/types/block.go#L52-L69) (DAHeader), or embed the DAHeader.
    28  The [Commit](https://github.com/lazyledger/lazyledger-core/blob/cbf1f1a4a0472373289a9834b0d33e0918237b7f/rpc/core/routes.go#L25) only contains the [SignedHeader](https://github.com/lazyledger/lazyledger-core/blob/cbf1f1a4a0472373289a9834b0d33e0918237b7f/rpc/core/types/responses.go#L32-L36) (Header and Commit signatures).
    29  Not all light clients will need the full DAHeader though (e.g. super-light-clients do not).
    30  
    31  
    32  ## Decision
    33  
    34  For our MVP, we [decide](https://github.com/lazyledger/lazyledger-core/issues/307) to only modify the existing RPC-endpoint based light client.
    35  This is mostly because we want to ship our MVP as quickly as possible but independently of this it makes sense to provide a familiar experience for engineers coming from the Cosmos ecosystem.
    36  
    37  We will later implement the above mentioned variants.
    38  How exactly will be described in separate ADRs though.
    39  
    40  ## Detailed Design
    41  
    42  From a user perspective very little changes:
    43  the existing light client command gets an additional flag that indicates whether to run DAS or not.
    44  Additionally, the light client operator can decide the number of successful samples to make to deem the block available (and hence valid).
    45  
    46  In case DAS is enabled, the light client will need to:
    47  1. retrieve the DAHeader corresponding to the data root in the Header
    48  2. request a parameterizable number of random samples.
    49  
    50  If the all sampling requests succeed, the whole block is available ([with some high enough probability](https://arxiv.org/abs/1809.09044)).
    51  
    52  ### UX
    53  
    54  The main change to the light client [command](https://github.com/lazyledger/lazyledger-core/blob/master/cmd/tendermint/commands/light.go#L32-L104) is to add in a new flag to indicate if it should run DAS or not.
    55  Additionally, the user can choose the number of succeeding samples required for a block to be considered available.
    56  
    57  ```diff
    58  ===================================================================
    59  diff --git a/cmd/tendermint/commands/light.go b/cmd/tendermint/commands/light.go
    60  --- a/cmd/tendermint/commands/light.go	(revision 48b043014f0243edd1e8ebad8cd0564ab9100407)
    61  +++ b/cmd/tendermint/commands/light.go	(date 1620546761822)
    62  @@ -64,6 +64,8 @@
    63   	dir                string
    64   	maxOpenConnections int
    65  
    66  +	daSampling     bool
    67  +	numSamples     uint32
    68   	sequential     bool
    69   	trustingPeriod time.Duration
    70   	trustedHeight  int64
    71  @@ -101,6 +103,10 @@
    72   	LightCmd.Flags().BoolVar(&sequential, "sequential", false,
    73   		"sequential verification. Verify all headers sequentially as opposed to using skipping verification",
    74   	)
    75  +	LightCmd.Flags().BoolVar(&daSampling, "da-sampling", false,
    76  +		"data availability sampling. Verify each header (sequential verification), additionally verify data availability via data availability sampling",
    77  +	)
    78  +	LightCmd.Flags().Uint32Var(&numSamples, "num-samples", 15, "Number of data availability samples until block data deemed available.")
    79   }
    80  ```
    81  
    82  For the Data Availability sampling, the light client will have to run an IPFS node.
    83  It makes sense to make this mostly opaque to the user as everything around IPFS can be [configured](https://github.com/ipfs/go-ipfs/blob/d6322f485af222e319c893eeac51c44a9859e901/docs/config.md) in the `$IPFS_PATH`.
    84  This IPFS path should simply be a sub-directory inside the light client's [directory](https://github.com/lazyledger/lazyledger-core/blob/cbf1f1a4a0472373289a9834b0d33e0918237b7f/cmd/tendermint/commands/light.go#L86-L87).
    85  We can later add the ability to let users configure the IPFS setup more granular.
    86  
    87  **Note:** DAS should only be compatible to sequential verification.
    88  In case a light client is parametrized to run DAS and skipping verification, the CLI should return an easy-to-understand warning or even an error explaining why this does not make sense.
    89  
    90  ### Light Client Protocol with DAS
    91  
    92  #### Light Store
    93  
    94  The light client stores data in its own [badgerdb instance](https://github.com/lazyledger/lazyledger-core/blob/50f722a510dd2ba8e3d31931c9d83132d6318d4b/cmd/tendermint/commands/light.go#L125) in the given directory:
    95  
    96  ```go
    97  db, err := badgerdb.NewDB("light-client-db", dir)
    98  ```
    99  
   100  While it is not critical for this feature, we should at least try to re-use that same DB instance for the local ipld store.
   101  Otherwise, we introduce yet another DB instance; something we want to avoid, especially on the long run (see [#283](https://github.com/lazyledger/lazyledger-core/issues/283)).
   102  For the first implementation, it might still be simpler to create a separate DB instance and tackle cleaning this up in a separate pull request, e.g. together with other [instances]([#283](https://github.com/lazyledger/lazyledger-core/issues/283)).
   103  
   104  #### RPC
   105  
   106  No changes to the RPC endpoints are absolutely required.
   107  Although, for convenience and ease of use, we should either add the `DAHeader` to the existing [Commit](https://github.com/lazyledger/lazyledger-core/blob/cbf1f1a4a0472373289a9834b0d33e0918237b7f/rpc/core/routes.go#L25) endpoint, or, introduce a new endpoint to retrieve the `DAHeader` on demand and for a certain height or block hash.
   108  
   109  The first has the downside that not every light client needs the DAHeader.
   110  The second explicitly reveals to full-nodes which clients are doing DAS and which not.
   111  
   112  **Implementation Note:** The additional (or modified) RPC endpoint could work as a simple first step until we implement downloading the DAHeader from a given data root in the header.
   113  Also, the light client uses a so called [`Provider`](https://github.com/tendermint/tendermint/blob/7f30bc96f014b27fbe74a546ea912740eabdda74/light/provider/provider.go#L9-L26) to retrieve [LightBlocks](https://github.com/tendermint/tendermint/blob/7f30bc96f014b27fbe74a546ea912740eabdda74/types/light.go#L11-L16), i.e. signed headers and validator sets.
   114  Currently, only the [`http` provider](https://github.com/tendermint/tendermint/blob/7f30bc96f014b27fbe74a546ea912740eabdda74/light/provider/http/http.go#L1) is implemented.
   115  Hence, as _a first implementation step_, we should augment the `Provider` and the `LightBlock` to optionally include the DAHeader (details below).
   116  In parallel but in a separate pull request, we add a separate RPC endpoint to download the DAHeader for a certain height.
   117  
   118  #### Store DataAvailabilityHeader
   119  
   120  For full nodes to be able to serve the `DataAvailabilityHeader` without having to recompute it each time, it needs to be stored somewhere.
   121  While this is independent of the concrete serving mechanism, it is more so relevant for the RPC endpoint.
   122  There is ongoing work to make the Tendermint Store only store Headers and the DataAvailabilityHeader in [#218](https://github.com/lazyledger/lazyledger-core/pull/218/) / [#182](https://github.com/lazyledger/lazyledger-core/issues/182).
   123  
   124  At the time writing this ADR, another pull request ([#312](https://github.com/lazyledger/lazyledger-core/pull/312)) is in the works with a more isolated change that adds the `DataAvailabilityHeader` to the `BlockID`.
   125  Hence, the DAHeader is [stored](https://github.com/lazyledger/lazyledger-core/blob/50f722a510dd2ba8e3d31931c9d83132d6318d4b/store/store.go#L355-L367) along the [`BlockMeta`](https://github.com/lazyledger/lazyledger-core/blob/50f722a510dd2ba8e3d31931c9d83132d6318d4b/types/block_meta.go#L11-L17) there.
   126  
   127  For a first implementation, we could first build on top of #312 and adapt to the changed storage API where only headers and the DAHeader are stored inside tendermint's store (as drafted in #218).
   128  A major downside of storing block data inside of tendermint's store as well as in the IPFS' block store is that is not only redundantly stored data but also IO intense work that will slow down full nodes.
   129  
   130  
   131  #### DAS
   132  
   133  The changes for DAS are very simple from a high-level perspective assuming that the light client has the ability to download the DAHeader along with the required data (signed header + validator set) of a given height:
   134  
   135  Every time the light client validates a retrieved light-block, it additionally starts DAS in the background (once).
   136  For a DAS light client it is important to use [sequential](https://github.com/tendermint/tendermint/blob/f366ae3c875a4f4f61f37f4b39383558ac5a58cc/light/client.go#L46-L53) verification and not [skipping](https://github.com/tendermint/tendermint/blob/f366ae3c875a4f4f61f37f4b39383558ac5a58cc/light/client.go#L55-L69) verification.
   137  Skipping verification only works under the assumption that 2/3+1 of voting power is honest.
   138  The whole point of doing DAS (and state fraud proofs) is to remove that assumption.
   139  See also this related issue in the LL specification: [#159](https://github.com/lazyledger/lazyledger-specs/issues/159).
   140  
   141  Independent of the existing implementation, there are three ways this could be implemented:
   142  1. the DAS light client only accepts a header as valid and trusts it after DAS succeeds (additionally to the tendermint verification), and it waits until DAS succeeds (or there was an error or timeout on the way)
   143  2. (aka 1.5) the DAS light client stages headers where the tendermint verification passes as valid and spins up DAS sampling rotines in the background; the staged headers are committed as valid iff all routines successfully return in time
   144  3. the DAS light client optimistically accepts a header as valid and trusts it if the regular tendermint verification succeeds; the DAS is run in the background (with potentially much longer timeouts as in 1.) and after the background routine returns (or errs or times out), the already trusted headers are marked as unavailable; this might require rolling back the already trusted headers
   145  
   146  We note that from an implementation point of view 1. is not only the simplest approach, but it would also work best with the currently implemented light client design.
   147  It is the approach that should be implemented first.
   148  
   149  The 2. approach can be seen as an optimization where the higher latency DAS can be conducted in parallel for various heights.
   150  This could speed up catching-up (sequentially) if the light client went offline (shorter than the weak subjectivity time window).
   151  
   152  The 3. approach is the most general of all, but it moves the responsibility to wait or to rollback headers to the caller and hence is undesirable as it offers too much flexibility.
   153  
   154  
   155  #### Data Structures
   156  
   157  ##### LightBlock
   158  
   159  As mentioned above the LightBlock should optionally contain the DataAvailabilityHeader.
   160  ```diff
   161  Index: types/light.go
   162  ===================================================================
   163  diff --git a/types/light.go b/types/light.go
   164  --- a/types/light.go	(revision 64044aa2f2f2266d1476013595aa33bb274ba161)
   165  +++ b/types/light.go	(date 1620481205049)
   166  @@ -13,6 +13,9 @@
   167   type LightBlock struct {
   168   	*SignedHeader `json:"signed_header"`
   169   	ValidatorSet  *ValidatorSet `json:"validator_set"`
   170  +
   171  +	// DataAvailabilityHeader is only populated for DAS light clients for others it can be nil.
   172  +	DataAvailabilityHeader *DataAvailabilityHeader `json:"data_availability_header"`
   173   }
   174  ```
   175  
   176  Alternatively, we could introduce a `DASLightBlock` that embeds a `LightBlock` and has the `DataAvailabilityHeader` as the only (non-optional) field.
   177  This would be more explict as it is a new type.
   178  Instead, adding a field to the existing `LightBlock`is backwards compatible and does not require any further code changes; the new type requires `To`- and `FromProto` functions at least.
   179  
   180  ##### Provider
   181  
   182  The [`Provider`](https://github.com/tendermint/tendermint/blob/7f30bc96f014b27fbe74a546ea912740eabdda74/light/provider/provider.go#L9-L26) should be changed to additionally provide the `DataAvailabilityHeader` to enable DAS light clients.
   183  Implementations of the interface need to additionally retrieve the `DataAvailabilityHeader` for the [modified LightBlock](#lightblock).
   184  Users of the provider need to indicate this to the provider.
   185  
   186  We could either augment the `LightBlock` method with a flag, add a new method solely for providing the `DataAvailabilityHeader`, or, we could introduce a new method for DAS light clients.
   187  
   188  The latter is preferable because it is the most explicit and clear, and it still keeps places where DAS is not used without any code changes.
   189  
   190  Hence:
   191  
   192  ```diff
   193  Index: light/provider/provider.go
   194  ===================================================================
   195  diff --git a/light/provider/provider.go b/light/provider/provider.go
   196  --- a/light/provider/provider.go	(revision 7d06ae28196e8765c9747aca9db7d2732f56cfc3)
   197  +++ b/light/provider/provider.go	(date 1620298115962)
   198  @@ -21,6 +21,14 @@
   199   	// error is returned.
   200   	LightBlock(ctx context.Context, height int64) (*types.LightBlock, error)
   201  
   202  +	// DASLightBlock returns the LightBlock containing the DataAvailabilityHeader.
   203  +	// Other than including the DataAvailabilityHeader it behaves exactly the same
   204  +	// as LightBlock.
   205  +	//
   206  +	// It can be used by DAS light clients.
   207  +	DASLightBlock(ctx context.Context, height int64) (*types.LightBlock, error)
   208  +
   209  +
   210   	// ReportEvidence reports an evidence of misbehavior.
   211   	ReportEvidence(context.Context, types.Evidence) error
   212   }
   213  ```
   214  Alternatively, with the exact same result, we could embed the existing `Provider` into a new interface: e.g. `DASProvider` that adds this method.
   215  This is completely equivalent as above and which approach is better will become more clear when we spent more time on the implementation.
   216  
   217  Regular light clients will call `LightBlock` and DAS light clients will call `DASLightBlock`.
   218  In the first case the result will be the same as for vanilla Tendermint and in the second case the returned `LightBlock` will additionally contain the `DataAvailabilityHeader` of the requested height.
   219  
   220  #### Running an IPFS node
   221  
   222  We already have methods to [initialize](https://github.com/lazyledger/lazyledger-core/blob/cbf1f1a4a0472373289a9834b0d33e0918237b7f/cmd/tendermint/commands/init.go#L116-L157) and [run](https://github.com/lazyledger/lazyledger-core/blob/cbf1f1a4a0472373289a9834b0d33e0918237b7f/node/node.go#L1449-L1488)  an IPFS node in place.
   223  These need to be refactored such that they can effectively be for the light client as well.
   224  This means:
   225  1. these methods need to be exported and available in a place that does not introduce interdependence of go packages
   226  2. users should be able to run a light client with a single command and hence most of the initialization logic should be coupled with creating the actual IPFS node and [made independent](https://github.com/lazyledger/lazyledger-core/blob/cbf1f1a4a0472373289a9834b0d33e0918237b7f/cmd/tendermint/commands/init.go#L119-L120) of the `tendermint init` command
   227  
   228  An example for 2. can be found in the IPFS [code](https://github.com/ipfs/go-ipfs/blob/cd72589cfd41a5397bb8fc9765392bca904b596a/cmd/ipfs/daemon.go#L239) itself.
   229  We might want to provide a slightly different default initialization though (see how this is [overridable](https://github.com/ipfs/go-ipfs/blob/cd72589cfd41a5397bb8fc9765392bca904b596a/cmd/ipfs/daemon.go#L164-L165) in the ipfs daemon cmd).
   230  
   231  We note that for operating a fully functional light client the IPFS node could be running in client mode [`dht.ModeClient`](https://github.com/libp2p/go-libp2p-kad-dht/blob/09d923fcf68218181b5cd329bf5199e767bd33c3/dht_options.go#L29-L30) but be actually want light clients to also respond to incoming queries, e.g. from other light clients.
   232  Hence, they should by default run in [`dht.ModeServer`](https://github.com/libp2p/go-libp2p-kad-dht/blob/09d923fcf68218181b5cd329bf5199e767bd33c3/dht_options.go#L31-L32).
   233  In an environment were any bandwidth must be saved, or, were the network conditions do not allow the server mode, we make it easy to change the default behavior.
   234  
   235  ##### Client
   236  
   237  We add another [`Option`](https://github.com/tendermint/tendermint/blob/a91680efee3653e3de620f24eb8ddca1c95ce8f9/light/client.go#L43-L117) to the [`Client`](https://github.com/tendermint/tendermint/blob/a91680efee3653e3de620f24eb8ddca1c95ce8f9/light/client.go#L173) that indicates that this client does DAS.
   238  
   239  This option indicates:
   240  1. to do sequential verification and
   241  2. to request [`DASLightBlocks`](#lightblock) from the [provider](#provider).
   242  
   243  All other changes should only affect unexported methods only.
   244  
   245  ##### ValidateAvailability
   246  
   247  In order for the light clients to perform DAS to validate availability, they do not need to be aware of the fact that an IPFS node is run.
   248  Instead, we can use the existing [`ValidateAvailability`](https://github.com/lazyledger/lazyledger-core/blame/master/p2p/ipld/validate.go#L23-L28) function (as defined in [ADR 002](adr-002-ipld-da-sampling.md) and implemented in [#270](https://github.com/lazyledger/lazyledger-core/pull/270)).
   249  Note that this expects an ipfs core API object `CoreAPI` to be passed in.
   250  Using that interface has the major benefit that we could even change the requirement that the light client itself runs the IPFS node without changing most of the validation logic.
   251  E.g., the IPFS node (with our custom IPLD plugin) could run in different process (or machine), and we could still just pass in that same `CoreAPI` interface.
   252  
   253  Orthogonal to this ADR, we also note that we could change all IPFS readonly methods to accept the minimal interface they actually use, namely something that implements `ResolveNode` (and maybe additionally a `NodeGetter`).
   254  
   255  `ValidateAvailability` needs to be called each time a header is validated.
   256  A DAS light client will have to request the `DASLightBlock` for this as per above to be able to pass in a `DataAvailabilityHeader`.
   257  
   258  #### Testing
   259  
   260  Ideally, we add the DAS light client to the existing e2e tests.
   261  It might be worth to catch up with some relevant changes from tendermint upstream.
   262  In particular, [tendermint/tendermint#6196](https://github.com/tendermint/tendermint/pull/6196) and previous changes that it depends on.
   263  
   264  Additionally, we should provide a simple example in the documentation that walks through the DAS light client.
   265  It would be good if the light client logs some (info) output related to DAS to provide feedback to the user.
   266  
   267  ## Status
   268  
   269  Proposed
   270  
   271  ## Consequences
   272  
   273  ### Positive
   274  
   275  - simple to implement and understand
   276  - familiar to tendermint / Cosmos devs
   277  - allows trying out the MVP without relying on the [lazyledger-app](https://github.com/lazyledger/lazyledger-app) (instead a simple abci app like a modified [KVStore](https://github.com/lazyledger/lazyledger-core/blob/42e4e8b58ebc58ebd663c114d2bcd7ab045b1c55/abci/example/kvstore/README.md) app could be used to demo the DAS light client)
   278  
   279  ### Negative
   280  
   281  - light client does not discover peers
   282  - requires the light client that currently runs simple RPC requests only to run an IPFS node
   283  - rpc makes it extremely easy to infer which light clients are doing DAS and which not
   284  - the initial light client implementation might still be confusing to devs familiar to tendermint/Cosmos for the reason that it does DAS (and state fraud proofs) to get rid of the underlying honest majority assumption, but it will still do all checks related to that same honest majority assumption (e.g. download validator sets, Commits and validate that > 2/3 of them signed the header)
   285  
   286  ### Neutral
   287  
   288  DAS light clients need to additionally obtain the DAHeader from the data root in the header to be able to actually do DAS.
   289  
   290  ## References
   291  
   292  We have linked all references above inside the text already.