github.com/lazyledger/lazyledger-core@v0.35.0-dev.0.20210613111200-4c651f053571/docs/lazy-adr/adr-004-mvp-light-client.md (about) 1 # ADR 004: Data Availability Sampling Light Client 2 3 ## Changelog 4 5 - 2021-05-03: Initial Draft 6 7 ## Context 8 9 We decided to augment the existing [RPC-based Tendermint light client](https://github.com/tendermint/tendermint/blob/bc643b19c48495077e0394d3e21e1d2a52c99548/light/doc.go#L2-L126) by adding the possibility to additionally validate blocks by doing Data Availability Sampling (DAS). 10 In general, DAS gives light clients assurance that the data behind the block header they validated is actually available in the network and hence, that state fraud proofs could be generated. 11 See [ADR 002](adr-002-ipld-da-sampling.md) for more context on DAS. 12 13 A great introduction on the Tendermint light client (and light clients in general) can be found in this series of [blog posts](https://medium.com/tendermint/everything-you-need-to-know-about-the-tendermint-light-client-f80d03856f98) as well as this [paper](https://arxiv.org/abs/2010.07031). 14 15 This ADR describes the changes necessary to augment the existing Tendermint light client implementation with DAS from a UX as well as from a protocol perspective. 16 17 ## Alternative Approaches 18 19 Ideally, the light client should not just request [signed headers](https://github.com/tendermint/tendermint/blob/bc643b19c48495077e0394d3e21e1d2a52c99548/light/doc.go#L35-L52) from [a few pre-configured peers](https://github.com/tendermint/tendermint/blob/bc643b19c48495077e0394d3e21e1d2a52c99548/light/setup.go#L51-L52) but instead also discover peers from a p2p network. 20 We will eventually implement this. For more context, we refer to this [issue](https://github.com/lazyledger/lazyledger-core/issues/86). 21 This would require that the (signed) headers are provided via other means than the RPC. 22 See this [abandoned pull request](https://github.com/tendermint/tendermint/pull/4508) and [issue](https://github.com/tendermint/tendermint/issues/4456) in the Tendermint repository and also this [suggestion](https://github.com/lazyledger/lazyledger-core/issues/86#issuecomment-831182564) by [@Wondertan](https://github.com/Wondertan) in this repository. 23 24 For some use-cases—like DAS light validator nodes, or the light clients of a Data Availability Layer that are run by full nodes of an Optimistic Rollup—it would even make sense that the light client (passively) participates in the consensus protocol to some extent; i.e. runs a subset of the consensus reactor to Consensus messages ([Votes](https://github.com/tendermint/tendermint/blob/bc643b19c48495077e0394d3e21e1d2a52c99548/types/vote.go#L48-L59) etc.) come in as early as possible. 25 Then light clients would not need to wait for the canonical commit to be included in the next [block](https://github.com/tendermint/tendermint/blob/bc643b19c48495077e0394d3e21e1d2a52c99548/types/block.go#L48). 26 27 For the RPC-based light client it could also make sense to add a new RPC endpoint to tendermint for clients to retrieve the [`DataAvailabilityHeader`](https://github.com/lazyledger/lazyledger-core/blob/50f722a510dd2ba8e3d31931c9d83132d6318d4b/types/block.go#L52-L69) (DAHeader), or embed the DAHeader. 28 The [Commit](https://github.com/lazyledger/lazyledger-core/blob/cbf1f1a4a0472373289a9834b0d33e0918237b7f/rpc/core/routes.go#L25) only contains the [SignedHeader](https://github.com/lazyledger/lazyledger-core/blob/cbf1f1a4a0472373289a9834b0d33e0918237b7f/rpc/core/types/responses.go#L32-L36) (Header and Commit signatures). 29 Not all light clients will need the full DAHeader though (e.g. super-light-clients do not). 30 31 32 ## Decision 33 34 For our MVP, we [decide](https://github.com/lazyledger/lazyledger-core/issues/307) to only modify the existing RPC-endpoint based light client. 35 This is mostly because we want to ship our MVP as quickly as possible but independently of this it makes sense to provide a familiar experience for engineers coming from the Cosmos ecosystem. 36 37 We will later implement the above mentioned variants. 38 How exactly will be described in separate ADRs though. 39 40 ## Detailed Design 41 42 From a user perspective very little changes: 43 the existing light client command gets an additional flag that indicates whether to run DAS or not. 44 Additionally, the light client operator can decide the number of successful samples to make to deem the block available (and hence valid). 45 46 In case DAS is enabled, the light client will need to: 47 1. retrieve the DAHeader corresponding to the data root in the Header 48 2. request a parameterizable number of random samples. 49 50 If the all sampling requests succeed, the whole block is available ([with some high enough probability](https://arxiv.org/abs/1809.09044)). 51 52 ### UX 53 54 The main change to the light client [command](https://github.com/lazyledger/lazyledger-core/blob/master/cmd/tendermint/commands/light.go#L32-L104) is to add in a new flag to indicate if it should run DAS or not. 55 Additionally, the user can choose the number of succeeding samples required for a block to be considered available. 56 57 ```diff 58 =================================================================== 59 diff --git a/cmd/tendermint/commands/light.go b/cmd/tendermint/commands/light.go 60 --- a/cmd/tendermint/commands/light.go (revision 48b043014f0243edd1e8ebad8cd0564ab9100407) 61 +++ b/cmd/tendermint/commands/light.go (date 1620546761822) 62 @@ -64,6 +64,8 @@ 63 dir string 64 maxOpenConnections int 65 66 + daSampling bool 67 + numSamples uint32 68 sequential bool 69 trustingPeriod time.Duration 70 trustedHeight int64 71 @@ -101,6 +103,10 @@ 72 LightCmd.Flags().BoolVar(&sequential, "sequential", false, 73 "sequential verification. Verify all headers sequentially as opposed to using skipping verification", 74 ) 75 + LightCmd.Flags().BoolVar(&daSampling, "da-sampling", false, 76 + "data availability sampling. Verify each header (sequential verification), additionally verify data availability via data availability sampling", 77 + ) 78 + LightCmd.Flags().Uint32Var(&numSamples, "num-samples", 15, "Number of data availability samples until block data deemed available.") 79 } 80 ``` 81 82 For the Data Availability sampling, the light client will have to run an IPFS node. 83 It makes sense to make this mostly opaque to the user as everything around IPFS can be [configured](https://github.com/ipfs/go-ipfs/blob/d6322f485af222e319c893eeac51c44a9859e901/docs/config.md) in the `$IPFS_PATH`. 84 This IPFS path should simply be a sub-directory inside the light client's [directory](https://github.com/lazyledger/lazyledger-core/blob/cbf1f1a4a0472373289a9834b0d33e0918237b7f/cmd/tendermint/commands/light.go#L86-L87). 85 We can later add the ability to let users configure the IPFS setup more granular. 86 87 **Note:** DAS should only be compatible to sequential verification. 88 In case a light client is parametrized to run DAS and skipping verification, the CLI should return an easy-to-understand warning or even an error explaining why this does not make sense. 89 90 ### Light Client Protocol with DAS 91 92 #### Light Store 93 94 The light client stores data in its own [badgerdb instance](https://github.com/lazyledger/lazyledger-core/blob/50f722a510dd2ba8e3d31931c9d83132d6318d4b/cmd/tendermint/commands/light.go#L125) in the given directory: 95 96 ```go 97 db, err := badgerdb.NewDB("light-client-db", dir) 98 ``` 99 100 While it is not critical for this feature, we should at least try to re-use that same DB instance for the local ipld store. 101 Otherwise, we introduce yet another DB instance; something we want to avoid, especially on the long run (see [#283](https://github.com/lazyledger/lazyledger-core/issues/283)). 102 For the first implementation, it might still be simpler to create a separate DB instance and tackle cleaning this up in a separate pull request, e.g. together with other [instances]([#283](https://github.com/lazyledger/lazyledger-core/issues/283)). 103 104 #### RPC 105 106 No changes to the RPC endpoints are absolutely required. 107 Although, for convenience and ease of use, we should either add the `DAHeader` to the existing [Commit](https://github.com/lazyledger/lazyledger-core/blob/cbf1f1a4a0472373289a9834b0d33e0918237b7f/rpc/core/routes.go#L25) endpoint, or, introduce a new endpoint to retrieve the `DAHeader` on demand and for a certain height or block hash. 108 109 The first has the downside that not every light client needs the DAHeader. 110 The second explicitly reveals to full-nodes which clients are doing DAS and which not. 111 112 **Implementation Note:** The additional (or modified) RPC endpoint could work as a simple first step until we implement downloading the DAHeader from a given data root in the header. 113 Also, the light client uses a so called [`Provider`](https://github.com/tendermint/tendermint/blob/7f30bc96f014b27fbe74a546ea912740eabdda74/light/provider/provider.go#L9-L26) to retrieve [LightBlocks](https://github.com/tendermint/tendermint/blob/7f30bc96f014b27fbe74a546ea912740eabdda74/types/light.go#L11-L16), i.e. signed headers and validator sets. 114 Currently, only the [`http` provider](https://github.com/tendermint/tendermint/blob/7f30bc96f014b27fbe74a546ea912740eabdda74/light/provider/http/http.go#L1) is implemented. 115 Hence, as _a first implementation step_, we should augment the `Provider` and the `LightBlock` to optionally include the DAHeader (details below). 116 In parallel but in a separate pull request, we add a separate RPC endpoint to download the DAHeader for a certain height. 117 118 #### Store DataAvailabilityHeader 119 120 For full nodes to be able to serve the `DataAvailabilityHeader` without having to recompute it each time, it needs to be stored somewhere. 121 While this is independent of the concrete serving mechanism, it is more so relevant for the RPC endpoint. 122 There is ongoing work to make the Tendermint Store only store Headers and the DataAvailabilityHeader in [#218](https://github.com/lazyledger/lazyledger-core/pull/218/) / [#182](https://github.com/lazyledger/lazyledger-core/issues/182). 123 124 At the time writing this ADR, another pull request ([#312](https://github.com/lazyledger/lazyledger-core/pull/312)) is in the works with a more isolated change that adds the `DataAvailabilityHeader` to the `BlockID`. 125 Hence, the DAHeader is [stored](https://github.com/lazyledger/lazyledger-core/blob/50f722a510dd2ba8e3d31931c9d83132d6318d4b/store/store.go#L355-L367) along the [`BlockMeta`](https://github.com/lazyledger/lazyledger-core/blob/50f722a510dd2ba8e3d31931c9d83132d6318d4b/types/block_meta.go#L11-L17) there. 126 127 For a first implementation, we could first build on top of #312 and adapt to the changed storage API where only headers and the DAHeader are stored inside tendermint's store (as drafted in #218). 128 A major downside of storing block data inside of tendermint's store as well as in the IPFS' block store is that is not only redundantly stored data but also IO intense work that will slow down full nodes. 129 130 131 #### DAS 132 133 The changes for DAS are very simple from a high-level perspective assuming that the light client has the ability to download the DAHeader along with the required data (signed header + validator set) of a given height: 134 135 Every time the light client validates a retrieved light-block, it additionally starts DAS in the background (once). 136 For a DAS light client it is important to use [sequential](https://github.com/tendermint/tendermint/blob/f366ae3c875a4f4f61f37f4b39383558ac5a58cc/light/client.go#L46-L53) verification and not [skipping](https://github.com/tendermint/tendermint/blob/f366ae3c875a4f4f61f37f4b39383558ac5a58cc/light/client.go#L55-L69) verification. 137 Skipping verification only works under the assumption that 2/3+1 of voting power is honest. 138 The whole point of doing DAS (and state fraud proofs) is to remove that assumption. 139 See also this related issue in the LL specification: [#159](https://github.com/lazyledger/lazyledger-specs/issues/159). 140 141 Independent of the existing implementation, there are three ways this could be implemented: 142 1. the DAS light client only accepts a header as valid and trusts it after DAS succeeds (additionally to the tendermint verification), and it waits until DAS succeeds (or there was an error or timeout on the way) 143 2. (aka 1.5) the DAS light client stages headers where the tendermint verification passes as valid and spins up DAS sampling rotines in the background; the staged headers are committed as valid iff all routines successfully return in time 144 3. the DAS light client optimistically accepts a header as valid and trusts it if the regular tendermint verification succeeds; the DAS is run in the background (with potentially much longer timeouts as in 1.) and after the background routine returns (or errs or times out), the already trusted headers are marked as unavailable; this might require rolling back the already trusted headers 145 146 We note that from an implementation point of view 1. is not only the simplest approach, but it would also work best with the currently implemented light client design. 147 It is the approach that should be implemented first. 148 149 The 2. approach can be seen as an optimization where the higher latency DAS can be conducted in parallel for various heights. 150 This could speed up catching-up (sequentially) if the light client went offline (shorter than the weak subjectivity time window). 151 152 The 3. approach is the most general of all, but it moves the responsibility to wait or to rollback headers to the caller and hence is undesirable as it offers too much flexibility. 153 154 155 #### Data Structures 156 157 ##### LightBlock 158 159 As mentioned above the LightBlock should optionally contain the DataAvailabilityHeader. 160 ```diff 161 Index: types/light.go 162 =================================================================== 163 diff --git a/types/light.go b/types/light.go 164 --- a/types/light.go (revision 64044aa2f2f2266d1476013595aa33bb274ba161) 165 +++ b/types/light.go (date 1620481205049) 166 @@ -13,6 +13,9 @@ 167 type LightBlock struct { 168 *SignedHeader `json:"signed_header"` 169 ValidatorSet *ValidatorSet `json:"validator_set"` 170 + 171 + // DataAvailabilityHeader is only populated for DAS light clients for others it can be nil. 172 + DataAvailabilityHeader *DataAvailabilityHeader `json:"data_availability_header"` 173 } 174 ``` 175 176 Alternatively, we could introduce a `DASLightBlock` that embeds a `LightBlock` and has the `DataAvailabilityHeader` as the only (non-optional) field. 177 This would be more explict as it is a new type. 178 Instead, adding a field to the existing `LightBlock`is backwards compatible and does not require any further code changes; the new type requires `To`- and `FromProto` functions at least. 179 180 ##### Provider 181 182 The [`Provider`](https://github.com/tendermint/tendermint/blob/7f30bc96f014b27fbe74a546ea912740eabdda74/light/provider/provider.go#L9-L26) should be changed to additionally provide the `DataAvailabilityHeader` to enable DAS light clients. 183 Implementations of the interface need to additionally retrieve the `DataAvailabilityHeader` for the [modified LightBlock](#lightblock). 184 Users of the provider need to indicate this to the provider. 185 186 We could either augment the `LightBlock` method with a flag, add a new method solely for providing the `DataAvailabilityHeader`, or, we could introduce a new method for DAS light clients. 187 188 The latter is preferable because it is the most explicit and clear, and it still keeps places where DAS is not used without any code changes. 189 190 Hence: 191 192 ```diff 193 Index: light/provider/provider.go 194 =================================================================== 195 diff --git a/light/provider/provider.go b/light/provider/provider.go 196 --- a/light/provider/provider.go (revision 7d06ae28196e8765c9747aca9db7d2732f56cfc3) 197 +++ b/light/provider/provider.go (date 1620298115962) 198 @@ -21,6 +21,14 @@ 199 // error is returned. 200 LightBlock(ctx context.Context, height int64) (*types.LightBlock, error) 201 202 + // DASLightBlock returns the LightBlock containing the DataAvailabilityHeader. 203 + // Other than including the DataAvailabilityHeader it behaves exactly the same 204 + // as LightBlock. 205 + // 206 + // It can be used by DAS light clients. 207 + DASLightBlock(ctx context.Context, height int64) (*types.LightBlock, error) 208 + 209 + 210 // ReportEvidence reports an evidence of misbehavior. 211 ReportEvidence(context.Context, types.Evidence) error 212 } 213 ``` 214 Alternatively, with the exact same result, we could embed the existing `Provider` into a new interface: e.g. `DASProvider` that adds this method. 215 This is completely equivalent as above and which approach is better will become more clear when we spent more time on the implementation. 216 217 Regular light clients will call `LightBlock` and DAS light clients will call `DASLightBlock`. 218 In the first case the result will be the same as for vanilla Tendermint and in the second case the returned `LightBlock` will additionally contain the `DataAvailabilityHeader` of the requested height. 219 220 #### Running an IPFS node 221 222 We already have methods to [initialize](https://github.com/lazyledger/lazyledger-core/blob/cbf1f1a4a0472373289a9834b0d33e0918237b7f/cmd/tendermint/commands/init.go#L116-L157) and [run](https://github.com/lazyledger/lazyledger-core/blob/cbf1f1a4a0472373289a9834b0d33e0918237b7f/node/node.go#L1449-L1488) an IPFS node in place. 223 These need to be refactored such that they can effectively be for the light client as well. 224 This means: 225 1. these methods need to be exported and available in a place that does not introduce interdependence of go packages 226 2. users should be able to run a light client with a single command and hence most of the initialization logic should be coupled with creating the actual IPFS node and [made independent](https://github.com/lazyledger/lazyledger-core/blob/cbf1f1a4a0472373289a9834b0d33e0918237b7f/cmd/tendermint/commands/init.go#L119-L120) of the `tendermint init` command 227 228 An example for 2. can be found in the IPFS [code](https://github.com/ipfs/go-ipfs/blob/cd72589cfd41a5397bb8fc9765392bca904b596a/cmd/ipfs/daemon.go#L239) itself. 229 We might want to provide a slightly different default initialization though (see how this is [overridable](https://github.com/ipfs/go-ipfs/blob/cd72589cfd41a5397bb8fc9765392bca904b596a/cmd/ipfs/daemon.go#L164-L165) in the ipfs daemon cmd). 230 231 We note that for operating a fully functional light client the IPFS node could be running in client mode [`dht.ModeClient`](https://github.com/libp2p/go-libp2p-kad-dht/blob/09d923fcf68218181b5cd329bf5199e767bd33c3/dht_options.go#L29-L30) but be actually want light clients to also respond to incoming queries, e.g. from other light clients. 232 Hence, they should by default run in [`dht.ModeServer`](https://github.com/libp2p/go-libp2p-kad-dht/blob/09d923fcf68218181b5cd329bf5199e767bd33c3/dht_options.go#L31-L32). 233 In an environment were any bandwidth must be saved, or, were the network conditions do not allow the server mode, we make it easy to change the default behavior. 234 235 ##### Client 236 237 We add another [`Option`](https://github.com/tendermint/tendermint/blob/a91680efee3653e3de620f24eb8ddca1c95ce8f9/light/client.go#L43-L117) to the [`Client`](https://github.com/tendermint/tendermint/blob/a91680efee3653e3de620f24eb8ddca1c95ce8f9/light/client.go#L173) that indicates that this client does DAS. 238 239 This option indicates: 240 1. to do sequential verification and 241 2. to request [`DASLightBlocks`](#lightblock) from the [provider](#provider). 242 243 All other changes should only affect unexported methods only. 244 245 ##### ValidateAvailability 246 247 In order for the light clients to perform DAS to validate availability, they do not need to be aware of the fact that an IPFS node is run. 248 Instead, we can use the existing [`ValidateAvailability`](https://github.com/lazyledger/lazyledger-core/blame/master/p2p/ipld/validate.go#L23-L28) function (as defined in [ADR 002](adr-002-ipld-da-sampling.md) and implemented in [#270](https://github.com/lazyledger/lazyledger-core/pull/270)). 249 Note that this expects an ipfs core API object `CoreAPI` to be passed in. 250 Using that interface has the major benefit that we could even change the requirement that the light client itself runs the IPFS node without changing most of the validation logic. 251 E.g., the IPFS node (with our custom IPLD plugin) could run in different process (or machine), and we could still just pass in that same `CoreAPI` interface. 252 253 Orthogonal to this ADR, we also note that we could change all IPFS readonly methods to accept the minimal interface they actually use, namely something that implements `ResolveNode` (and maybe additionally a `NodeGetter`). 254 255 `ValidateAvailability` needs to be called each time a header is validated. 256 A DAS light client will have to request the `DASLightBlock` for this as per above to be able to pass in a `DataAvailabilityHeader`. 257 258 #### Testing 259 260 Ideally, we add the DAS light client to the existing e2e tests. 261 It might be worth to catch up with some relevant changes from tendermint upstream. 262 In particular, [tendermint/tendermint#6196](https://github.com/tendermint/tendermint/pull/6196) and previous changes that it depends on. 263 264 Additionally, we should provide a simple example in the documentation that walks through the DAS light client. 265 It would be good if the light client logs some (info) output related to DAS to provide feedback to the user. 266 267 ## Status 268 269 Proposed 270 271 ## Consequences 272 273 ### Positive 274 275 - simple to implement and understand 276 - familiar to tendermint / Cosmos devs 277 - allows trying out the MVP without relying on the [lazyledger-app](https://github.com/lazyledger/lazyledger-app) (instead a simple abci app like a modified [KVStore](https://github.com/lazyledger/lazyledger-core/blob/42e4e8b58ebc58ebd663c114d2bcd7ab045b1c55/abci/example/kvstore/README.md) app could be used to demo the DAS light client) 278 279 ### Negative 280 281 - light client does not discover peers 282 - requires the light client that currently runs simple RPC requests only to run an IPFS node 283 - rpc makes it extremely easy to infer which light clients are doing DAS and which not 284 - the initial light client implementation might still be confusing to devs familiar to tendermint/Cosmos for the reason that it does DAS (and state fraud proofs) to get rid of the underlying honest majority assumption, but it will still do all checks related to that same honest majority assumption (e.g. download validator sets, Commits and validate that > 2/3 of them signed the header) 285 286 ### Neutral 287 288 DAS light clients need to additionally obtain the DAHeader from the data root in the header to be able to actually do DAS. 289 290 ## References 291 292 We have linked all references above inside the text already.