github.com/fibonacci-chain/fbc@v0.0.0-20231124064014-c7636198c1e9/libs/iavl/docs/proof/proof.md (about) 1 # Proofs 2 3 What sets IAVL apart from most other key/value stores is the ability to return 4 [Merkle proofs](https://en.wikipedia.org/wiki/Merkle_tree) along with values. These proofs can 5 be used to verify that a returned value is, in fact, the value contained within a given IAVL tree. 6 This verification is done by comparing the proof's root hash with the tree's root hash. 7 8 Somewhat simplified, an IAVL tree is a variant of a 9 [binary search tree](https://en.wikipedia.org/wiki/Binary_search_tree) where inner nodes contain 10 keys used for binary search, and leaf nodes contain the actual key/value pairs ordered by key. 11 Consider the following example, containing five key/value pairs (such as key `a` with value `1`): 12 13 ``` 14 d 15 / \ 16 c e 17 / \ / \ 18 b c=3 d=4 e=5 19 / \ 20 a=1 b=2 21 ``` 22 23 In reality, IAVL nodes contain more data than shown here - for details please refer to the 24 [node documentation](../node/node.md). However, this simplified version is sufficient for an 25 overview. 26 27 A cryptographically secure hash is generated for each node in the tree by hashing the node's key 28 and value (if leaf node), version, and height, as well as the hashes of each direct child (if 29 any). This implies that the hash of any given node also depends on the hashes of all descendants 30 of the node. In turn, this implies that the hash of the root node depends on the hashes of all 31 nodes (and therefore all data) in the tree. 32 33 If we fetch the value `a=1` from the tree and want to verify that this is the correct value, we 34 need the following information: 35 36 ``` 37 d 38 / \ 39 c hash=d6f56d 40 / \ 41 b hash=ec6088 42 / \ 43 a,hash(1) hash=92fd030 44 ``` 45 46 Note that we take the hash of the value of `a=1` instead of simply using the value `1` itself; 47 both would work, but the value can be arbitrarily large while the hash has a constant size. 48 49 With this data, we are able to compute the hashes for all nodes up to and including the root, 50 and can compare this root hash with the root hash of the IAVL tree - if they match, we can be 51 reasonably certain that the provided value is the same as the value in the tree. This data is 52 therefore considered a _proof_ for the value. Notice how we don't need to include any data from 53 e.g. the `e`-branch of the tree at all, only the hash - as the tree grows in size, these savings 54 become very significant, requiring only `log₂(n)` hashes for a tree of `n` keys. 55 56 However, this still introduces quite a bit of overhead. Since we usually want to fetch several 57 values from the tree and verify them, it is often useful to generate a _range proof_, which can 58 prove any and all key/value pairs within a contiguous, ordered key range. For example, the 59 following proof can verify both `a=1`, `b=2`, and `c=3`: 60 61 ``` 62 d 63 / \ 64 c hash=d6f56d 65 / \ 66 b c,hash(3) 67 / \ 68 a,hash(1) b,hash(2) 69 ``` 70 71 Range proofs can also prove the _absence_ of any keys within the range. For example, the above 72 proof can prove that the key `ab` is not in the tree, because if it was it would have to be 73 ordered between `a` and `b` - it is clear from the proof that there is no such node, and if 74 there was it would cause the parent hashes to be different from what we see. 75 76 Range proofs can be generated for non-existant endpoints by including the nearest neighboring 77 keys, which allows them to cover any arbitrary key range. This can also be used to generate an 78 absence proof for a _single_ non-existant key, by returning a range proof between the two nearest 79 neighbors. The range proof is therefore a complete proof for all existing and all absent key/value 80 pairs ordered between two arbitrary endpoints. 81 82 Note that the IAVL terminology for range proofs may differ from that used in other systems, where 83 it refers to proofs that a value lies within some interval without revealing the exact value. IAVL 84 range proofs are used to prove which key/value pairs exist (or not) in some key range, and may be 85 known as range queries elsewhere. 86 87 ## API Overview 88 89 The following is a general overview of the API - for details, see the 90 [API reference](https://pkg.go.dev/github.com/tendermint/iavl). 91 92 As an example, we will be using the same IAVL tree as described in the introduction: 93 94 ``` 95 d 96 / \ 97 c e 98 / \ / \ 99 b c=3 d=4 e=5 100 / \ 101 a=1 b=2 102 ``` 103 104 This tree can be generated as follows: 105 106 ```go 107 package main 108 109 import ( 110 "fmt" 111 "log" 112 113 "github.com/tendermint/iavl" 114 db "github.com/tendermint/tm-db" 115 ) 116 117 func main() { 118 tree, err := iavl.NewMutableTree(db.NewMemDB(), 0) 119 if err != nil { 120 log.Fatal(err) 121 } 122 123 tree.Set([]byte("e"), []byte{5}) 124 tree.Set([]byte("d"), []byte{4}) 125 tree.Set([]byte("c"), []byte{3}) 126 tree.Set([]byte("b"), []byte{2}) 127 tree.Set([]byte("a"), []byte{1}) 128 129 rootHash, version, err := tree.SaveVersion() 130 if err != nil { 131 log.Fatal(err) 132 } 133 fmt.Printf("Saved version %v with root hash %x\n", version, rootHash) 134 135 // Output tree structure, including all node hashes (prefixed with 'n') 136 fmt.Println(tree.String()) 137 } 138 ``` 139 140 ### Tree Root Hash 141 142 Proofs are verified against the root hash of an IAVL tree. This root hash is retrived via 143 `MutableTree.Hash()` or `ImmutableTree.Hash()`, returning a `[]byte` hash. It is also returned by 144 `MutableTree.SaveVersion()`, as shown above. 145 146 ```go 147 fmt.Printf("%x\n", tree.Hash()) 148 // Outputs: dd21329c026b0141e76096b5df395395ae3fc3293bd46706b97c034218fe2468 149 ``` 150 151 ### Generating Proofs 152 153 The following methods are used to generate proofs, all of which are of type `RangeProof`: 154 155 * `ImmutableTree.GetWithProof(key []byte)`: fetches the key's value (if it exists) along with a 156 proof of existence or proof of absence. 157 158 * `ImmutableTree.GetRangeWithProof(start, end []byte, limit int)`: fetches the keys, values, and 159 proofs for the given key range, optionally with a limit (end key is excluded). 160 161 * `MutableTree.GetVersionedWithProof(key []byte, version int64)`: like `GetWithProof()`, but for a 162 specific version of the tree. 163 164 * `MutableTree.GetVersionedRangeWithProof(key []byte, version int64)`: like `GetRangeWithProof()`, 165 but for a specific version of the tree. 166 167 ### Verifying Proofs 168 169 The following `RangeProof` methods are used to verify proofs: 170 171 * `Verify(rootHash []byte)`: verify that the proof root hash matches the given tree root hash. 172 173 * `VerifyItem(key, value []byte)`: verify that the given key exists with the given value, according 174 to the proof. 175 176 * `VerifyAbsent(key []byte)`: verify that the given key is absent, according to the proof. 177 178 To verify that a `RangeProof` is valid for a given IAVL tree (i.e. that the proof root hash matches 179 the tree root hash), run `RangeProof.Verify()` with the tree's root hash: 180 181 ```go 182 // Generate a proof for a=1 183 value, proof, err := tree.GetWithProof([]byte("a")) 184 if err != nil { 185 log.Fatal(err) 186 } 187 188 // Verify that the proof's root hash matches the tree's 189 err = proof.Verify(tree.Hash()) 190 if err != nil { 191 log.Fatalf("Invalid proof: %v", err) 192 } 193 ``` 194 195 The proof must always be verified against the root hash with `Verify()` before attempting other 196 operations. The proof can also be verified manually with `RangeProof.ComputeRootHash()`: 197 198 ```go 199 if !bytes.Equal(proof.ComputeRootHash(), tree.Hash()) { 200 log.Fatal("Proof hash mismatch") 201 } 202 ``` 203 204 To verify that a key has a given value according to the proof, use `VerifyItem()` on a proof 205 generated for this key (or key range): 206 207 ```go 208 // The proof was generated for the item a=1, so this is successful 209 err = proof.VerifyItem([]byte("a"), []byte{1}) 210 fmt.Printf("prove a=1: %v\n", err) 211 // outputs nil 212 213 // If we instead claim that a=2, the proof will error 214 err = proof.VerifyItem([]byte("a"), []byte{2}) 215 fmt.Printf("prove a=2: %v\n", err) 216 // outputs "leaf value hash not same: invalid proof" 217 218 // Also, verifying b=2 errors even though it is correct, since the proof is for a=1 219 err = proof.VerifyItem([]byte("b"), []byte{2}) 220 fmt.Printf("prove b=2: %v\n", err) 221 // outputs "leaf key not found in proof: invalid proof" 222 ``` 223 224 If we generate a proof for a range of keys, we can use this both to prove the value of any of the 225 keys in the range as well as the absence of any keys that would have been within it: 226 227 ```go 228 // Note that the end key is not inclusive, so c is not in the proof. 0 means 229 // no key limit (all keys). 230 keys, values, proof, err := tree.GetRangeWithProof([]byte("a"), []byte("c"), 0) 231 if err != nil { 232 log.Fatal(err) 233 } 234 235 err = proof.Verify(tree.Hash()) 236 if err != nil { 237 log.Fatal(err) 238 } 239 240 // Prove that a=1 is in the range 241 err = proof.VerifyItem([]byte("a"), []byte{1}) 242 fmt.Printf("prove a=1: %v\n", err) 243 // outputs nil 244 245 // Prove that b=2 is also in the range 246 err = proof.VerifyItem([]byte("b"), []byte{2}) 247 fmt.Printf("prove b=2: %v\n", err) 248 // outputs nil 249 250 // Since "ab" is ordered after "a" but before "b", we can prove that it 251 // is not in the range and therefore not in the tree at all 252 err = proof.VerifyAbsence([]byte("ab")) 253 fmt.Printf("prove no ab: %v\n", err) 254 // outputs nil 255 256 // If we try to prove ab, we get an error: 257 err = proof.VerifyItem([]byte("ab"), []byte{0}) 258 fmt.Printf("prove ab=0: %v\n", err) 259 // outputs "leaf key not found in proof: invalid proof" 260 ``` 261 262 ### Proof Structure 263 264 The overall proof structure was described in the introduction. Here, we will have a look at the 265 actual data structure. Knowledge of this is not necessary to use proofs. It may also be useful 266 to have a look at the [`Node` data structure](../node/node.md). 267 268 Recall our example tree: 269 270 ``` 271 d 272 / \ 273 c e 274 / \ / \ 275 b c=3 d=4 e=5 276 / \ 277 a=1 b=2 278 ``` 279 280 A `RangeProof` contains the following data, as well as JSON tags for serialization: 281 282 ```go 283 type RangeProof struct { 284 LeftPath PathToLeaf `json:"left_path"` 285 InnerNodes []PathToLeaf `json:"inner_nodes"` 286 Leaves []ProofLeafNode `json:"leaves"` 287 } 288 ``` 289 290 * `LeftPath` contains the path to the leftmost node in the proof. For a proof of the range `a` to 291 `e` (excluding `e=5`), it contains information about the inner nodes `d`, `c`, and `b` in that 292 order. 293 294 * `InnerNodes` contains paths with any additional inner nodes not already in `LeftPath`, with `nil` 295 paths for nodes already traversed. For a proof of the range `a` to `e` (excluding `e=5`), this 296 contains the paths `nil`, `nil`, `[e]` where the `nil` paths refer to the paths to `b=2` and 297 `c=3` already traversed in `LeftPath`, and `[e]` contains data about the `e` inner node needed 298 to prove `d=4`. 299 300 * `Leaves` contains data about the leaf nodes in the range. For the range `a` to `e` (exluding 301 `e=5`) this contains info about `a=1`, `b=2`, `c=3`, and `d=4` in left-to-right order. 302 303 Note that `Leaves` may contain additional leaf nodes outside the requested range, for example to 304 satisfy absence proofs if a given key does not exist. This may require additional inner nodes 305 to be included as well. 306 307 `PathToLeaf` is simply a slice of `ProofInnerNode`: 308 309 ```go 310 type PathToLeaf []ProofInnerNode 311 ``` 312 313 Where `ProofInnerNode` contains the following data (a subset of the [node data](../node/node.md)): 314 315 ```go 316 type ProofInnerNode struct { 317 Height int8 `json:"height"` 318 Size int64 `json:"size"` 319 Version int64 `json:"version"` 320 Left []byte `json:"left"` 321 Right []byte `json:"right"` 322 } 323 ``` 324 325 Unlike in our diagrams, the key of the inner nodes are not actually part of the proof. This is 326 because they are only used to guide binary searches and do not necessarily correspond to actual keys 327 in the data set, and are thus not included in any hashes. 328 329 Similarly, `ProofLeafNode` contains a subset of leaf node data: 330 331 ```go 332 type ProofLeafNode struct { 333 Key cmn.HexBytes `json:"key"` 334 ValueHash cmn.HexBytes `json:"value"` 335 Version int64 `json:"version"` 336 } 337 ``` 338 339 Notice how the proof contains a hash of the node's value rather than the value itself. This is 340 because values can be arbitrarily large while the hash has a constant size. The Merkle hashes of 341 the tree are computed in the same way, by hashing the value before including it in the node 342 hash. 343 344 The information in these proofs is sufficient to reasonably prove that a given value exists (or 345 does not exist) in a given version of an IAVL dataset without fetching the entire dataset, requiring 346 only `log₂(n)` hashes for a dataset of `n` items. For more information, please see the 347 [API reference](https://pkg.go.dev/github.com/tendermint/iavl).