github.com/Team-Kujira/tendermint@v0.34.24-indexer/spec/abci/apps.md

github.com/Team-Kujira/tendermint@v0.34.24-indexer/spec/abci/apps.md (about)

1 ---
2 order: 2
3 title: Applications
4 ---
5
6 # Applications
7
8 Please ensure you've first read the spec for [ABCI Methods and Types](abci.md)
9
10 Here we cover the following components of ABCI applications:
11
12 - [Connection State](#connection-state) - the interplay between ABCI connections and application state
13 and the differences between `CheckTx` and `DeliverTx`.
14 - [Transaction Results](#transaction-results) - rules around transaction
15 results and validity
16 - [Validator Set Updates](#validator-updates) - how validator sets are
17 changed during `InitChain` and `EndBlock`
18 - [Query](#query) - standards for using the `Query` method and proofs about the
19 application state
20 - [Crash Recovery](#crash-recovery) - handshake protocol to synchronize
21 Tendermint and the application on startup.
22 - [State Sync](#state-sync) - rapid bootstrapping of new nodes by restoring state machine snapshots
23
24 ## Connection State
25
26 Since Tendermint maintains four concurrent ABCI connections, it is typical
27 for an application to maintain a distinct state for each, and for the states to
28 be synchronized during `Commit`.
29
30 ### Concurrency
31
32 In principle, each of the four ABCI connections operate concurrently with one
33 another. This means applications need to ensure access to state is
34 thread safe. In practice, both the
35 [default in-process ABCI client](https://github.com/tendermint/tendermint/blob/v0.34.4/abci/client/local_client.go#L18)
36 and the
37 [default Go ABCI
38 server](https://github.com/tendermint/tendermint/blob/v0.34.4/abci/server/socket_server.go#L32)
39 use global locks across all connections, so they are not
40 concurrent at all. This means if your app is written in Go, and compiled in-process with Tendermint
41 using the default `NewLocalClient`, or run out-of-process using the default `SocketServer`,
42 ABCI messages from all connections will be linearizable (received one at a
43 time).
44
45 The existence of this global mutex means Go application developers can get
46 thread safety for application state by routing *all* reads and writes through the ABCI
47 system. Thus it may be *unsafe* to expose application state directly to an RPC
48 interface, and unless explicit measures are taken, all queries should be routed through the ABCI Query method.
49
50 ### BeginBlock
51
52 The BeginBlock request can be used to run some code at the beginning of
53 every block. It also allows Tendermint to send the current block hash
54 and header to the application, before it sends any of the transactions.
55
56 The app should remember the latest height and header (ie. from which it
57 has run a successful Commit) so that it can tell Tendermint where to
58 pick up from when it restarts. See information on the Handshake, below.
59
60 ### Commit
61
62 Application state should only be persisted to disk during `Commit`.
63
64 Before `Commit` is called, Tendermint locks and flushes the mempool so that no new messages will
65 be received on the mempool connection. This provides an opportunity to safely update all four connection
66 states to the latest committed state at once.
67
68 When `Commit` completes, it unlocks the mempool.
69
70 WARNING: if the ABCI app logic processing the `Commit` message sends a
71 `/broadcast_tx_sync` or `/broadcast_tx_commit` and waits for the response
72 before proceeding, it will deadlock. Executing those `broadcast_tx` calls
73 involves acquiring a lock that is held during the `Commit` call, so it's not
74 possible. If you make the call to the `broadcast_tx` endpoints concurrently,
75 that's no problem, it just can't be part of the sequential logic of the
76 `Commit` function.
77
78 ### Consensus Connection
79
80 The Consensus Connection should maintain a `DeliverTxState` - the working state
81 for block execution. It should be updated by the calls to `BeginBlock`, `DeliverTx`,
82 and `EndBlock` during block execution and committed to disk as the "latest
83 committed state" during `Commit`.
84
85 Updates made to the `DeliverTxState` by each method call must be readable by each subsequent method -
86 ie. the updates are linearizable.
87
88 ### Mempool Connection
89
90 The mempool Connection should maintain a `CheckTxState`
91 to sequentially process pending transactions in the mempool that have
92 not yet been committed. It should be initialized to the latest committed state
93 at the end of every `Commit`.
94
95 Before calling `Commit`, Tendermint will lock and flush the mempool connection,
96 ensuring that all existing CheckTx are responded to and no new ones can begin.
97 The `CheckTxState` may be updated concurrently with the `DeliverTxState`, as
98 messages may be sent concurrently on the Consensus and Mempool connections.
99
100 After `Commit`, while still holding the mempool lock, CheckTx is run again on all transactions that remain in the
101 node's local mempool after filtering those included in the block.
102 An additional `Type` parameter is made available to the CheckTx function that
103 indicates whether an incoming transaction is new (`CheckTxType_New`), or a
104 recheck (`CheckTxType_Recheck`).
105
106 Finally, after re-checking transactions in the mempool, Tendermint will unlock
107 the mempool connection. New transactions are once again able to be processed through CheckTx.
108
109 Note that CheckTx is just a weak filter to keep invalid transactions out of the block chain.
110 CheckTx doesn't have to check everything that affects transaction validity; the
111 expensive things can be skipped. It's weak because a Byzantine node doesn't
112 care about CheckTx; it can propose a block full of invalid transactions if it wants.
113
114 #### Replay Protection
115
116 To prevent old transactions from being replayed, CheckTx must implement
117 replay protection.
118
119 It is possible for old transactions to be sent to the application. So
120 it is important CheckTx implements some logic to handle them.
121
122 ### Query Connection
123
124 The Info Connection should maintain a `QueryState` for answering queries from the user,
125 and for initialization when Tendermint first starts up (both described further
126 below).
127 It should always contain the latest committed state associated with the
128 latest committed block.
129
130 `QueryState` should be set to the latest `DeliverTxState` at the end of every `Commit`,
131 after the full block has been processed and the state committed to disk.
132 Otherwise it should never be modified.
133
134 Tendermint Core currently uses the Query connection to filter peers upon
135 connecting, according to IP address or node ID. For instance,
136 returning non-OK ABCI response to either of the following queries will
137 cause Tendermint to not connect to the corresponding peer:
138
139 - `p2p/filter/addr/<ip addr>`, where `<ip addr>` is an IP address.
140 - `p2p/filter/id/<id>`, where `<is>` is the hex-encoded node ID (the hash of
141 the node's p2p pubkey).
142
143 Note: these query formats are subject to change!
144
145 ### Snapshot Connection
146
147 The Snapshot Connection is optional, and is only used to serve state sync snapshots for other nodes
148 and/or restore state sync snapshots to a local node being bootstrapped.
149
150 For more information, see [the state sync section of this document](#state-sync).
151
152 ## Transaction Results
153
154 The `Info` and `Log` fields are non-deterministic values for debugging/convenience purposes
155 that are otherwise ignored.
156
157 The `Data` field must be strictly deterministic, but can be arbitrary data.
158
159 ### Gas
160
161 Ethereum introduced the notion of `gas` as an abstract representation of the
162 cost of resources used by nodes when processing transactions. Every operation in the
163 Ethereum Virtual Machine uses some amount of gas, and gas can be accepted at a market-variable price.
164 Users propose a maximum amount of gas for their transaction; if the tx uses less, they get
165 the difference credited back. Tendermint adopts a similar abstraction,
166 though uses it only optionally and weakly, allowing applications to define
167 their own sense of the cost of execution.
168
169 In Tendermint, the [ConsensusParams.Block.MaxGas](../proto/types/params.proto) limits the amount of `gas` that can be used in a block.
170 The default value is `-1`, meaning no limit, or that the concept of gas is
171 meaningless.
172
173 Responses contain a `GasWanted` and `GasUsed` field. The former is the maximum
174 amount of gas the sender of a tx is willing to use, and the latter is how much it actually
175 used. Applications should enforce that `GasUsed <= GasWanted` - ie. tx execution
176 should halt before it can use more resources than it requested.
177
178 When `MaxGas > -1`, Tendermint enforces the following rules:
179
180 - `GasWanted <= MaxGas` for all txs in the mempool
181 - `(sum of GasWanted in a block) <= MaxGas` when proposing a block
182
183 If `MaxGas == -1`, no rules about gas are enforced.
184
185 Note that Tendermint does not currently enforce anything about Gas in the consensus, only the mempool.
186 This means it does not guarantee that committed blocks satisfy these rules!
187 It is the application's responsibility to return non-zero response codes when gas limits are exceeded.
188
189 The `GasUsed` field is ignored completely by Tendermint. That said, applications should enforce:
190
191 - `GasUsed <= GasWanted` for any given transaction
192 - `(sum of GasUsed in a block) <= MaxGas` for every block
193
194 In the future, we intend to add a `Priority` field to the responses that can be
195 used to explicitly prioritize txs in the mempool for inclusion in a block
196 proposal. See [#1861](https://github.com/tendermint/tendermint/issues/1861).
197
198 ### CheckTx
199
200 If `Code != 0`, it will be rejected from the mempool and hence
201 not broadcasted to other peers and not included in a proposal block.
202
203 `Data` contains the result of the CheckTx transaction execution, if any. It is
204 semantically meaningless to Tendermint.
205
206 `Events` include any events for the execution, though since the transaction has not
207 been committed yet, they are effectively ignored by Tendermint.
208
209 ### DeliverTx
210
211 DeliverTx is the workhorse of the blockchain. Tendermint sends the
212 DeliverTx requests asynchronously but in order, and relies on the
213 underlying socket protocol (ie. TCP) to ensure they are received by the
214 app in order. They have already been ordered in the global consensus by
215 the Tendermint protocol.
216
217 If DeliverTx returns `Code != 0`, the transaction will be considered invalid,
218 though it is still included in the block.
219
220 DeliverTx also returns a [Code, Data, and Log](../../proto/abci/types.proto#L189-L191).
221
222 `Data` contains the result of the CheckTx transaction execution, if any. It is
223 semantically meaningless to Tendermint.
224
225 Both the `Code` and `Data` are included in a structure that is hashed into the
226 `LastResultsHash` of the next block header.
227
228 `Events` include any events for the execution, which Tendermint will use to index
229 the transaction by. This allows transactions to be queried according to what
230 events took place during their execution.
231
232 ## Updating the Validator Set
233
234 The application may set the validator set during InitChain, and may update it during
235 EndBlock.
236
237 Note that the maximum total power of the validator set is bounded by
238 `MaxTotalVotingPower = MaxInt64 / 8`. Applications are responsible for ensuring
239 they do not make changes to the validator set that cause it to exceed this
240 limit.
241
242 Additionally, applications must ensure that a single set of updates does not contain any duplicates -
243 a given public key can only appear once within a given update. If an update includes
244 duplicates, the block execution will fail irrecoverably.
245
246 ### InitChain
247
248 The `InitChain` method can return a list of validators.
249 If the list is empty, Tendermint will use the validators loaded in the genesis
250 file.
251 If the list returned by `InitChain` is not empty, Tendermint will use its contents as the validator set.
252 This way the application can set the initial validator set for the
253 blockchain.
254
255 ### EndBlock
256
257 Updates to the Tendermint validator set can be made by returning
258 `ValidatorUpdate` objects in the `ResponseEndBlock`:
259
260 ```protobuf
261 message ValidatorUpdate {
262 tendermint.crypto.keys.PublicKey pub_key
263 int64 power
264 }
265
266 message PublicKey {
267 oneof {
268 ed25519 bytes = 1;
269 }
270 ```
271
272 The `pub_key` currently supports only one type:
273
274 - `type = "ed25519"`
275
276 The `power` is the new voting power for the validator, with the
277 following rules:
278
279 - power must be non-negative
280 - if power is 0, the validator must already exist, and will be removed from the
281 validator set
282 - if power is non-0:
283 - if the validator does not already exist, it will be added to the validator
284 set with the given power
285 - if the validator does already exist, its power will be adjusted to the given power
286 - the total power of the new validator set must not exceed MaxTotalVotingPower
287
288 Note the updates returned in block `H` will only take effect at block `H+2`.
289
290 ## Consensus Parameters
291
292 ConsensusParams enforce certain limits in the blockchain, like the maximum size
293 of blocks, amount of gas used in a block, and the maximum acceptable age of
294 evidence. They can be set in InitChain and updated in EndBlock.
295
296 ### BlockParams.MaxBytes
297
298 The maximum size of a complete Protobuf encoded block.
299 This is enforced by Tendermint consensus.
300
301 This implies a maximum transaction size that is this MaxBytes, less the expected size of
302 the header, the validator set, and any included evidence in the block.
303
304 Must have `0 < MaxBytes < 100 MB`.
305
306 ### BlockParams.MaxGas
307
308 The maximum of the sum of `GasWanted` that will be allowed in a proposed block.
309 This is *not* enforced by Tendermint consensus.
310 It is left to the app to enforce (ie. if txs are included past the
311 limit, they should return non-zero codes). It is used by Tendermint to limit the
312 txs included in a proposed block.
313
314 Must have `MaxGas >= -1`.
315 If `MaxGas == -1`, no limit is enforced.
316
317 ### EvidenceParams.MaxAgeDuration
318
319 This is the maximum age of evidence in time units.
320 This is enforced by Tendermint consensus.
321
322 If a block includes evidence older than this (AND the evidence was created more
323 than `MaxAgeNumBlocks` ago), the block will be rejected (validators won't vote
324 for it).
325
326 Must have `MaxAgeDuration > 0`.
327
328 ### EvidenceParams.MaxAgeNumBlocks
329
330 This is the maximum age of evidence in blocks.
331 This is enforced by Tendermint consensus.
332
333 If a block includes evidence older than this (AND the evidence was created more
334 than `MaxAgeDuration` ago), the block will be rejected (validators won't vote
335 for it).
336
337 Must have `MaxAgeNumBlocks > 0`.
338
339 ### EvidenceParams.MaxNum
340
341 This is the maximum number of evidence that can be committed to a single block.
342
343 The product of this and the `MaxEvidenceBytes` must not exceed the size of
344 a block minus it's overhead ( ~ `MaxBytes`).
345
346 Must have `MaxNum > 0`.
347
348 ### Updates
349
350 The application may set the ConsensusParams during InitChain, and update them during
351 EndBlock. If the ConsensusParams is empty, it will be ignored. Each field
352 that is not empty will be applied in full. For instance, if updating the
353 Block.MaxBytes, applications must also set the other Block fields (like
354 Block.MaxGas), even if they are unchanged, as they will otherwise cause the
355 value to be updated to 0.
356
357 #### InitChain
358
359 ResponseInitChain includes a ConsensusParams.
360 If ConsensusParams is nil, Tendermint will use the params loaded in the genesis
361 file. If ConsensusParams is not nil, Tendermint will use it.
362 This way the application can determine the initial consensus params for the
363 blockchain.
364
365 #### EndBlock
366
367 ResponseEndBlock includes a ConsensusParams.
368 If ConsensusParams nil, Tendermint will do nothing.
369 If ConsensusParam is not nil, Tendermint will use it.
370 This way the application can update the consensus params over time.
371
372 Note the updates returned in block `H` will take effect right away for block
373 `H+1`.
374
375 ## Query
376
377 Query is a generic method with lots of flexibility to enable diverse sets
378 of queries on application state. Tendermint makes use of Query to filter new peers
379 based on ID and IP, and exposes Query to the user over RPC.
380
381 Note that calls to Query are not replicated across nodes, but rather query the
382 local node's state - hence they may return stale reads. For reads that require
383 consensus, use a transaction.
384
385 The most important use of Query is to return Merkle proofs of the application state at some height
386 that can be used for efficient application-specific light-clients.
387
388 Note Tendermint has technically no requirements from the Query
389 message for normal operation - that is, the ABCI app developer need not implement
390 Query functionality if they do not wish too.
391
392 ### Query Proofs
393
394 The Tendermint block header includes a number of hashes, each providing an
395 anchor for some type of proof about the blockchain. The `ValidatorsHash` enables
396 quick verification of the validator set, the `DataHash` gives quick
397 verification of the transactions included in the block, etc.
398
399 The `AppHash` is unique in that it is application specific, and allows for
400 application-specific Merkle proofs about the state of the application.
401 While some applications keep all relevant state in the transactions themselves
402 (like Bitcoin and its UTXOs), others maintain a separated state that is
403 computed deterministically *from* transactions, but is not contained directly in
404 the transactions themselves (like Ethereum contracts and accounts).
405 For such applications, the `AppHash` provides a much more efficient way to verify light-client proofs.
406
407 ABCI applications can take advantage of more efficient light-client proofs for
408 their state as follows:
409
410 - return the Merkle root of the deterministic application state in
411 `ResponseCommit.Data`. This Merkle root will be included as the `AppHash` in the next block.
412 - return efficient Merkle proofs about that application state in `ResponseQuery.Proof`
413 that can be verified using the `AppHash` of the corresponding block.
414
415 For instance, this allows an application's light-client to verify proofs of
416 absence in the application state, something which is much less efficient to do using the block hash.
417
418 Some applications (eg. Ethereum, Cosmos-SDK) have multiple "levels" of Merkle trees,
419 where the leaves of one tree are the root hashes of others. To support this, and
420 the general variability in Merkle proofs, the `ResponseQuery.Proof` has some minimal structure:
421
422 ```protobuf
423 message ProofOps {
424 repeated ProofOp ops
425 }
426
427 message ProofOp {
428 string type = 1;
429 bytes key = 2;
430 bytes data = 3;
431 }
432 ```
433
434 Each `ProofOp` contains a proof for a single key in a single Merkle tree, of the specified `type`.
435 This allows ABCI to support many different kinds of Merkle trees, encoding
436 formats, and proofs (eg. of presence and absence) just by varying the `type`.
437 The `data` contains the actual encoded proof, encoded according to the `type`.
438 When verifying the full proof, the root hash for one ProofOp is the value being
439 verified for the next ProofOp in the list. The root hash of the final ProofOp in
440 the list should match the `AppHash` being verified against.
441
442 ### Peer Filtering
443
444 When Tendermint connects to a peer, it sends two queries to the ABCI application
445 using the following paths, with no additional data:
446
447 - `/p2p/filter/addr/<IP:PORT>`, where `<IP:PORT>` denote the IP address and
448 the port of the connection
449 - `p2p/filter/id/<ID>`, where `<ID>` is the peer node ID (ie. the
450 pubkey.Address() for the peer's PubKey)
451
452 If either of these queries return a non-zero ABCI code, Tendermint will refuse
453 to connect to the peer.
454
455 ### Paths
456
457 Queries are directed at paths, and may optionally include additional data.
458
459 The expectation is for there to be some number of high level paths
460 differentiating concerns, like `/p2p`, `/store`, and `/app`. Currently,
461 Tendermint only uses `/p2p`, for filtering peers. For more advanced use, see the
462 implementation of
463 [Query in the Cosmos-SDK](https://github.com/cosmos/cosmos-sdk/blob/v0.23.1/baseapp/baseapp.go#L333).
464
465 ## Crash Recovery
466
467 On startup, Tendermint calls the `Info` method on the Info Connection to get the latest
468 committed state of the app. The app MUST return information consistent with the
469 last block it succesfully completed Commit for.
470
471 If the app succesfully committed block H, then `last_block_height = H` and `last_block_app_hash = <hash returned by Commit for block H>`. If the app
472 failed during the Commit of block H, then `last_block_height = H-1` and
473 `last_block_app_hash = <hash returned by Commit for block H-1, which is the hash in the header of block H>`.
474
475 We now distinguish three heights, and describe how Tendermint syncs itself with
476 the app.
477
478 ```md
479 storeBlockHeight = height of the last block Tendermint saw a commit for
480 stateBlockHeight = height of the last block for which Tendermint completed all
481 block processing and saved all ABCI results to disk
482 appBlockHeight = height of the last block for which ABCI app succesfully
483 completed Commit
484
485 ```
486
487 Note we always have `storeBlockHeight >= stateBlockHeight` and `storeBlockHeight >= appBlockHeight`
488 Note also Tendermint never calls Commit on an ABCI app twice for the same height.
489
490 The procedure is as follows.
491
492 First, some simple start conditions:
493
494 If `appBlockHeight == 0`, then call InitChain.
495
496 If `storeBlockHeight == 0`, we're done.
497
498 Now, some sanity checks:
499
500 If `storeBlockHeight < appBlockHeight`, error
501 If `storeBlockHeight < stateBlockHeight`, panic
502 If `storeBlockHeight > stateBlockHeight+1`, panic
503
504 Now, the meat:
505
506 If `storeBlockHeight == stateBlockHeight && appBlockHeight < storeBlockHeight`,
507 replay all blocks in full from `appBlockHeight` to `storeBlockHeight`.
508 This happens if we completed processing the block, but the app forgot its height.
509
510 If `storeBlockHeight == stateBlockHeight && appBlockHeight == storeBlockHeight`, we're done.
511 This happens if we crashed at an opportune spot.
512
513 If `storeBlockHeight == stateBlockHeight+1`
514 This happens if we started processing the block but didn't finish.
515
516 If `appBlockHeight < stateBlockHeight`
517 replay all blocks in full from `appBlockHeight` to `storeBlockHeight-1`,
518 and replay the block at `storeBlockHeight` using the WAL.
519 This happens if the app forgot the last block it committed.
520
521 If `appBlockHeight == stateBlockHeight`,
522 replay the last block (storeBlockHeight) in full.
523 This happens if we crashed before the app finished Commit
524
525 If `appBlockHeight == storeBlockHeight`
526 update the state using the saved ABCI responses but dont run the block against the real app.
527 This happens if we crashed after the app finished Commit but before Tendermint saved the state.
528
529 ## State Sync
530
531 A new node joining the network can simply join consensus at the genesis height and replay all
532 historical blocks until it is caught up. However, for large chains this can take a significant
533 amount of time, often on the order of days or weeks.
534
535 State sync is an alternative mechanism for bootstrapping a new node, where it fetches a snapshot
536 of the state machine at a given height and restores it. Depending on the application, this can
537 be several orders of magnitude faster than replaying blocks.
538
539 Note that state sync does not currently backfill historical blocks, so the node will have a
540 truncated block history - users are advised to consider the broader network implications of this in
541 terms of block availability and auditability. This functionality may be added in the future.
542
543 For details on the specific ABCI calls and types, see the [methods and types section](abci.md).
544
545 ### Taking Snapshots
546
547 Applications that want to support state syncing must take state snapshots at regular intervals. How
548 this is accomplished is entirely up to the application. A snapshot consists of some metadata and
549 a set of binary chunks in an arbitrary format:
550
551 - `Height (uint64)`: The height at which the snapshot is taken. It must be taken after the given
552 height has been committed, and must not contain data from any later heights.
553
554 - `Format (uint32)`: An arbitrary snapshot format identifier. This can be used to version snapshot
555 formats, e.g. to switch from Protobuf to MessagePack for serialization. The application can use
556 this when restoring to choose whether to accept or reject a snapshot.
557
558 - `Chunks (uint32)`: The number of chunks in the snapshot. Each chunk contains arbitrary binary
559 data, and should be less than 16 MB; 10 MB is a good starting point.
560
561 - `Hash ([]byte)`: An arbitrary hash of the snapshot. This is used to check whether a snapshot is
562 the same across nodes when downloading chunks.
563
564 - `Metadata ([]byte)`: Arbitrary snapshot metadata, e.g. chunk hashes for verification or any other
565 necessary info.
566
567 For a snapshot to be considered the same across nodes, all of these fields must be identical. When
568 sent across the network, snapshot metadata messages are limited to 4 MB.
569
570 When a new node is running state sync and discovering snapshots, Tendermint will query an existing
571 application via the ABCI `ListSnapshots` method to discover available snapshots, and load binary
572 snapshot chunks via `LoadSnapshotChunk`. The application is free to choose how to implement this
573 and which formats to use, but must provide the following guarantees:
574
575 - **Consistent:** A snapshot must be taken at a single isolated height, unaffected by
576 concurrent writes. This can be accomplished by using a data store that supports ACID
577 transactions with snapshot isolation.
578
579 - **Asynchronous:** Taking a snapshot can be time-consuming, so it must not halt chain progress,
580 for example by running in a separate thread.
581
582 - **Deterministic:** A snapshot taken at the same height in the same format must be identical
583 (at the byte level) across nodes, including all metadata. This ensures good availability of
584 chunks, and that they fit together across nodes.
585
586 A very basic approach might be to use a datastore with MVCC transactions (such as RocksDB),
587 start a transaction immediately after block commit, and spawn a new thread which is passed the
588 transaction handle. This thread can then export all data items, serialize them using e.g.
589 Protobuf, hash the byte stream, split it into chunks, and store the chunks in the file system
590 along with some metadata - all while the blockchain is applying new blocks in parallel.
591
592 A more advanced approach might include incremental verification of individual chunks against the
593 chain app hash, parallel or batched exports, compression, and so on.
594
595 Old snapshots should be removed after some time - generally only the last two snapshots are needed
596 (to prevent the last one from being removed while a node is restoring it).
597
598 ### Bootstrapping a Node
599
600 An empty node can be state synced by setting the configuration option `statesync.enabled =
601 true`. The node also needs the chain genesis file for basic chain info, and configuration for
602 light client verification of the restored snapshot: a set of Tendermint RPC servers, and a
603 trusted header hash and corresponding height from a trusted source, via the `statesync`
604 configuration section.
605
606 Once started, the node will connect to the P2P network and begin discovering snapshots. These
607 will be offered to the local application via the `OfferSnapshot` ABCI method. Once a snapshot
608 is accepted Tendermint will fetch and apply the snapshot chunks. After all chunks have been
609 successfully applied, Tendermint verifies the app's `AppHash` against the chain using the light
610 client, then switches the node to normal consensus operation.
611
612 #### Snapshot Discovery
613
614 When the empty node join the P2P network, it asks all peers to report snapshots via the
615 `ListSnapshots` ABCI call (limited to 10 per node). After some time, the node picks the most
616 suitable snapshot (generally prioritized by height, format, and number of peers), and offers it
617 to the application via `OfferSnapshot`. The application can choose a number of responses,
618 including accepting or rejecting it, rejecting the offered format, rejecting the peer who sent
619 it, and so on. Tendermint will keep discovering and offering snapshots until one is accepted or
620 the application aborts.
621
622 #### Snapshot Restoration
623
624 Once a snapshot has been accepted via `OfferSnapshot`, Tendermint begins downloading chunks from
625 any peers that have the same snapshot (i.e. that have identical metadata fields). Chunks are
626 spooled in a temporary directory, and then given to the application in sequential order via
627 `ApplySnapshotChunk` until all chunks have been accepted.
628
629 The method for restoring snapshot chunks is entirely up to the application.
630
631 During restoration, the application can respond to `ApplySnapshotChunk` with instructions for how
632 to continue. This will typically be to accept the chunk and await the next one, but it can also
633 ask for chunks to be refetched (either the current one or any number of previous ones), P2P peers
634 to be banned, snapshots to be rejected or retried, and a number of other responses - see the ABCI
635 reference for details.
636
637 If Tendermint fails to fetch a chunk after some time, it will reject the snapshot and try a
638 different one via `OfferSnapshot` - the application can choose whether it wants to support
639 restarting restoration, or simply abort with an error.
640
641 #### Snapshot Verification
642
643 Once all chunks have been accepted, Tendermint issues an `Info` ABCI call to retrieve the
644 `LastBlockAppHash`. This is compared with the trusted app hash from the chain, retrieved and
645 verified using the light client. Tendermint also checks that `LastBlockHeight` corresponds to the
646 height of the snapshot.
647
648 This verification ensures that an application is valid before joining the network. However, the
649 snapshot restoration may take a long time to complete, so applications may want to employ additional
650 verification during the restore to detect failures early. This might e.g. include incremental
651 verification of each chunk against the app hash (using bundled Merkle proofs), checksums to
652 protect against data corruption by the disk or network, and so on. However, it is important to
653 note that the only trusted information available is the app hash, and all other snapshot metadata
654 can be spoofed by adversaries.
655
656 Apps may also want to consider state sync denial-of-service vectors, where adversaries provide
657 invalid or harmful snapshots to prevent nodes from joining the network. The application can
658 counteract this by asking Tendermint to ban peers. As a last resort, node operators can use
659 P2P configuration options to whitelist a set of trusted peers that can provide valid snapshots.
660
661 #### Transition to Consensus
662
663 Once the snapshots have all been restored, Tendermint gathers additional information necessary for
664 bootstrapping the node (e.g. chain ID, consensus parameters, validator sets, and block headers)
665 from the genesis file and light client RPC servers. It also fetches and records the `AppVersion`
666 from the ABCI application.
667
668 Once the state machine has been restored and Tendermint has gathered this additional
669 information, it transitions to block sync (if enabled) to fetch any remaining blocks up the chain
670 head, and then transitions to regular consensus operation. At this point the node operates like
671 any other node, apart from having a truncated block history at the height of the restored snapshot.