github.com/muhammedhassanm/blockchain@v0.0.0-20200120143007-697261defd4d/sawtooth-core-master/docs/source/architecture/journal.rst

github.com/muhammedhassanm/blockchain@v0.0.0-20200120143007-697261defd4d/sawtooth-core-master/docs/source/architecture/journal.rst (about)

1 *******
2 Journal
3 *******
4
5 The Journal is responsible for maintaining and extending the blockchain for the
6 validator. This responsibility involves validating candidate blocks, evaluating
7 valid blocks to determine if they are the correct chain head, and generating
8 new blocks to extend the chain.
9
10 The Journal is the consumer of Blocks and Batches that arrive at the validator.
11 These Blocks and Batches arrive via interconnect, either through the gossip
12 protocol or the REST API. The newly-arrived Blocks and Batches are sent to the
13 Journal, which routes them internally.
14
15 .. image:: ../images/journal_organization.*
16 :width: 80%
17 :align: center
18 :alt: Journal Organization Diagram
19
20 The Journal divides up the processing of Blocks and Batches to different
21 pipelines. Both objects are delivered initially to the Completer, which
22 guarantees that all dependencies for the Blocks and Batches have been satisfied
23 and delivered downstream. Completed Blocks are delivered to the Chain
24 controller for validation and fork resolution. Completed Batches are delivered
25 the BlockPublisher for validation and inclusion in a Block.
26
27 The Journal is designed to be asynchronous, allowing incoming blocks to be
28 processed in parallel by the ChainController, as well as allowing the
29 BlockPublisher to proceed with claiming blocks even when the incoming block
30 rate is high.
31
32 It is also flexible enough to accept different consensus algorithms. The
33 Journal implements a consensus interface that defines the entry points and
34 responsibilities of a consensus algorithm.
35
36 The BlockStore
37 ==============
38
39 The BlockStore contains all the blocks in the current blockchain - that is, the
40 list of blocks from the current chain head back to the Genesis blocks. Blocks
41 from forks are not included in the BlockStore. The BlockStore also includes a
42 reference to the head of the current chain. It is expected to be coherent at
43 all times, and an error in the BlockStore is considered a non-recoverable error
44 for the validator. Such critical errors would include missing blocks, bad
45 indexes, missing chain reference, incomplete blocks or invalid blocks in the
46 store. The BlockStore provides an atomic means to update the store when the
47 current fork is changed (the chain head is updated).
48
49 The BlockStore is a persistent on-disk store of all Blocks in the current
50 chain. When the validator is started, the contents of the BlockStore is trusted
51 to be the current state of the world.
52
53 All blocks stored here are formally complete. The BlockStore allows blocks to
54 be accessed via Block ID. Blocks can also be accessed via Batch ID,
55 Transaction ID, or block number; for example, ``get_block_by_batch_id``,
56 ``get_block_by_transaction_id``, ``get_batch_by_transaction``, or
57 ``get_block_by_number``.
58
59 The BlockStore maintains internal mappings of Transaction-to-Block and
60 Batch-to-Block. These may be rebuilt if missing or corrupt. This rebuild should
61 be done during startup, and not during the course of normal operation. These
62 mappings should be stored in a format that is cached to disk, so they are not
63 required to be held in memory at all times. As the blockchain grows, these will
64 become quite large.
65
66 The BlockStore provides an atomic method for updating the current head of the
67 chain. In order for the BlockStore to switch forks, it is provided with a list
68 of blocks in the new chain to commit, and a list of blocks in the old chain to
69 decommit. These lists are the blocks in each fork back to the common root.
70
71 The BlockCache
72 ==============
73
74 The Block Cache holds the working set of blocks for the validator and tracks the
75 processing state. This processing state is tracked as valid, invalid, or
76 unknown. Valid blocks are blocks that have been proven to be valid by the
77 ChainController. Invalid blocks are blocks that failed validation or have an
78 invalid block as a predecessor. Unknown are blocks that have not yet completed
79 validation, usually having just arrived from the Completer.
80
81 The BlockCache is an in-memory construct. It is rebuilt by demand when the
82 system is started.
83
84 If a block is not present in the BlockCache, it will look in the BlockStore for
85 the block. If it is not found or the lookup fails, the block is unknown to the
86 system. If the block is found in the BlockStore it is loaded into the
87 BlockCache and marked as valid. All blocks in the BlockStore are considered
88 valid.
89
90 The BlockCache keeps blocks that are currently relevant, tracked by the last
91 time the block was accessed. Periodically, the blocks that have not been
92 accessed recently are purged from the block cache, but only if none of the
93 other blocks in the BlockCache reference those blocks as predecessors.
94
95 The Completer
96 =============
97
98 The Completer is responsible for making sure Blocks and Batches are complete
99 before they are delivered. Blocks are considered formally complete once all of
100 their predecessors have been delivered to the ChainController and their batches
101 field contains all the Batches specified in the BlockHeader’s batch_ids list.
102 The batches field is also expected to be in the same order as the batch_ids.
103 Once Blocks are formally complete they are delivered to the ChainController for
104 validation.
105
106 Batches are considered complete once all of its dependent transactions exist in
107 the current chain or have been delivered to the BlockPublisher.
108
109 All Blocks and Batches will have a timeout for being completed. After the
110 initial request for the missing dependencies is sent, if the response is not
111 received within the specified time window, they are dropped.
112
113 If you have a new block of unknown validity, you must ensure that its
114 predecessors have been delivered to the journal. If a predecessor is not
115 delivered on request to the journal in a reasonable amount of time, the new
116 block cannot be validated.
117
118 Consider the case where you have the chain A->B->C :
119
120 If C arrives and B is not in the BlockCache, the validator will request B. If
121 the request for B times out, the C block is dropped.
122
123 If later on D arrives with predecessor C, of chain A->B->C->D, the Completer
124 will request C from the network and once C arrives, then will request B again.
125 If B arrives this time, then the new chain will be delivered to the
126 ChainController, where they will be check for validity and considered for
127 becoming the block head by the ChainController.
128
129 The Consensus Interface
130 =======================
131
132 In the spirit of configurability, the Journal supports
133 :term:`dynamic consensus algorithms<Dynamic consensus>`
134 that can be changed via the Settings transaction family. The
135 initial selection of a consensus algorithm is set for the chain in the genesis
136 block during genesis (described below). This may be changed during the course of
137 a chain's lifetime. The Journal and its consensus interface support dynamic
138 consensus for probabilistic finality algorithms like Proof of Work, as well as
139 algorithms with absolute finality like PBFT.
140
141 The Consensus algorithm services to the journal are divided into three distinct
142 interfaces that have specific lifetimes and access to information.
143
144 1. Consensus.BlockPublisher
145 2. Consensus.BlockVerifier
146 3. Consensus.ForkResolver
147
148 Consensus algorithm implementations in Sawtooth must implement all of the
149 consensus interfaces. Each of these objects are provided read-only access to
150 the BlockCache and GlobalState.
151
152 Consensus.BlockPublisher
153 ------------------------
154
155 An implementation of the interface Consensus.BlockPublisher is used by the
156 BlockPublisher to create new candidate blocks to extend the chain. The
157 Consensus.BlockPublisher is provided access to a read-only view of global
158 state, a read-only view of the BlockStore, and an interface to publish batches.
159
160 Three events are called on the Consensus.BlockPublisher,
161
162 1. initialize_block - The BlockHeader is provided for the candidate block. This
163 is called immediately after the block_header is initialized and allows for
164 validation of the consensus algorithm's internal state, checks if the header
165 is correct, checks if according to the consensus rules a block could be
166 published, and checks if any initialization of the block_header is required.
167 If this function fails no candidate block is created and the BlockPublisher
168 will periodically attempt to create new blocks.
169 2. check_publish_block - Periodically, polling is done to check if the block can
170 be published. In the case of PoET, this is a check to see if the wait time
171 has expired, but could be on any other criteria the consensus algorithm has
172 for determining if it is time to publish a block. When this returns true the
173 BlockPublisher will proceed in creating the block.
174 3. finalize_block - Once check_publish_block has confirmed it is time to
175 publish a block, the block header is considered complete, except for the
176 consensus information. The BlockPublisher calls finalize_block with the
177 completed block_header allowing the consensus field to be filled out.
178 Afterwards, the BlockPublisher signs the block and broadcasts it to the
179 network.
180
181 This implementation needs to take special care to handle the genesis block
182 correctly. During genesis operation, the Consensus.BlockPublisher will be called
183 to initialize and finalize a block so that it can be published on the chain
184 (see below).
185
186 Consensus.BlockVerifier
187 -----------------------
188
189 The Consensus.BlockVerifier implementation provides Block verification services
190 to the BlockValidator. This gives the consensus algorithm an opportunity to
191 check whether the candidate block was published following the consensus rules.
192
193 Consensus.ForkResolver
194 ----------------------
195
196 The consensus algorithm is responsible for fork resolution on the system.
197 Depending on the consensus algorithm, the determination of the valid block to
198 become the chain head will differ. In a Bitcoin Proof of Work consensus, this
199 will be the longest chain, whereas PoET uses the measure of aggregate local
200 mean (a measure of the total amount of time spent waiting) to determine the
201 valid fork. Consensus algorithms with finality, such as PBFT, will only ever
202 produce blocks that extend the current head. These algorithms will never have
203 forks to resolve. The ForkResolver for these algorithms with finality will
204 always select the new block that extends the current head.
205
206 The ChainController
207 ===================
208
209 The ChainController is responsible for determining which chain the validator is
210 currently on and coordinating any change-of-chain activities that need to
211 happen.
212
213 The ChainController is designed to be able to handle multiple block validation
214 activities simultaneously. For instance, if multiple forks form on the network,
215 the ChainController can process blocks from all of the competing forks
216 simultaneously. This is advantageous as it allows progress to be made even when
217 there are several deep forks competing. The current chain can also be advanced
218 while a deep fork is being evaluated. This was implemented for cases that could
219 happen if a group of validators lost connectivity with the network and later
220 rejoined.
221
222 .. note::
223
224 Currently, the thread pool is set to 1, so only one Block is validated
225 at a time.
226
227 Here is the basic flow of the ChainController as a single block is processed.
228
229 .. image:: ../images/journal_chain_controller.*
230 :width: 80%
231 :align: center
232 :alt: Journal Chain Controller Diagram
233
234 When a block arrives, the ChainController creates a BlockValidator and
235 dispatches it to a thread pool for execution. Once the BlockValidator has
236 completed, it will callback to the ChainController indicating whether the new
237 block should be the chain head. This indication falls into 3 cases:
238
239 1. The chain head has been updated since the BlockValidator was created. In
240 this case a new BlockValidator is created and dispatched to redo the fork
241 resolution.
242 2. The new Block should become the chain head. In this case the chain head is
243 updated to be the new block.
244 3. The new Block should not become the chain head. This could be because the
245 new Block is part of a chain that has an invalid block in it, or it is a
246 member of a shorter or less desirable fork as determined by consensus.
247
248 The Chain Controller synchronizes chain head updates such that only one
249 BlockValidator result can be processed at a time. This is to prevent the race
250 condition of multiple fork resolution processes attempting to update the chain
251 head at the same time.
252
253 Chain Head Update
254 -----------------
255
256 When the chain needs to be updated, the ChainController does an update of the
257 ChainHead using the BlockStore, providing it with the list of commit blocks
258 that are in the new fork and a list of decommit blocks that are in the
259 BlockStore, which must be removed. After the BlockStore is updated, the Block
260 Publisher is notified that there is a new ChainHead.
261
262 Delayed Block Processing
263 ------------------------
264
265 While the ChainController does Block validation in parallel, there are cases
266 where the ChainController will serialize Block validation. These cases are when
267 a Block is received and any of its predecessors are still being validated. In
268 this case the validation of the predecessor is completed before the new block is
269 scheduled. This is done to avoid redoing the validation work of the predecessor
270 Block, since the predecessor must be validated prior to the new Block, the delay
271 is inconsequential to the outcome.
272
273 The BlockValidator
274 ------------------
275
276 The BlockValidator is a subcomponent of the ChainController that is responsible
277 for Block validation and fork resolution. When the BlockValidator is
278 instantiated, it is given the candidate Block to validate and the current chain
279 head.
280
281 During processing, if a Block is marked as invalid it is discarded, never to be
282 considered again. The only way to have the Block reconsidered is by flushing the
283 BlockCache, which can be done by restarting the validator.
284
285 The BlockValidator has three stages of evaluation.
286
287 1. Determine the common root of the fork (ForkRoot). This is done by walking the
288 chain back from the candidate and the chain head until a common block is
289 found. The Root can be the ChainHead in the case that the Candidate is
290 advancing the existing chain. The only case that the ForkRoot will not be
291 found is if the Candidate is from another Genesis. If this is the case, the
292 Candidate and all of its predecessors are marked as Invalid and discarded.
293 During this step, an ordered list of both chains is built back to the
294 ForkRoot.
295 2. The Candidate chain is validated. This process walks forward from the
296 ForkRoot and applies block validation rules (described below) to each Block
297 successively. If any block fails validation, it and all of its successors
298 are marked as Invalid (Valid Blocks are defined as having Valid
299 predecessor(s)). Once the Candidate is successfully Validated and marked as
300 Valid, the Candidate is ready for Fork Resolution.
301 3. Fork resolution requires a determination to be made if the Candidate should
302 replace the ChainHead and is deferred entirely to the consensus
303 implementation. Once the Consensus determines if the block is the new
304 ChainHead, the answer is returned to the ChainController, which updates the
305 BlockStore. If it is not the new ChainHead, the Candidate is dropped.
306 Additionally, if the Candidate is to become the ChainHead, the list of
307 transactions committed in the new chain back to the common root is computed
308 and the same list is computed on the current chain. This information helps
309 the BlockPublisher update its pending batch list when the chain is updated.
310
311 Block Validation
312 ----------------
313
314 Block validation has the following steps that are always run in order. Failure
315 of any validation step results in failure, processing is stopped, and the Block
316 is marked as Invalid.
317
318 1. **Transaction Permissioning** - On-chain transaction permissions are
319 checked to see who is allowed to submit transactions and batches.
320
321 #. **On-chain Block Validation Rules** - The on-chain block validation rules
322 are checked to ensure that the Block doesn't invalidate any of the
323 rules stored at ``sawtooth.validator.block_validation_rules``.
324
325 #. **Batches Validation** - All of the Batches in the block are sent in order
326 to a Transaction Scheduler for validation. If any Batches fail validation,
327 this block is marked as invalid. Note: Batch and Signature verification is
328 done on receipt of the Batch prior to it being routed to the Journal. The
329 batches are checked for the following:
330
331 * No duplicate Batches
332 * No duplicate Transactions
333 * Valid Transaction dependencies
334 * Successful Batch Execution
335
336 #. **Consensus Verification** - The Consensus instance is given to the Block for
337 verification. Consensus block verification is done by the consensus algorithm
338 using its own rules.
339
340 #. **State Hash Check** - The StateRootHash generated by validating the block is
341 checked against the StateRootHash (state_root_hash field in the BlockHeader)
342 on the block. They must match for the block to be valid.
343
344 If the block is computed to be valid, then StateRootHash is committed to the
345 store.
346
347 The BlockPublisher
348 ==================
349
350 The BlockPublisher is responsible for creating candidate blocks to extend the
351 current chain. The BlockPublisher does all of the housekeeping work around
352 creating a block but takes direction from the consensus algorithm for when to
353 create a block and when to publish a block.
354
355 The BlockPublisher follows this logic flow:
356
357 .. image:: ../images/journal_block_publisher_flow.*
358 :width: 80%
359 :align: center
360 :alt: Journal Block Publisher Diagram
361
362 At each processing stage, the consensus algorithm has a chance to inspect and
363 confirm the validity of the block.
364
365 During CreateBlock, an instance of Consensus.BlockPublisher is created that is
366 responsible for guiding the creation of this candidate block. Also, a
367 TransactionScheduler is created and all of the pending Batches are submitted to
368 it.
369
370 A delay is employed in the checking loop to ensure that there is time for the
371 batch processing to occur.
372
373 Genesis Operation
374 =================
375
376 The Journal supports Genesis operation. This is the action of creating a root of
377 the chain (the Genesis block) when the block store is empty. This operation is
378 necessary for bootstrapping a validator network with the desired consensus
379 model, any deployment-specific configuration settings, as well as any
380 genesis-time transactions for an application's Transaction Family.
381
382 Genesis Batch Creation
383 ----------------------
384
385 The CLI tool produces batches in a file, which will be consumed by the
386 validator on startup (when starting with an empty chain).
387
388 The file contains a protobuf-encoded list of batches:
389
390 .. code-block:: protobuf
391 :caption: File: sawtooth-core/protos/genesis.proto
392
393 message GenesisData {
394 repeated Batch batches = 1;
395 }
396
397 The tool should take multiple input batch collections, and combine them
398 together into the single list of batches contained in GenesisData. This allows
399 independent tools or transaction families to include their own batches, without
400 needing to know anything about the genesis process.
401
402 The first implementation assumes that the order of the input batches have
403 implied dependencies, with each batch being implicitly dependent on the
404 previous. Any dependencies should be verified when the final set of batches is
405 produced. This would be enforced by the use of strict ordering of the batches
406 during execution time. Future implementations may provide a way to verify
407 dependencies across input batches.
408
409 Transaction family authors who need to provide batches that will be included,
410 need to provide their own tool to produce GenesisData, with the batches they
411 require for the process. Each individual tool may manage their batch and
412 transaction dependencies explicitly within the context of their specific
413 genesis batches.
414
415 Example
416 ~~~~~~~
417
418 The following example configures the validator to use PoET consensus
419 and specifies the appropriate settings:
420
421 .. code-block:: bash
422
423 sawset proposal create \
424 -k <signing-key-file> \
425 -o sawset.batch \
426 sawtooth.consensus.algorithm=poet \
427 sawtooth.poet.initial_wait_timer=x \
428 sawtooth.poet.target_wait_time=x \
429 sawtooth.poet.population_estimate_sample_size=x
430 sawadm genesis \
431 sawset.batch
432
433 A genesis.batch file will written to the validator's data directory.
434
435 Block Creation
436 --------------
437
438 On startup, the validator would use the resulting genesis.batch file to produce
439 a genesis block under the following conditions:
440
441 * The genesis.batch file exists
442 * There is no block specified as the chain head
443
444 If either of these conditions is not met, the validator halts operation.
445
446 The validator will load the batches from the file into the pending queue. It
447 will then produce the genesis block through the standard process with the
448 following modifications.
449
450 First, the execution of the batches will be strictly in the order they have
451 been provided. The Executor will not attempt to reorder them, or drop failed
452 transactions. Any failure of a transaction in genesis.batch will fail to
453 produce the genesis block, and the validator will treat this as a fatal error.
454
455 Second, it will use a genesis consensus, to determine block validity. At the
456 start of the genesis block creation process, state (the Merkle-Radix tree) will be empty.
457 Given that the consensus mechanism is specified by a configuration setting in
458 the state, this will return None. As a result, the genesis consensus mechanism
459 will be used. This will produce a block with an empty consensus field.
460
461 In addition to the genesis block, the blockchain ID (that is, the signature of
462 the genesis block) is written to the file ``block-chain-id`` in the validator’s
463 data directory.
464
465 Part of the production of the genesis block will require the configuration of
466 the consensus mechanism. The second block will then use the configured
467 consensus model, which will need to know how to initialize the consensus field
468 from an empty one. In future cases, transitions between consensus models may be
469 possible, as long as they know how to read the consensus field of the previous
470 block.
471
472 To complete the process, all necessary transaction processors must be running.
473 A minimum requirement is the Sawtooth Settings transaction processor,
474 ``settings-tp``.
475
476 .. Licensed under Creative Commons Attribution 4.0 International License
477 .. https://creativecommons.org/licenses/by/4.0/