github.com/muhammedhassanm/blockchain@v0.0.0-20200120143007-697261defd4d/sawtooth-core-master/docs/source/architecture/journal.rst (about) 1 ******* 2 Journal 3 ******* 4 5 The Journal is responsible for maintaining and extending the blockchain for the 6 validator. This responsibility involves validating candidate blocks, evaluating 7 valid blocks to determine if they are the correct chain head, and generating 8 new blocks to extend the chain. 9 10 The Journal is the consumer of Blocks and Batches that arrive at the validator. 11 These Blocks and Batches arrive via interconnect, either through the gossip 12 protocol or the REST API. The newly-arrived Blocks and Batches are sent to the 13 Journal, which routes them internally. 14 15 .. image:: ../images/journal_organization.* 16 :width: 80% 17 :align: center 18 :alt: Journal Organization Diagram 19 20 The Journal divides up the processing of Blocks and Batches to different 21 pipelines. Both objects are delivered initially to the Completer, which 22 guarantees that all dependencies for the Blocks and Batches have been satisfied 23 and delivered downstream. Completed Blocks are delivered to the Chain 24 controller for validation and fork resolution. Completed Batches are delivered 25 the BlockPublisher for validation and inclusion in a Block. 26 27 The Journal is designed to be asynchronous, allowing incoming blocks to be 28 processed in parallel by the ChainController, as well as allowing the 29 BlockPublisher to proceed with claiming blocks even when the incoming block 30 rate is high. 31 32 It is also flexible enough to accept different consensus algorithms. The 33 Journal implements a consensus interface that defines the entry points and 34 responsibilities of a consensus algorithm. 35 36 The BlockStore 37 ============== 38 39 The BlockStore contains all the blocks in the current blockchain - that is, the 40 list of blocks from the current chain head back to the Genesis blocks. Blocks 41 from forks are not included in the BlockStore. The BlockStore also includes a 42 reference to the head of the current chain. It is expected to be coherent at 43 all times, and an error in the BlockStore is considered a non-recoverable error 44 for the validator. Such critical errors would include missing blocks, bad 45 indexes, missing chain reference, incomplete blocks or invalid blocks in the 46 store. The BlockStore provides an atomic means to update the store when the 47 current fork is changed (the chain head is updated). 48 49 The BlockStore is a persistent on-disk store of all Blocks in the current 50 chain. When the validator is started, the contents of the BlockStore is trusted 51 to be the current state of the world. 52 53 All blocks stored here are formally complete. The BlockStore allows blocks to 54 be accessed via Block ID. Blocks can also be accessed via Batch ID, 55 Transaction ID, or block number; for example, ``get_block_by_batch_id``, 56 ``get_block_by_transaction_id``, ``get_batch_by_transaction``, or 57 ``get_block_by_number``. 58 59 The BlockStore maintains internal mappings of Transaction-to-Block and 60 Batch-to-Block. These may be rebuilt if missing or corrupt. This rebuild should 61 be done during startup, and not during the course of normal operation. These 62 mappings should be stored in a format that is cached to disk, so they are not 63 required to be held in memory at all times. As the blockchain grows, these will 64 become quite large. 65 66 The BlockStore provides an atomic method for updating the current head of the 67 chain. In order for the BlockStore to switch forks, it is provided with a list 68 of blocks in the new chain to commit, and a list of blocks in the old chain to 69 decommit. These lists are the blocks in each fork back to the common root. 70 71 The BlockCache 72 ============== 73 74 The Block Cache holds the working set of blocks for the validator and tracks the 75 processing state. This processing state is tracked as valid, invalid, or 76 unknown. Valid blocks are blocks that have been proven to be valid by the 77 ChainController. Invalid blocks are blocks that failed validation or have an 78 invalid block as a predecessor. Unknown are blocks that have not yet completed 79 validation, usually having just arrived from the Completer. 80 81 The BlockCache is an in-memory construct. It is rebuilt by demand when the 82 system is started. 83 84 If a block is not present in the BlockCache, it will look in the BlockStore for 85 the block. If it is not found or the lookup fails, the block is unknown to the 86 system. If the block is found in the BlockStore it is loaded into the 87 BlockCache and marked as valid. All blocks in the BlockStore are considered 88 valid. 89 90 The BlockCache keeps blocks that are currently relevant, tracked by the last 91 time the block was accessed. Periodically, the blocks that have not been 92 accessed recently are purged from the block cache, but only if none of the 93 other blocks in the BlockCache reference those blocks as predecessors. 94 95 The Completer 96 ============= 97 98 The Completer is responsible for making sure Blocks and Batches are complete 99 before they are delivered. Blocks are considered formally complete once all of 100 their predecessors have been delivered to the ChainController and their batches 101 field contains all the Batches specified in the BlockHeader’s batch_ids list. 102 The batches field is also expected to be in the same order as the batch_ids. 103 Once Blocks are formally complete they are delivered to the ChainController for 104 validation. 105 106 Batches are considered complete once all of its dependent transactions exist in 107 the current chain or have been delivered to the BlockPublisher. 108 109 All Blocks and Batches will have a timeout for being completed. After the 110 initial request for the missing dependencies is sent, if the response is not 111 received within the specified time window, they are dropped. 112 113 If you have a new block of unknown validity, you must ensure that its 114 predecessors have been delivered to the journal. If a predecessor is not 115 delivered on request to the journal in a reasonable amount of time, the new 116 block cannot be validated. 117 118 Consider the case where you have the chain A->B->C : 119 120 If C arrives and B is not in the BlockCache, the validator will request B. If 121 the request for B times out, the C block is dropped. 122 123 If later on D arrives with predecessor C, of chain A->B->C->D, the Completer 124 will request C from the network and once C arrives, then will request B again. 125 If B arrives this time, then the new chain will be delivered to the 126 ChainController, where they will be check for validity and considered for 127 becoming the block head by the ChainController. 128 129 The Consensus Interface 130 ======================= 131 132 In the spirit of configurability, the Journal supports 133 :term:`dynamic consensus algorithms<Dynamic consensus>` 134 that can be changed via the Settings transaction family. The 135 initial selection of a consensus algorithm is set for the chain in the genesis 136 block during genesis (described below). This may be changed during the course of 137 a chain's lifetime. The Journal and its consensus interface support dynamic 138 consensus for probabilistic finality algorithms like Proof of Work, as well as 139 algorithms with absolute finality like PBFT. 140 141 The Consensus algorithm services to the journal are divided into three distinct 142 interfaces that have specific lifetimes and access to information. 143 144 1. Consensus.BlockPublisher 145 2. Consensus.BlockVerifier 146 3. Consensus.ForkResolver 147 148 Consensus algorithm implementations in Sawtooth must implement all of the 149 consensus interfaces. Each of these objects are provided read-only access to 150 the BlockCache and GlobalState. 151 152 Consensus.BlockPublisher 153 ------------------------ 154 155 An implementation of the interface Consensus.BlockPublisher is used by the 156 BlockPublisher to create new candidate blocks to extend the chain. The 157 Consensus.BlockPublisher is provided access to a read-only view of global 158 state, a read-only view of the BlockStore, and an interface to publish batches. 159 160 Three events are called on the Consensus.BlockPublisher, 161 162 1. initialize_block - The BlockHeader is provided for the candidate block. This 163 is called immediately after the block_header is initialized and allows for 164 validation of the consensus algorithm's internal state, checks if the header 165 is correct, checks if according to the consensus rules a block could be 166 published, and checks if any initialization of the block_header is required. 167 If this function fails no candidate block is created and the BlockPublisher 168 will periodically attempt to create new blocks. 169 2. check_publish_block - Periodically, polling is done to check if the block can 170 be published. In the case of PoET, this is a check to see if the wait time 171 has expired, but could be on any other criteria the consensus algorithm has 172 for determining if it is time to publish a block. When this returns true the 173 BlockPublisher will proceed in creating the block. 174 3. finalize_block - Once check_publish_block has confirmed it is time to 175 publish a block, the block header is considered complete, except for the 176 consensus information. The BlockPublisher calls finalize_block with the 177 completed block_header allowing the consensus field to be filled out. 178 Afterwards, the BlockPublisher signs the block and broadcasts it to the 179 network. 180 181 This implementation needs to take special care to handle the genesis block 182 correctly. During genesis operation, the Consensus.BlockPublisher will be called 183 to initialize and finalize a block so that it can be published on the chain 184 (see below). 185 186 Consensus.BlockVerifier 187 ----------------------- 188 189 The Consensus.BlockVerifier implementation provides Block verification services 190 to the BlockValidator. This gives the consensus algorithm an opportunity to 191 check whether the candidate block was published following the consensus rules. 192 193 Consensus.ForkResolver 194 ---------------------- 195 196 The consensus algorithm is responsible for fork resolution on the system. 197 Depending on the consensus algorithm, the determination of the valid block to 198 become the chain head will differ. In a Bitcoin Proof of Work consensus, this 199 will be the longest chain, whereas PoET uses the measure of aggregate local 200 mean (a measure of the total amount of time spent waiting) to determine the 201 valid fork. Consensus algorithms with finality, such as PBFT, will only ever 202 produce blocks that extend the current head. These algorithms will never have 203 forks to resolve. The ForkResolver for these algorithms with finality will 204 always select the new block that extends the current head. 205 206 The ChainController 207 =================== 208 209 The ChainController is responsible for determining which chain the validator is 210 currently on and coordinating any change-of-chain activities that need to 211 happen. 212 213 The ChainController is designed to be able to handle multiple block validation 214 activities simultaneously. For instance, if multiple forks form on the network, 215 the ChainController can process blocks from all of the competing forks 216 simultaneously. This is advantageous as it allows progress to be made even when 217 there are several deep forks competing. The current chain can also be advanced 218 while a deep fork is being evaluated. This was implemented for cases that could 219 happen if a group of validators lost connectivity with the network and later 220 rejoined. 221 222 .. note:: 223 224 Currently, the thread pool is set to 1, so only one Block is validated 225 at a time. 226 227 Here is the basic flow of the ChainController as a single block is processed. 228 229 .. image:: ../images/journal_chain_controller.* 230 :width: 80% 231 :align: center 232 :alt: Journal Chain Controller Diagram 233 234 When a block arrives, the ChainController creates a BlockValidator and 235 dispatches it to a thread pool for execution. Once the BlockValidator has 236 completed, it will callback to the ChainController indicating whether the new 237 block should be the chain head. This indication falls into 3 cases: 238 239 1. The chain head has been updated since the BlockValidator was created. In 240 this case a new BlockValidator is created and dispatched to redo the fork 241 resolution. 242 2. The new Block should become the chain head. In this case the chain head is 243 updated to be the new block. 244 3. The new Block should not become the chain head. This could be because the 245 new Block is part of a chain that has an invalid block in it, or it is a 246 member of a shorter or less desirable fork as determined by consensus. 247 248 The Chain Controller synchronizes chain head updates such that only one 249 BlockValidator result can be processed at a time. This is to prevent the race 250 condition of multiple fork resolution processes attempting to update the chain 251 head at the same time. 252 253 Chain Head Update 254 ----------------- 255 256 When the chain needs to be updated, the ChainController does an update of the 257 ChainHead using the BlockStore, providing it with the list of commit blocks 258 that are in the new fork and a list of decommit blocks that are in the 259 BlockStore, which must be removed. After the BlockStore is updated, the Block 260 Publisher is notified that there is a new ChainHead. 261 262 Delayed Block Processing 263 ------------------------ 264 265 While the ChainController does Block validation in parallel, there are cases 266 where the ChainController will serialize Block validation. These cases are when 267 a Block is received and any of its predecessors are still being validated. In 268 this case the validation of the predecessor is completed before the new block is 269 scheduled. This is done to avoid redoing the validation work of the predecessor 270 Block, since the predecessor must be validated prior to the new Block, the delay 271 is inconsequential to the outcome. 272 273 The BlockValidator 274 ------------------ 275 276 The BlockValidator is a subcomponent of the ChainController that is responsible 277 for Block validation and fork resolution. When the BlockValidator is 278 instantiated, it is given the candidate Block to validate and the current chain 279 head. 280 281 During processing, if a Block is marked as invalid it is discarded, never to be 282 considered again. The only way to have the Block reconsidered is by flushing the 283 BlockCache, which can be done by restarting the validator. 284 285 The BlockValidator has three stages of evaluation. 286 287 1. Determine the common root of the fork (ForkRoot). This is done by walking the 288 chain back from the candidate and the chain head until a common block is 289 found. The Root can be the ChainHead in the case that the Candidate is 290 advancing the existing chain. The only case that the ForkRoot will not be 291 found is if the Candidate is from another Genesis. If this is the case, the 292 Candidate and all of its predecessors are marked as Invalid and discarded. 293 During this step, an ordered list of both chains is built back to the 294 ForkRoot. 295 2. The Candidate chain is validated. This process walks forward from the 296 ForkRoot and applies block validation rules (described below) to each Block 297 successively. If any block fails validation, it and all of its successors 298 are marked as Invalid (Valid Blocks are defined as having Valid 299 predecessor(s)). Once the Candidate is successfully Validated and marked as 300 Valid, the Candidate is ready for Fork Resolution. 301 3. Fork resolution requires a determination to be made if the Candidate should 302 replace the ChainHead and is deferred entirely to the consensus 303 implementation. Once the Consensus determines if the block is the new 304 ChainHead, the answer is returned to the ChainController, which updates the 305 BlockStore. If it is not the new ChainHead, the Candidate is dropped. 306 Additionally, if the Candidate is to become the ChainHead, the list of 307 transactions committed in the new chain back to the common root is computed 308 and the same list is computed on the current chain. This information helps 309 the BlockPublisher update its pending batch list when the chain is updated. 310 311 Block Validation 312 ---------------- 313 314 Block validation has the following steps that are always run in order. Failure 315 of any validation step results in failure, processing is stopped, and the Block 316 is marked as Invalid. 317 318 1. **Transaction Permissioning** - On-chain transaction permissions are 319 checked to see who is allowed to submit transactions and batches. 320 321 #. **On-chain Block Validation Rules** - The on-chain block validation rules 322 are checked to ensure that the Block doesn't invalidate any of the 323 rules stored at ``sawtooth.validator.block_validation_rules``. 324 325 #. **Batches Validation** - All of the Batches in the block are sent in order 326 to a Transaction Scheduler for validation. If any Batches fail validation, 327 this block is marked as invalid. Note: Batch and Signature verification is 328 done on receipt of the Batch prior to it being routed to the Journal. The 329 batches are checked for the following: 330 331 * No duplicate Batches 332 * No duplicate Transactions 333 * Valid Transaction dependencies 334 * Successful Batch Execution 335 336 #. **Consensus Verification** - The Consensus instance is given to the Block for 337 verification. Consensus block verification is done by the consensus algorithm 338 using its own rules. 339 340 #. **State Hash Check** - The StateRootHash generated by validating the block is 341 checked against the StateRootHash (state_root_hash field in the BlockHeader) 342 on the block. They must match for the block to be valid. 343 344 If the block is computed to be valid, then StateRootHash is committed to the 345 store. 346 347 The BlockPublisher 348 ================== 349 350 The BlockPublisher is responsible for creating candidate blocks to extend the 351 current chain. The BlockPublisher does all of the housekeeping work around 352 creating a block but takes direction from the consensus algorithm for when to 353 create a block and when to publish a block. 354 355 The BlockPublisher follows this logic flow: 356 357 .. image:: ../images/journal_block_publisher_flow.* 358 :width: 80% 359 :align: center 360 :alt: Journal Block Publisher Diagram 361 362 At each processing stage, the consensus algorithm has a chance to inspect and 363 confirm the validity of the block. 364 365 During CreateBlock, an instance of Consensus.BlockPublisher is created that is 366 responsible for guiding the creation of this candidate block. Also, a 367 TransactionScheduler is created and all of the pending Batches are submitted to 368 it. 369 370 A delay is employed in the checking loop to ensure that there is time for the 371 batch processing to occur. 372 373 Genesis Operation 374 ================= 375 376 The Journal supports Genesis operation. This is the action of creating a root of 377 the chain (the Genesis block) when the block store is empty. This operation is 378 necessary for bootstrapping a validator network with the desired consensus 379 model, any deployment-specific configuration settings, as well as any 380 genesis-time transactions for an application's Transaction Family. 381 382 Genesis Batch Creation 383 ---------------------- 384 385 The CLI tool produces batches in a file, which will be consumed by the 386 validator on startup (when starting with an empty chain). 387 388 The file contains a protobuf-encoded list of batches: 389 390 .. code-block:: protobuf 391 :caption: File: sawtooth-core/protos/genesis.proto 392 393 message GenesisData { 394 repeated Batch batches = 1; 395 } 396 397 The tool should take multiple input batch collections, and combine them 398 together into the single list of batches contained in GenesisData. This allows 399 independent tools or transaction families to include their own batches, without 400 needing to know anything about the genesis process. 401 402 The first implementation assumes that the order of the input batches have 403 implied dependencies, with each batch being implicitly dependent on the 404 previous. Any dependencies should be verified when the final set of batches is 405 produced. This would be enforced by the use of strict ordering of the batches 406 during execution time. Future implementations may provide a way to verify 407 dependencies across input batches. 408 409 Transaction family authors who need to provide batches that will be included, 410 need to provide their own tool to produce GenesisData, with the batches they 411 require for the process. Each individual tool may manage their batch and 412 transaction dependencies explicitly within the context of their specific 413 genesis batches. 414 415 Example 416 ~~~~~~~ 417 418 The following example configures the validator to use PoET consensus 419 and specifies the appropriate settings: 420 421 .. code-block:: bash 422 423 sawset proposal create \ 424 -k <signing-key-file> \ 425 -o sawset.batch \ 426 sawtooth.consensus.algorithm=poet \ 427 sawtooth.poet.initial_wait_timer=x \ 428 sawtooth.poet.target_wait_time=x \ 429 sawtooth.poet.population_estimate_sample_size=x 430 sawadm genesis \ 431 sawset.batch 432 433 A genesis.batch file will written to the validator's data directory. 434 435 Block Creation 436 -------------- 437 438 On startup, the validator would use the resulting genesis.batch file to produce 439 a genesis block under the following conditions: 440 441 * The genesis.batch file exists 442 * There is no block specified as the chain head 443 444 If either of these conditions is not met, the validator halts operation. 445 446 The validator will load the batches from the file into the pending queue. It 447 will then produce the genesis block through the standard process with the 448 following modifications. 449 450 First, the execution of the batches will be strictly in the order they have 451 been provided. The Executor will not attempt to reorder them, or drop failed 452 transactions. Any failure of a transaction in genesis.batch will fail to 453 produce the genesis block, and the validator will treat this as a fatal error. 454 455 Second, it will use a genesis consensus, to determine block validity. At the 456 start of the genesis block creation process, state (the Merkle-Radix tree) will be empty. 457 Given that the consensus mechanism is specified by a configuration setting in 458 the state, this will return None. As a result, the genesis consensus mechanism 459 will be used. This will produce a block with an empty consensus field. 460 461 In addition to the genesis block, the blockchain ID (that is, the signature of 462 the genesis block) is written to the file ``block-chain-id`` in the validator’s 463 data directory. 464 465 Part of the production of the genesis block will require the configuration of 466 the consensus mechanism. The second block will then use the configured 467 consensus model, which will need to know how to initialize the consensus field 468 from an empty one. In future cases, transitions between consensus models may be 469 possible, as long as they know how to read the consensus field of the previous 470 block. 471 472 To complete the process, all necessary transaction processors must be running. 473 A minimum requirement is the Sawtooth Settings transaction processor, 474 ``settings-tp``. 475 476 .. Licensed under Creative Commons Attribution 4.0 International License 477 .. https://creativecommons.org/licenses/by/4.0/