github.com/muhammedhassanm/blockchain@v0.0.0-20200120143007-697261defd4d/sawtooth-core-master/docs/source/architecture/journal.rst (about)

     1  *******
     2  Journal
     3  *******
     4  
     5  The Journal is responsible for maintaining and extending the blockchain for the
     6  validator. This responsibility involves validating candidate blocks, evaluating
     7  valid blocks to determine if they are the correct chain head, and generating
     8  new blocks to extend the chain.
     9  
    10  The Journal is the consumer of Blocks and Batches that arrive at the validator.
    11  These Blocks and Batches arrive via interconnect, either through the gossip
    12  protocol or the REST API. The newly-arrived Blocks and Batches are sent to the
    13  Journal, which routes them internally.
    14  
    15  .. image:: ../images/journal_organization.*
    16     :width: 80%
    17     :align: center
    18     :alt: Journal Organization Diagram
    19  
    20  The Journal divides up the processing of Blocks and Batches to different
    21  pipelines. Both objects are delivered initially to the Completer, which
    22  guarantees that all dependencies for the Blocks and Batches have been satisfied
    23  and delivered downstream. Completed Blocks are delivered to the Chain
    24  controller for validation and fork resolution. Completed Batches are delivered
    25  the BlockPublisher for validation and inclusion in a Block.
    26  
    27  The Journal is designed to be asynchronous, allowing incoming blocks to be
    28  processed in parallel by the ChainController, as well as allowing the
    29  BlockPublisher to proceed with claiming blocks even when the incoming block
    30  rate is high.
    31  
    32  It is also flexible enough to accept different consensus algorithms.  The
    33  Journal implements a consensus interface that defines the entry points and
    34  responsibilities of a consensus algorithm.
    35  
    36  The BlockStore
    37  ==============
    38  
    39  The BlockStore contains all the blocks in the current blockchain - that is, the
    40  list of blocks from the current chain head back to the Genesis blocks. Blocks
    41  from forks are not included in the BlockStore. The BlockStore also includes a
    42  reference to the head of the current chain. It is expected to be coherent at
    43  all times, and an error in the BlockStore is considered a non-recoverable error
    44  for the validator. Such critical errors would include missing blocks, bad
    45  indexes, missing chain reference, incomplete blocks or invalid blocks in the
    46  store. The BlockStore provides an atomic means to update the store when the
    47  current fork is changed (the chain head is updated).
    48  
    49  The BlockStore is a persistent on-disk store of all Blocks in the current
    50  chain. When the validator is started, the contents of the BlockStore is trusted
    51  to be the current state of the world.
    52  
    53  All blocks stored here are formally complete. The BlockStore allows blocks to
    54  be accessed via Block ID. Blocks can also be accessed via Batch ID,
    55  Transaction ID, or block number; for example, ``get_block_by_batch_id``,
    56  ``get_block_by_transaction_id``, ``get_batch_by_transaction``, or
    57  ``get_block_by_number``.
    58  
    59  The BlockStore maintains internal mappings of Transaction-to-Block and
    60  Batch-to-Block. These may be rebuilt if missing or corrupt. This rebuild should
    61  be done during startup, and not during the course of normal operation. These
    62  mappings should be stored in a format that is cached to disk, so they are not
    63  required to be held in memory at all times. As the blockchain grows, these will
    64  become quite large.
    65  
    66  The BlockStore provides an atomic method for updating the current head of the
    67  chain. In order for the BlockStore to switch forks, it is provided with a list
    68  of blocks in the new chain to commit, and a list of blocks in the old chain to
    69  decommit. These lists are the blocks in each fork back to the common root.
    70  
    71  The BlockCache
    72  ==============
    73  
    74  The Block Cache holds the working set of blocks for the validator and tracks the
    75  processing state. This processing state is tracked as valid, invalid, or
    76  unknown. Valid blocks are blocks that have been proven to be valid by the
    77  ChainController. Invalid blocks are blocks that failed validation or have an
    78  invalid block as a predecessor. Unknown are blocks that have not yet completed
    79  validation, usually having just arrived from the Completer.
    80  
    81  The BlockCache is an in-memory construct. It is rebuilt by demand when the
    82  system is started.
    83  
    84  If a block is not present in the BlockCache, it will look in the BlockStore for
    85  the block. If it is not found or the lookup fails, the block is unknown to the
    86  system. If the block is found in the BlockStore it is loaded into the
    87  BlockCache and marked as valid. All blocks in the BlockStore are considered
    88  valid.
    89  
    90  The BlockCache keeps blocks that are currently relevant, tracked by the last
    91  time the block was accessed. Periodically, the blocks that have not been
    92  accessed recently are purged from the block cache, but only if none of the
    93  other blocks in the BlockCache reference those blocks as predecessors.
    94  
    95  The Completer
    96  =============
    97  
    98  The Completer is responsible for making sure Blocks and Batches are complete
    99  before they are delivered. Blocks are considered formally complete once all of
   100  their predecessors have been delivered to the ChainController and their batches
   101  field contains all the Batches specified in the BlockHeader’s batch_ids list.
   102  The batches field is also expected to be in the same order as the batch_ids.
   103  Once Blocks are formally complete they are delivered to the ChainController for
   104  validation.
   105  
   106  Batches are considered complete once all of its dependent transactions exist in
   107  the current chain or have been delivered to the BlockPublisher.
   108  
   109  All Blocks and Batches will have a timeout for being completed. After the
   110  initial request for the missing dependencies is sent, if the response is not
   111  received within the specified time window, they are dropped.
   112  
   113  If you have a new block of unknown validity, you must ensure that its
   114  predecessors have been delivered to the journal. If a predecessor is not
   115  delivered on request to the journal in a reasonable amount of time, the new
   116  block cannot be validated.
   117  
   118  Consider the case where you have the chain A->B->C :
   119  
   120  If C arrives and B is not in the BlockCache, the validator will request B. If
   121  the request for B times out, the C block is dropped.
   122  
   123  If later on D arrives with predecessor C, of chain A->B->C->D, the Completer
   124  will request C from the network and once C arrives, then will request B again.
   125  If B arrives this time, then the new chain will be delivered to the
   126  ChainController, where they will be check for validity and considered for
   127  becoming the block head by the ChainController.
   128  
   129  The Consensus Interface
   130  =======================
   131  
   132  In the spirit of configurability, the Journal supports
   133  :term:`dynamic consensus algorithms<Dynamic consensus>`
   134  that can be changed via the Settings transaction family. The
   135  initial selection of a consensus algorithm is set for the chain in the genesis
   136  block during genesis (described below). This may be changed during the course of
   137  a chain's lifetime. The Journal and its consensus interface support dynamic
   138  consensus for probabilistic finality algorithms like Proof of Work, as well as
   139  algorithms with absolute finality like PBFT.
   140  
   141  The Consensus algorithm services to the journal are divided into three distinct
   142  interfaces that have specific lifetimes and access to information.
   143  
   144  1. Consensus.BlockPublisher
   145  2. Consensus.BlockVerifier
   146  3. Consensus.ForkResolver
   147  
   148  Consensus algorithm implementations in Sawtooth must implement all of the
   149  consensus interfaces. Each of these objects are provided read-only access to
   150  the BlockCache and GlobalState.
   151  
   152  Consensus.BlockPublisher
   153  ------------------------
   154  
   155  An implementation of the interface Consensus.BlockPublisher is used by the
   156  BlockPublisher to create new candidate blocks to extend the chain. The
   157  Consensus.BlockPublisher is provided access to a read-only view of global
   158  state, a read-only view of the BlockStore, and an interface to publish batches.
   159  
   160  Three events are called on the Consensus.BlockPublisher,
   161  
   162  1. initialize_block - The BlockHeader is provided for the candidate block. This
   163     is called immediately after the block_header is initialized and allows for
   164     validation of the consensus algorithm's internal state, checks if the header
   165     is correct, checks if according to the consensus rules a block could be
   166     published, and checks if any initialization of the block_header is required.
   167     If this function fails no candidate block is created and the BlockPublisher
   168     will periodically attempt to create new blocks.
   169  2. check_publish_block - Periodically, polling is done to check if the block can
   170     be published. In the case of PoET, this is a check to see if the wait time
   171     has expired, but could be on any other criteria the consensus algorithm has
   172     for determining if it is time to publish a block. When this returns true the
   173     BlockPublisher will proceed in creating the block.
   174  3. finalize_block - Once check_publish_block has confirmed it is time to
   175     publish a block, the block header is considered complete, except for the
   176     consensus information. The BlockPublisher calls finalize_block with the
   177     completed block_header allowing the consensus field to be filled out.
   178     Afterwards, the BlockPublisher signs the block and broadcasts it to the
   179     network.
   180  
   181  This implementation needs to take special care to handle the genesis block
   182  correctly. During genesis operation, the Consensus.BlockPublisher will be called
   183  to initialize and finalize a block so that it can be published on the chain
   184  (see below).
   185  
   186  Consensus.BlockVerifier
   187  -----------------------
   188  
   189  The Consensus.BlockVerifier implementation provides Block verification services
   190  to the BlockValidator. This gives the consensus algorithm an opportunity to
   191  check whether the candidate block was published following the consensus rules.
   192  
   193  Consensus.ForkResolver
   194  ----------------------
   195  
   196  The consensus algorithm is responsible for fork resolution on the system.
   197  Depending on the consensus algorithm, the determination of the valid block to
   198  become the chain head will differ. In a Bitcoin Proof of Work consensus, this
   199  will be the longest chain, whereas PoET uses the measure of aggregate local
   200  mean (a measure of the total amount of time spent waiting) to determine the
   201  valid fork. Consensus algorithms with finality, such as PBFT, will only ever
   202  produce blocks that extend the current head. These algorithms will never have
   203  forks to resolve. The ForkResolver for these algorithms with finality will
   204  always select the new block that extends the current head.
   205  
   206  The ChainController
   207  ===================
   208  
   209  The ChainController is responsible for determining which chain the validator is
   210  currently on and coordinating any change-of-chain activities that need to
   211  happen.
   212  
   213  The ChainController is designed to be able to handle multiple block validation
   214  activities simultaneously. For instance, if multiple forks form on the network,
   215  the ChainController can process blocks from all of the competing forks
   216  simultaneously. This is advantageous as it allows progress to be made even when
   217  there are several deep forks competing. The current chain can also be advanced
   218  while a deep fork is being evaluated. This was implemented for cases that could
   219  happen if a group of validators lost connectivity with the network and later
   220  rejoined.
   221  
   222  .. note::
   223  
   224    Currently, the thread pool is set to 1, so only one Block is validated
   225    at a time.
   226  
   227  Here is the basic flow of the ChainController as a single block is processed.
   228  
   229  .. image:: ../images/journal_chain_controller.*
   230     :width: 80%
   231     :align: center
   232     :alt: Journal Chain Controller Diagram
   233  
   234  When a block arrives, the ChainController creates a BlockValidator and
   235  dispatches it to a thread pool for execution.  Once the BlockValidator has
   236  completed, it will callback to the ChainController indicating whether the new
   237  block should be the chain head. This indication falls into 3 cases:
   238  
   239  1. The chain head has been updated since the BlockValidator was created. In
   240     this case a new BlockValidator is created and dispatched to redo the fork
   241     resolution.
   242  2. The new Block should become the chain head. In this case the chain head is
   243     updated to be the new block.
   244  3. The new Block should not become the chain head. This could be because the
   245     new Block is part of a chain that has an invalid block in it, or it is a
   246     member of a shorter or less desirable fork as determined by consensus.
   247  
   248  The Chain Controller synchronizes chain head updates such that only one
   249  BlockValidator result can be processed at a time. This is to prevent the race
   250  condition of multiple fork resolution processes attempting to update the chain
   251  head at the same time.
   252  
   253  Chain Head Update
   254  -----------------
   255  
   256  When the chain needs to be updated, the ChainController does an update of the
   257  ChainHead using the BlockStore, providing it with the list of commit blocks
   258  that are in the new fork and a list of decommit blocks that are in the
   259  BlockStore, which must be removed. After the BlockStore is updated, the Block
   260  Publisher is notified that there is a new ChainHead.
   261  
   262  Delayed Block Processing
   263  ------------------------
   264  
   265  While the ChainController does Block validation in parallel, there are cases
   266  where the ChainController will serialize Block validation. These cases are when
   267  a Block is received and any of its predecessors are still being validated. In
   268  this case the validation of the predecessor is completed before the new block is
   269  scheduled. This is done to avoid redoing the validation work of the predecessor
   270  Block, since the predecessor must be validated prior to the new Block, the delay
   271  is inconsequential to the outcome.
   272  
   273  The BlockValidator
   274  ------------------
   275  
   276  The BlockValidator is a subcomponent of the ChainController that is responsible
   277  for Block validation and fork resolution. When the BlockValidator is
   278  instantiated, it is given the candidate Block to validate and the current chain
   279  head.
   280  
   281  During processing, if a Block is marked as invalid it is discarded, never to be
   282  considered again. The only way to have the Block reconsidered is by flushing the
   283  BlockCache, which can be done by restarting the validator.
   284  
   285  The BlockValidator has three stages of evaluation.
   286  
   287  1. Determine the common root of the fork (ForkRoot). This is done by walking the
   288     chain back from the candidate and the chain head until a common block is
   289     found. The Root can be the ChainHead in the case that the Candidate is
   290     advancing the existing chain. The only case that the ForkRoot will not be
   291     found is if the Candidate is from another Genesis. If this is the case, the
   292     Candidate and all of its predecessors are marked as Invalid and discarded.
   293     During this step, an ordered list of both chains is built back to the
   294     ForkRoot.
   295  2. The Candidate chain is validated. This process walks forward from the
   296     ForkRoot and applies block validation rules (described below) to each Block
   297     successively. If any block fails validation, it and all of its successors
   298     are marked as Invalid (Valid Blocks are defined as having Valid
   299     predecessor(s)). Once the Candidate is successfully Validated and marked as
   300     Valid, the Candidate is ready for Fork Resolution.
   301  3. Fork resolution requires a determination to be made if the Candidate should
   302     replace the ChainHead and is deferred entirely to the consensus
   303     implementation. Once the Consensus determines if the block is the new
   304     ChainHead, the answer is returned to the ChainController, which updates the
   305     BlockStore.  If it is not the new ChainHead, the Candidate is dropped.
   306     Additionally, if the Candidate is to become the ChainHead, the list of
   307     transactions committed in the new chain back to the common root is computed
   308     and the same list is computed on the current chain. This information helps
   309     the BlockPublisher update its pending batch list when the chain is updated.
   310  
   311  Block Validation
   312  ----------------
   313  
   314  Block validation has the following steps that are always run in order. Failure
   315  of any validation step results in failure, processing is stopped, and the Block
   316  is marked as Invalid.
   317  
   318  1. **Transaction Permissioning** - On-chain transaction permissions are
   319     checked to see who is allowed to submit transactions and batches.
   320  
   321  #. **On-chain Block Validation Rules** - The on-chain block validation rules
   322     are checked to ensure that the Block doesn't invalidate any of the
   323     rules stored at ``sawtooth.validator.block_validation_rules``.
   324  
   325  #. **Batches Validation** - All of the Batches in the block are sent in order
   326     to a Transaction Scheduler for validation. If any Batches fail validation,
   327     this block is marked as invalid. Note: Batch and Signature verification is
   328     done on receipt of the Batch prior to it being routed to the Journal. The
   329     batches are checked for the following:
   330  
   331      * No duplicate Batches
   332      * No duplicate Transactions
   333      * Valid Transaction dependencies
   334      * Successful Batch Execution
   335  
   336  #. **Consensus Verification** - The Consensus instance is given to the Block for
   337     verification. Consensus block verification is done by the consensus algorithm
   338     using its own rules.
   339  
   340  #. **State Hash Check** - The StateRootHash generated by validating the block is
   341     checked against the StateRootHash (state_root_hash field in the BlockHeader)
   342     on the block. They must match for the block to be valid.
   343  
   344  If the block is computed to be valid, then StateRootHash is committed to the
   345  store.
   346  
   347  The BlockPublisher
   348  ==================
   349  
   350  The BlockPublisher is responsible for creating candidate blocks to extend the
   351  current chain. The BlockPublisher does all of the housekeeping work around
   352  creating a block but takes direction from the consensus algorithm for when to
   353  create a block and when to publish a block.
   354  
   355  The BlockPublisher follows this logic flow:
   356  
   357  .. image:: ../images/journal_block_publisher_flow.*
   358     :width: 80%
   359     :align: center
   360     :alt: Journal Block Publisher Diagram
   361  
   362  At each processing stage, the consensus algorithm has a chance to inspect and
   363  confirm the validity of the block.
   364  
   365  During CreateBlock, an instance of Consensus.BlockPublisher is created that is
   366  responsible for guiding the creation of this candidate block.  Also, a
   367  TransactionScheduler is created and all of the pending Batches are submitted to
   368  it.
   369  
   370  A delay is employed in the checking loop to ensure that there is time for the
   371  batch processing to occur.
   372  
   373  Genesis Operation
   374  =================
   375  
   376  The Journal supports Genesis operation. This is the action of creating a root of
   377  the chain (the Genesis block) when the block store is empty. This operation is
   378  necessary for bootstrapping a validator network with the desired consensus
   379  model, any deployment-specific configuration settings, as well as any
   380  genesis-time transactions for an application's Transaction Family.
   381  
   382  Genesis Batch Creation
   383  ----------------------
   384  
   385  The CLI tool produces batches in a file, which will be consumed by the
   386  validator on startup (when starting with an empty chain).
   387  
   388  The file contains a protobuf-encoded list of batches:
   389  
   390  .. code-block:: protobuf
   391          :caption: File: sawtooth-core/protos/genesis.proto
   392  
   393          message GenesisData {
   394              repeated Batch batches = 1;
   395          }
   396  
   397  The tool should take multiple input batch collections, and combine them
   398  together into the single list of batches contained in GenesisData. This allows
   399  independent tools or transaction families to include their own batches, without
   400  needing to know anything about the genesis process.
   401  
   402  The first implementation assumes that the order of the input batches have
   403  implied dependencies, with each batch being implicitly dependent on the
   404  previous.  Any dependencies should be verified when the final set of batches is
   405  produced.  This would be enforced by the use of strict ordering of the batches
   406  during execution time.  Future implementations may provide a way to verify
   407  dependencies across input batches.
   408  
   409  Transaction family authors who need to provide batches that will be included,
   410  need to provide their own tool to produce GenesisData, with the batches they
   411  require for the process. Each individual tool may manage their batch and
   412  transaction dependencies explicitly within the context of their specific
   413  genesis batches.
   414  
   415  Example
   416  ~~~~~~~
   417  
   418  The following example configures the validator to use PoET consensus
   419  and specifies the appropriate settings:
   420  
   421  .. code-block:: bash
   422  
   423          sawset proposal create \
   424            -k <signing-key-file> \
   425            -o sawset.batch \
   426            sawtooth.consensus.algorithm=poet \
   427            sawtooth.poet.initial_wait_timer=x \
   428            sawtooth.poet.target_wait_time=x \
   429            sawtooth.poet.population_estimate_sample_size=x
   430  	  sawadm genesis \
   431            sawset.batch
   432  
   433  A genesis.batch file will written to the validator's data directory.
   434  
   435  Block Creation
   436  --------------
   437  
   438  On startup, the validator would use the resulting genesis.batch file to produce
   439  a genesis block under the following conditions:
   440  
   441  * The genesis.batch file exists
   442  * There is no block specified as the chain head
   443  
   444  If either of these conditions is not met, the validator halts operation.
   445  
   446  The validator will load the batches from the file into the pending queue.  It
   447  will then produce the genesis block through the standard process with the
   448  following modifications.
   449  
   450  First, the execution of the batches will be strictly in the order they have
   451  been provided.  The Executor will not attempt to reorder them, or drop failed
   452  transactions.  Any failure of a transaction in genesis.batch will fail to
   453  produce the genesis block, and the validator will treat this as a fatal error.
   454  
   455  Second, it will use a genesis consensus, to determine block validity. At the
   456  start of the genesis block creation process, state (the Merkle-Radix tree) will be empty.
   457  Given that the consensus mechanism is specified by a configuration setting in
   458  the state, this will return None.  As a result, the genesis consensus mechanism
   459  will be used. This will produce a block with an empty consensus field.
   460  
   461  In addition to the genesis block, the blockchain ID (that is, the signature of
   462  the genesis block) is written to the file ``block-chain-id`` in the validator’s
   463  data directory.
   464  
   465  Part of the production of the genesis block will require the configuration of
   466  the consensus mechanism. The second block will then use the configured
   467  consensus model, which will need to know how to initialize the consensus field
   468  from an empty one.  In future cases, transitions between consensus models may be
   469  possible, as long as they know how to read the consensus field of the previous
   470  block.
   471  
   472  To complete the process, all necessary transaction processors must be running.
   473  A minimum requirement is the Sawtooth Settings transaction processor,
   474  ``settings-tp``.
   475  
   476  .. Licensed under Creative Commons Attribution 4.0 International License
   477  .. https://creativecommons.org/licenses/by/4.0/