github.com/muhammedhassanm/blockchain@v0.0.0-20200120143007-697261defd4d/sawtooth-core-master/docs/source/architecture/scheduling.rst (about) 1 ********************** 2 Transaction Scheduling 3 ********************** 4 5 Sawtooth supports both serial and parallel scheduling of transactions. The 6 scheduler type is specified via a command line argument or as an option in the 7 validator's configuration file when the validator process is started. Both 8 schedulers result in the same deterministic results and are completely 9 interchangeable. 10 11 The parallel processing of transactions provides a performance improvement for 12 even fast transaction workloads by reducing overall latency effects which occur 13 when transaction execution is performed serially. When transactions are of 14 non-uniform duration (such as may happen with more realistic complex 15 workloads), the performance benefit is especially magnified when faster 16 transactions outnumber slower transactions. 17 18 Transaction scheduling and execution in Sawtooth correctly and efficiently 19 handles transactions which modify the same state addresses, including 20 transactions within the same block. Even at the batch level, transactions can 21 modify the same state addresses. Naive distributed ledger implementations may 22 not allow overlapping state modifications within a block, severely limiting 23 performance of such transactions to one-transaction-per-block, but Sawtooth has 24 no such block-level restriction. Instead, state is incremental per transaction 25 execution. This prevents double spends while allowing multiple transactions 26 which alter the same state values to appear in a single block. Of course, in 27 cases where these types of block-level restrictions are desired, transaction 28 families may implement the appropriate business logic. 29 30 Scheduling within the Validator 31 =============================== 32 33 The sawtooth-validator process has two major components which use schedulers to 34 calculate state changes and the resulting Merkle hashes based on transaction 35 processing: the Chain Controller and the Block Publisher. The Chain Controller 36 and Block Publisher pass a scheduler to the Executor. While the validator 37 contains only a single Chain Controller, a single Block Publisher, and a single 38 Executor, there are numerous instances of schedulers which are dynamically 39 created as needed. 40 41 Chain Controller 42 ---------------- 43 44 The Chain Controller is responsible for maintaining the current chain head (a 45 pointer to the last block in the current chain). Processing is block-based; 46 when it receives a candidate block (over the network or from the Block 47 Publisher), it determines whether the chain head should be updated to point to 48 that candidate block. The Chain Controller creates a scheduler to calculate 49 new state with a related Merkle hash for the block being published. The Merkle 50 hash is compared to the state root contained in the block header. If they 51 match, the block is valid from a transaction execution and state standpoint. 52 The Chain Controller uses this information in combination with consensus 53 information to determine whether to update the current chain head. 54 55 Block Publisher 56 --------------- 57 58 The Block Publisher is responsible for creating new candidate blocks. As 59 batches are received by the validator (from clients or other validator nodes), 60 they are added to the Block Publisher's pending queue. Only valid transactions 61 will be added to the next candidate block. For timeliness, batches are added 62 to a scheduler as they are added to the pending queue; thus, transactions are 63 processed incrementally as they are received. 64 65 When the pending queue changes significantly, such as when the chain head has 66 been updated by the Chain Controller, the Block Publisher cancels the current 67 scheduler and creates a new scheduler. 68 69 Executor 70 -------- 71 72 The Executor is responsible for the execution of transactions by sending them 73 to transaction processors. The overall flow for each transaction is: 74 75 - The Executor obtains the next transaction and initial context from the 76 scheduler 77 - The Executor obtains a new context for the transaction from the Context 78 Manager by providing the initial context (contexts are chained together) 79 - The Executor sends the transaction and a context reference to the transaction 80 processor 81 - The transaction processor updates the context's state via context manager 82 calls 83 - The transaction processor notifies the Executor that the transaction is 84 complete 85 - The Executor updates the scheduler with the transaction's result with the 86 updated context 87 88 In the case of serial scheduling, step (1) simply blocks until step (6) from 89 the previous transaction has completed. For the parallel scheduler, step (1) 90 blocks until a transaction exists which can be executed because it's 91 dependencies have been satisfied, with steps (2) through (6) happening in 92 parallel for each transaction being executed. 93 94 Iterative Scheduling 95 ==================== 96 97 Each time the executor requests the next transaction, the scheduler calculates 98 the next transaction dynamically based on knowledge of the transaction 99 dependency graph and previously executed transactions within this schedule. 100 101 Serial Scheduler 102 ---------------- 103 104 For the serial scheduler, the dependency graph is straightforward; each 105 transaction is dependent on the one before it. The next transaction is 106 released only when the scheduler has received the execution results from the 107 transaction before it. 108 109 Parallel Scheduler 110 ------------------ 111 112 As batches are added to the parallel scheduler, predecessor transactions are 113 calculated for each transaction in the batch. A predecessor transaction is 114 a transaction which must be fully executed prior to executing the transaction 115 for which it is a predecessor. 116 117 Each transaction has a list of inputs and outputs; these are address 118 declarations fields in the transaction's header and are filled in by the client 119 when the transaction is created. Inputs and outputs specify which locations in 120 state are accessed or modified by the transaction. Predecessor transactions are 121 determined using these inputs/outputs declarations. 122 123 .. note:: 124 125 It is possible for poorly written clients to impact parallelism by providing 126 overly broad inputs/outputs declarations. Transaction processor 127 implementations can enforce specific inputs/outputs requirements to 128 provide an incentive for correct client behavior. 129 130 The parallel scheduler calculates predecessors using a Merkle-Radix tree with nodes 131 addressable by state addresses or namespaces. This tree is called the 132 predecessor tree. Input declarations are considered reads, with output 133 declarations considered writes. By keeping track of readers and writers within 134 nodes of the tree, predecessors for a transaction can be quickly determined. 135 136 Unlike the serial scheduler, the order in which transactions will be returned 137 to the Executor is not predetermined. The parallel scheduler is careful about 138 which transactions are returned; only transactions with do not have state 139 conflicts will be executed in parallel. When the Executor asks for the next 140 transaction, the scheduler inspects the list of unscheduled transactions; the 141 first in the list for which all predecessors have finished executed will be be 142 returned. If none are found, the scheduler will block and re-check after 143 a transaction has finished being executed. 144 145 .. Licensed under Creative Commons Attribution 4.0 International License 146 .. https://creativecommons.org/licenses/by/4.0/