github.com/onflow/flow-go@v0.35.7-crescendo-preview.23-atree-inlining/engine/common/follower/compliance.go (about)

     1  package follower
     2  
     3  import (
     4  	"github.com/onflow/flow-go/model/flow"
     5  	"github.com/onflow/flow-go/module"
     6  )
     7  
     8  // complianceCore interface describes the follower's compliance core logic. Slightly simplified, the
     9  // compliance layer ingests incoming untrusted blocks from the network, filters out all invalid blocks,
    10  // extends the protocol state with the valid blocks, and lastly pipes the valid blocks to the HotStuff
    11  // follower. Conceptually, the algorithm proceeds as follows:
    12  //
    13  //  1. _light_ validation of the block header:
    14  //     - check that the block's proposer is the legitimate leader for the respective view
    15  //     - verify the leader's signature
    16  //     - verify QC within the block
    17  //     - verify whether TC should be included and check the TC
    18  //
    19  //     Optimization for fast catchup:
    20  //     Honest nodes that we synchronize blocks from supply those blocks in sequentially connected order.
    21  //     This allows us to only validate the highest QC of such a sequence. A QC proves validity of the
    22  //     referenced block as well as all its ancestors. The only other detail we have to verify is that the
    23  //     block hashes match with the ParentID in their respective child.
    24  //     To utilize this optimization, we require that the input `connectedRange` is continuous sequence
    25  //     of blocks, i.e. connectedRange[i] is the parent of connectedRange[i+1].
    26  //
    27  //  2. All blocks that pass the light validation go into a size-limited cache with random ejection policy.
    28  //     Under happy operations this cache should not run full, as we prune it by finalized view.
    29  //
    30  //  3. Only certified blocks pass the cache [Note: this is the reason why we need to validate the QC].
    31  //     This caching strategy provides the fist line of defence:
    32  //     - Broken blocks from malicious leaders do not pass this cache, as they will never get certified.
    33  //     - Hardening [heuristic] against spam via block synchronization:
    34  //     TODO: implement
    35  //     We differentiate between two scenarios: (i) the blocks are _all_ already known, i.e. a no-op from
    36  //     the cache's perspective vs (ii) there were some previously unknown blocks in the batch. If and only
    37  //     if there is new information (case ii), we pass the certified blocks to step 4. In case of (i),
    38  //     this is completely redundant information (idempotent), and hence we just exit early.
    39  //     Thereby, the only way for a spamming node to load our higher-level logic is to include
    40  //     valid pending yet previously unknown blocks (very few generally exist in the system).
    41  //
    42  //  4. All certified blocks are passed to the PendingTree, which constructs a graph of all blocks
    43  //     with view greater than the latest finalized block [Note: graph-theoretically this is a forest].
    44  //
    45  //  5. In a nutshell, the PendingTree tracks which blocks have already been connected to the latest finalized
    46  //     block. When adding certified blocks to the PendingTree, it detects additional blocks now connecting
    47  //     the latest finalized block. More formally, the PendingTree locally tracks the tree of blocks rooted
    48  //     on the latest finalized block. When new vertices (i.e. certified blocks) are added to the tree, they
    49  //     they move onto step 6. Blocks are entering step 6 are guaranteed to be in 'parent-first order', i.e.
    50  //     connect to already known blocks. Disconnected blocks remain in the PendingTree, until they are pruned
    51  //     by latest finalized view.
    52  //
    53  //  6. All blocks entering this step are guaranteed to be valid (as they are confirmed to be certified in
    54  //     step 3). Furthermore, we know they connect to previously processed blocks.
    55  //
    56  // On the one hand, step 1 includes CPU-intensive cryptographic checks. On the other hand, it is very well
    57  // parallelizable. In comparison, step 2 and 3 are negligible. Therefore, we can have multiple worker
    58  // routines: a worker takes a batch of transactions and runs it through steps 1,2,3. The blocks that come
    59  // out of step 3, are queued in a channel for further processing.
    60  //
    61  // The PendingTree(step 4) requires very little CPU. Step 5 is a data base write populating many indices,
    62  // to extend the protocol state. Step 6 is only a queuing operation, with vanishing cost. There is little
    63  // benefit to parallelizing state extension, because under normal operations forks are rare and knowing
    64  // the full ancestry is required for the protocol state. Therefore, we have a single thread to extend
    65  // the protocol state with new certified blocks.
    66  //
    67  // Notes:
    68  //   - At the moment, this interface exists to facilitate testing. Specifically, it allows to
    69  //     test the ComplianceEngine with a mock of complianceCore. Higher level business logic does not
    70  //     interact with complianceCore, because complianceCore is wrapped inside the ComplianceEngine.
    71  //   - At the moment, we utilize this interface to also document the algorithmic design.
    72  type complianceCore interface {
    73  	module.Startable
    74  	module.ReadyDoneAware
    75  
    76  	// OnBlockRange consumes an *untrusted* range of connected blocks( part of a fork). The originID parameter
    77  	// identifies the node that sent the batch of blocks. The input `connectedRange` must be sequentially ordered
    78  	// blocks that form a chain, i.e. connectedRange[i] is the parent of connectedRange[i+1]. Submitting a
    79  	// disconnected batch results in an `ErrDisconnectedBatch` error and the batch is dropped (no-op).
    80  	// Implementors need to ensure that this function is safe to be used in concurrent environment.
    81  	// Caution: this method is allowed to block.
    82  	// Expected errors during normal operations:
    83  	//   - cache.ErrDisconnectedBatch
    84  	OnBlockRange(originID flow.Identifier, connectedRange []*flow.Block) error
    85  
    86  	// OnFinalizedBlock prunes all blocks below the finalized view from the compliance layer's Cache
    87  	// and PendingTree.
    88  	// Caution: this method is allowed to block
    89  	// Implementors need to ensure that this function is safe to be used in concurrent environment.
    90  	OnFinalizedBlock(finalized *flow.Header)
    91  }