github.com/onflow/flow-go@v0.35.7-crescendo-preview.23-atree-inlining/cmd/collection/README.md (about) 1 # Collection 2 3 The collection node is responsible for accepting transactions from users, packaging 4 them into collections to ease the load on consensus nodes, and storing transaction 5 texts for the duration they are needed by the network. 6 7 This document provides a high-level overview of the collection node. Each section 8 includes links to the appropriate package, which may contain more detailed documentation. 9 10 <!-- START doctoc generated TOC please keep comment here to allow auto update --> 11 <!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE --> 12 ## Table of Contents 13 14 - [Terminology](#terminology) 15 - [Processes](#processes) 16 - [Transaction Lifecycle](#transaction-lifecycle) 17 - [Collection Lifecycle](#collection-lifecycle) 18 - [Engines](#engines) 19 - [Ingest](#ingest) 20 - [Proposal](#proposal) 21 - [Synchronization](#synchronization) 22 - [Provider](#provider) 23 - [Storage](#storage) 24 - [Cluster State](#cluster-state) 25 - [Collection Builder](#collection-builder) 26 - [Collection Finalizer](#collection-finalizer) 27 28 <!-- END doctoc generated TOC please keep comment here to allow auto update --> 29 30 ## Terminology 31 32 * **Collection** - a set of transactions proposed by a cluster of collection nodes. 33 * **Guaranteed Collection** - a collection that a quorum of nodes in the cluster has 34 committed to storing. 35 * **Collection Guarantee** - an attestation to a collection that has been guaranteed. 36 Concretely, this is a hash over the collection and signatures from a qualified 37 majority of cluster members. (Sometimes simply referred to as `guarantee`.) 38 * **Cluster** - a group of collection nodes that work together to create collections. 39 Each cluster is responsible for a different subset of transactions. 40 41 ## Processes 42 43 ### Transaction Lifecycle 44 45 1. Transactions are received by a collection node (typically via an [Access Node](../access)). 46 2. Transactions are propagated to collection nodes in the responsible cluster. 47 3. Transactions are introduced into the memory pool. 48 4. Transactions are included in a collection proposal. 49 5. Transactions are removed from the memory pool when the collection is guaranteed. 50 51 ### Collection Lifecycle 52 53 1. A collection is proposed by a cluster member. 54 2. The collection is finalized by the core consensus algorithm. 55 3. A guarantee for the finalized collection is submitted to consensus nodes. 56 4. The guarantee is included in a block. 57 5. The block is propagated to execution nodes. 58 6. Execution nodes request the full collection from collection nodes. 59 60 ## Engines 61 62 ### [Ingest](../../engine/collection/ingest) 63 64 The `ingest` engine is responsible for accepting, validating, and storing new transactions. 65 Once a transaction has been ingested, it can be included in a new collection via the `proposal` engine. 66 67 In general, collection nodes _cannot_ fully ensure that a transaction is valid. 68 Consequently, the validation performed at this stage is best-effort. 69 70 ### [Proposal](../../engine/collection/proposal) 71 72 The `proposal` engine is responsible for handling the consensus process for the cluster. 73 It runs an instance of [HotStuff](../../consensus/hotstuff) within the cluster. 74 This results in each cluster building a secondary blockchain, where each block 75 represents a proposed collection. 76 77 ### [Synchronization](../../engine/collection/synchronization) 78 79 The `synchronization` engine is responsible for keeping the node in sync with its cluster. 80 It periodically polls cluster members for their latest finalized collection, and handles 81 requesting ranges of finalized collections when the node is behind. It also handles 82 requesting specific collections when the `proposal` engine receives a new proposal for 83 which the parent is not known. 84 85 ### [Provider](../../engine/collection/provider) 86 87 The `provider` engine is responsible for responding to requests for resources stored 88 by the collection node, typically full collections or individual transactions. In 89 particular, execution nodes and verification nodes request collections in order to 90 execute the constituent transactions. 91 92 ## Storage 93 94 ### [Cluster State](../../state/cluster) 95 96 The cluster state implements a storage layer for the blockchain produced by the 97 `proposal` engine. The design mirrors the [Protocol State](../../state/protocol), 98 which is the storage layer for the main Flow blockchain produced by consensus nodes. 99 100 Since the core HotStuff logic is only aware of blocks, we sometimes use the term 101 "cluster block" to refer to a block within a cluster's auxiliary blockchain. Each 102 cluster block contains a single proposed collection. 103 104 #### [Collection Builder](../../module/builder/collection) 105 106 The collection builder implements the logic for building new, valid collections 107 to propose to the cluster. 108 109 Valid collections contain transactions that have passed basic validation, are not 110 expired, and do not contain any transactions that already exist in a parent collection. 111 112 #### [Collection Finalizer](../../module/finalizer/collection) 113 114 The collection finalizer marks a collection as finalized (aka guaranteed). This is 115 invoked by the core [HotStuff](../../consensus/hotstuff) logic when it determines 116 that a collection has been finalized by the cluster. 117 118 When a collection is finalized, the corresponding collection guarantee is sent to 119 consensus nodes for inclusion in a block.