github.com/koko1123/flow-go-1@v0.29.6/module/mempool/consensus/Fork-Aware_Mempools.md (about) 1 ## Determining includability of Execution Receipt: 2 3 ### Problem Description 4 A consensus primary knows a set `S` of Execution Receipts, some of which might not be known to other consensus replicas. 5 The primaries Fork-Choice rule decides to build on top of block `F`. When constructing the payload, the primary must 6 decide which ExecutionReceipts to incorporated in the payload. 7 * Consider the fork `<- A <- B <- ... <- E <- F` leading up to block `F`. Here, `A` denotes the latest sealed block in that fork. 8 * For incorporating Execution Receipts, we can restrict our consideration to the section `B <- ... <- E <- F` of the fork, 9 as all earlier blocks have already been sealed as of `F`. 10 11 _Notation_ 12 13 We use the following notation 14 * `r[B]` is an execution result for block `B`. If there are multiple different results for block `B`, we add an index, e.g. `r[B]_1`, `r[B]_2`, ... 15 * `Er[r]` is a execution receipt vouching for result `r`. For example `Er[r[C]_2]` is the receipt for result `r[C]_2` 16 * an Execution Receipt `r` has the following fields: 17 * `PreviousResultID` denotes the result `ID` for the parent block that has been used as starting state for computing the current block 18 19 ![Execution Tree](/docs/ExecutionResultTrees.png) 20 21 ### Criteria for Incorporating Execution Receipts 22 23 Let Er<sup>(1)</sup>, Er<sup>(2)</sup>, ..., Er<sup>(K)</sup> be the receipts included in the _child_ of block `F`. 24 25 There are multiple criteria that have to be satisfied for a receipt to be incorporated in the payload: 26 1. Receipts must be for unsealed blocks on the fork that is being extended. Formally: 27 * Er<sup>(i)</sup> must be for one of the blocks `B, ..., F` 28 2. No duplication of incorporated receipts. Formally: 29 * There are no duplicates in Er<sup>(1)</sup>, Er<sup>(2)</sup>, ..., Er<sup>(K)</sup> 30 * _And_ for each `Er` ∈ {Er<sup>(1)</sup>, Er<sup>(2)</sup>, ..., Er<sup>(K)</sup>}: 31 32 `Er` was _not_ incorporated in any of the blocks `B, ..., F` 33 3. The parent result (`PreviousResultID`) must have been previously incorporated (either in ancestor blocks or earlier in the new block itself). Formally: 34 * For each `Er` ∈ {Er<sup>(1)</sup>, Er<sup>(2)</sup>, ..., Er<sup>(K)</sup>}: 35 * `Er.PreviousResultID` is the sealed result 36 * _or_ there exists an Execution Receipt `Er'` that was incorporated in the blocks `B, ..., F` 37 with `Er'.ExecutionResult.ID() == Er.PreviousResultID` 38 * _or_ there exists an Execution Receipt in the list Er<sup>(1)</sup>, Er<sup>(2)</sup>, ..., Er<sup>(K)</sup> _prior_ to `Er` 39 with `Er'.ExecutionResult.ID() == Er.PreviousResultID` 40 41 42 Note that the condition cannot be relaxed to: "there must be any ExecutionResult for the parent block be included in the fork" . It must be specifically the parent result referenced by PreviousResultID. 43 44 ### Problem formalization 45 46 As illustrated by the figure above, the ExecutionResults form a tree, with the last sealed result as the root. 47 * All Execution Receipts committing to the same result from an [equivalence class](https://en.wikipedia.org/wiki/Equivalence_class) and can be 48 represented as one vertex in the [Execution Tree](/docs/ExecutionResultTrees.png). 49 * Consider the results `r[A]` and `r[B]`. As `r[A]`'s output state is used as the staring state to compute block `B`, 50 we can say: "from result `r[A]` `computation` (denoted by symbol `Σ`) leads to `r[B]`". Formally: 51 ``` 52 r[A] Σ r[B] 53 ``` 54 Here, `Σ` is a [binary relation](https://en.wikipedia.org/wiki/Binary_relation) (more specifically a homogeneous binary relation). 55 Furthermore, consider the case: 56 * `r[A] Σ r[B]` (i.e. from result `r[A]` `computation` leads to `r[B]`) 57 * `r[B] Σ r[C]_1` (i.e. from result `r[B]` `computation` leads to `r[C]_1`) 58 59 then we can summarize that from result `r[A]` `computation` leads to `r[C]_1`. Formally: 60 ``` 61 from r[A] Σ r[B] and r[B] Σ r[C]_1 it follows that r[A] Σ r[C]_1 62 ``` 63 Hence, `Σ` is a [transitive relation](https://en.wikipedia.org/wiki/Binary_relation). 64 65 Note: 66 * `computation` (`Σ`) does _not_ restricted to honest computation. Rather, it means computation proclaimed by an execution node (and backed by its stake). 67 68 ### Algorithmic solution 69 70 Lets break up the problem into 3 steps: 71 1. For the first step, lets ignore the receipts already included in the fork. Lets start with only the `sealed_state` and ask: 72 ``` 73 What is the largest set of Execution Receipts that are potential candidates for inclusion in the block I am building? 74 ``` 75 This is necessarily a super-set of the receipts already included in the fork, as a correct solution should at least reproduce those Receipts 76 and potentially others, which haven't been included. 77 2. We need a suitable ordering so that any receipt's parent result is listed before. 78 3. From the result of Step 2, we can then remove the Receipts already included in the fork. 79 80 By construction, this generates a _correct and complete_ solution satisfying the above-listed Criteria for Incorporating Execution Receipts. 81 82 #### Step 1: largest set of Execution Receipts that are potential candidates for inclusion 83 84 From the perspective of the primary, all Execution Receipts whose results `r` satisfy `sealed_state Σ r` are candidates for inclusion in the block. 85 Formally, **the transitive closure of the binary relation `Σ` on the `sealed_state` yields are candidates for inclusion in the block.** 86 87 [Wikipedia](https://en.wikipedia.org/wiki/Reachability): For a directed graph `G = (V , E)`, with vertex set `V` and edge set `E`, 88 the [reachability relation](https://en.wikipedia.org/wiki/Reachability) of `G` is the transitive closure of `E`. 89 [Reachability](https://en.wikipedia.org/wiki/Reachability) refers to the ability to get from one vertex to another within a graph. 90 A vertex `s` can reach a vertex `t` if there exists a sequence of adjacent vertices (i.e. a path) which starts with `s` 91 and ends with `t`. 92 93 _Available algorithms:_ 94 There are a variety of algorithms for computing the transitive closure / reachability in a directed graph with different runtime complexities (e.g. 95 [Floyd–Warshall algorithm](https://en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm), DFS etc). 96 A good overview is given in the [Panagiotis Parchas's lecture notes](http://www.cse.ust.hk/~dimitris/6311/L24-RI-Parchas.pdf) 97 and [Transitive Closure of a Graph](https://www.techiedelight.com/transitive-closure-graph/). 98 The trade-offs are mainly between upfront Construction time vs Query time. 99 100 For our specific problem, we assume that the graph frequently changes due to new results being published. 101 Furthermore, we know that our graph is a Tree and hence sparse. Therefore, **running depth-first search (DFS) from the `sealed_state` 102 (or any other tree search algorithm) has optimal runtime complexity** of `O(|V|+|E|)`. 103 104 #### Step 2: suitable ordering 105 106 DFS already lists Execution Results in the desired order. 107 108 #### Step 3: remove the Receipts already included in ancestors 109 110 We can simply store the Receipts that are already included in the fork in a lookup table `M`. 111 When searching the tree in step 1, we skip all receipts that are in `M` on the fly. 112 113 114 ## Further reading 115 * [Lecture notes on directed Graphs](http://www.orcca.on.ca/~yxie/courses/cs2210b-2011/htmls/notes/16-directedgraph.pdf) 116 * [Graph Algorithms and Network Flows](https://hochbaum.ieor.berkeley.edu/files/ieor266-2014.pdf) 117 * Paper: [The Serial Transitive Closure Problem for Trees](https://www.math.ucsd.edu/~sbuss/ResearchWeb/transclosure/paper.pdf) 118 119 120 121 122 123