github.com/ari-anchor/sei-tendermint@v0.0.0-20230519144642-dc826b7b56bb/spec/consensus/consensus.md (about) 1 --- 2 order: 1 3 --- 4 # Byzantine Consensus Algorithm 5 6 ## Terms 7 8 - The network is composed of optionally connected _nodes_. Nodes 9 directly connected to a particular node are called _peers_. 10 - The consensus process in deciding the next block (at some _height_ 11 `H`) is composed of one or many _rounds_. 12 - `NewHeight`, `Propose`, `Prevote`, `Precommit`, and `Commit` 13 represent state machine states of a round. (aka `RoundStep` or 14 just "step"). 15 - A node is said to be _at_ a given height, round, and step, or at 16 `(H,R,S)`, or at `(H,R)` in short to omit the step. 17 - To _prevote_ or _precommit_ something means to broadcast a [prevote 18 vote](https://godoc.org/github.com/tendermint/tendermint/types#Vote) 19 or [first precommit 20 vote](https://godoc.org/github.com/tendermint/tendermint/types#FirstPrecommit) 21 for something. 22 - A vote _at_ `(H,R)` is a vote signed with the bytes for `H` and `R` 23 included in its [sign-bytes](../core/data_structures.md#vote). 24 - _+2/3_ is short for "more than 2/3" 25 - _1/3+_ is short for "1/3 or more" 26 - A set of +2/3 of prevotes for a particular block or `<nil>` at 27 `(H,R)` is called a _proof-of-lock-change_ or _PoLC_ for short. 28 29 ## State Machine Overview 30 31 At each height of the blockchain a round-based protocol is run to 32 determine the next block. Each round is composed of three _steps_ 33 (`Propose`, `Prevote`, and `Precommit`), along with two special steps 34 `Commit` and `NewHeight`. 35 36 In the optimal scenario, the order of steps is: 37 38 ```md 39 NewHeight -> (Propose -> Prevote -> Precommit)+ -> Commit -> NewHeight ->... 40 ``` 41 42 The sequence `(Propose -> Prevote -> Precommit)` is called a _round_. 43 There may be more than one round required to commit a block at a given 44 height. Examples for why more rounds may be required include: 45 46 - The designated proposer was not online. 47 - The block proposed by the designated proposer was not valid. 48 - The block proposed by the designated proposer did not propagate 49 in time. 50 - The block proposed was valid, but +2/3 of prevotes for the proposed 51 block were not received in time for enough validator nodes by the 52 time they reached the `Precommit` step. Even though +2/3 of prevotes 53 are necessary to progress to the next step, at least one validator 54 may have voted `<nil>` or maliciously voted for something else. 55 - The block proposed was valid, and +2/3 of prevotes were received for 56 enough nodes, but +2/3 of precommits for the proposed block were not 57 received for enough validator nodes. 58 59 Some of these problems are resolved by moving onto the next round & 60 proposer. Others are resolved by increasing certain round timeout 61 parameters over each successive round. 62 63 ## State Machine Diagram 64 65 ```md 66 +-------------------------------------+ 67 v |(Wait til `CommmitTime+timeoutCommit`) 68 +-----------+ +-----+-----+ 69 +----------> | Propose +--------------+ | NewHeight | 70 | +-----------+ | +-----------+ 71 | | ^ 72 |(Else, after timeoutPrecommit) v | 73 +-----+-----+ +-----------+ | 74 | Precommit | <------------------------+ Prevote | | 75 +-----+-----+ +-----------+ | 76 |(When +2/3 Precommits for block found) | 77 v | 78 +--------------------------------------------------------------------+ 79 | Commit | 80 | | 81 | * Set CommitTime = now; | 82 | * Wait for block, then stage/save/commit block; | 83 +--------------------------------------------------------------------+ 84 ``` 85 86 # Background Gossip 87 88 A node may not have a corresponding validator private key, but it 89 nevertheless plays an active role in the consensus process by relaying 90 relevant meta-data, proposals, blocks, and votes to its peers. A node 91 that has the private keys of an active validator and is engaged in 92 signing votes is called a _validator-node_. All nodes (not just 93 validator-nodes) have an associated state (the current height, round, 94 and step) and work to make progress. 95 96 Between two nodes there exists a `Connection`, and multiplexed on top of 97 this connection are fairly throttled `Channel`s of information. An 98 epidemic gossip protocol is implemented among some of these channels to 99 bring peers up to speed on the most recent state of consensus. For 100 example, 101 102 - Nodes gossip `PartSet` parts of the current round's proposer's 103 proposed block. A LibSwift inspired algorithm is used to quickly 104 broadcast blocks across the gossip network. 105 - Nodes gossip prevote/precommit votes. A node `NODE_A` that is ahead 106 of `NODE_B` can send `NODE_B` prevotes or precommits for `NODE_B`'s 107 current (or future) round to enable it to progress forward. 108 - Nodes gossip prevotes for the proposed PoLC (proof-of-lock-change) 109 round if one is proposed. 110 - Nodes gossip to nodes lagging in blockchain height with block 111 [commits](https://godoc.org/github.com/tendermint/tendermint/types#Commit) 112 for older blocks. 113 - Nodes opportunistically gossip `ReceivedVote` messages to hint peers what 114 votes it already has. 115 - Nodes broadcast their current state to all neighboring peers. (but 116 is not gossiped further) 117 118 There's more, but let's not get ahead of ourselves here. 119 120 ## Proposals 121 122 A proposal is signed and published by the designated proposer at each 123 round. The proposer is chosen by a deterministic and non-choking round 124 robin selection algorithm that selects proposers in proportion to their 125 voting power (see 126 [implementation](https://github.com/tendermint/tendermint/blob/master/types/validator_set.go)). 127 128 A proposal at `(H,R)` is composed of a block and an optional latest 129 `PoLC-Round < R` which is included iff the proposer knows of one. This 130 hints the network to allow nodes to unlock (when safe) to ensure the 131 liveness property. 132 133 ## State Machine Spec 134 135 ### Propose Step (height:H,round:R) 136 137 Upon entering `Propose`: 138 139 - The designated proposer proposes a block at `(H,R)`. 140 141 The `Propose` step ends: 142 143 - After `timeoutProposeR` after entering `Propose`. --> goto 144 `Prevote(H,R)` 145 - After receiving proposal block and all prevotes at `PoLC-Round`. --> 146 goto `Prevote(H,R)` 147 - After [common exit conditions](#common-exit-conditions) 148 149 ### Prevote Step (height:H,round:R) 150 151 Upon entering `Prevote`, each validator broadcasts its prevote vote. 152 153 - First, if the validator is locked on a block since `LastLockRound` 154 but now has a PoLC for something else at round `PoLC-Round` where 155 `LastLockRound < PoLC-Round < R`, then it unlocks. 156 - If the validator is still locked on a block, it prevotes that. 157 - Else, if the proposed block from `Propose(H,R)` is good, it 158 prevotes that. 159 - Else, if the proposal is invalid or wasn't received on time, it 160 prevotes `<nil>`. 161 162 The `Prevote` step ends: 163 164 - After +2/3 prevotes for a particular block or `<nil>`. -->; goto 165 `Precommit(H,R)` 166 - After `timeoutPrevote` after receiving any +2/3 prevotes. --> goto 167 `Precommit(H,R)` 168 - After [common exit conditions](#common-exit-conditions) 169 170 ### Precommit Step (height:H,round:R) 171 172 Upon entering `Precommit`, each validator broadcasts its precommit vote. 173 174 - If the validator has a PoLC at `(H,R)` for a particular block `B`, it 175 (re)locks (or changes lock to) and precommits `B` and sets 176 `LastLockRound = R`. 177 - Else, if the validator has a PoLC at `(H,R)` for `<nil>`, it unlocks 178 and precommits `<nil>`. 179 - Else, it keeps the lock unchanged and precommits `<nil>`. 180 181 A precommit for `<nil>` means "I didn’t see a PoLC for this round, but I 182 did get +2/3 prevotes and waited a bit". 183 184 The Precommit step ends: 185 186 - After +2/3 precommits for `<nil>`. --> goto `Propose(H,R+1)` 187 - After `timeoutPrecommit` after receiving any +2/3 precommits. --> goto 188 `Propose(H,R+1)` 189 - After [common exit conditions](#common-exit-conditions) 190 191 ### Common exit conditions 192 193 - After +2/3 precommits for a particular block. --> goto 194 `Commit(H)` 195 - After any +2/3 prevotes received at `(H,R+x)`. --> goto 196 `Prevote(H,R+x)` 197 - After any +2/3 precommits received at `(H,R+x)`. --> goto 198 `Precommit(H,R+x)` 199 200 ### Commit Step (height:H) 201 202 - Set `CommitTime = now()` 203 - Wait until block is received. --> goto `NewHeight(H+1)` 204 205 ### NewHeight Step (height:H) 206 207 - Move `Precommits` to `LastCommit` and increment height. 208 - Set `StartTime = CommitTime+timeoutCommit` 209 - Wait until `StartTime` to receive straggler commits. --> goto 210 `Propose(H,0)` 211 212 ## Proofs 213 214 ### Proof of Safety 215 216 Assume that at most -1/3 of the voting power of validators is byzantine. 217 If a validator commits block `B` at round `R`, it's because it saw +2/3 218 of precommits at round `R`. This implies that 1/3+ of honest nodes are 219 still locked at round `R' > R`. These locked validators will remain 220 locked until they see a PoLC at `R' > R`, but this won't happen because 221 1/3+ are locked and honest, so at most -2/3 are available to vote for 222 anything other than `B`. 223 224 ### Proof of Liveness 225 226 If 1/3+ honest validators are locked on two different blocks from 227 different rounds, a proposers' `PoLC-Round` will eventually cause nodes 228 locked from the earlier round to unlock. Eventually, the designated 229 proposer will be one that is aware of a PoLC at the later round. Also, 230 `timeoutProposalR` increments with round `R`, while the size of a 231 proposal are capped, so eventually the network is able to "fully gossip" 232 the whole proposal (e.g. the block & PoLC). 233 234 ### Proof of Fork Accountability 235 236 Define the JSet (justification-vote-set) at height `H` of a validator 237 `V1` to be all the votes signed by the validator at `H` along with 238 justification PoLC prevotes for each lock change. For example, if `V1` 239 signed the following precommits: `Precommit(B1 @ round 0)`, 240 `Precommit(<nil> @ round 1)`, `Precommit(B2 @ round 4)` (note that no 241 precommits were signed for rounds 2 and 3, and that's ok), 242 `Precommit(B1 @ round 0)` must be justified by a PoLC at round 0, and 243 `Precommit(B2 @ round 4)` must be justified by a PoLC at round 4; but 244 the precommit for `<nil>` at round 1 is not a lock-change by definition 245 so the JSet for `V1` need not include any prevotes at round 1, 2, or 3 246 (unless `V1` happened to have prevoted for those rounds). 247 248 Further, define the JSet at height `H` of a set of validators `VSet` to 249 be the union of the JSets for each validator in `VSet`. For a given 250 commit by honest validators at round `R` for block `B` we can construct 251 a JSet to justify the commit for `B` at `R`. We say that a JSet 252 _justifies_ a commit at `(H,R)` if all the committers (validators in the 253 commit-set) are each justified in the JSet with no duplicitous vote 254 signatures (by the committers). 255 256 - **Lemma**: When a fork is detected by the existence of two 257 conflicting [commits](../core/data_structures.md#commit), the 258 union of the JSets for both commits (if they can be compiled) must 259 include double-signing by at least 1/3+ of the validator set. 260 **Proof**: The commit cannot be at the same round, because that 261 would immediately imply double-signing by 1/3+. Take the union of 262 the JSets of both commits. If there is no double-signing by at least 263 1/3+ of the validator set in the union, then no honest validator 264 could have precommitted any different block after the first commit. 265 Yet, +2/3 did. Reductio ad absurdum. 266 267 As a corollary, when there is a fork, an external process can determine 268 the blame by requiring each validator to justify all of its round votes. 269 Either we will find 1/3+ who cannot justify at least one of their votes, 270 and/or, we will find 1/3+ who had double-signed. 271 272 ### Alternative algorithm 273 274 Alternatively, we can take the JSet of a commit to be the "full commit". 275 That is, if light clients and validators do not consider a block to be 276 committed unless the JSet of the commit is also known, then we get the 277 desirable property that if there ever is a fork (e.g. there are two 278 conflicting "full commits"), then 1/3+ of the validators are immediately 279 punishable for double-signing. 280 281 There are many ways to ensure that the gossip network efficiently share 282 the JSet of a commit. One solution is to add a new message type that 283 tells peers that this node has (or does not have) a +2/3 majority for B 284 (or) at (H,R), and a bitarray of which votes contributed towards that 285 majority. Peers can react by responding with appropriate votes. 286 287 We will implement such an algorithm for the next iteration of the 288 Tendermint consensus protocol. 289 290 Other potential improvements include adding more data in votes such as 291 the last known PoLC round that caused a lock change, and the last voted 292 round/step (or, we may require that validators not skip any votes). This 293 may make JSet verification/gossip logic easier to implement. 294 295 ### Censorship Attacks 296 297 Due to the definition of a block 298 [commit](https://github.com/tendermint/tendermint/blob/master/docs/tendermint-core/validators.md), any 1/3+ coalition of 299 validators can halt the blockchain by not broadcasting their votes. Such 300 a coalition can also censor particular transactions by rejecting blocks 301 that include these transactions, though this would result in a 302 significant proportion of block proposals to be rejected, which would 303 slow down the rate of block commits of the blockchain, reducing its 304 utility and value. The malicious coalition might also broadcast votes in 305 a trickle so as to grind blockchain block commits to a near halt, or 306 engage in any combination of these attacks. 307 308 If a global active adversary were also involved, it can partition the 309 network in such a way that it may appear that the wrong subset of 310 validators were responsible for the slowdown. This is not just a 311 limitation of Tendermint, but rather a limitation of all consensus 312 protocols whose network is potentially controlled by an active 313 adversary. 314 315 ### Overcoming Forks and Censorship Attacks 316 317 For these types of attacks, a subset of the validators through external 318 means should coordinate to sign a reorg-proposal that chooses a fork 319 (and any evidence thereof) and the initial subset of validators with 320 their signatures. Validators who sign such a reorg-proposal forego its 321 collateral on all other forks. Clients should verify the signatures on 322 the reorg-proposal, verify any evidence, and make a judgement or prompt 323 the end-user for a decision. For example, a phone wallet app may prompt 324 the user with a security warning, while a refrigerator may accept any 325 reorg-proposal signed by +1/2 of the original validators. 326 327 No non-synchronous Byzantine fault-tolerant algorithm can come to 328 consensus when 1/3+ of validators are dishonest, yet a fork assumes that 329 1/3+ of validators have already been dishonest by double-signing or 330 lock-changing without justification. So, signing the reorg-proposal is a 331 coordination problem that cannot be solved by any non-synchronous 332 protocol (i.e. automatically, and without making assumptions about the 333 reliability of the underlying network). It must be provided by means 334 external to the weakly-synchronous Tendermint consensus algorithm. For 335 now, we leave the problem of reorg-proposal coordination to human 336 coordination via internet media. Validators must take care to ensure 337 that there are no significant network partitions, to avoid situations 338 where two conflicting reorg-proposals are signed. 339 340 Assuming that the external coordination medium and protocol is robust, 341 it follows that forks are less of a concern than [censorship 342 attacks](#censorship-attacks). 343 344 ### Canonical vs subjective commit 345 346 We distinguish between "canonical" and "subjective" commits. A subjective commit is what 347 each validator sees locally when they decide to commit a block. The canonical commit is 348 what is included by the proposer of the next block in the `LastCommit` field of 349 the block. This is what makes it canonical and ensures every validator agrees on the canonical commit, 350 even if it is different from the +2/3 votes a validator has seen, which caused the validator to 351 commit the respective block. Each block contains a canonical +2/3 commit for the previous 352 block.