github.com/onflow/flow-go@v0.33.17/consensus/hotstuff/cruisectl/Readme.md (about) 1 # Cruise Control: Automated Block Time Adjustment for Precise Epoch Switchover Timing 2 3 # Overview 4 5 ## Context 6 7 Epochs have a fixed length, measured in views. 8 The actual view rate of the network varies depending on network conditions, e.g. load, number of offline replicas, etc. 9 We would like for consensus nodes to observe the actual view rate of the committee, and adjust how quickly they proceed 10 through views accordingly, to target a desired weekly epoch switchover time. 11 12 ## High-Level Design 13 14 The `BlockTimeController` observes the current view rate and adjusts the timing when the proposal should be released. 15 It is a [PID controller](https://en.wikipedia.org/wiki/PID_controller). The essential idea is to take into account the 16 current error, the rate of change of the error, and the cumulative error, when determining how much compensation to apply. 17 The compensation function $u[v]$ has three terms: 18 19 - $P[v]$ compensates proportionally to the magnitude of the instantaneous error 20 - $I[v]$ compensates proportionally to the magnitude of the error and how long it has persisted 21 - $D[v]$ compensates proportionally to the rate of change of the error 22 23 24 📚 This document uses ideas from: 25 26 - the paper [Fast self-tuning PID controller specially suited for mini robots](https://www.frba.utn.edu.ar/wp-content/uploads/2021/02/EWMA_PID_7-1.pdf) 27 - the ‘Leaky Integrator’ [[forum discussion](https://engineering.stackexchange.com/questions/29833/limiting-the-integral-to-a-time-window-in-pid-controller), [technical background](https://www.music.mcgill.ca/~gary/307/week2/node4.html)] 28 29 30 ### Choice of Process Variable: Targeted Epoch Switchover Time 31 32 The process variable is the variable which: 33 34 - has a target desired value, or setpoint ($SP$) 35 - is successively measured by the controller to compute the error $e$ 36 37 --- 38 👉 The `BlockTimeController` controls the progression through views, such that the epoch switchover happens at the intended point in time. We define: 39 40 - $\gamma = k\cdot \tau_0$ is the remaining epoch duration of a hypothetical ideal system, where *all* remaining $k$ views of the epoch progress with the ideal view time $\tau_0$. 41 - $\gamma = k\cdot \tau_0$ is the remaining epoch duration of a hypothetical ideal system, where *all* remaining $k$ views of the epoch progress with the ideal view time $\tau_0$. 42 - The parameter $\tau_0$ is computed solely based on the Epoch configuration as 43 $\tau_0 := \frac{<{\rm total\ epoch\ time}>}{<{\rm total\ views\ in\ epoch}>}$ (for mainnet 22, Epoch 75, we have $\tau_0 \simeq$ 1250ms). 44 - $\Gamma$ is the *actual* time remaining until the desired epoch switchover. 45 46 The error, which the controller should drive towards zero, is defined as: 47 48 ```math 49 e := \gamma - \Gamma 50 ``` 51 --- 52 53 54 From our definition it follows that: 55 56 - $e > 0$ implies that the estimated epoch switchover (assuming ideal system behaviour) happens too late. Therefore, to hit the desired epoch switchover time, the time we spend in views has to be *smaller* than $\tau_0$. 57 - For $e < 0$ means that we estimate the epoch switchover to be too early. Therefore, we should be slowing down and spend more than $\tau_0$ in the following views. 58 59 **Reasoning:** 60 61 The desired idealized system behaviour would a constant view duration $\tau_0$ throughout the entire epoch. 62 63 However, in the real-world system we have disturbances (varying message relay times, slow or offline nodes, etc) and measurement uncertainty (node can only observe its local view times, but not the committee’s collective swarm behaviour). 64 65  66 67 After a disturbance, we want the controller to drive the system back to a state, where it can closely follow the ideal behaviour from there on. 68 69 - Simulations have shown that this approach produces *very* stable controller with the intended behaviour. 70 71 **Controller driving $e := \gamma - \Gamma \rightarrow 0$** 72 - setting the differential term $K_d=0$, the controller responds as expected with damped oscillatory behaviour 73 to a singular strong disturbance. Setting $K_d=3$ suppresses oscillations and the controller's performance improves as it responds more effectively. 74 75  76  77 78 - controller very quickly compensates for moderate disturbances and observational noise in a well-behaved system: 79 80  81 82 - controller compensates massive anomaly (100s network partition) effectively: 83 84  85 86 - controller effectively stabilizes system with continued larger disturbances (20% of offline consensus participants) and notable observational noise: 87 88  89 90 **References:** 91 92 - statistical model for happy-path view durations: [ID controller for ``block-rate-delay``](https://www.notion.so/ID-controller-for-block-rate-delay-cc9c2d9785ac4708a37bb952557b5ef4?pvs=21) 93 - For Python implementation with additional disturbances (offline nodes) and observational noise, see GitHub repo: [flow-internal/analyses/pacemaker_timing/2023-05_Blocktime_PID-controller](https://github.com/dapperlabs/flow-internal/tree/master/analyses/pacemaker_timing/2023-05_Blocktime_PID-controller) → [controller_tuning_v01.py](https://github.com/dapperlabs/flow-internal/blob/master/analyses/pacemaker_timing/2023-05_Blocktime_PID-controller/controller_tuning_v01.py) 94 95 # Detailed PID controller specification 96 97 Each consensus participant runs a local instance of the controller described below. Hence, all the quantities are based on the node’s local observation. 98 99 ## Definitions 100 101 **Observables** (quantities provided to the node or directly measurable by the node): 102 103 - $v$ is the node’s current view 104 - ideal view time $\tau_0$ is computed solely based on the Epoch configuration: 105 $\tau_0 := \frac{<{\rm total\ epoch\ time}>}{<{\rm total\ views\ in\ epoch}>}$ (for mainnet 22, Epoch 75, we have $\tau_0 \simeq$ 1250ms). 106 - $t[v]$ is the time the node entered view $v$ 107 - $F[v]$ is the final view of the current epoch 108 - $T[v]$ is the target end time of the current epoch 109 110 **Derived quantities** 111 112 - remaining views of the epoch $k[v] := F[v] +1 - v$ 113 - time remaining until the desired epoch switchover $\Gamma[v] := T[v]-t[v]$ 114 - error $e[v] := \underbrace{k\cdot\tau_0}_{\gamma[v]} - \Gamma[v] = t[v] + k\cdot\tau_0 - T[v]$ 115 116 ### Precise convention of View Timing 117 118 Upon observing block `B` with view $v$, the controller updates its internal state. 119 120 Note the '+1' term in the computation of the remaining views $k[v] := F[v] +1 - v$ . This is related to our convention that the epoch begins (happy path) when observing the first block of the epoch. Only by observing this block, the nodes transition to the first view of the epoch. Up to that point, the consensus replicas remain in the last view of the previous epoch, in the state of `having processed the last block of the old epoch and voted for it` (happy path). Replicas remain in this state until they see a confirmation of the view (either QC or TC for the last view of the previous epoch). 121 122  123 124 In accordance with this convention, observing the proposal for the last view of an epoch, marks the start of the last view. By observing the proposal, nodes enter the last view, verify the block, vote for it, the primary aggregates the votes, constructs the child (for first view of new epoch). The last view of the epoch ends, when the child proposal is published. 125 126 ### Controller 127 128 The goal of the controller is to drive the system towards an error of zero, i.e. $e[v] \rightarrow 0$. For a [PID controller](https://en.wikipedia.org/wiki/PID_controller), the output $u$ for view $v$ has the form: 129 130 ```math 131 u[v] = K_p \cdot e[v]+K_i \cdot \mathcal{I}[v] + K_d \cdot \Delta[v] 132 ``` 133 134 With error terms (computed from observations) 135 136 - $e[v]$ representing the *instantaneous* error as of view $v$ 137 (commonly referred to as ‘proportional term’) 138 - $\mathcal{I} [v] = \sum_v e[v]$ the sum of the errors 139 (commonly referred to as ‘integral term’) 140 - $\Delta[v]=e[v]-e[v-1]$ the rate of change of the error 141 (commonly referred to as ‘derivative term’) 142 143 and controller parameters (values derived from controller tuning): 144 145 - $K_p$ be the proportional coefficient 146 - $K_i$ be the integral coefficient 147 - $K_d$ be the derivative coefficient 148 149 ## Measuring view duration 150 151 Each consensus participant observes the error $e[v]$ based on its local view evolution. As the following figure illustrates, the view duration is highly variable on small time scales. 152 153  154 155 Therefore, we expect $e[v]$ to be very variable. Furthermore, note that a node uses its local view transition times as an estimator for the collective behaviour of the entire committee. Therefore, there is also observational noise obfuscating the underlying collective behaviour. Hence, we expect notable noise. 156 157 ## Managing noise 158 159 Noisy values for $e[v]$ also impact the derivative term $\Delta[v]$ and integral term $\mathcal{I}[v]$. This can impact the controller’s performance. 160 161 ### **Managing noise in the proportional term** 162 163 An established approach for managing noise in observables is to use [exponentially weighted moving average [EWMA]](https://en.wikipedia.org/wiki/Moving_average) instead of the instantaneous values. Specifically, let $\bar{e}[v]$ denote the EWMA of the instantaneous error, which is computed as follows: 164 165 ```math 166 \eqalign{ 167 \textnormal{initialization: }\quad \bar{e} :&= 0 \\ 168 \textnormal{update with instantaneous error\ } e[v]:\quad \bar{e}[v] &= \alpha \cdot e[v] + (1-\alpha)\cdot \bar{e}[v-1] 169 } 170 ``` 171 172 The parameter $\alpha$ relates to the averaging time window. Let $\alpha \equiv \frac{1}{N_\textnormal{ewma}}$ and consider that the input changes from $x_\textnormal{old}$ to $x_\textnormal{new}$ as a step function. Then $N_\textnormal{ewma}$ is the number of samples required to move the output average about 2/3 of the way from $x_\textnormal{old}$ to $x_\textnormal{new}$. 173 174 see also [Python `Ewma` implementation](https://github.com/dapperlabs/flow-internal/blob/423d927421c073e4c3f66165d8f51b829925278f/analyses/pacemaker_timing/2023-05_Blocktime_PID-controller/controller_tuning_v01.py#L405-L431) 175 176 ### **Managing noise in the integral term** 177 178 In particular systematic observation bias are a problem, as it leads to a diverging integral term. The commonly adopted approach is to use a ‘leaky integrator’ [[1](https://www.music.mcgill.ca/~gary/307/week2/node4.html), [2](https://engineering.stackexchange.com/questions/29833/limiting-the-integral-to-a-time-window-in-pid-controller)], which we denote as $\bar{\mathcal{I}}[v]$. 179 180 ```math 181 \eqalign{ 182 \textnormal{initialization: }\quad \bar{\mathcal{I}} :&= 0 \\ 183 \textnormal{update with instantaneous error\ } e[v]:\quad \bar{\mathcal{I}}[v] &= e[v] + (1-\beta)\cdot\bar{\mathcal{I}}[v-1] 184 } 185 ``` 186 187 Intuitively, the loss factor $\beta$ relates to the time window of the integrator. A factor of 0 means an infinite time horizon, while $\beta =1$ makes the integrator only memorize the last input. Let $\beta \equiv \frac{1}{N_\textnormal{itg}}$ and consider a constant input value $x$. Then $N_\textnormal{itg}$ relates to the number of past samples that the integrator remembers: 188 189 - the integrators output will saturate at $x\cdot N_\textnormal{itg}$ 190 - an integrator initialized with 0, reaches 2/3 of the saturation value $x\cdot N_\textnormal{itg}$ after consuming $N_\textnormal{itg}$ inputs 191 192 see also [Python `LeakyIntegrator` implementation](https://github.com/dapperlabs/flow-internal/blob/423d927421c073e4c3f66165d8f51b829925278f/analyses/pacemaker_timing/2023-05_Blocktime_PID-controller/controller_tuning_v01.py#L444-L468) 193 194 ### **Managing noise in the derivative term** 195 196 Similarly to the proportional term, we apply an EWMA to the differential term and denote the averaged value as $\bar{\Delta}[v]$: 197 198 ```math 199 \eqalign{ 200 \textnormal{initialization: }\quad \bar{\Delta} :&= 0 \\ 201 \textnormal{update with instantaneous error\ } e[v]:\quad \bar{\Delta}[v] &= \bar{e}[v] - \bar{e}[v-1] 202 } 203 ``` 204 205 ## Final formula for PID controller 206 207 We have used a statistical model of the view duration extracted from mainnet 22 (Epoch 75) and manually added disturbances and observational noise and systemic observational bias. 208 209 The following parameters have proven to generate stable controller behaviour over a large variety of network conditions: 210 211 --- 212 👉 The controller is given by 213 214 ```math 215 u[v] = K_p \cdot \bar{e}[v]+K_i \cdot \bar{\mathcal{I}}[v] + K_d \cdot \bar{\Delta}[v] 216 ``` 217 218 with parameters: 219 220 - $K_p = 2.0$ 221 - $K_i = 0.6$ 222 - $K_d = 3.0$ 223 - $N_\textnormal{ewma} = 5$, i.e. $\alpha = \frac{1}{N_\textnormal{ewma}} = 0.2$ 224 - $N_\textnormal{itg} = 50$, i.e. $\beta = \frac{1}{N_\textnormal{itg}} = 0.02$ 225 226 The controller output $u[v]$ represents the amount of time by which the controller wishes to deviate from the ideal view duration $\tau_0$. In other words, the duration of view $v$ that the controller wants to set is 227 ```math 228 \widehat{\tau}[v] = \tau_0 - u[v] 229 ``` 230 --- 231 232 233 For further details about 234 235 - the statistical model of the view duration, see [ID controller for ``block-rate-delay``](https://www.notion.so/ID-controller-for-block-rate-delay-cc9c2d9785ac4708a37bb952557b5ef4?pvs=21) 236 - the simulation and controller tuning, see [flow-internal/analyses/pacemaker_timing/2023-05_Blocktime_PID-controller](https://github.com/dapperlabs/flow-internal/tree/master/analyses/pacemaker_timing/2023-05_Blocktime_PID-controller) → [controller_tuning_v01.py](https://github.com/dapperlabs/flow-internal/blob/master/analyses/pacemaker_timing/2023-05_Blocktime_PID-controller/controller_tuning_v01.py) 237 238 ### Limits of authority 239 240 In general, there is no bound on the output of the controller output $u$. However, it is important to limit the controller’s influence to keep $u$ within a sensible range. 241 242 - upper bound on view duration $\widehat{\tau}[v]$ that we allow the controller to set: 243 244 The current timeout threshold is set to 2.5s. Therefore, the largest view duration we want to allow the controller to set is 1.6s. 245 Thereby, approx. 900ms remain for message propagation, voting and constructing the child block, which will prevent the controller to drive the node into timeout with high probability. 246 247 - lower bound on the view duration: 248 249 Let $t_\textnormal{p}[v]$ denote the time when the primary for view $v$ has constructed its block proposal. 250 The time difference $t_\textnormal{p}[v] - t[v]$ between the primary entering the view and having its proposal 251 ready is the minimally required time to execute the protocol. The controller can only *delay* broadcasting the block, 252 but it cannot release the block before $t_\textnormal{p}[v]$ simply because the proposal isn’t ready any earlier. 253 254 255 256 👉 Let $\hat{t}[v]$ denote the time when the primary for view $v$ *broadcasts* its proposal. We assign: 257 258 ```math 259 \hat{t}[v] := \max\big(t[v] +\min(\widehat{\tau}[v],\ 2\textnormal{s}),\ t_\textnormal{p}[v]\big) 260 ``` 261 262 263 264 ## Edge Cases 265 266 ### A node is catching up 267 268 When a node is catching up, it processes blocks more quickly than when it is up-to-date, and therefore observes a faster view rate. This would cause the node’s `BlockRateManager` to compensate by increasing the block rate delay. 269 270 As long as delay function is responsive, it doesn’t have a practical impact, because nodes catching up don’t propose anyway. 271 272 To the extent the delay function is not responsive, this would cause the block rate to slow down slightly, when the node is caught up. 273 274 **Assumption:** as we assume that only a smaller fraction of nodes go offline, the effect is expected to be small and easily compensated for by the supermajority of online nodes. 275 276 ### A node has a misconfigured clock 277 278 Cap the maximum deviation from the default delay (limits the general impact of error introduced by the `BlockTimeController`). The node with misconfigured clock will contribute to the error in a limited way, but as long as the majority of nodes have an accurate clock, they will offset this error. 279 280 **Assumption:** few enough nodes will have a misconfigured clock, that the effect will be small enough to be easily compensated for by the supermajority of correct nodes. 281 282 ### Near epoch boundaries 283 284 We might incorrectly compute high error in the target view rate, if local current view and current epoch are not exactly synchronized. By default, they would not be, because `EpochTransition` events occur upon finalization, and current view is updated as soon as QC/TC is available. 285 286 **Solution:** determine epoch locally based on view only, do not use `EpochTransition` event. 287 288 ### EECC 289 290 We need to detect EECC and revert to a default block-rate-delay (stop adjusting). 291 292 ## Testing 293 294 [Cruise Control: Benchnet Testing Notes](https://www.notion.so/Cruise-Control-Benchnet-Testing-Notes-ea08f49ba9d24ce2a158fca9358966df?pvs=21)