github.com/oam-dev/kubevela@v1.9.11/design/vela-core/rollout-design.md (about) 1 # OAM Rollout Controller Design 2 3 - Owner: Ryan Zhang (@ryangzhang-oss) 4 - Date: 01/14/2021 5 - Status: Draft 6 7 ## Table of Contents 8 9 - [Introduction](#introduction) 10 - [Two flavors of Rollout](#two-flavors-of-rollout) 11 - [Goals](#goals) 12 - [Proposal](#proposal) 13 - [Registration via Definition/Capability](#registration-via-definitioncapability) 14 - [Templating](#templating) 15 - [CLI/UI interoperability](#cliui-interoperability) 16 - [vela up](#vela-up) 17 - [Examples](#examples) 18 19 ## Introduction 20 21 `Rollout` or `Upgrade` is one of the most essential "day 2" operation on any application 22 . KubeVela, as an application centric platform, definitely needs to provide a customized solution 23 to alleviate the burden on the application operators. There are several popular rollout solutions 24 , i.e. [flagger](https://flagger.app/), in the open source community. However, none of them 25 work directly with our OAM framework. Therefore, we propose to create an OAM native rollout 26 framework that can address all the application rollout/upgrade needs in Kubevela. 27 28 ### Two flavors of Rollout 29 After hearing from the OAM community, it is clear to us that there are two flavors of rollout 30 that we need to support. 31 - One way is through an OAM trait. This flavor works for existing OAM applications that don't 32 have many specialized requirements such as interaction with other traits. For 33 example, most rollout operations don't work well with scaling operations. Thus, the 34 application operator needs to remove any scalar traits from the component before applying the 35 rollout trait. Another example is rollout operations usually also involve traffic spliting 36 of sorts. Thus, the application operators might also need to manually adjust the related 37 traffic trait before and after applying the rollout trait. 38 - The other ways is through a new applicationDeployment CR which directly reference different 39 versions of applications instead of workloads. This opens up the possibility for the controller 40 to solve conflicts between traits automatically. This resource, however, not a trait and requires 41 the application to be immutable. 42 43 ### Design Principles and Goals 44 We design our controllers with the following principles in mind 45 - First, we want all flavors of rollout controllers share the same core rollout 46 related logic. The trait and application related logic can be easily encapsulated into its own 47 package. 48 - Second, the core rollout related logic is easily extensible to support different type of 49 workloads, i.e. Deployment, Cloneset, Statefulset, Daemonset or even customized workloads. 50 - Thirdly, the core rollout related logic has a well documented state machine that 51 does state transition explicitly. 52 - Finally, the controllers can support all the rollout/upgrade needs of an application running 53 in a production environment. 54 55 ### Proposal Layout 56 Here is the rest of the proposal 57 - First, we will present the exact rollout CRD spec. 58 - Second, we will give a high level design on how do we plan to implement the controller. 59 - Third, we will present the state machine and their transition events. 60 - Finally, we will list common rollout scenarios and their corresponding user experience and 61 implementation details. 62 63 ## Rollout API Design 64 Let's start with the rollout trait spec definition. The applicationDeployment spec is very similar. 65 ```go 66 // RolloutTraitSpec defines the desired state of RolloutTrait 67 type RolloutTraitSpec struct { 68 // TargetRef references a target resource that contains the newer version 69 // of the software. We assumed that new resource already exists. 70 // This is the only resource we work on if the resource is a stateful resource (cloneset/statefulset) 71 TargetRef corev1.ObjectReference `json:"targetRef"` 72 73 // SourceRef references the list of resources that contains the older version 74 // of the software. We assume that it's the first time to deploy when we cannot find any source. 75 // +optional 76 SourceRef []corev1.ObjectReference `json:"sourceRef,omitempty"` 77 78 // RolloutPlan is the details on how to rollout the resources 79 RolloutPlan RolloutPlan `json:"rolloutPlan"` 80 } 81 ``` 82 The target and source here are the same as other OAM traits that refers to a workload instance 83 with its GVK and name. 84 85 We can see that the core part of the rollout logic is encapsulated by the `RolloutPlan` object. This 86 allows us to write multiple controllers that share the same logic without duplicating the code 87 . Here is the definition of the `RolloutPlan` structure. 88 89 ```go 90 // RolloutPlan fines the details of the rollout plan 91 type RolloutPlan struct { 92 93 // RolloutStrategy defines strategies for the rollout plan 94 // +optional 95 RolloutStrategy RolloutStrategyType `json:"rolloutStrategy,omitempty"` 96 97 // The size of the target resource. The default is the same 98 // as the size of the source resource. 99 // +optional 100 TargetSize *int32 `json:"targetSize,omitempty"` 101 102 // The number of batches, default = 1 103 // mutually exclusive to RolloutBatches 104 // +optional 105 NumBatches *int32 `json:"numBatches,omitempty"` 106 107 // The exact distribution among batches. 108 // mutually exclusive to NumBatches 109 // +optional 110 RolloutBatches []RolloutBatch `json:"rolloutBatches,omitempty"` 111 112 // All pods in the batches up to the batchPartition (included) will have 113 // the target resource specification while the rest still have the source resource 114 // This is designed for the operators to manually rollout 115 // Default is the the number of batches which will rollout all the batches 116 // +optional 117 BatchPartition *int32 `json:"lastBatchToRollout,omitempty"` 118 119 // RevertOnDelete revert the rollout when the rollout CR is deleted, default is false 120 //+optional 121 RevertOnDelete bool `json:"revertOnDelete,omitempty"` 122 123 // Paused the rollout, default is false 124 //+optional 125 Paused bool `json:"paused,omitempty"` 126 127 // RolloutWebhooks provides a way for the rollout to interact with an external process 128 // +optional 129 RolloutWebhooks []RolloutWebhook `json:"rolloutWebhooks,omitempty"` 130 131 // CanaryMetric provides a way for the rollout process to automatically check certain metrics 132 // before complete the process 133 // +optional 134 CanaryMetric []CanaryMetric `json:"canaryMetric,omitempty"` 135 } 136 ``` 137 138 ## User Experience Workflow 139 OAM rollout experience is different from flagger in some key areas and here are the 140 implications on its impact on the user experience. 141 - We assume that the resources it refers to are **immutable**. In contrast, flagger watches over a 142 target resource and reacts whenever the target's specification changes. 143 - The trait version of the controller refers to componentRevision. 144 - The application version of the controller refers to immutable application. 145 - The rollout logic **works only once** and stops after it reaches a terminal state. One can 146 still change the rollout plan in the middle of the rollout as long as it does not change the 147 pods that are already updated. 148 - The applicationDeployment controller only rollout one component in the applications for now. 149 - Users in general should rely on the rollout CR to do the actual rollout which means they 150 shall set the `replicas` or `partition` field of the new resources to the starting value 151 indicated in the [detailed rollout plan design](#rollout-plan-work-with-different-type-of-workloads). 152 153 154 ## Notable implementation level design decisions 155 Here are some high level implementation design decisions that will impact the user experience of 156 rolling out. 157 158 ### Rollout workflows 159 As we mentioned in the introduction section, we will implement two rollout controllers that work 160 on different levels. At the end, they both emit an in-memory rollout plan object which includes 161 references to the target and source kubernetes resources that the rollout planner will execute 162 upon. For example, the applicationDeployment controller will get the component from the 163 application and extract the real workload from it before passing it to the rollout plan object. 164 165 With that said, two controllers operate differently to extract the real workload. Here are the 166 high level descriptions of how each works. 167 168 #### Application inplace upgrade workflow 169 The most natural way to upgrade an application is to upgrade it in-place which means the users 170 just change the application, and the system will pick up the change, then apply to the runtime 171 . The implementation of this type of upgrade looks like this: 172 - The application controller compute a hash value of the applicationConfiguration. The 173 application controller **always** use the component revision name in the AC it generates. This 174 guaranteed that the AC also changes when the component changes. 175 - The application controller creates the applicationConfiguration with a new name (with a suffix 176 ) upon changing of its hash value and with a pre-determined annotation 177 "app.oam.dev/appconfig-rollout" set to true. 178 - The AC controller have special handle logic in the apply part of the logic. The exact logic 179 depends on the workload type and we will list each in the 180 [rollout with different workload](#Rollout plan work with different type of workloads) section 181 . This special AC logic is also the real magic for the other rollout scenario to work as AC 182 controller is the only entity that is directly responsible for emitting the workload to the k8s. 183 184 185 #### ApplicationDeployment workflow 186 When an appDeployment is used to do application level rollout, **the target application 187 is not reconciled by the application controller yet**. This is to make sure the 188 appDeployment controller has the full control of the new application from the beginning. 189 We will use a pre-defined annotation "app.oam.dev/rollout-template" that equals to "true" to facilitate 190 that. We expect any system, such as the [kubevela apiserver](APIServer-Catalog.md), that 191 utilizes an appDeployment object to follow this rule. 192 - Upon creation, the appDeployment controller marks itself as the owner of the application. The 193 application controller will have built-in logic to ignore any applications that has the 194 "app.oam.dev/rollout-template" annotation set to true. 195 - the appDeployment controller will also add another annotation "app.oam.dev/creating" to the 196 application to be passed down to the ApplicationConfiguration CR it generates to mark 197 that the AC is reconciled for the first time. 198 - The ApplicationConfiguration controller recognizes this annotation, and it will see if there is 199 anything it needs to do before emitting the workload to the k8s. The AC controller removes this 200 annotation at the end of a successful reconcile. 201 - The appDeployment controller can change the target application fields. For example, 202 - It might remove all the conflict traits, such as HPA during upgrade. 203 - It might modify the label selectors fields in the services to make sure there are ways to 204 differentiate traffic routing to the old and new application resources. 205 - The appDeployment controller will return the control of the new application back to the 206 application controller after it makes the initial adjustment of the application by removing the 207 annotation. 208 - We will use a webhook to ensure that the "rollout" annotation cannot be added back once removed. 209 - Upon a successful rollout, the appDeployment controller leaves no pods running for the old 210 application. 211 - Upon a failed rollout, the condition is not determined, it could result in an unstable state 212 since both the old and new applications have been modified. 213 - Thus, we introduced a `revertOnDelete` field so that a user can delete the appDeployment and 214 expect the old application to be intact, and the new application takes no effect. 215 216 #### Rollout trait workflow 217 The rollout traits controller only works with componentRevision. 218 - The component controller emits the new component revision when a new component is created or 219 updated. 220 - The application configuration controller emits the new component and assign the 221 componentRevision as the source and target of rollout trait. 222 - We assume that there is no other scalar related traits deployed at the same time. We will use 223 `conflict-with` fields in the traitDefinition and webhooks to enforce that. 224 - Upon a successful rollout, the rollout trait will keep the old component revision with no 225 pod left. 226 - Upon a failed rollout,the rollout trait will just stop and leaves the resource mixed. This 227 state mostly should still work since the other traits are not touched. 228 229 ### Rollout plan work with different type of workloads 230 The rollout plan part of the rollout logic is shared between all rollout controller. It comes 231 with references to the target and source workload. The controller is responsible for fetching 232 the different revisions of the resources. Deployment and Cloneset represents the two major types 233 of resources in that Cloneset can contain both the new and old revisions at a stable state while 234 deployment only contains one version when it's stable. 235 236 #### Rollout plan works with deployment 237 It's pretty straightforward for the outer controller to create the in-memory target and source 238 deployment object since they are just a copy of the kubernetes resources. 239 - The deployment workload should set the **`Paused` field to be true** by the user in the 240 appDeployment case. 241 - Another options is for the user to leave the `replicas` field as 0 if the rollout does not have 242 access to that field. 243 - If the rollout is successful, the source deployment `replicas` field will be zero and the 244 target deployment will be the same as the original source. 245 - If the rollout failed, we will leave the state as it is. 246 - If the rollout failed and `revertOnDelete` is `true` and the rollout CR is deleted, then the 247 source deployment `replicas` field will be turned back to before rollout and the target deployment's `replicas` field will 248 be zero. 249 250 #### Rollout plan works with cloneset 251 The outer controller creates the in-memory target and source cloneset object with different image 252 ids. The source is only used when we do need to do rollback. 253 - The user should set the Cloneset workload's **`Paused` field to be true** by the user in the 254 appDeployment case. 255 - Another options is for the user to leave the `partition` field in a value that effectively stop 256 upgrade if the rollout does not have access to that field. 257 - The rollout plan mostly just adjusts the `partition` field in the Cloneset and leaves the rest 258 of the rollout logic to the Cloneset controller. 259 - If the rollout is successful, the `partition` field will be zero 260 - If the rollout failed, we will leave the `partition` field as the last time we touch it. 261 - If the rollout failed and `revertOnDelete` is `true` and the rollout CR is deleted, we will 262 perform a revert on the Cloneset. Note that only the latest Cloneset controller allows rollback 263 when one increases the `partition` field. 264 265 ### Operational features 266 - We will use the same service mesh model as flagger in the sense that user needs to provide the 267 service mesh provider type and give us the reference to an ingress object. 268 - We plan to directly import the various flagger mesh implementation. 269 - We plan to import the implementation of notifiers from flagger too 270 - We can consider adding an alert rule in the rollout Plan api in the future 271 272 ### We use webhook to validate whether the change to the rollout CRD is valid. 273 We will go with strict restrictions on the CRD update in that nothing can be updated other than 274 the following fields: 275 - the BatchPartition field can only increase unless the target ref has changed 276 - the RolloutBatches field can only change the part after the BatchPartition field 277 - the CanaryMetric/Paused/RevertOnDelete can be modified freely. 278 - the rollout controller will simply replace the existing rollout CR value in the in-memory map 279 which will lead to its in-memory execution to stop. The new CR will kick off a new execution 280 loop which will resume the rollout operation based on the rollout and its resources status which 281 follows the pre-determined state machine transition. 282 283 ### The controller have extension points setup for the following plug-ins: 284 - workloads. Each workload handler needs to implement the following operations: 285 - scale the resources 286 - determine the health of the workload 287 - report how many replicas are upgraded/ready/available 288 - (future) metrics provider. 289 - (future) service mesh provider. Each mesh provider needs to implement the following operations: 290 - direct certain percent of the traffic to the source/target workload 291 - fetch the current traffic split 292 293 ## State Transition 294 Here is the state transition graph 295 296  297 298 Here are the various top-level states of the rollout 299 ```go 300 // VerifyingSpecState indicates that the rollout is in the stage of verifying the rollout settings 301 // and the controller can locate both the target and the source 302 VerifyingSpecState RollingState = "verifyingSpec" 303 // InitializingState indicates that the rollout is initializing all the new resources 304 InitializingState RollingState = "initializing" 305 // RollingInBatchesState indicates that the rollout starts rolling 306 RollingInBatchesState RollingState = "rollingInBatches" 307 // FinalisingState indicates that the rollout is finalizing, possibly clean up the old resources, adjust traffic 308 FinalisingState RollingState = "finalising" 309 // RolloutFailingState indicates that the rollout is failing 310 // one needs to finalize it before mark it as failed by cleaning up the old resources, adjust traffic 311 RolloutFailingState RollingState = "rolloutFailing" 312 // RolloutSucceedState indicates that rollout successfully completed to match the desired target state 313 RolloutSucceedState RollingState = "rolloutSucceed" 314 // RolloutAbandoningState indicates that the rollout is abandoned, can be restarted. This is a terminal state 315 RolloutAbandoningState RollingState = "rolloutAbandoned" 316 // RolloutFailedState indicates that rollout is failed, the target replica is not reached 317 // we can not move forward anymore, we will let the client to decide when or whether to revert. 318 RolloutFailedState RollingState = "rolloutFailed" 319 ) 320 ``` 321 322 These are the sub-states of the rollout when its in the rolling state. 323 ```go 324 // BatchInitializingState still rolling the batch, the batch rolling is not completed yet 325 BatchInitializingState BatchRollingState = "batchInitializing" 326 // BatchInRollingState still rolling the batch, the batch rolling is not completed yet 327 BatchInRollingState BatchRollingState = "batchInRolling" 328 // BatchVerifyingState verifying if the application is ready to roll. 329 BatchVerifyingState BatchRollingState = "batchVerifying" 330 // BatchRolloutFailedState indicates that the batch didn't get the manual or automatic approval 331 BatchRolloutFailedState BatchRollingState = "batchVerifyFailed" 332 // BatchFinalizingState indicates that all the pods in the are available, we can move on to the next batch 333 BatchFinalizingState BatchRollingState = "batchFinalizing" 334 // BatchReadyState indicates that all the pods in the are upgraded and its state is ready 335 BatchReadyState BatchRollingState = "batchReady" 336 ) 337 ``` 338 339 ## Future work 340 The applicationDeployment should also work on traits. For example, if someone plans to update the 341 HPA traits formula, there should be a way for them to rolling out the HPA change step by step too. 342