github.com/oam-dev/kubevela@v1.9.11/design/vela-core/rollout-design.md

github.com/oam-dev/kubevela@v1.9.11/design/vela-core/rollout-design.md (about)

1 # OAM Rollout Controller Design
2
3 - Owner: Ryan Zhang (@ryangzhang-oss)
4 - Date: 01/14/2021
5 - Status: Draft
6
7 ## Table of Contents
8
9 - [Introduction](#introduction)
10 - [Two flavors of Rollout](#two-flavors-of-rollout)
11 - [Goals](#goals)
12 - [Proposal](#proposal)
13 - [Registration via Definition/Capability](#registration-via-definitioncapability)
14 - [Templating](#templating)
15 - [CLI/UI interoperability](#cliui-interoperability)
16 - [vela up](#vela-up)
17 - [Examples](#examples)
18
19 ## Introduction
20
21 `Rollout` or `Upgrade` is one of the most essential "day 2" operation on any application
22 . KubeVela, as an application centric platform, definitely needs to provide a customized solution
23 to alleviate the burden on the application operators. There are several popular rollout solutions
24 , i.e. [flagger](https://flagger.app/), in the open source community. However, none of them
25 work directly with our OAM framework. Therefore, we propose to create an OAM native rollout
26 framework that can address all the application rollout/upgrade needs in Kubevela.
27
28 ### Two flavors of Rollout
29 After hearing from the OAM community, it is clear to us that there are two flavors of rollout
30 that we need to support.
31 - One way is through an OAM trait. This flavor works for existing OAM applications that don't
32 have many specialized requirements such as interaction with other traits. For
33 example, most rollout operations don't work well with scaling operations. Thus, the
34 application operator needs to remove any scalar traits from the component before applying the
35 rollout trait. Another example is rollout operations usually also involve traffic spliting
36 of sorts. Thus, the application operators might also need to manually adjust the related
37 traffic trait before and after applying the rollout trait.
38 - The other ways is through a new applicationDeployment CR which directly reference different
39 versions of applications instead of workloads. This opens up the possibility for the controller
40 to solve conflicts between traits automatically. This resource, however, not a trait and requires
41 the application to be immutable.
42
43 ### Design Principles and Goals
44 We design our controllers with the following principles in mind
45 - First, we want all flavors of rollout controllers share the same core rollout
46 related logic. The trait and application related logic can be easily encapsulated into its own
47 package.
48 - Second, the core rollout related logic is easily extensible to support different type of
49 workloads, i.e. Deployment, Cloneset, Statefulset, Daemonset or even customized workloads.
50 - Thirdly, the core rollout related logic has a well documented state machine that
51 does state transition explicitly.
52 - Finally, the controllers can support all the rollout/upgrade needs of an application running
53 in a production environment.
54
55 ### Proposal Layout
56 Here is the rest of the proposal
57 - First, we will present the exact rollout CRD spec.
58 - Second, we will give a high level design on how do we plan to implement the controller.
59 - Third, we will present the state machine and their transition events.
60 - Finally, we will list common rollout scenarios and their corresponding user experience and
61 implementation details.
62
63 ## Rollout API Design
64 Let's start with the rollout trait spec definition. The applicationDeployment spec is very similar.
65 ```go
66 // RolloutTraitSpec defines the desired state of RolloutTrait
67 type RolloutTraitSpec struct {
68 // TargetRef references a target resource that contains the newer version
69 // of the software. We assumed that new resource already exists.
70 // This is the only resource we work on if the resource is a stateful resource (cloneset/statefulset)
71 TargetRef corev1.ObjectReference `json:"targetRef"`
72
73 // SourceRef references the list of resources that contains the older version
74 // of the software. We assume that it's the first time to deploy when we cannot find any source.
75 // +optional
76 SourceRef []corev1.ObjectReference `json:"sourceRef,omitempty"`
77
78 // RolloutPlan is the details on how to rollout the resources
79 RolloutPlan RolloutPlan `json:"rolloutPlan"`
80 }
81 ```
82 The target and source here are the same as other OAM traits that refers to a workload instance
83 with its GVK and name.
84
85 We can see that the core part of the rollout logic is encapsulated by the `RolloutPlan` object. This
86 allows us to write multiple controllers that share the same logic without duplicating the code
87 . Here is the definition of the `RolloutPlan` structure.
88
89 ```go
90 // RolloutPlan fines the details of the rollout plan
91 type RolloutPlan struct {
92
93 // RolloutStrategy defines strategies for the rollout plan
94 // +optional
95 RolloutStrategy RolloutStrategyType `json:"rolloutStrategy,omitempty"`
96
97 // The size of the target resource. The default is the same
98 // as the size of the source resource.
99 // +optional
100 TargetSize *int32 `json:"targetSize,omitempty"`
101
102 // The number of batches, default = 1
103 // mutually exclusive to RolloutBatches
104 // +optional
105 NumBatches *int32 `json:"numBatches,omitempty"`
106
107 // The exact distribution among batches.
108 // mutually exclusive to NumBatches
109 // +optional
110 RolloutBatches []RolloutBatch `json:"rolloutBatches,omitempty"`
111
112 // All pods in the batches up to the batchPartition (included) will have
113 // the target resource specification while the rest still have the source resource
114 // This is designed for the operators to manually rollout
115 // Default is the the number of batches which will rollout all the batches
116 // +optional
117 BatchPartition *int32 `json:"lastBatchToRollout,omitempty"`
118
119 // RevertOnDelete revert the rollout when the rollout CR is deleted, default is false
120 //+optional
121 RevertOnDelete bool `json:"revertOnDelete,omitempty"`
122
123 // Paused the rollout, default is false
124 //+optional
125 Paused bool `json:"paused,omitempty"`
126
127 // RolloutWebhooks provides a way for the rollout to interact with an external process
128 // +optional
129 RolloutWebhooks []RolloutWebhook `json:"rolloutWebhooks,omitempty"`
130
131 // CanaryMetric provides a way for the rollout process to automatically check certain metrics
132 // before complete the process
133 // +optional
134 CanaryMetric []CanaryMetric `json:"canaryMetric,omitempty"`
135 }
136 ```
137
138 ## User Experience Workflow
139 OAM rollout experience is different from flagger in some key areas and here are the
140 implications on its impact on the user experience.
141 - We assume that the resources it refers to are **immutable**. In contrast, flagger watches over a
142 target resource and reacts whenever the target's specification changes.
143 - The trait version of the controller refers to componentRevision.
144 - The application version of the controller refers to immutable application.
145 - The rollout logic **works only once** and stops after it reaches a terminal state. One can
146 still change the rollout plan in the middle of the rollout as long as it does not change the
147 pods that are already updated.
148 - The applicationDeployment controller only rollout one component in the applications for now.
149 - Users in general should rely on the rollout CR to do the actual rollout which means they
150 shall set the `replicas` or `partition` field of the new resources to the starting value
151 indicated in the [detailed rollout plan design](#rollout-plan-work-with-different-type-of-workloads).
152
153
154 ## Notable implementation level design decisions
155 Here are some high level implementation design decisions that will impact the user experience of
156 rolling out.
157
158 ### Rollout workflows
159 As we mentioned in the introduction section, we will implement two rollout controllers that work
160 on different levels. At the end, they both emit an in-memory rollout plan object which includes
161 references to the target and source kubernetes resources that the rollout planner will execute
162 upon. For example, the applicationDeployment controller will get the component from the
163 application and extract the real workload from it before passing it to the rollout plan object.
164
165 With that said, two controllers operate differently to extract the real workload. Here are the
166 high level descriptions of how each works.
167
168 #### Application inplace upgrade workflow
169 The most natural way to upgrade an application is to upgrade it in-place which means the users
170 just change the application, and the system will pick up the change, then apply to the runtime
171 . The implementation of this type of upgrade looks like this:
172 - The application controller compute a hash value of the applicationConfiguration. The
173 application controller **always** use the component revision name in the AC it generates. This
174 guaranteed that the AC also changes when the component changes.
175 - The application controller creates the applicationConfiguration with a new name (with a suffix
176 ) upon changing of its hash value and with a pre-determined annotation
177 "app.oam.dev/appconfig-rollout" set to true.
178 - The AC controller have special handle logic in the apply part of the logic. The exact logic
179 depends on the workload type and we will list each in the
180 [rollout with different workload](#Rollout plan work with different type of workloads) section
181 . This special AC logic is also the real magic for the other rollout scenario to work as AC
182 controller is the only entity that is directly responsible for emitting the workload to the k8s.
183
184
185 #### ApplicationDeployment workflow
186 When an appDeployment is used to do application level rollout, **the target application
187 is not reconciled by the application controller yet**. This is to make sure the
188 appDeployment controller has the full control of the new application from the beginning.
189 We will use a pre-defined annotation "app.oam.dev/rollout-template" that equals to "true" to facilitate
190 that. We expect any system, such as the [kubevela apiserver](APIServer-Catalog.md), that
191 utilizes an appDeployment object to follow this rule.
192 - Upon creation, the appDeployment controller marks itself as the owner of the application. The
193 application controller will have built-in logic to ignore any applications that has the
194 "app.oam.dev/rollout-template" annotation set to true.
195 - the appDeployment controller will also add another annotation "app.oam.dev/creating" to the
196 application to be passed down to the ApplicationConfiguration CR it generates to mark
197 that the AC is reconciled for the first time.
198 - The ApplicationConfiguration controller recognizes this annotation, and it will see if there is
199 anything it needs to do before emitting the workload to the k8s. The AC controller removes this
200 annotation at the end of a successful reconcile.
201 - The appDeployment controller can change the target application fields. For example,
202 - It might remove all the conflict traits, such as HPA during upgrade.
203 - It might modify the label selectors fields in the services to make sure there are ways to
204 differentiate traffic routing to the old and new application resources.
205 - The appDeployment controller will return the control of the new application back to the
206 application controller after it makes the initial adjustment of the application by removing the
207 annotation.
208 - We will use a webhook to ensure that the "rollout" annotation cannot be added back once removed.
209 - Upon a successful rollout, the appDeployment controller leaves no pods running for the old
210 application.
211 - Upon a failed rollout, the condition is not determined, it could result in an unstable state
212 since both the old and new applications have been modified.
213 - Thus, we introduced a `revertOnDelete` field so that a user can delete the appDeployment and
214 expect the old application to be intact, and the new application takes no effect.
215
216 #### Rollout trait workflow
217 The rollout traits controller only works with componentRevision.
218 - The component controller emits the new component revision when a new component is created or
219 updated.
220 - The application configuration controller emits the new component and assign the
221 componentRevision as the source and target of rollout trait.
222 - We assume that there is no other scalar related traits deployed at the same time. We will use
223 `conflict-with` fields in the traitDefinition and webhooks to enforce that.
224 - Upon a successful rollout, the rollout trait will keep the old component revision with no
225 pod left.
226 - Upon a failed rollout,the rollout trait will just stop and leaves the resource mixed. This
227 state mostly should still work since the other traits are not touched.
228
229 ### Rollout plan work with different type of workloads
230 The rollout plan part of the rollout logic is shared between all rollout controller. It comes
231 with references to the target and source workload. The controller is responsible for fetching
232 the different revisions of the resources. Deployment and Cloneset represents the two major types
233 of resources in that Cloneset can contain both the new and old revisions at a stable state while
234 deployment only contains one version when it's stable.
235
236 #### Rollout plan works with deployment
237 It's pretty straightforward for the outer controller to create the in-memory target and source
238 deployment object since they are just a copy of the kubernetes resources.
239 - The deployment workload should set the **`Paused` field to be true** by the user in the
240 appDeployment case.
241 - Another options is for the user to leave the `replicas` field as 0 if the rollout does not have
242 access to that field.
243 - If the rollout is successful, the source deployment `replicas` field will be zero and the
244 target deployment will be the same as the original source.
245 - If the rollout failed, we will leave the state as it is.
246 - If the rollout failed and `revertOnDelete` is `true` and the rollout CR is deleted, then the
247 source deployment `replicas` field will be turned back to before rollout and the target deployment's `replicas` field will
248 be zero.
249
250 #### Rollout plan works with cloneset
251 The outer controller creates the in-memory target and source cloneset object with different image
252 ids. The source is only used when we do need to do rollback.
253 - The user should set the Cloneset workload's **`Paused` field to be true** by the user in the
254 appDeployment case.
255 - Another options is for the user to leave the `partition` field in a value that effectively stop
256 upgrade if the rollout does not have access to that field.
257 - The rollout plan mostly just adjusts the `partition` field in the Cloneset and leaves the rest
258 of the rollout logic to the Cloneset controller.
259 - If the rollout is successful, the `partition` field will be zero
260 - If the rollout failed, we will leave the `partition` field as the last time we touch it.
261 - If the rollout failed and `revertOnDelete` is `true` and the rollout CR is deleted, we will
262 perform a revert on the Cloneset. Note that only the latest Cloneset controller allows rollback
263 when one increases the `partition` field.
264
265 ### Operational features
266 - We will use the same service mesh model as flagger in the sense that user needs to provide the
267 service mesh provider type and give us the reference to an ingress object.
268 - We plan to directly import the various flagger mesh implementation.
269 - We plan to import the implementation of notifiers from flagger too
270 - We can consider adding an alert rule in the rollout Plan api in the future
271
272 ### We use webhook to validate whether the change to the rollout CRD is valid.
273 We will go with strict restrictions on the CRD update in that nothing can be updated other than
274 the following fields:
275 - the BatchPartition field can only increase unless the target ref has changed
276 - the RolloutBatches field can only change the part after the BatchPartition field
277 - the CanaryMetric/Paused/RevertOnDelete can be modified freely.
278 - the rollout controller will simply replace the existing rollout CR value in the in-memory map
279 which will lead to its in-memory execution to stop. The new CR will kick off a new execution
280 loop which will resume the rollout operation based on the rollout and its resources status which
281 follows the pre-determined state machine transition.
282
283 ### The controller have extension points setup for the following plug-ins:
284 - workloads. Each workload handler needs to implement the following operations:
285 - scale the resources
286 - determine the health of the workload
287 - report how many replicas are upgraded/ready/available
288 - (future) metrics provider.
289 - (future) service mesh provider. Each mesh provider needs to implement the following operations:
290 - direct certain percent of the traffic to the source/target workload
291 - fetch the current traffic split
292
293 ## State Transition
294 Here is the state transition graph
295
296 ![](https://raw.githubusercontent.com/oam-dev/kubevela.io/main/docs/resources/approllout-status-transition.jpg)
297
298 Here are the various top-level states of the rollout
299 ```go
300 // VerifyingSpecState indicates that the rollout is in the stage of verifying the rollout settings
301 // and the controller can locate both the target and the source
302 VerifyingSpecState RollingState = "verifyingSpec"
303 // InitializingState indicates that the rollout is initializing all the new resources
304 InitializingState RollingState = "initializing"
305 // RollingInBatchesState indicates that the rollout starts rolling
306 RollingInBatchesState RollingState = "rollingInBatches"
307 // FinalisingState indicates that the rollout is finalizing, possibly clean up the old resources, adjust traffic
308 FinalisingState RollingState = "finalising"
309 // RolloutFailingState indicates that the rollout is failing
310 // one needs to finalize it before mark it as failed by cleaning up the old resources, adjust traffic
311 RolloutFailingState RollingState = "rolloutFailing"
312 // RolloutSucceedState indicates that rollout successfully completed to match the desired target state
313 RolloutSucceedState RollingState = "rolloutSucceed"
314 // RolloutAbandoningState indicates that the rollout is abandoned, can be restarted. This is a terminal state
315 RolloutAbandoningState RollingState = "rolloutAbandoned"
316 // RolloutFailedState indicates that rollout is failed, the target replica is not reached
317 // we can not move forward anymore, we will let the client to decide when or whether to revert.
318 RolloutFailedState RollingState = "rolloutFailed"
319 )
320 ```
321
322 These are the sub-states of the rollout when its in the rolling state.
323 ```go
324 // BatchInitializingState still rolling the batch, the batch rolling is not completed yet
325 BatchInitializingState BatchRollingState = "batchInitializing"
326 // BatchInRollingState still rolling the batch, the batch rolling is not completed yet
327 BatchInRollingState BatchRollingState = "batchInRolling"
328 // BatchVerifyingState verifying if the application is ready to roll.
329 BatchVerifyingState BatchRollingState = "batchVerifying"
330 // BatchRolloutFailedState indicates that the batch didn't get the manual or automatic approval
331 BatchRolloutFailedState BatchRollingState = "batchVerifyFailed"
332 // BatchFinalizingState indicates that all the pods in the are available, we can move on to the next batch
333 BatchFinalizingState BatchRollingState = "batchFinalizing"
334 // BatchReadyState indicates that all the pods in the are upgraded and its state is ready
335 BatchReadyState BatchRollingState = "batchReady"
336 )
337 ```
338
339 ## Future work
340 The applicationDeployment should also work on traits. For example, if someone plans to update the
341 HPA traits formula, there should be a way for them to rolling out the HPA change step by step too.
342