github.com/oam-dev/kubevela@v1.9.11/design/vela-core/rollout-design.md (about)

     1  # OAM Rollout Controller Design
     2  
     3  - Owner: Ryan Zhang (@ryangzhang-oss)
     4  - Date: 01/14/2021
     5  - Status: Draft
     6  
     7  ## Table of Contents
     8  
     9  - [Introduction](#introduction)
    10     - [Two flavors of Rollout](#two-flavors-of-rollout)
    11  - [Goals](#goals)
    12  - [Proposal](#proposal)
    13    - [Registration via Definition/Capability](#registration-via-definitioncapability)
    14    - [Templating](#templating)
    15    - [CLI/UI interoperability](#cliui-interoperability)
    16    - [vela up](#vela-up)
    17  - [Examples](#examples)
    18  
    19  ## Introduction
    20  
    21  `Rollout` or `Upgrade` is one of the most essential "day 2" operation on any application
    22  . KubeVela, as an application centric platform, definitely needs to provide a customized solution
    23  to alleviate the  burden on the application operators. There are several popular rollout solutions
    24  , i.e. [flagger](https://flagger.app/),  in the open source community. However, none of them
    25  work directly with our OAM framework. Therefore, we propose to create an OAM native rollout
    26   framework that can address all the application rollout/upgrade needs in Kubevela.
    27      
    28  ### Two flavors of Rollout
    29  After hearing from the OAM community, it is clear to us that there are two flavors of rollout
    30   that we need to support. 
    31   - One way is through an OAM trait. This flavor works for existing OAM applications that don't
    32    have many specialized requirements such as interaction with other traits. For
    33    example, most rollout operations don't work well with scaling operations. Thus, the
    34    application operator needs to remove any scalar traits from the component before applying the
    35    rollout trait. Another example is rollout operations usually also involve traffic spliting
    36    of sorts. Thus, the application operators might also need to manually adjust the related
    37    traffic trait before and after applying the rollout trait. 
    38   - The other ways is through a new applicationDeployment CR which directly reference different
    39    versions of applications instead of workloads. This opens up the possibility for the controller
    40    to solve conflicts between traits automatically. This resource, however, not a trait and requires
    41    the application to be immutable. 
    42  
    43  ### Design Principles and Goals
    44  We design our controllers with the following principles in mind 
    45    - First, we want all flavors of rollout controllers share the same core rollout
    46     related logic. The trait and application related logic can be easily encapsulated into its own
    47     package.
    48    - Second, the core rollout related logic is easily extensible to support different type of
    49     workloads, i.e. Deployment, Cloneset, Statefulset, Daemonset or even customized workloads. 
    50    - Thirdly, the core rollout related logic has a well documented state machine that
    51     does state transition explicitly.
    52    - Finally, the controllers can support all the rollout/upgrade needs of an application running
    53     in a production environment.
    54     
    55  ### Proposal Layout
    56  Here is the rest of the proposal 
    57    - First, we will present the exact rollout CRD spec.
    58    - Second, we will give a high level design on how do we plan to implement the controller.
    59    - Third, we will present the state machine and their transition events. 
    60    - Finally, we will list common rollout scenarios and their corresponding user experience and
    61     implementation details.
    62        
    63  ## Rollout API Design
    64  Let's start with the rollout trait spec definition. The applicationDeployment spec is very similar.
    65  ```go
    66  // RolloutTraitSpec defines the desired state of RolloutTrait
    67  type RolloutTraitSpec struct {
    68  	// TargetRef references a target resource that contains the newer version
    69  	// of the software. We assumed that new resource already exists.
    70  	// This is the only resource we work on if the resource is a stateful resource (cloneset/statefulset)
    71  	TargetRef corev1.ObjectReference `json:"targetRef"`
    72  
    73  	// SourceRef references the list of resources that contains the older version
    74  	// of the software. We assume that it's the first time to deploy when we cannot find any source.
    75  	// +optional
    76  	SourceRef []corev1.ObjectReference `json:"sourceRef,omitempty"`
    77  
    78  	// RolloutPlan is the details on how to rollout the resources
    79  	RolloutPlan RolloutPlan `json:"rolloutPlan"`
    80  }
    81  ```
    82  The target and source here are the same as other OAM traits that refers to a workload instance
    83   with its GVK and name. 
    84   
    85  We can see that the core part of the rollout logic is encapsulated by the `RolloutPlan` object. This
    86   allows us to write multiple controllers that share the same logic without duplicating the code
    87   . Here is the definition of the `RolloutPlan` structure.
    88  
    89  ```go
    90  // RolloutPlan fines the details of the rollout plan
    91  type RolloutPlan struct {
    92  
    93  	// RolloutStrategy defines strategies for the rollout plan
    94  	// +optional
    95  	RolloutStrategy RolloutStrategyType `json:"rolloutStrategy,omitempty"`
    96  
    97  	// The size of the target resource. The default is the same
    98  	// as the size of the source resource.
    99  	// +optional
   100  	TargetSize *int32 `json:"targetSize,omitempty"`
   101  
   102  	// The number of batches, default = 1
   103  	// mutually exclusive to RolloutBatches
   104  	// +optional
   105  	NumBatches *int32 `json:"numBatches,omitempty"`
   106  
   107  	// The exact distribution among batches.
   108  	// mutually exclusive to NumBatches
   109  	// +optional
   110  	RolloutBatches []RolloutBatch `json:"rolloutBatches,omitempty"`
   111  
   112  	// All pods in the batches up to the batchPartition (included) will have
   113  	// the target resource specification while the rest still have the source resource
   114  	// This is designed for the operators to manually rollout
   115  	// Default is the the number of batches which will rollout all the batches
   116  	// +optional
   117  	BatchPartition *int32 `json:"lastBatchToRollout,omitempty"`
   118  
   119  	// RevertOnDelete revert the rollout when the rollout CR is deleted, default is false 
   120  	//+optional 
   121  	RevertOnDelete bool `json:"revertOnDelete,omitempty"`
   122  
   123  	// Paused the rollout, default is false 
   124  	//+optional 
   125  	Paused bool `json:"paused,omitempty"`
   126  
   127  	// RolloutWebhooks provides a way for the rollout to interact with an external process
   128  	// +optional
   129  	RolloutWebhooks []RolloutWebhook `json:"rolloutWebhooks,omitempty"`
   130  
   131  	// CanaryMetric provides a way for the rollout process to automatically check certain metrics
   132  	// before complete the process
   133  	// +optional
   134  	CanaryMetric []CanaryMetric `json:"canaryMetric,omitempty"`
   135  }
   136  ```
   137  
   138  ## User Experience Workflow
   139  OAM rollout experience is different from flagger in some key areas and here are the
   140   implications on its impact on the user experience.
   141  - We assume that the resources it refers to are **immutable**. In contrast, flagger watches over a
   142    target resource and reacts whenever the target's specification changes.
   143      - The trait version of the controller refers to componentRevision.
   144      - The application version of the controller refers to immutable application. 
   145  - The rollout logic **works only once** and stops after it reaches a terminal state. One can
   146    still change the rollout plan in the middle of the rollout as long as it does not change the
   147    pods that are already updated. 
   148  - The applicationDeployment controller only rollout one component in the applications for now.
   149  - Users in general should rely on the rollout CR to do the actual rollout which means they
   150   shall set the `replicas` or `partition` field of the new resources to the starting value
   151   indicated in the [detailed rollout plan design](#rollout-plan-work-with-different-type-of-workloads).
   152      
   153   
   154  ## Notable implementation level design decisions
   155  Here are some high level implementation design decisions that will impact the user experience of
   156   rolling out.
   157  
   158  ### Rollout workflows
   159  As we mentioned in the introduction section, we will implement two rollout controllers that work
   160  on different levels. At the end, they both emit an in-memory rollout plan object which includes
   161  references to the target and source kubernetes resources that the rollout planner will execute
   162  upon. For example, the applicationDeployment controller will get the component from the
   163   application and extract the real workload from it before passing it to the rollout plan object.
   164   
   165   With that said, two controllers operate differently to extract the real workload. Here are the
   166    high level descriptions of how each works. 
   167   
   168   #### Application inplace upgrade workflow 
   169   The most natural way to upgrade an application is to upgrade it in-place which means the users
   170  just change the application, and the system will pick up the change, then apply to the runtime
   171  . The implementation of this type of upgrade looks like this:
   172  - The application controller compute a hash value of the applicationConfiguration. The
   173   application controller **always** use the component revision name in the AC it generates. This
   174    guaranteed that the AC also changes when the component changes.
   175  - The application controller creates the applicationConfiguration with a new name (with a suffix
   176  ) upon changing of its hash value and with a pre-determined annotation 
   177   "app.oam.dev/appconfig-rollout" set to true. 
   178  - The AC controller have special handle logic in the apply part of the logic. The exact logic
   179   depends on the workload type and we will list each in the 
   180   [rollout with different workload](#Rollout plan work with different type of workloads) section
   181   . This special AC logic is also the real magic for the other rollout scenario to work as AC
   182    controller is the only entity that is directly responsible for emitting the workload to the k8s.
   183     
   184     
   185  #### ApplicationDeployment workflow 
   186  When an appDeployment is used to do application level rollout, **the target application
   187  is not reconciled by the application controller yet**. This is to make sure  the
   188  appDeployment controller has the full control of the new application from the beginning.
   189  We will use a pre-defined annotation "app.oam.dev/rollout-template" that equals to "true" to facilitate
   190   that. We expect any system, such as the [kubevela apiserver](APIServer-Catalog.md), that
   191    utilizes an appDeployment object to follow this rule.
   192  - Upon creation, the appDeployment controller marks itself as the owner of the application. The
   193   application controller will have built-in logic to ignore any applications that has the
   194   "app.oam.dev/rollout-template" annotation set to true.
   195  - the appDeployment controller will also add another annotation "app.oam.dev/creating" to the
   196   application to be passed down to the ApplicationConfiguration CR it generates to mark 
   197   that the AC is reconciled for the first time.
   198  - The ApplicationConfiguration controller recognizes this annotation, and it will see if there is
   199   anything it needs to do before emitting the workload to the k8s. The AC controller removes this
   200    annotation at the end of a successful reconcile.
   201  - The appDeployment controller can change the target application fields. For example, 
   202     - It might remove all the conflict traits, such as HPA during upgrade. 
   203     - It might modify the label selectors fields in the services to make sure there are ways to
   204      differentiate traffic routing to the old and new application resources.
   205  - The appDeployment controller will return the control of the new application back to the
   206   application controller after it makes the initial adjustment of the application by removing the
   207    annotation.
   208    - We will use a webhook to ensure that the "rollout" annotation cannot be added back once removed.
   209  - Upon a successful rollout, the appDeployment controller leaves no pods running for the old
   210   application.
   211  - Upon a failed rollout, the condition is not determined, it could result in an unstable state
   212   since both the old and new applications have been modified. 
   213  - Thus, we introduced a `revertOnDelete` field so that a user can delete the appDeployment and
   214    expect the old application to be intact, and the new application takes no effect.
   215  
   216  #### Rollout trait workflow
   217  The rollout traits controller only works with componentRevision.
   218  - The component controller emits the new component revision when a new component is created or
   219   updated.
   220  - The application configuration controller emits the new component and assign the
   221   componentRevision as the source and target of rollout trait. 
   222  - We assume that there is no other scalar related traits deployed at the same time. We will use
   223   `conflict-with` fields in the traitDefinition and webhooks to enforce that.
   224  - Upon a successful rollout, the rollout trait will keep the old component revision with no
   225   pod left.
   226  - Upon a failed rollout,the rollout trait will just stop and leaves the resource mixed. This
   227   state mostly should still work since the other traits are not touched.
   228  
   229  ### Rollout plan work with different type of workloads
   230  The rollout plan part of the rollout logic is shared between all rollout controller. It comes
   231   with references to the target and source workload. The controller is responsible for fetching
   232  the different revisions of the resources. Deployment and Cloneset represents the two major types
   233   of resources in that Cloneset can contain both the new and old revisions at a stable state while
   234   deployment only contains one version when it's stable.
   235   
   236  #### Rollout plan works with deployment
   237  It's pretty straightforward for the outer controller to create the in-memory target and source
   238   deployment object since they are just a copy of the kubernetes resources.
   239  - The deployment workload should set the **`Paused` field to be true** by the user in the
   240   appDeployment case.
   241  - Another options is for the user to leave the `replicas` field as 0 if the rollout does not have
   242   access to that field.
   243  - If the rollout is successful, the source deployment `replicas` field will be zero and the
   244   target deployment will be the same as the original source.
   245  - If the rollout failed, we will leave the state as it is.
   246  - If the rollout failed and `revertOnDelete` is `true` and the rollout CR is deleted, then the
   247   source deployment `replicas` field will be turned back to before rollout and the target deployment's `replicas` field will
   248   be zero.
   249  
   250  #### Rollout plan works with cloneset
   251  The outer controller creates the in-memory target and source cloneset object with different image
   252   ids. The source is only used when we do need to do rollback.
   253  - The user should set the Cloneset workload's **`Paused` field to be true** by the user in the
   254    appDeployment case.
   255  - Another options is for the user to leave the `partition` field in a value that effectively stop
   256   upgrade if the rollout does not have access to that field.
   257  - The rollout plan mostly just adjusts the  `partition` field in the Cloneset and leaves the rest
   258   of the rollout logic to the Cloneset controller.
   259  - If the rollout is successful, the `partition` field will be zero
   260  - If the rollout failed, we will leave the `partition` field as the last time we touch it.
   261  - If the rollout failed and `revertOnDelete` is `true` and the rollout CR is deleted, we will
   262   perform a revert on the Cloneset. Note that only the latest Cloneset controller allows rollback
   263   when one increases the `partition` field.
   264  
   265  ### Operational features
   266  - We will use the same service mesh model as flagger in the sense that user needs to provide the
   267   service mesh provider type and give us the reference to an ingress object. 
   268      - We plan to directly import the various flagger mesh implementation.
   269  - We plan to import the implementation of notifiers from flagger too
   270  - We can consider adding an alert rule in the rollout Plan api in the future
   271  
   272  ### We use webhook to validate whether the change to the rollout CRD is valid.
   273  We will go with strict restrictions on the CRD update in that nothing can be updated other than
   274  the following fields:
   275  - the BatchPartition field can only increase unless the target ref has changed
   276  - the RolloutBatches field can only change the part after the BatchPartition field
   277  - the CanaryMetric/Paused/RevertOnDelete can be modified freely.
   278  - the rollout controller will simply replace the existing rollout CR value in the in-memory map
   279   which will lead to its in-memory execution to stop. The new CR will kick off a new execution
   280   loop which will resume the rollout operation based on the rollout and its resources status which
   281    follows the pre-determined state machine transition.
   282  
   283  ### The controller have extension points setup for the following plug-ins:
   284  - workloads. Each workload handler needs to implement the following operations:
   285      - scale the resources
   286      - determine the health of the workload
   287      - report how many replicas are upgraded/ready/available
   288  - (future) metrics provider.
   289  - (future) service mesh provider. Each mesh provider needs to implement the following operations:
   290       - direct certain percent of the traffic to the source/target workload
   291       - fetch the current traffic split
   292  
   293  ## State Transition
   294  Here is the state transition graph
   295  
   296  ![](https://raw.githubusercontent.com/oam-dev/kubevela.io/main/docs/resources/approllout-status-transition.jpg)
   297  
   298  Here are the various top-level states of the rollout 
   299  ```go
   300  	// VerifyingSpecState indicates that the rollout is in the stage of verifying the rollout settings
   301      // and the controller can locate both the target and the source
   302      VerifyingSpecState RollingState = "verifyingSpec"
   303      // InitializingState indicates that the rollout is initializing all the new resources
   304      InitializingState RollingState = "initializing"
   305      // RollingInBatchesState indicates that the rollout starts rolling
   306      RollingInBatchesState RollingState = "rollingInBatches"
   307      // FinalisingState indicates that the rollout is finalizing, possibly clean up the old resources, adjust traffic
   308      FinalisingState RollingState = "finalising"
   309      // RolloutFailingState indicates that the rollout is failing
   310      // one needs to finalize it before mark it as failed by cleaning up the old resources, adjust traffic
   311      RolloutFailingState RollingState = "rolloutFailing"
   312      // RolloutSucceedState indicates that rollout successfully completed to match the desired target state
   313      RolloutSucceedState RollingState = "rolloutSucceed"
   314      // RolloutAbandoningState indicates that the rollout is abandoned, can be restarted. This is a terminal state
   315      RolloutAbandoningState RollingState = "rolloutAbandoned"
   316      // RolloutFailedState indicates that rollout is failed, the target replica is not reached
   317      // we can not move forward anymore, we will let the client to decide when or whether to revert.
   318      RolloutFailedState RollingState = "rolloutFailed"
   319  )
   320  ```
   321  
   322  These are the sub-states of the rollout when its in the rolling state.
   323  ```go
   324  	// BatchInitializingState still rolling the batch, the batch rolling is not completed yet
   325      BatchInitializingState BatchRollingState = "batchInitializing"
   326      // BatchInRollingState still rolling the batch, the batch rolling is not completed yet
   327      BatchInRollingState BatchRollingState = "batchInRolling"
   328      // BatchVerifyingState verifying if the application is ready to roll.
   329      BatchVerifyingState BatchRollingState = "batchVerifying"
   330      // BatchRolloutFailedState indicates that the batch didn't get the manual or automatic approval
   331      BatchRolloutFailedState BatchRollingState = "batchVerifyFailed"
   332      // BatchFinalizingState indicates that all the pods in the are available, we can move on to the next batch
   333      BatchFinalizingState BatchRollingState = "batchFinalizing"
   334      // BatchReadyState indicates that all the pods in the are upgraded and its state is ready
   335      BatchReadyState BatchRollingState = "batchReady"
   336  )
   337  ```
   338  
   339  ## Future work
   340  The applicationDeployment should also work on traits. For example, if someone plans to update the
   341   HPA traits formula, there should be a way for them to rolling out the HPA change step by step too.
   342