github.com/oam-dev/kubevela@v1.9.11/design/vela-core/workflow_policy.md (about) 1 # Application-Level Policies and Customized Control-Logic Workflow Design 2 3 ## Background 4 5 The current model consists of mainly Components and Traits. 6 While this enables the Application object to plug-in operational capabilities, it is still not flexible enough. 7 Specifically, it has the following limitations: 8 9 - The current control logic could not be customized. Once the Vela controller renders final k8s resources, it simply applies them without any extension points. In some scenarios, users want to do more complex operations like: 10 - Blue-green style upgrade of the app. 11 - User interaction like manual approval/rollback. 12 - Distributing workloads across multiple clusters. 13 - Actions to enforce policies and audit. 14 - Pushing final k8s resources to other config store (e.g. Git repos). 15 - There is only per-component config, but no application-level policies. In some scenarios, users want to define policies like: 16 - Security: RBAC rules, audit settings, secret backend types. 17 - Insights: app delivery lead time, frequence, MTTR. 18 19 Here is an overview of the features we want to expose and the capabilities we want to plug in: 20 21  22 23 ## Proposal 24 25 To resolve the aforementioned problems, we propose to add app-level policies and customizable workflow to the Application CRD: 26 27 ```yaml 28 kind: Application 29 spec: 30 components: ... 31 32 # Policies are rendered after components are rendered but before workflow are started 33 policies: 34 - type: security 35 name: my-rule 36 properties: 37 rbac: enabled 38 audit: enabled 39 secretBackend: vault 40 41 - type: deployment-insights 42 name: my-deploy-insight 43 properties: 44 leadTime: enabled 45 frequency: enabled 46 mttr: enabled 47 48 # workflow is used to customize the control logic. 49 # If workflow is specified, Vela won't apply any resource, but provide rendered resources in a ConfigMap, referenced via AppRevision. 50 # workflow steps are executed in array order, and each step: 51 # - will have a context in annotation. 52 # - should mark "finish" phase in status.conditions. 53 workflow: 54 55 steps: 56 57 # blue-green rollout 58 - type: blue-green-rollout 59 stage: post-render # stage could be pre/post-render. Default is post-render. 60 properties: 61 partition: "50%" 62 63 # suspend can manually stop the workflow and resume. it will also allow suspend policy for workflow. 64 - type: suspend 65 66 # traffic shift 67 - type: traffic-shift 68 properties: 69 partition: "50%" 70 71 # promote/rollback 72 - type: rollout-promotion 73 properties: 74 manualApproval: true 75 rollbackIfNotApproved: true 76 ``` 77 78 This also implicates we will add two Definition CRDs -- `PolicyDefinition` and `WorkflowStepDefinition`. 79 80 PolicyDefinition looks like below: 81 82 ```yaml 83 apiVersion: core.oam.dev/v1beta1 84 kind: PolicyDefinition 85 spec: 86 schematic: 87 cue: 88 template: | 89 parameters: { 90 frequency: *"enabled" | "disabled" 91 } 92 output: { 93 apiVersion: app.oam.dev/v1 94 kind: Insight 95 spec: 96 frequency: parameters.frequency 97 } 98 ``` 99 100 ### CUE-Based Workflow Task 101 102 Outputing a CR object to complete a task in workflow requires users to implement an Operator which incurs heavy overhead. 103 To simplify it, especially for users with simple use cases, we decide to provide lightweight CUE based workflow task. 104 105 ```yaml 106 apiVersion: core.oam.dev/v1beta1 107 kind: WorkflowStepDefinition 108 metadata: 109 name: apply 110 spec: 111 schematic: 112 cue: 113 template: | 114 import "vela/op" 115 116 parameters: { 117 image: string 118 } 119 120 apply: op.#Apply & { 121 resource: context.workload 122 } 123 124 wait: op.#ConditionalWait & { 125 continue: apply.status.ready == true 126 } 127 128 export: op.#Export & { 129 secret: apply.status.secret 130 } 131 ``` 132 133 ### Stability mechanism 134 135 #### Backoff Time 136 137 Sometimes a workflow step can take a long time, so we need a backoff time for workflow reconciliation. 138 139 If the status of workflow step is `waiting` or `failed`, the workflow will be reconciled after a backoff time like below: 140 141 ``` 142 int(0.05 * 2^(n-1)) 143 ``` 144 145 Based on the above formula, we will take `1s` as the min time and `60s` as the max time. You can change the max time by setting `MaxWorkflowWaitBackoffTime`. 146 147 For example, if the workflow is `waiting`, the first ten reconciliation will be like: 148 149 | Times | 2^(n-1) | 0.05*2^(n-1) | Requeue After(s) | 150 | ------ | ------ | ------ | ------ | 151 | 1 | 1 | 0.05 | 1 | 152 | 2 | 2 | 0.1 | 1 | 153 | 3 | 4 | 0.2 | 1 | 154 | 4 | 8 | 0.4 | 1 | 155 | 5 | 16 | 0.8 | 1 | 156 | 6 | 32 | 1.6 | 1 | 157 | 7 | 64 | 3.2 | 3 | 158 | 8 | 128 | 6.4 | 6 | 159 | 9 | 256 | 12.8 | 12 | 160 | 10 | 512 | 25.6 | 25 | 161 | ... | ... | ... | ... | 162 163 #### Failed Workflow Steps 164 165 If the workflow step is `failed`, it means that there may be some error in the workflow step, like some cue errors. 166 167 > Note that if the workflow step is unhealthy, the workflow step will be marked as `wait` but not `failed` and it will wait for healthy. 168 169 For this case, we will retry the workflow step 10 times by default, and if the workflow step is still `failed`, we will terminate this workflow, and it's message will be `The workflow terminates automatically because the failed times of steps have reached the limit`. You can change the retry times by setting `MaxWorkflowStepErrorRetryTimes`. 170 171 ## Implementation 172 173 In this section we will discuss the implementation details for supporting policies and workflow tasks. 174 175 Here's a diagram of how workflow internals work: 176 177  178 179 180 ### 1. Application Controller 181 182 Here are the steps in Application Controller: 183 184 - On reconciling an Application event, Application Controller will render out all resources from components, traits, policies. 185 It will also put rendered resources into a ConfigMap, and reference the ConfigMap name in AppRevision as below: 186 187 ```yaml 188 kind: ApplicationRevision 189 spec: 190 ... 191 resourcesConfigMap: 192 name: my-app-v1 193 --- 194 195 kind: ConfigMap 196 metadata: 197 name: my-app-v1 198 data: 199 mysvc: | 200 { 201 "apiVersion": "apps/v1", 202 "kind": "Deployment", 203 "metadata": { 204 "name": "mysvc" 205 }, 206 "spec": { 207 "replicas": 1 208 } 209 } 210 ...more name:data pairs... 211 ``` 212 213 - After render, Application Controller will execute `spec.workflow`. 214 This will basically call Workflow Manager to execute workflow tasks starting from scratch or last-run step on retry. 215 216 217 ### 2. Workflow Manager 218 219 Here are the steps in Workflow Manager: 220 221 - The Workflow Manager will get the current workflow step via `status.workflow.stepIndex`. 222 - If stepIndex is equal to the length of the all steps, it indicates that workflow is all done and return immediately. 223 - If there are workflow tasks left, they will be run step by step. For each step, Workflow Manager will call Task Manager to handle it. 224 - On return from calling Task Manager, Workflow Manager checks the return result: 225 - If `status = completed`, Workflow Manager will increment `status.workflow.stepIndex`, and continue to run next step if any. 226 - Otherwise, it will retry later. 227 228 ### 3. Task Manager 229 230 Here are the steps in Task Manager: 231 232 - A workflow task will be executed synchronously which requires that the steps of a task should be non-blocking. 233 - A workflow task will be parsed with its properties first to retrieve the full CUE data. 234 - Task manager will get all do-able steps from the CUE data. This is done by analyzing if the step has a `#do` field. 235 Here is an example: 236 237 ``` 238 apply: op.#Apply & { 239 resource: ... 240 } 241 ``` 242 243 The `op.#Apply` contains a [hidden field][2] `#do`: 244 245 ```yaml 246 #Apply: { 247 #do: "apply" 248 ... 249 } 250 ``` 251 252 This will inject the `#do` field to the `apply` step. 253 254 - All do-able steps will be executed one by one by Task Manager. 255 256 257 ### 4. CUE Step Execution 258 259 - Task Manager will keep a map of actions. 260 An action follows this interface: 261 262 ```go 263 type TaskAction interface { 264 // cueValue is the parsed CUE value for this action 265 Run(cueValue interface{}) (TaskStatus, error) 266 } 267 ``` 268 269 - Task Manager will use the `#do` field of the CUE step as the key to find an action to run. 270 271 - An action returns a status indicating what to do next: 272 - continue: continue to run the next action. 273 - wait: makes the workflow manager to retry later. 274 - break: makes the workflow manager to stop the entire workflow. 275 - failedAfterRetries: if there are no other running steps, makes the workflow manager to suspend the workflow. 276 277 - Task Manager will change status as needed based on the returned TaskStatus, e.g. change to wait. 278 279 280 ## Task Action 281 282 These are the task actions to be supported in `vela/op` CUE lib: 283 284 285 - Load: loads the rendered component resources 286 287 ``` 288 #Load: { 289 #do: "load" 290 component?: string 291 } 292 ``` 293 294 - KubeRead: reads a k8s resource object 295 296 ``` 297 #Read: { 298 #do: "read" 299 apiVersion: string 300 kind: string 301 namespace: string 302 name: string 303 } 304 ``` 305 306 - Apply: applies a k8s resource object 307 308 ``` 309 #Apply: { 310 #do: "apply" 311 resource: string 312 } 313 ``` 314 315 - Wait: waits until the `continue` condition is ready, otherwise makes the controller to reconcile later. 316 317 ``` 318 #Wait: { 319 #do: "wait" 320 continue: bool 321 } 322 ``` 323 324 - Break: breaks from the workflow, and reports reasoning message. 325 326 ``` 327 #Break: { 328 #do: "break" 329 message: string 330 } 331 ``` 332 333 334 - Export: exports the data into context for other workflow tasks to reuse 335 336 ``` 337 #Export: { 338 #do: "export" 339 type: "patch" | *"var" 340 if type == "patch" { 341 component: string 342 } 343 value: _ 344 } 345 ``` 346 347 ## Workflow Operation 348 349 These are the operations that users can use to control the workflow at global level. 350 351 ### 1. Terminate Workflow 352 353 If the execution of the workflow does not meet expectations, it may be necessary to terminate the workflow 354 355 There are two ways to achieve that: 356 357 1. Modify the `workflow.terminated` field in status 358 359 ```yaml 360 kind: Application 361 metadata: 362 name: foo 363 status: 364 phase: runningWorkflow 365 workflow: 366 stepIndex: 1 367 terminated: true 368 steps: 369 - name: ... 370 ``` 371 372 373 2. Use `op.#Break` in workflowStep definition. When the task is executed, the op.#Break can be captured and then report terminated status 374 375 ```yaml 376 if job.status == "failed"{ 377 break: op.#Break & { 378 message: "job failed: "+ job.status.message 379 } 380 } 381 ``` 382 383 ### 2. Pause Workflow 384 385 1. Modify the value of the `workflow.suspend` field to true to pause the workflow 386 387 ```yaml 388 kind: Application 389 metadata: 390 name: foo 391 status: 392 phase: runningWorkflow 393 workflow: 394 stepIndex: 1 395 suspend: true 396 steps: 397 - name: ... 398 ``` 399 400 2. The built-in suspend task support pause workflow, the example as follow 401 402 ```yaml 403 kind: Application 404 spec: 405 components: ... 406 workflow: 407 steps: 408 - name: manual-approve 409 type: suspend 410 ``` 411 412 The `workflow.suspend` field will be set to true after the suspend-type task is started 413 414 ### 3. Resume Workflow 415 416 Modify the value of the `workflow.suspend` field to false to resume the workflow 417 418 ```yaml 419 kind: Application 420 metadata: 421 name: foo 422 status: 423 phase: runningWorkflow 424 workflow: 425 stepIndex: 1 426 suspend: false 427 steps: 428 - name: ... 429 ``` 430 431 ### 4. Restart Workflow 432 433 The workflow will be restarted in the following two cases: 434 435 1. Modify the value of the `status.phase` field to "runningWorkflow" and clear the status of the workflow 436 437 ```yaml 438 kind: Application 439 metadata: 440 name: foo 441 status: 442 phase: runningWorkflow 443 workflow: {} 444 ``` 445 446 447 2. The application spec changes 448 449 The spec change also means that the application needs to be re-executed, and the application controller will clear the status of application includes workflow status. 450 451 452 ## Operator Best Practice 453 454 Each workflow task has similar interactions with Task Manager as follows: 455 456 - The Task Manager will apply the workflow object with annotation `app.oam.dev/workflow-context`. This annotation will pass in the context marshalled in json defined as the following: 457 ```go 458 type WorkflowContext struct { 459 cli client.Client 460 store *corev1.ConfigMap 461 components map[string]*ComponentManifest 462 vars *value.Value 463 modified bool 464 } 465 ``` 466 467 - The workflow object's status condition should turn to be `True` status and `Succeeded` reason, and `observedGeneration` to match the resource's generation per se. 468 This is to solve the [issue of passing data from the old generation][1]. 469 We will provide CUE op library to check this condition to decide whether to wait. 470 471 ```yaml 472 kind: SomeTask 473 metadata: 474 generation: 2 475 status: 476 observedGeneration: 2 477 conditions: 478 - type: workflow-progress 479 status: 'True' 480 reason: 'Succeeded' 481 ``` 482 483 ## Use Cases 484 485 In this section we will walk through how we implement workflow solutions for the following use cases. 486 487 ### Case 1: Multi-cluster 488 489 In this case, users want to distribute workflow to multiple clusters. The dispatcher implementation is flexible and could be based on [open-cluster-management](https://open-cluster-management.io/) or other methods. 490 491 ```yaml 492 workflow: 493 steps: 494 - type: open-cluster-management 495 properties: 496 placement: 497 - clusterSelector: 498 region: east 499 replicas: "70%" 500 - clusterSelector: 501 region: west 502 replicas: "20%" 503 ``` 504 505 The process goes as: 506 507 - During infra setup, the Cluster objects are applied and agents are setup in each cluster to manage lifecycle of k8s clusters. 508 - Once the Application is applied, the OCM controller can retrieve all rendered resources from AppRevision. It will apply a ManifestWork object including all resources. Then the OCM agent will execute the workload creation in each cluster. 509 510 ### Case 2: Blue-green rollout 511 512 In this case, users want to rollout a new version of the application components in a blue-green rolling upgrade style. 513 514 ```yaml 515 workflow: 516 steps: 517 # blue-green rollout 518 - type: blue-green-rollout 519 properties: 520 partition: "50%" 521 522 # traffic shift 523 - type: traffic-shift 524 properties: 525 partition: "50%" 526 527 # promote/rollback 528 - type: rollout-promotion 529 properties: 530 manualApproval: true 531 rollbackIfNotApproved: true 532 ``` 533 534 The process goes as: 535 536 - By default, each modification of the Application object will generate an AppRevision object. The rollout controller will get the current revision from the context and retrieve the previous revision via kube API. 537 - Then the rollout controller will do the operation to rollings replicas between two revisions (the actual behavior depends on the workload type, e.g. Deployment or CloneSet). 538 - Once the rollover is done, the rollout controller can shift partial traffic to the new revision too. 539 - The rollout controller will wait for the manual approval. In this case, it is in the status of Rollout object: 540 ```yaml 541 kind: Rollout 542 status: 543 pause: true # change this to false 544 ``` 545 546 The reference to the rollout object will be in the Application object: 547 ```yaml 548 apiVersion: core.oam.dev/v1beta1 549 kind: Application 550 status: 551 workflow: 552 steps: 553 - type: rollout-promotion 554 resourceRef: 555 kind: Rollout 556 name: ... 557 ``` 558 559 ### Case 3: Data Passing 560 561 In this case, users want to deploy a database component first, wait the database to be up and ready, and then deploy the application with database connection secret. 562 563 ```yaml 564 components: 565 - name: my-db 566 type: mysql 567 properties: 568 569 - name: my-app 570 type: webservice 571 572 573 workflow: 574 steps: 575 # Wait for the MySQL object's status.connSecret to have value. 576 - type: apply-component 577 outputs: 578 - name: connSecret 579 valueFrom: output.status.connSecret 580 properties: 581 name: my-db 582 583 # Patch my-app Deployment object's field with the secret name 584 # emitted from MySQL object. And then apply my-app component. 585 - type: apply-component 586 inputs: 587 - from: connSecret 588 parameterKey: patch.valueFrom.field 589 properties: 590 name: my-app 591 patch: 592 to: 593 field: spec.containers[0].envFrom[0].secretRef.name 594 valueFrom: 595 apiVersion: database.example.org/v1alpha1 596 kind: MySQLInstance 597 name: my-db 598 ``` 599 600 ### Case 4: GitOps rollout 601 602 In this case, users just want Vela to provide final k8s resources and push them to Git, and then integrate with ArgoCD/Flux to do final rollout. Users will setup a GitOps workflow like below: 603 604 ```yaml 605 workflow: 606 steps: 607 - type: gitops # This part configures how to push resources to Git repo 608 properties: 609 gitRepo: git-repo-url 610 branch: branch 611 credentials: ... 612 ``` 613 614 The process goes as: 615 616 - Everytime an Application event is triggered, the GitOps workflow controller will push the rendered resources to a Git repo. This will trigger ArgoCD/Flux to do continuous deployment. 617 618 ### Case 5: Template-based rollout 619 620 In this case, a template for Application object has already been defined. Instead of writing the `spec.components`, users will reference the template and provide parameters/patch to it. 621 622 ```yaml 623 workflow: 624 steps: 625 - type: helm-template 626 stage: pre-render 627 properties: 628 source: git-repo-url 629 path: chart/folder/path 630 parameters: 631 image: my-image 632 replicas: 3 633 --- 634 workflow: 635 steps: 636 - type: kustomize-patch 637 stage: pre-render 638 properties: 639 source: git-repo-url 640 path: base/folder/path 641 patch: 642 spec: 643 components: 644 - name: instance 645 properties: 646 image: prod-image 647 ``` 648 649 The process goes as: 650 651 - On creating the application, app controller will apply the HelmTemplate/KustomizePatch objects, and wait for its status. 652 - The HelmTemplate/KustomizePatch controller would read the template from specified source, render the final config. It will compare the config with the Application object -- if there is difference, it will write back to the Application object per se. 653 - The update of Application will trigger another event, the app controller will apply the HelmTemplate/KustomizePatch objects with new context. But this time, the HelmTemplate/KustomizePatch controller will find no diff after the rendering. So it will skip this time. 654 655 ### Case 6: Conditional Check 656 657 In this case, users want to execute different steps based on the responseCode. When the `if` condition is not met, the step will be skipped. 658 659 ```yaml 660 workflow: 661 steps: 662 - name: request 663 type: webhook 664 - name: handle-200 665 type: deploy 666 if: request.output.responseCode == 200 667 - name: handle-400 668 type: notification 669 if: request.output.responseCode == 400 670 - name: handle-500 671 type: rollback 672 if: request.output.responseCode == 500 673 ``` 674 675 If users want to execute one step no matter what, they can use `if: always` in the step. In this way, whether the workflow is successful or not, the step will be executed`. 676 677 ```yaml 678 workflow: 679 steps: 680 - type: deploy 681 name: deploy-app 682 - name: notificationA 683 if: always 684 type: notification 685 ``` 686 687 ### Case 7: step group 688 689 In this case, the user runs multiple workflow steps in the `step-group` workflow type. subSteps in a step group will be executed in dag mode. 690 ```yaml 691 workflow: 692 steps: 693 - type: step-group 694 name: run-step-group1 695 subSteps: 696 - name: sub-step1 697 type: ... 698 ... 699 - name: sub-step2 700 type: ... 701 ... 702 703 ``` 704 705 The process is as follows: 706 707 - When executing a `step-group` step, the subSteps in the step group are executed in dag mode. A step group will only complete when all subSteps have been executed to completion. 708 709 ## Considerations 710 711 ### Comparison with Argo Workflow/Tekton 712 713 The workflow defined here are k8s resource based and very simple one direction workflow. It's mainly used to customize Vela control logic to do more complex deployment operations. 714 715 While Argo Workflow/Tekton shares similar idea to provide workflow functionalities, they are container based and provide more complex features like parameters sharing (using volumes and sidecars). More importantly, these projects couldn't satisfy our needs. Otherwise we can just use them in our implementation. 716 717 [1]: https://github.com/crossplane/oam-kubernetes-runtime/issues/222 718 [2]: https://cuetorials.com/overview/scope-and-visibility/#hidden-fields-and-values