sigs.k8s.io/cluster-api@v1.7.1/docs/proposals/20220712-cluster-api-addon-orchestration.md (about)

     1  ---
     2  title: Cluster API Add-Ons Orchestration
     3  authors:
     4    - "@fabriziopandini"
     5    - "@jont828"
     6    - "@jackfrancis"
     7  reviewers:
     8    - "@CecileRobertMichon"
     9    - "@elmiko"
    10    - "@g-gaston"
    11    - "@enxebre"
    12    - "@dlipovetsky"
    13    - "@sbueringer"
    14    - "@fabriziopandini"
    15    - "@g-gaston"
    16    - "@killianmuldoon"
    17  creation-date: 2022-07-12
    18  last-updated: 2022-09-29
    19  status: implementable
    20  see-also:
    21  replaces:
    22  superseded-by:
    23  ---
    24  
    25  # Cluster API Add-On Orchestration
    26  
    27  - Ref: https://github.com/kubernetes-sigs/cluster-api/issues/5491
    28  - Ref: https://docs.google.com/document/d/1TdbfXC2_Hhg0mH7-7hXcT1Gg8h6oXKrKbnJqbpFFvjw/edit?usp=sharing
    29  
    30  ## Table of Contents
    31  
    32  <!-- START doctoc generated TOC please keep comment here to allow auto update -->
    33  <!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
    34  
    35  - [Glosssary](#glosssary)
    36  - [Summary](#summary)
    37  - [Motivation](#motivation)
    38    - [Goals](#goals)
    39    - [Non goals](#non-goals)
    40    - [Future work](#future-work)
    41  - [Proposal](#proposal)
    42    - [User Stories](#user-stories)
    43      - [Story 1](#story-1)
    44      - [Story 2](#story-2)
    45      - [Story 3](#story-3)
    46      - [Story 4](#story-4)
    47      - [Story 5](#story-5)
    48    - [ClusterAddonProvider Functionality](#clusteraddonprovider-functionality)
    49    - [ClusterAddonProvider Design](#clusteraddonprovider-design)
    50      - [ClusterAddonProvider for Helm](#clusteraddonprovider-for-helm)
    51      - [HelmChartProxy](#helmchartproxy)
    52      - [HelmReleaseProxy](#helmreleaseproxy)
    53      - [Controller Design](#controller-design)
    54      - [Defining the list of Cluster add-ons to install](#defining-the-list-of-cluster-add-ons-to-install)
    55      - [Installing and lifecycle managing the package manager](#installing-and-lifecycle-managing-the-package-manager)
    56      - [Mapping HelmChartProxies to Helm charts](#mapping-helmchartproxies-to-helm-charts)
    57      - [Mapping Kubernetes version to add-ons version](#mapping-kubernetes-version-to-add-ons-version)
    58      - [Generating package configurations](#generating-package-configurations)
    59      - [Maintaining an inventory of add-ons installed in a Cluster](#maintaining-an-inventory-of-add-ons-installed-in-a-cluster)
    60      - [Surface Cluster add-ons status](#surface-cluster-add-ons-status)
    61      - [Upgrade strategy](#upgrade-strategy)
    62      - [Managing dependencies](#managing-dependencies)
    63    - [Example: Installing Calico CNI](#example-installing-calico-cni)
    64    - [Additional considerations](#additional-considerations)
    65      - [HelmChartProxy design](#helmchartproxy-design)
    66      - [Provider contracts](#provider-contracts)
    67  - [Cluster API enhancements](#cluster-api-enhancements)
    68    - [clusterctl](#clusterctl)
    69  - [Alternatives](#alternatives)
    70    - [Why not ClusterResourceSets?](#why-not-clusterresourcesets)
    71    - [Existing package management controllers](#existing-package-management-controllers)
    72  
    73  <!-- END doctoc generated TOC please keep comment here to allow auto update -->
    74  
    75  ## Glosssary 
    76  
    77  **Package management tool:** a tool ultimately responsible[^1] for installing Kubernetes applications, e.g. [Helm](https://helm.sh/) or [carvel-kapp](https://carvel.dev/kapp/).
    78  
    79  **Add-on:** an application that extends the functionality of Kubernetes.
    80  
    81  **Cluster add-on:** an add-on that is essential to the proper function of the Cluster like e.g. networking (CNI) or integration with the underlying infrastructure provider (CPI, CSI). In some cases the lifecycle of Cluster add-on is strictly linked to the Cluster lifecycle (e.g. CPI must be upgraded in lock step with the Cluster upgrade).
    82  
    83  **Helm**: A CNCF graduated project that serves as a package manager widely used in the community. Note that Helm packages are Helm charts.
    84  
    85  [^1]: Package management tools usually have broader scope that goes over and behind the act of installing a Kuberentes application, like e.g. defining formats for packaging Kubernetes applications, defining repository/marketplaces where users can find libraries of existing applications, templating systems for customizing applications etc.
    86  
    87  ## Summary
    88  
    89  Cluster API is designed to manage the lifecycle of Kubernetes Clusters through declarative specs, and currently users rely on their own package management tool of choice for Kubernetes application management (e.g Helm, kapp etc.).
    90  
    91  This proposal aims to provide a solution for orchestrating Cluster add-ons around the Cluster lifecycle that is consistent with the Cluster API design philosophy by using a declarative spec.
    92  
    93  This solution will likely require at least three iterations:
    94  
    95  1. The foundation (the one described in this document) will aim to provide a minimal viable solution for this space, integration with the package management tool of choice (Helm in this case), explore and address basic concerns about orchestration, such as identifying a single source of truth for add-on configuration and a reconciliation mechanism to ensure the expected configuration is applied.
    96  
    97  2. Advanced orchestration will build on the foundation from the first iteration and will explore and add scenarios like configuring add-ons to be upgraded before the Cluster upgrade starts. This work will leverage the lifecycle hooks introduced in Cluster API v1.2.
    98  
    99  3. Add-on dependencies can be explored after the orchestration becomes more mature. This iteration can consider use cases where there are dependencies between add-ons and eventually introduce capabilities to account for that in add-on orchestration. For example, add-ons might need to be installed or ugpraded in a certain order and certain versions of add-ons might be incompatible with each other. The scope of this work is TBD.
   100  
   101  ## Motivation
   102  
   103  ClusterResourceSets are the current solution for managing add-ons in Cluster API. However, ClusterResourceSets have been designed as a temporary measure until a better alternative is available. In particular, an add-on solution in Cluster API should be in line with the [goals of Cluster API](https://cluster-api.sigs.k8s.io/#goals) by aiming “to reuse and integrate existing ecosystem components rather than duplicating their functionality.” These components include other projects in SIG Cluster Lifecycle such as [Cluster Add-ons](https://github.com/kubernetes-sigs/cluster-addons) and well established tools like Helm, kapp, ArgoCD, and Flux. 
   104  
   105  This proposal does not intend to solve the issue of add-ons in all of Kubernetes. Rather, it aims to bring add-ons into the Cluster API cluster lifecycle by managing add-ons using a declarative spec and orchestrating existing package management tools.
   106  
   107  ### Goals
   108  
   109  - To design a solution for orchestrating Cluster add-ons.
   110  - To leverage existing package management tools such as Helm for all the foundational capabilities of add-on management, i.e. add-on packages/repository, templating/configuration, add-on creation, upgrade and deletion etc. 
   111  - To make add-on management in Cluster API modular and pluggable, and to make it simple for developers to build a Cluster API Add-on Provider based on any package management tool, just like with infrastructure and bootstrap providers.
   112  
   113  ### Non goals
   114  
   115  - To implement a full fledged package management tool in Cluster API; there are already several awesome package management tools in the ecosystem, and CAPI should not reinvent the wheel. 
   116  - To provide a mechanism for altering, customizing, or dealing with single Kubernetes resources defining a Cluster add-on, i.e. Deployments, Services, ServiceAccounts. Cluster API should treat add-ons as opaque components and delegate all the operations impacting add-on internals to the package management tool.
   117  - To expect users to use a specific package management tool.
   118  - To implement a solution for installing add-ons on the management cluster itself.
   119  
   120  ### Future work
   121  
   122  - Handle the upgrade of Cluster add-ons when the workload Cluster is upgraded to a new Kubernetes version.
   123  - Provide support for advanced orchestration use cases leveraging on Cluster API’s lifecycle hooks for Cluster API.
   124  - Introduce capabilities to reconcile add-ons in an order that respects dependencies between add-ons.
   125  - Deprecate or remove the ClusterResourceSet experimental feature.
   126  
   127  ## Proposal
   128  
   129  This document introduces the concept of a ClusterAddonProvider, a new pluggable component to be deployed on the Management Cluster. A ClusterAddonProvider must be implemented using a specified package management tool and will act as an intermediary/proxy between Cluster API and the chosen package management solution.
   130  
   131  This proposal also introduces a ClusterAddonProvider for [Helm](https://helm.sh/) as a concrete example of how a ClusterAddonProvider would work with a package management tool. Users can configure ClusterAddonProvider for Helm through a new Kubernetes CRD called HelmChartProxy to be deployed on the Management Cluster. HelmChartProxy will act as an intermediary (proxy) between Cluster API and the Helm to install packages onto workload clusters. The implementation of ClusterAddonProvider for Helm can be used as a reference to implement ClusterAddonProviders for other package managers, i.e. ClusterAddonProvider for kapp.
   132  
   133  ### User Stories
   134  
   135  #### Story 1
   136  
   137  As a developer or Cluster operator, I would like Cluster add-on providers to automatically include or install the package management tool.
   138  
   139  #### Story 2
   140  
   141  As a developer or Cluster operator, I would like Cluster add-on providers to expose configurations native to the package management tool to install Cluster add-ons into the newly created Workload Cluster.
   142  
   143  #### Story 3
   144  
   145  As a developer or Cluster operator, I would like Cluster add-on providers to leverage existing capabilities of the package management tool, such as identifying the proper version of Cluster add-ons package to install onto the workload Cluster.
   146  
   147  #### Story 4
   148  
   149  As a developer or Cluster operator, I would like Cluster add-on providers to use information available in the Management Cluster (e.g. service or pod CIDR on the Cluster object) for customizing add-ons configurations specifically for the newly created workload Cluster.
   150  
   151  #### Story 5
   152  
   153  As a developer or Cluster operator, I would like Cluster add-on providers to update add-ons configurations when information available in the Management Cluster (e.g add-on configuration) changes.
   154  
   155  ### ClusterAddonProvider Functionality
   156  
   157  A ClusterAddonProvider is a new component for bringing add-ons into the Cluster API lifecycle. It serves as a general solution to Cluster API Add-ons with multiple concrete implementations, similar to the relationship between Cluster API and infrastructure providers. It aims to do so handle add-on lifecycle management in a holistic way by providing the following capabilities:
   158  
   159  - Provide or install a package management tool
   160  - Determine which add-ons to install or uninstall on each Cluster
   161  - Provide a default configuration for each add-on
   162  - Configure an add-on by resolving or inherit information from the Cluster configuration (i.e. service or pod CIDR for CNI)
   163  - Install add-ons during Cluster creation (immediately after the API server is available). This operation usually brings in some additional requirements such as: 
   164    - Resolve the add-on version for the Cluster’s Kubernetes version
   165    - Wait for the add-on to be actually running
   166  - Provide a mechanism for upgrading an add-on before, during, or after the Cluster upgrade and cleaning up the previous version
   167  - Support for out-of band upgrades for add-ons not linked to the Cluster lifecycle, i.e. CVE fix
   168  - Configure pre and post installation steps for each add-on, thus allowing to manage migrations/operations such as:
   169    - CPI migration from in-tree to out-of-tree
   170    - CoreDNS configuration migration from one version to another
   171  - Orchestrate add-on deletion before Cluster deletion
   172  
   173  This proposal addresses a subset of these problems as part of a first iteration by designing a way for Cluster API to orchestrate an external package management tool for add-on management. An external package management tool is expected to provide all the foundational capabilities for add-on management like add-on packages and repositories, templating/configuration, and management of the creation, upgrade, and deletion workflow etc.
   174  
   175  ### ClusterAddonProvider Design
   176  
   177  When discussing the design of a ClusterAddonProvider, ClusterAddonProvider for Helm will be used as an example of a concrete implementation.
   178  
   179  #### ClusterAddonProvider for Helm
   180  
   181  ClusterAddonProvider for Helm proposes two new CRDs named HelmChartProxy and HelmReleaseProxy. HelmChartProxy is used to specify a Helm chart with configuration values and a selector identifying the Clusters the add-on applies to. HelmReleaseProxy contains information about the underlying Helm release installed on a cluster from a Helm chart. All the configurations will be done from the HelmChartProxy and the HelmReleaseProxy is simply used to maintain an inventory as well as surface information. Note that the specific fields in both CRDs are subject to change as the project evolves.
   182  
   183  This is a link to the prototype for [ClusterAddonProvider for Helm](https://github.com/Jont828/cluster-api-addon-provider-helm), and a demo can be found in this [Cluster API Office Hours recording](https://www.youtube.com/watch?v=dipR1Tzh4jE). The prototype implements HelmChartProxy and HelmReleaseProxy as well as controllers for both CRDs.
   184  
   185  #### HelmChartProxy
   186  
   187  HelmChartProxy is a namespaced CRD that serves as the user interface for defining and configuring Helm charts as well as selecting which workload clusters it will be installed on. 
   188  
   189  ```go
   190  // HelmChartProxySpec defines the desired state of HelmChartProxy.
   191  type HelmChartProxySpec struct {
   192    // ClusterSelector selects Clusters in the same namespace with a label that matches the specified label selector. The Helm 
   193    // chart will be installed on all selected Clusters. If a Cluster is no longer selected, the Helm release will be uninstalled.
   194    ClusterSelector metav1.LabelSelector `json:"clusterSelector"`
   195  
   196    // ChartName is the name of the Helm chart in the repository.
   197    ChartName string `json:"chartName"`
   198  
   199    // RepoURL is the URL of the Helm chart repository.
   200    RepoURL string `json:"repoURL"`
   201  
   202    // ReleaseName is the release name of the installed Helm chart. If it is not specified, a
   203    // name will be generated.
   204    // +optional
   205    ReleaseName string `json:"releaseName,omitempty"`
   206  
   207    // ReleaseNamespace is the namespace the Helm release will be installed on each selected
   208    // Cluster. If it is not specified, it will be set to the default namespace.
   209    // +optional
   210    ReleaseNamespace string `json:"namespace,omitempty"`
   211  
   212    // Version is the version of the Helm chart. If it is not specified, the chart will use 
   213    // and be kept up to date with the latest version.
   214    // +optional
   215    Version string `json:"version,omitempty"`
   216  
   217    // ValuesTemplate is an inline YAML representing the values for the Helm chart. This YAML supports Go templating to reference
   218    // fields from each selected workload Cluster and programatically create and set values.
   219    // +optional
   220    ValuesTemplate string `json:"valuesTemplate,omitempty"`
   221  }
   222  
   223  // HelmChartProxyStatus defines the observed state of HelmChartProxy.
   224  type HelmChartProxyStatus struct {
   225    // Conditions defines current state of the HelmChartProxy.
   226    // +optional
   227    Conditions clusterv1.Conditions `json:"conditions,omitempty"`
   228  
   229    // MatchingClusters is the list of references to Clusters selected by the ClusterSelector.
   230    // +optional
   231    MatchingClusters []corev1.ObjectReference `json:"matchingClusters"`
   232  }
   233  ```
   234  
   235  #### HelmReleaseProxy
   236  
   237  HelmReleaseProxy is a namespaced resource representing a single Helm release installed on a selected workload Cluster. HelmReleaseProxies are created and managed by the HelmChartProxy controller. The parent HelmChartProxy will be responsible for keeping its HelmReleaseProxy children up-to-date as its lifecycle is updated (for example, if the `values` properties of a HelmChartProxy are updated, its child HelmReleaseProxy resources will be updated as well).
   238  
   239  ```go
   240  // HelmReleaseProxySpec defines the desired state of HelmReleaseProxy.
   241  type HelmReleaseProxySpec struct {
   242    // ClusterRef is a reference to the Cluster to install the Helm release on.
   243    ClusterRef corev1.ObjectReference `json:"clusterRef"`
   244  
   245    // ChartName is the name of the Helm chart in the repository.
   246    ChartName string `json:"chartName"`
   247  
   248    // RepoURL is the URL of the Helm chart repository.
   249    RepoURL string `json:"repoURL"`
   250  
   251    // ReleaseName is the release name of the installed Helm chart. If it is not specified, a
   252    // name will be generated.
   253    // +optional
   254    ReleaseName string `json:"releaseName,omitempty"`
   255  
   256    // ReleaseNamespace is the namespace the Helm release will be installed on the referenced 
   257    // Cluster. If it is not specified, it will be set to the default namespace.
   258    // +optional
   259    ReleaseNamespace string `json:"namespace"`
   260  
   261    // Version is the version of the Helm chart. If it is not specified, the chart will use 
   262    // and be kept up to date with the latest version.
   263    // +optional
   264    Version string `json:"version,omitempty"`
   265  
   266    // Values is an inline YAML representing the values for the Helm chart. This YAML is the result of the rendered
   267    // Go templating with the values from the referenced workload Cluster.
   268    // +optional
   269    Values string `json:"values,omitempty"`
   270  }
   271  
   272  // HelmReleaseProxyStatus defines the observed state of HelmReleaseProxy.
   273  type HelmReleaseProxyStatus struct {
   274    // Conditions defines current state of the HelmReleaseProxy.
   275    // +optional
   276    Conditions clusterv1.Conditions `json:"conditions,omitempty"`
   277  
   278    // Status is the current status of the Helm release.
   279    // +optional
   280    Status string `json:"status,omitempty"`
   281  
   282    // Revision is the current revision of the Helm release.
   283    // +optional
   284    Revision int `json:"revision,omitempty"`
   285  }
   286  ```
   287  
   288  #### Controller Design
   289  
   290  The HelmChartProxy controller is responsible for maintaining our list of HelmReleaseProxies and resolving any referenced Cluster fields. The controller will:
   291  
   292  - Find all workload clusters matching the cluster selector within the same namespace.
   293  - Resolve the referenced Cluster fields in the values map based on each selected cluster.
   294  - Create a HelmReleaseProxy for each selected workload cluster if one does not exist with the resolved values.
   295  - Update the HelmReleaseProxy for each selected workload cluster if it exists and the version or resolved values have changed.
   296  - Delete the HelmReleaseProxy for each selected workload cluster if the cluster no longer matches the cluster selector.
   297  
   298  The HelmReleaseProxy controller is responsible for creating, updating, and deleting the Helm release installed on the workload cluster. The controller will:
   299  
   300  - Install a Helm release on the workload cluster if it does not exist.
   301  - Update the existing Helm release if the HelmReleaseProxy values or version do not match the existing Helm release.
   302  - Delete the Helm release if the HelmReleaseProxy is deleted.
   303  
   304  This design of HelmReleaseProxy is especially convenient for the controller because a `Create()` maps directly to a `helm install` operation, a `Update()` maps directly to a `helm upgrade` operation, and a `Delete()` maps directly to a `helm uninstall` operation.
   305  
   306  Note that the HelmReleaseProxy controller contains a static configuration and is only responsible for ensuring there is a Helm release matching the HelmReleaseProxy. The HelmReleaseProxy gets created, updated, and deleted by the HelmChartProxy controller.
   307  
   308  #### Defining the list of Cluster add-ons to install
   309  
   310  HelmChartProxy allows users to define the add-ons to install and which clusters they should be installed on using the `clusterSelector`. The specified Helm chart is then installed on every cluster that includes a label matching the cluster selector and makes it part of the orchestration. If a selected cluster no longer matches the cluster selector label, then the Helm release will be uninstalled. 
   311  
   312  #### Installing and lifecycle managing the package manager
   313  
   314  ClusterAddonProvider for Helm imports Helm v3 as a library, making this a no-op. However, it is worth noting that each version of Helm only supports certain version of Kubernetes. For example, the [Helm docs](https://helm.sh/docs/topics/version_skew/#supported-version-skew) mention that Helm 3.9 supports Kubernetes 1.21 through 1.24. In this case, we will document the version support matrix for ClusterAddonProvider for Helm if it differs from the CAPI support matrix.
   315  
   316  Additionally, other package managers like kapp require in-cluster components such as the kapp-controller. In this case, ClusterAddonProvider should include those in-cluster components as well.
   317  
   318  #### Mapping HelmChartProxies to Helm charts
   319  
   320  A Helm chart is uniquely identified using the URL of its repository, the name of the chart, and the chart version. As a result, each HelmChartProxy uses the `repoURL`, `chartName`, and `version` fields to uniquely identify a Helm chart and determine which version to install. This allows HelmChartProxy to be compatible with any valid Helm repository and chart.
   321  
   322  #### Mapping Kubernetes version to add-ons version
   323  
   324  By definition, the Cluster add-on lifecycle is strictly linked to the Cluster lifecycle. As a result, there is a strict correlation between the Cluster’s Kubernetes version and the add-on version to be installed in a Cluster.
   325  
   326  Some add-ons have an explicit version correlation such as Cloud Provider Interface following the “Versioning Policy for External Cloud Providers.” Other add-ons like Calico define a case by case compatability matrix in their [docs](https://projectcalico.docs.tigera.io/getting-started/kubernetes/requirements#supported-versions).
   327  
   328  On top of that, in many Cluster API installations, the add-on version an operator would like to use is limited to a list of certified versions defined in each organization/product vendor.
   329  
   330  We can rely upon existing functionality and capabilities in Helm to enforce Kubernetes version-specific requirements in a chart, including whether or not to install anything at all. It's worth noting that other add-on tools might not provide this functionality out of the box.
   331  
   332  #### Generating package configurations 
   333  
   334  Each package manager has its own specific configuration format. Helm charts, for example, can be configured using the `values.yaml` file where each chart can define its own configurable fields in YAML format.
   335  
   336  For ClusterAddonProvider for Helm, a solution is relatively simple. HelmChartProxy contains a `valuesTemplate` field which allows the contents of a `values.yaml` file to be passed in as a inline YAML string.
   337  
   338  Additionally, the `valuesTemplate` field functions as a Go template. This allows `valuesTemplate` to reference fields from the Cluster definition, i.e. cluster name, namespace, control plane ref, pod CIDRs. These template fields will be resolved dynamically based on the actual values for each workload cluster selected by HelmChartProxy’s cluster selector.
   339  
   340  In a future iteration we might consider to leverage template variables in ClusterClass. If a HelmChartProxy is selecting clusters with a managed topology, then the Go template configuration can benefit from `Cluster.spec.topology.variables`, and from the strong typing/validation ensured by the variable schema defined in corresponding ClusterClass.
   341  
   342  #### Maintaining an inventory of add-ons installed in a Cluster
   343  
   344  Maintaining an inventory of add-ons installed is useful for a number of reasons. The most straightforward is for users to see at a glance which add-ons are installed on a cluster. On the other hand, the inventory is useful for add-on orchestration as well. It allows the controller to be able to distinguish between an add-on installed through the orchestrator and an add-on installed out of band. This means that on add-on deletion, the controller will only uninstall add-ons it is responsible for and won’t remove existing add-ons.
   345  
   346  First, the simplest is to let the package manager maintain the list. For example, `helm list` lists all the releases on a cluster which already handles maintaining an inventory of addons. However, there is no way to distinguish between out of band add-ons and add-ons installed through the orchestrator. Additionally, any ClusterAddonProvider would need to rely on their package manager to have a suitable implementation of listing packages on a cluster.
   347  
   348  The solution implemented in the ClusterAddonProvider for Helm is to use HelmReleaseProxy to maintain an inventory. Each HelmReleaseProxy corresponds to a Helm release installed through HelmChartProxy. The controller only orchestrates Helm releases with a corresponding HelmReleaseProxy, ensuring that out of band Helm releases won’t be affected. 
   349  
   350  Additionally, each HelmReleaseProxy includes a `clusterv1.ClusterNameLabel` and an additional label indicating the HelmChartProxy it corresponds to. This provides a way to query for all Helm releases installed on a Cluster and to query for all Helm releases belonging to a HelmChartProxy.
   351  
   352  #### Surface Cluster add-ons status
   353  
   354  ClusterAddonProviders are responsible for installing add-ons on an arbitrary number of workload clusters. As such, it would be beneficial to see the status of an add-on from the management cluster and avoid having to fetch the kubeconfig of the workload cluster.
   355  
   356  In ClusterAddonProvider for Helm, the status of a Helm release can be conveyed through the HelmReleaseProxy status fields like `status`, `revision`, and `namespace`. Additional information can be surfaced in the `conditions` field.
   357  
   358  #### Upgrade strategy
   359  
   360  Users will be able to upgrade add-ons across multiple Clusters at once by updating the `version` field in the HelmChartProxy to trigger a Helm upgrade for all the Helm releases it manages. Additionally, HelmChartProxy allows users to upgrade their add-ons in a declarative manner.
   361  
   362  Note that hooks and automatic triggers for add-on upgrades when a Cluster is upgraded to a new Kubernetes version is out of scope for this iteration. Users will need to determine if an add-on version upgrade is needed, and if so, set a compatible version in the HelmChartProxy.
   363  
   364  In future iterations, we plan to handle the upgrade of Cluster add-ons when the workload Cluster is upgraded to a new Kubernetes version by leveraging the new runtime hooks feature.
   365  
   366  #### Managing dependencies
   367  
   368  Managing dependencies between HelmChartProxies is out of scope for this project. However, Helm charts define their dependencies in the [Chart.yaml file](https://helm.sh/docs/topics/charts/#the-chartyaml-file) and will install them automatically. If a Helm chart has a critical dependency on another chart, it is the chart owner's responsibility to enable it by default.
   369  
   370  ### Example: Installing Calico CNI
   371  
   372  Developers and Cluster operators often want to install a CNI on their workload clusters. HelmChartProxy controller can be used to install the Calico CNI on each selected workload cluster. In this example, the value configuration can be leveraged to initialize the `installation.calicoNetwork.ipPools` array by creating an element for every pod CIDR in the Cluster.
   373  
   374  ```yaml
   375  apiVersion: addons.cluster.x-k8s.io/v1alpha1
   376  kind: HelmChartProxy
   377  metadata:
   378    name: calico-cni
   379  spec:
   380    clusterSelector:
   381      matchLabels:
   382        calicoCNI: enabled
   383    releaseName: calico
   384    repoURL: https://projectcalico.docs.tigera.io/charts
   385    chartName: tigera-operator
   386    valuesTemplate: |
   387      installation:
   388        cni:
   389          type: Calico
   390          ipam:
   391            type: HostLocal
   392        calicoNetwork:
   393          bgp: Disabled
   394          mtu: 1350
   395          ipPools:{{range $i, $cidr := .Cluster.Spec.ClusterNetwork.Pods.CIDRBlocks }}
   396          - cidr: {{ $cidr }}
   397            encapsulation: None
   398            natOutgoing: Enabled
   399            nodeSelector: all(){{end}}
   400  ```
   401  
   402  ### Additional considerations
   403  
   404  #### HelmChartProxy design
   405  
   406  The scope of the cluster values accessible using Go templating must be defined along with use cases. An option would be to expose the Builtin struct type used in ClusterClass or construct a similar type. Additionally, while resources like the control plane and cluster have a 1:1 correlation with each selected cluster, resources like MachineDeployments or Machines could have any number and can’t be selected in a similar way. As a result, additional work is needed to decide if there’s a use case for including information in those types.
   407  
   408  There has also been discussion about whether HelmChartProxy and HelmReleaesProxy should be Cluster wide namespaced. This would impact whether an instance of HelmChartProxy would select Clusters across all namespaces or only Clusters within the same namespace. The CRDs are namespaced for the sake of simplicity as well as consistency with all the other Cluster API CRDs, but it is worth noting that it can be inconvenient if a user has several namespaces and wants to install the same Helm chart on all of them.
   409  
   410  #### Provider contracts
   411  
   412  As it stands, there is no formalized provider contract between ClusterAddonProviders that a ClusterAddonProvider must implement. That being said, the design of ClusterAddonProvider for Helm does suggest that certain patterns and components can be reused between different ClusterAddonProviders. These patterns can include:
   413  
   414  - Designing a CRD like HelmChartProxy to specify an add-on to install on certain Clusters.
   415  - Using a label selector of some kind to install an add-on on a selected workload Cluster and to uninstall an add-on a Cluster that is unselected.
   416  - Using Go templates in a field like `valuesTemplate` to dynamically resolve configurations based on each Cluster's definition.
   417  - Designing a CRD like HelmReleaseProxy to:
   418    - Maintain an inventory of add-ons and differentiate between managed add-ons and out of band add-ons.
   419    - Provide a modular design such that the HelmChartProxy controller does not need to implement any Helm operations and only make Kubernetes client calls to create, update, or delete a HelmReleaseProxy.
   420  
   421  ## Cluster API enhancements
   422  
   423  Despite the fact that most of the heavy lifting for add-on management is delegated to external ClusterAddonProvider components, over time we would like to provide Cluster API’s users an integrated and seamless user experience for add-on orchestration.
   424  
   425  ### clusterctl
   426  
   427  ClusterAddonProvider is a new component to be installed in a Management Cluster, and in order to make this operation simpler, we can extend clusterctl in order to manage ClusterAddonProvider implementations in clusterctl init, clusterctl upgrade, etc.
   428  
   429  Note: clusterctl assumes all the providers to be linked to a CAPI contract; while it is not clear at this stage if the same applies to ClusterAddonProvider. This should be further investigated.
   430  
   431  ## Alternatives
   432  
   433  ### Why not ClusterResourceSets?
   434  
   435  This proposal introduces ClusterAddonProviders as opposed to overhauling ClusterResourceSets for the following reasons:
   436  
   437  - ClusterResourceSets do not follow the Cluster API convention of using a declarative spec and having controllers create, delete, or update resources to build the desired state. Even though there has been work to add a Reconcile or an "apply always" mode to ClusterResourceSets, it is [explicitly stated](https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/proposals/20200220-cluster-resource-set.md#non-goalsfuture-work) that deleting resources from Clusters is a non-goal.
   438  - Users are already using package management tools, and being able to use their preferred package management tool reduces user friction towards adopting an add-on solution.
   439  - ClusterResourceSets have relatively fewer user experts compared to existing package management tools like Helm.
   440  - ClusterResourceSets fail to take advantage of functionality in existing package management tools like dependency installation, and evolving ClusterResourceSets would require re-inventing the functionality of existing package management tools.
   441  - ClusterResourceSets do not orchestrate deletion of add-ons and was not designed with that in mind. In order to orchestrate deletion of add-ons, it would require a significant redesign of the code and interface. Doing so would still not address the above issues compared to designing a new solution with all of this in mind.
   442  
   443  ### Existing package management controllers
   444  
   445  It is worth noting that there are other projects have integrated a package management tool with a Kubernetes controller. For example, Flux has developed a [Helm controller](https://github.com/fluxcd/helm-controller) as part of its GitOps toolkit for syncing Clusters with sources of configuration. It defines a `HelmRelease` CRD and offers [integration with Cluster API](https://fluxcd.io/flux/components/helm/helmreleases/#remote-clusters--cluster-api) where users can create a `HelmRelease` to install an add-on on a specified Cluster.
   446  
   447  Given that this proposal aims to explore the topic of orchestration between Cluster API and add-on tools, we still believe that using a simpler, homegrown version of HelmRelease CRD and avoiding dependencies with other projects allows us to iterate faster while we are still figuring out the best orchestration patterns between the two components.
   448  
   449  However, we encourage users to explore both solutions and provide feedback, so we can better figure out next steps.