sigs.k8s.io/cluster-api-provider-azure@v1.14.3/docs/proposals/20220825-azuremanaged-cluster-exp-graduation.md (about)

     1  ---
     2  title: AzureManagedCluster graduation from experimental
     3  authors:
     4    - "@jackfrancis"
     5  reviewers:
     6    - @CecileRobertMichon
     7    - @zmalik
     8    - @NovemberZulu
     9    - @mtougeron
    10    - @nojnhuh
    11  creation-date: 2022-08-25
    12  last-updated: 2022-08-25
    13  status: implementable
    14  see-also:
    15  - https://github.com/kubernetes-sigs/cluster-api-provider-azure/issues/2204
    16  - https://github.com/kubernetes-sigs/cluster-api/pull/6988
    17  - https://github.com/kubernetes-sigs/cluster-api-provider-azure/pull/2739
    18  ---
    19  
    20  
    21  # AzureManagedCluster graduation from experimental
    22  
    23  ## Summary
    24  
    25  `AzureManagedCluster` and its corresponding set of CRDs (we will refer to these CRDs as simply "`AzureManagedCluster`" in this document) is a CAPZ-native implementation of Azure Managed Kubernetes (AKS). Because there is no standard set of Cluster API resource definitions for a "Managed Kubernetes cluster", it is left up to the provider to reuse the existing Cluster API specification (for example, the `Cluster` and its to-be-implemented-by-provider properties such as `ControlPlaneEndpoint`, `ControlPlaneRef` and `InfrastructureRef`). As a result, CAPZ implemented "`AzureManagedCluster`" with an API contract designation of "experimental", to allow for rapid prototyping and discovery.
    26  
    27  With the recent adoption of "`AzureManagedCluster`" by the CAPZ community for practical, real-world use, we want to identify the set of outstanding items that may prevent graduation from experimental, and address each one of them, so that future adoption can be unlocked, and users can confidently build resilient systems on top of a stable API.
    28  
    29  ## Issue Tracking
    30  
    31  Issues that emerge as a result of this effort will be tracked here:
    32  
    33  - https://github.com/orgs/kubernetes-sigs/projects/26
    34  
    35  ## Motivation
    36  
    37  ### Goals
    38  - Agree upon a durable, post-experimental specification of the set of CRDs that implement Managed Kubernetes on Azure, e.g., `AzureManagedCluster`, `AzureManagedControlPlane`, `AzureManagedMachinePool`
    39  - Prioritize timeliness of a post-experimental definition so that users can confidently plan for any breaking changes that result from a graduated spec as soon as possible
    40  - Prioritize "architectural affinity" with other Cluster API providers implementing Managed Kubernetes
    41  
    42  ### Non-Goals / Future Work
    43  - Add to, or remove features from, "`AzureManagedCluster`"
    44  - Standardize any opinions about how best to use AKS
    45  - Promote CAPZ-specific opinions about Managed Kubernetes across the Cluster API provider ecosystem
    46  - Define operational support
    47  
    48  ## Post-graduation Prerequisites
    49  
    50  Concrete prereq workstreams are [tracked here](https://github.com/orgs/kubernetes-sigs/projects/26/views/1)
    51  
    52  ### 1. Land Managed Cluster in Cluster API Proposal (Status: DONE)
    53  
    54  See:
    55  
    56  - https://github.com/kubernetes-sigs/cluster-api/pull/6988
    57  
    58  The above proposal defines a set of recommendations for Cluster API providers implementing Managed Kubernetes solutions. Because this proposal is an opt-in collection of architectural recommendations, it is not required that "`AzureManagedCluster`" strictly agrees with everything therein. However, by contributing to the successful landing of that proposal in Cluster API, we can best ensure that any CAPZ-specific opinions or learnings are reflected, for the benefit of the larger ecosystem, and for the maximum happiness of AKS customers in particular.
    59  
    60  Ref:
    61  
    62  - https://github.com/kubernetes-sigs/cluster-api/pull/6988
    63  
    64  ### 2. Consider Cluster API Proposal Recommendations (Status: Finalizing Consensus)
    65  
    66  Once the proposal is accepted and merged into the Cluster API project as an endorsed set of provider recommendations, CAPZ can audit the existing "`AzureManagedCluster`" experimental implementation for areas of disagreement with said recommendations. For each discovery of disagreement, there should be a defensible reason to matriculate a divergent CAPZ Managed Kubernetes implementation into a post-experimental definition of "`AzureManagedCluster`", and ideally some form of sign-off from other provider maintainers. Where community consensus cannot be reached, CAPZ should strongly consider evolving the experimental "`AzureManagedCluster`" implementation to meet the Cluster API Managed Kubernetes recommendation as a pre-requisite to graduating from experimental.
    67  
    68  Ref:
    69  
    70  - https://github.com/kubernetes-sigs/cluster-api-provider-azure/pull/2739
    71  - https://github.com/kubernetes-sigs/cluster-api/issues/7494
    72  
    73  
    74  ### 3. Pathway Towards Full AKS Feature Support (Status: DONE)
    75  
    76  "`AzureManagedCluster`" should be able to accommodate the entire feature set of AKS. Rather than require the concrete list of features of AKS to be fully implemented as a prerequisite to graduation, we should instead audit the AKS feature matrix against the "`AzureManagedCluster`" architectural surface area and ensure that CAPZ is well prepared to continually integrate existing and new AKS features into "`AzureManagedCluster`" along a non-breaking path forward.
    77  
    78  ### 4. Affinity between CAPZ / Cluster API resource lifecycle enforcement and AKS lifecycle enforcement. (Status: Adding E2E Tests)
    79  
    80  Not unrelated to the above audit of AKS features, we should also overlay the Cluster API lifecycle combinatorics (simply speaking: Create, Read, Update, Delete) against the set of AKS cluster primitives to ensure that sane interfaces exist in CAPZ in order to effectively enforce (for example, add an AKS node pool), or passively delegate authority (for example, defer to the built-in AKS autoscaler when enabled to enforce `MachinePool` replica count), if appropriate.
    81  
    82  ### 5. Azure API Request Optimizations (Status: DONE)
    83  
    84  Prior to graduation from experimental, we should ensure that the sufficient set of configurable interfaces exist to allow "`AzureManagedCluster`" users to effectively tune CAPZ so that no unnecessary Azure API requests are introduced into their AKS environment. Essentially: CAPZ + AKS offers an opportunity for more AKS API requests from at least two dimensions:
    85  
    86  - the implementation details of eventual consistency introduce net new "let me check if the current state matches my goal state" GET calls that aren't there if you aren't running an eventual consistency controller on top of your AKS environment
    87  - if we do our jobs correctly CAPZ will optimize for multi-cluster stories, which may yield customer scenarios where more clusters are running on single subscriptions, which is at least _n_x (where n is the number of clusters) calls to the AKS API
    88  
    89  We should ship sane, optimized defaults, but allow for flexible overrides to anticipate a wide variety of operational use cases.
    90  
    91  Ref:
    92  
    93  - https://github.com/kubernetes-sigs/cluster-api-provider-azure/pull/2688
    94  
    95  ### 6. Audit Webhooks (Status: DONE)
    96  
    97  Ensure that the CAPZ webhooks that enforce input validation and other configuration requirements are complete for each feature, and match the authoritative AKS API requirements.
    98  
    99  Ref:
   100  
   101  - https://github.com/kubernetes-sigs/cluster-api-provider-azure/pull/2626
   102  - https://github.com/kubernetes-sigs/cluster-api-provider-azure/pull/2717
   103  - https://github.com/kubernetes-sigs/cluster-api-provider-azure/pull/2741
   104  - https://github.com/kubernetes-sigs/cluster-api-provider-azure/pull/2795
   105  
   106  ### 7. E2E Tests (Status: In Progress)
   107  
   108  For every supported AKS feature, and for every supported lifecycle mutation of said feature, CAPZ should have thorough, regular E2E test coverage.
   109  
   110  ### 8. Documentation
   111  
   112  The current `AzureManagedCluster` is ad hoc, and in a cupboard hidden away in the docs structure. We should make this definitive, and in a more easily discoverable place.
   113  
   114  Ref:
   115  
   116  - https://github.com/kubernetes-sigs/cluster-api-provider-azure/pull/2776
   117  
   118  ## Open Questions (Status: DONE)
   119  
   120  ### Dependency upon (currently experimental) MachinePool spec (Status: DONE, won't take dependency)
   121  
   122  At present we reuse the Cluster API `MachinePool` specification (as `AzureManagedMachinePool`) to implement AKS node pools running on Azure VMSS. A consideration here is that `MachinePool` is considered experimental, and behind a feature flag, by Cluster API. Do we want to add the graduation of `MachinePool` out of experimental as a prerequisite for graduating "`AzureManagedCluster`" out of experimental?
   123  
   124  We are tracking a concrete Cluster API implementation of `MachinePoolMachine` [here](https://github.com/kubernetes-sigs/cluster-api/pull/6089).
   125  
   126  The CAPZ community has declared an intention to graduate `AzureManagedCluster` independently from `MachinePool`. In practice, momentum suggests `AzureManagedCluster` will graduate prior to `MachinePool`. Based on conversations with folks who have worked on `MachinePool`, we don't expect any API changes; in the unlikely event that API changes to `MachinePool` emerge prior to its graduation, we the `AzureManagedCluster` solution has its own abstraction (`AzureManagedMachinePool` to protect its users from those changes.