sigs.k8s.io/cluster-api@v1.7.1/docs/staging-use-cases.md

sigs.k8s.io/cluster-api@v1.7.1/docs/staging-use-cases.md (about)

1 ---
2 title: Cluster API Reference Use Cases
3 creation-date: 2019-04-16
4 last-updated: 2019-04-16
5 ---
6
7 
8 
9
10 - [Cluster API Reference Use Cases](#cluster-api-reference-use-cases)
11 - [Role Glossary](#role-glossary)
12 - [Icon Glossary](#icon-glossary)
13 - [Operator of Workload Cluster](#operator-of-workload-cluster)
14 - [Creating Clusters](#creating-clusters)
15 - [Staged Adoption of Cluster API By Operators](#staged-adoption-of-cluster-api-by-operators)
16 - [Deleting Clusters](#deleting-clusters)
17 - [Scaling](#scaling)
18 - [Configuration Updates](#configuration-updates)
19 - [Security](#security)
20 - [Upgrades](#upgrades)
21 - [Monitoring](#monitoring)
22 - [Adoption](#adoption)
23 - [Multitenancy Management](#multitenancy-management)
24 - [Disaster Recovery](#disaster-recovery)
25 - [Operator of Management Cluster](#operator-of-management-cluster)
26 - [Versioning and Upgrades](#versioning-and-upgrades)
27 - [Removing Cluster API](#removing-cluster-api)
28 - [Cross-cluster Metrics](#cross-cluster-metrics)
29 - [Specific Architecture Approaches](#specific-architecture-approaches)
30 - [Multitenancy Management](#multitenancy-management-1)
31 - [Multi-cluster/Multi-provider](#multi-clustermulti-provider)
32 - [Managing Providers](#managing-providers)
33 - [Creating Workload Clusters](#creating-workload-clusters)
34 - [Provider Implementors](#provider-implementors)
35 - [Cluster Health Checking](#cluster-health-checking)
36
37 
38
39 # Cluster API Reference Use Cases
40
41 This is a living document that serves as a reference and a staging area for use cases collected from the community during post-v1alpha1 project redesign.
42
43 ## Role Glossary
44 - __User__: consumer of a Kubernetes-conformant cluster created by the Cluster API.
45 - Does not use Cluster API.
46 - __Operator__: Administrator responsible for creating and managing a Kubernetes cluster deployed by Cluster API.
47 - Uses Cluster API.
48 - __Multi-cluster operator__: An operator responsible for multiple Kubernetes clusters deployed by Cluster API.
49 - Uses Cluster API.
50 - Cares about keeping config similar between many clusters.
51
52
53 ## Icon Glossary
54 - 🔭 Out of Scope for Cluster API itself, but should be possible via higher level tool.
55 - These are use-cases that we should take care not to prevent.
56
57 ## Operator of Workload Cluster
58
59 ### Creating Clusters
60
61 - As an operator, given that I have a cluster running Cluster API, I want to be able to use declarative APIs to manage another Kubernetes cluster (create, upgrade, scale, delete).
62
63 - As an operator, given that I have a cluster running Cluster API, I want to be able to use declarative APIs to manage a vendor’s Kubernetes conformant cluster (create, upgrade, scale, delete).
64
65 - As an operator, when I create a new cluster using Cluster API, I want Cluster API to automatically create and manage any supporting provider infrastructure needed for my new cluster to function.
66
67 - As an operator, when I create a new cluster using Cluster API, I want to be able to use existing infrastructure (e.g. VPC’s, SecurityGroups, veth, GPUs).
68
69 - 🔭 As an operator, when I create a new cluster using Cluster API, I want to be able to take advantage of resource aware topology (e.g. compute, device availability, etc.) to place machines.
70
71 - As an operator, I need to have a way to make minor customizations before kubelet starts while using a standard node image and standard boot script. Example: write a file, disable hyperthreading, load a kernel module.
72
73 - As an operator, I need to have a way to apply labels to Nodes created through ClusterAPI. This will allow me to partition workloads across Nodes/Machines and MachineDeployments. Examples of labels include datacenter, subnet, and hypervisor, which applications can use in affinity policies.
74
75 - As an operator, I want to be able to provision the nodes of a workload cluster on an existing vnet that I don’t have admin control of.
76
77 ### Staged Adoption of Cluster API By Operators
78
79 - As an operator, I would like to use some features of Cluster API without using all features of Cluster API.
80
81 - As an operator, given that I have a management cluster and a pre-existing control plane, I would like to manage the lifecycle of a group of worker nodes without managing the control plane those nodes join.
82
83 ### Deleting Clusters
84
85 - As an operator, when I delete a Cluster object, I want Cluster API to delete all the infrastructure it created for that cluster.
86
87 - As an operator, when I delete a Machine object, I want Cluster API to gracefully shutdown (drain) that Node and delete all the infrastructure it created for that Machine.
88
89 ### Scaling
90
91 - As an operator, given that I have deployed a cluster using Cluster API, I want to configure the cluster-autoscaler to drive scaling operations.
92
93 - As an operator, given that I have a management cluster and a workload cluster, I want to retrieve, set, and change the number of worker Nodes or control plane Nodes in my workload cluster.
94
95 - As an operator, given I have a management cluster and a workload cluster, I want to control the sizing, scaling, and optimizing of the workload cluster’s control plane in terms of Kubernetes primitives (e.g. HPA, VPA, resource limits, etc). I would like to import the best practices, sizing metrics and knowledge from e.g. the specialized SIG Scalability; the information should be expressed uniformly in Kubernetes terms.
96
97 - As an operator, I expect the Cluster API to maintain the number and type of Nodes that I have currently requested as members of the cluster.
98
99 ### Configuration Updates
100
101 - As an operator, given that I have a management cluster and a workload cluster, I want to update the IaaS credentials used to lifecycle manage my workload cluster because the correct credentials have changed.
102
103 - As an operator, given I have a management cluster and a workload cluster, I want to apply configuration changes before kubelet starts.
104
105 - As an operator, given that I have deployed a workload cluster via Cluster API, I want to change config in my workload cluster for which Cluster API is authoritative and have a Cluster API controller manage the deployment of that new configuration over the workload cluster.
106
107 - As an operator, given that I have deployed a workload cluster via Cluster API and used Cluster API controller to manage the deployment of a new (broken) configuration, I want to revert to the previous working configuration.
108
109 - As an operator, I want to declare the attributes (os, kernel, CRI) of the nodes I want my workload to run on. The CAPI provider/controller should select an appropriate image that satisfies my constraints/attributes.
110 - If the provider does not support the attributes I have specified or find an appropriate image it should fail with an appropriate error.
111
112 ### Security
113
114 - As an operator, given I have a management cluster and a workload cluster, I want to know when my cluster’s/machines’ certificates will expire, so that I can plan to rotate them.
115
116 - As an operator, given I have a management cluster and a workload cluster, I want to automatically, periodically repave all the nodes of my cluster to reduce the risk of unauthorized software running on my machines.
117
118 - As an operator, I want an external CA to sign certificates for the workload cluster control plane.
119
120 - 🔭As an operator, given I have a management cluster and a workload cluster, I want to rotate all the certificates and credentials my machines are using.
121 - Some certificates might get rotated as part of machine upgrade, but are otherwise the above is out of scope.
122
123 - 🔭 As an operator, given I have a management cluster and a workload cluster, I want to rotate/change the CA used to sign certificates for my workload cluster.
124
125 - 🔭 As an operator, I want an external CA to sign certificates for workload cluster kubelets.
126
127 ### Upgrades
128 - As an operator, given I have a management cluster and a workload cluster, I want to patch the OS running on all of my machines (e.g. for a CVE).
129
130 - As an operator, given I have a management cluster and a workload cluster, I want to upgrade my workload cluster (control plane and nodes) to a new version of kubernetes. I want the workload cluster control plane to be available during the upgrade.
131
132 - As an operator, given I have a management cluster and a workload cluster, I want to upgrade my workload cluster control plane to a new version of Kubernetes and also update my etcd version at the same time. I want to know in advance if the upgrade will require control plane downtime.
133
134 - As an operator, given I have a management cluster and a workload cluster, I want to upgrade the version of CNI plugin and network daemon that my workload cluster is using. I want to know in advance if the upgrade could cause application downtime.
135
136 - 🔭 As an operator, given I have a management cluster and a workload cluster, I want to upgrade my workload cluster to a new version of etcd without upgrading the Kubernetes control plane. I want to know in advance if the upgrade will require control plane downtime.
137
138 ### Monitoring
139 - 🔭 As an operator, given I have a management cluster and a workload cluster, I want to retrieve metrics about the underlying machines (e.g. CPU usage, memory) in the workload cluster.
140
141 - 🔭 As an operator, given I have a management cluster, a workload cluster, and permission to open interactive shells on that workload cluster, I want to open an interactive shell on the machines in my workload cluster.
142
143 - 🔭 As an operator, given I have a management cluster and a workload cluster, I want to monitor the cleanup of persistent disks/volumes used by my workload cluster.
144
145 - 🔭 As an operator, given I have a management cluster and a workload cluster, I want to monitor the cleanup of created by my workload cluster.
146
147 - 🔭 As an operator, given I have a management cluster and a workload cluster, I want to ensure the etcd database in my workload cluster is backed up.
148
149 ### Adoption
150 - 🔭 As an operator, given I have created a Kubernetes-conformant cluster without ClusterAPI, I want to use ClusterAPI to manage it. In order to do so, I need to know the requirements for adopting/importing this cluster in terms of required CRD’s and operators (e.g Machine and Cluster objects).
151
152 ### Multitenancy Management
153 - 🔭 As an operator, given I have a management cluster and a workload cluster, I want to setup roles, role bindings, users, and usage quotas on my workload cluster.
154
155 ### Disaster Recovery
156 - 🔭As an operator, I want to be able to recover from the complete loss of all the control plane replicas of a workload cluster. This excludes etcd.
157
158 - 🔭As an operator, I want to be able to recover the etcd cluster of a workload cluster from an irrecoverable failure. I will provide the etcd snapshot required by the recovery mechanism.
159
160 ## Operator of Management Cluster
161
162 - As an operator, given I have a Kubernetes-conformant cluster, I would like to install Cluster API and a provider on it in a straight-forward process.
163
164 - As an operator, given I have a management cluster that was deployed by Cluster API (via the pivot workflow), I want to manage the lifecycle of my management cluster using Cluster API.
165
166 - As an operator, given I am following the instructions in the Cluster API (/provider) README, I expect the instructions to work and that I will end up with a working management cluster.
167
168 ### Versioning and Upgrades
169 - As an operator, when I choose a version of Cluster API and provider to use, I want to know what version(s) of Kubernetes and other software (CNI, docker, OS, etc) can be managed by a specific Cluster API and/or provider version.
170
171 - As an operator of a management cluster, given that I have a cluster running Cluster API, I would like to upgrade the Cluster API and provider(s) without the users of Cluster API noticing (e.g. due to their API requests not working).
172
173 - As an operator of a management cluster, I want to know what versions of kubelet, control plane, OS, etc, all of the associated workload clusters are running, so that I can plan upgrades to the management cluster that will not break anyone’s ability to manage their workload clusters.
174
175 ### Removing Cluster API
176 - As an operator of a management cluster, given that I have a management cluster that I have used to deploy several workload clusters, I want to remove the Cluster/Machine objects representing one workload cluster from my management cluster without deprovisioning workload cluster.
177
178 - As an operator of a management cluster, given that I have a management cluster that I have used to deploy several workload clusters, I want to uninstall Cluster API from my management cluster without deprovisioning my workload clusters.
179
180 - As an operator of a management cluster, given that I have a management cluster, I want to use it to manage workload clusters that were created by a different management cluster.
181
182 ### Cross-cluster Metrics
183 - As an operator of a management cluster, I want to query my resource allocation on an infrastructure. For example, in an on-prem case, I do not have an infinite capacity cloud, so I need to be able to determine my reservation before deploying a workload cluster.
184
185 ### Specific Architecture Approaches
186 - As an operator of a management cluster, given that I give operators of workload clusters access to my management cluster, they can launch new workload clusters with control planes that run in the management cluster while the nodes of those workload clusters run elsewhere.
187
188 - As a multi-cluster operator, I would like to provide an EKS-like experience in which the workload control plane nodes are joined to the management cluster and the control plane config isn’t exposed to the consumer of the workload cluster. This enables me as an operator to manage the control plane nodes for all clusters using tooling like prometheus and fluentd. I can also control the configuration of the workload control plane in accordance with business policy.
189
190 ### Multitenancy Management
191 - As an operator of a management cluster, I want to control which users of management cluster can deploy new workload clusters, how many clusters they can deploy, and how many nodes/resources those clusters can use.
192
193 - As an operator of a management cluster, I want to ensure that only the user who creates a new workload cluster (and some specific other users) can manage and access the workload cluster.
194
195 - As an operator of a management cluster, I want the user who creates a new workload cluster to be able to give permission to other users to manage that cluster.
196
197 - As an operator of a management cluster, I want to configure whether operators of workload clusters are allowed to open interactive shells onto those clusters machines.
198
199 ## Multi-cluster/Multi-provider
200
201 ### Managing Providers
202 - As an operator, given I have a management cluster with at least one provider, I would like to install a new provider.
203
204 - As an operator, given I have a management cluster with at least one provider, I would like to remove one of those providers and orphan any clusters provisioned by that provider.
205
206 - As a multi-cluster operator, given that I have a single management cluster and that I have installed multiple providers and that one of those providers is malicious, I want that provider not to see IaaS secrets provided to any of the other providers.
207
208 - As an operator, if I have a management cluster running a particular Cluster API version and a particular set of providers, then I want to plan an upgrade of Cluster API and the providers so that I upgrade one at a time and always end up with a compatible set of versions.
209
210 ### Creating Workload Clusters
211 - As a multi-cluster operator, given that I have a management cluster, I want to create workload clusters across multiple providers with a consistent interface. For example, if I can create clusters on AWS without any manual intervention, I should have the same level of automation and lack of gotchas when using the VSphere provider.
212
213 - As a multi-cluster operator, given that I have a management cluster, I want to create workload clusters across multiple providers that are all similarly configured.
214
215 - As a multi-cluster operator, given that I have deployed my clusters via Cluster API, I want to find general information (name, status, access details) about my clusters across multiple providers.
216
217 - As a multi-cluster operator, I want to know what versions of Kubernetes all of my workload clusters are running across multiple providers.
218
219 - As a multi-cluster operator, given that I deploy workload clusters via several providers, I want to see a health and status summary from different providers. The detailed information can be provider specific, but a general, common status for generic phases must be given.
220
221 - As a multi-cluster operator, given that I have deployed my clusters via Cluster API, I want to view the configuration of all my clusters across multiple providers.
222
223 - As a multi-cluster operator, given that I have a single management cluster and that I have installed multiple providers, I want to lifecycle manage multiple workload clusters on each installed provider.
224
225 ### Provider Implementors
226 - As a provider, I want the machine controller to reconcile a Machine in response to an event from some other resource in the cluster. This is the sort of thing that other controllers do on a regular basis, so that's nothing particularly interesting. But having made a machine actuator, there's not an easy way to get access to the machine controller object in order to call its Watch method.
227
228 ## Cluster Health Checking
229
230 Cluster Health Checking is a service to provide the health status of Kubernetes cluster and its components.
231
232 - As an operator, given I have created a Kubernetes-conformant cluster with ClusterAPI, I want to check the Kubernetes cluster node status.
233 - Describe nodes and provide details if they are ready/healthy or not ready/healthy.
234 - List conditions for any nodes which are `NotReady`, list information about allocated resources.
235
236 - As an operator, given I have created a Kubernetes-conformant cluster with ClusterAPI, I want to check the kube-apiserver status.
237
238 - As an operator, given I have created a Kubernetes-conformant cluster with ClusterAPI, I want to check the etcd status.
239
240 - 🔭 As an operator, given I have created a Kubernetes-conformant cluster with ClusterAPI, I want to check the Kubernetes components status, like ingress controller, other add-on components etc.
241
242 - 🔭 As an operator, given I have created a Kubernetes-conformant cluster with ClusterAPI, I want to check unhealthy Pods statuses in configured namespace.
243 - Provide the details on any pods which are unhealthy in `kube-system` namespace. Filter the unhealthy pods for their status(`kubectl get pods --show-labels -n kube-system | grep -vE "Running|Completed"`)
244 - Describe any Pods which are not `Completed|Running`, list the Events to provide hints on the failure.
245 - Look for Pods which don't have all of their containers running.