sigs.k8s.io/cluster-api@v1.7.1/docs/book/src/developer/providers/machine-infrastructure.md (about)

     1  # Machine Infrastructure Provider Specification
     2  
     3  ## Overview
     4  
     5  A machine infrastructure provider is responsible for managing the lifecycle of provider-specific machine instances.
     6  These may be physical or virtual instances, and they represent the infrastructure for Kubernetes nodes.
     7  
     8  ## Data Types
     9  
    10  A machine infrastructure provider must define an API type for "infrastructure machine" resources. The type:
    11  
    12  1. Must belong to an API group served by the Kubernetes apiserver
    13  2. Must be implemented as a CustomResourceDefinition.
    14      1. The CRD name must have the format produced by `sigs.k8s.io/cluster-api/util/contract.CalculateCRDName(Group, Kind)`.
    15  3. Must be namespace-scoped
    16  4. Must have the standard Kubernetes "type metadata" and "object metadata"
    17  5. Must have a `spec` field with the following:
    18      1. Required fields:
    19          1. `providerID` (string): the identifier for the provider's machine instance. This field is expected to match the value set by the KCM cloud provider in the Nodes.
    20             The Machine controller bubbles it up to the Machine CR, and it's used to find the matching Node.
    21             Any other consumers can use the providerID as the source of truth to match both Machines and Nodes.
    22              
    23      2. Optional fields:
    24          1. `failureDomain` (string): the string identifier of the failure domain the instance is running in for the
    25             purposes of backwards compatibility and migrating to the v1alpha3 FailureDomain support (where FailureDomain
    26             is specified in Machine.Spec.FailureDomain). This field is meant to be temporary to aid in migration of data
    27             that was previously defined on the provider type and providers will be expected to remove the field in the
    28             next version that provides breaking API changes, favoring the value defined on Machine.Spec.FailureDomain
    29             instead. If supporting conversions from previous types, the provider will need to support a conversion from
    30             the provider-specific field that was previously used to the `failureDomain` field to support the automated
    31             migration path.
    32  6. Must have a `status` field with the following:
    33      1. Required fields:
    34          1. `ready` (boolean): indicates the provider-specific infrastructure has been provisioned and is ready
    35      2. Optional fields:
    36          1. `failureReason` (string): indicates there is a fatal problem reconciling the provider's infrastructure;
    37              meant to be suitable for programmatic interpretation
    38          2. `failureMessage` (string): indicates there is a fatal problem reconciling the provider's infrastructure;
    39              meant to be a more descriptive value than `failureReason`
    40          3. `addresses` (`MachineAddresses`): a list of the host names, external IP addresses, internal IP addresses,
    41              external DNS names, and/or internal DNS names for the provider's machine instance. `MachineAddress` is
    42              defined as:
    43              - `type` (string): one of `Hostname`, `ExternalIP`, `InternalIP`, `ExternalDNS`, `InternalDNS`
    44              - `address` (string)
    45  7. Should have a conditions field with the following:
    46     1. A Ready condition to represent the overall operational state of the component. It can be based on the summary of more detailed conditions existing on the same object, e.g. instanceReady, SecurityGroupsReady conditions.
    47  
    48  
    49  ### InfraMachineTemplate Resources
    50  
    51  For a given InfraMachine resource, you should also add a corresponding InfraMachineTemplate resource:
    52  
    53  ``` go
    54  // InfraMachineTemplateSpec defines the desired state of InfraMachineTemplate.
    55  type InfraMachineTemplateSpec struct {
    56  	Template InfraMachineTemplateResource `json:"template"`
    57  }
    58  
    59  // +kubebuilder:object:root=true
    60  // +kubebuilder:resource:path=inframachinetemplates,scope=Namespaced,categories=cluster-api,shortName=imt
    61  // +kubebuilder:storageversion
    62  
    63  // InfraMachineTemplate is the Schema for the inframachinetemplates API.
    64  type InfraMachineTemplate struct {
    65  	metav1.TypeMeta   `json:",inline"`
    66  	metav1.ObjectMeta `json:"metadata,omitempty"`
    67  
    68  	Spec InfraMachineTemplateSpec `json:"spec,omitempty"`
    69  }
    70  
    71  type InfraMachineTemplateResource struct {
    72  	// Standard object's metadata.
    73  	// More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata
    74  	// +optional
    75  	ObjectMeta clusterv1.ObjectMeta `json:"metadata,omitempty"`
    76  	Spec InfraMachineSpec `json:"spec"`
    77  }
    78  ```
    79  
    80  The CRD name of the template must also have the format produced by `sigs.k8s.io/cluster-api/util/contract.CalculateCRDName(Group, Kind)`.
    81  
    82  ### List Resources
    83  
    84  For any resource, also add list resources, e.g.
    85  
    86  ```go
    87  //+kubebuilder:object:root=true
    88  
    89  // InfraMachineList contains a list of InfraMachines.
    90  type InfraMachineList struct {
    91  	metav1.TypeMeta `json:",inline"`
    92  	metav1.ListMeta `json:"metadata,omitempty"`
    93  	Items           []InfraCluster `json:"items"`
    94  }
    95  
    96  //+kubebuilder:object:root=true
    97  
    98  // InfraMachineTemplateList contains a list of InfraMachineTemplates.
    99  type InfraMachineTemplateList struct {
   100  	metav1.TypeMeta `json:",inline"`
   101  	metav1.ListMeta `json:"metadata,omitempty"`
   102  	Items           []InfraClusterTemplate `json:"items"`
   103  }
   104  ```
   105  
   106  ## Behavior
   107  
   108  A machine infrastructure provider must respond to changes to its "infrastructure machine" resources. This process is
   109  typically called reconciliation. The provider must watch for new, updated, and deleted resources and respond
   110  accordingly.
   111  
   112  The following diagram shows the typical logic for a machine infrastructure provider:
   113  
   114  ![Machine infrastructure provider activity diagram](../../images/machine-infra-provider.png)
   115  
   116  ### Normal resource
   117  
   118  1. If the resource does not have a `Machine` owner, exit the reconciliation
   119      1. The Cluster API `Machine` reconciler populates this based on the value in the `Machines`'s
   120         `spec.infrastructureRef` field
   121  1. If the resource has `status.failureReason` or `status.failureMessage` set, exit the reconciliation
   122  1. If the `Cluster` to which this resource belongs cannot be found, exit the reconciliation
   123  1. Add the provider-specific finalizer, if needed
   124  1. If the associated `Cluster`'s `status.infrastructureReady` is `false`, exit the reconciliation
   125      1. **Note**: This check should not be blocking any further delete reconciliation flows.
   126      1. **Note**: This check should only be performed after appropriate owner references (if any) are updated.
   127  1. If the associated `Machine`'s `spec.bootstrap.dataSecretName` is `nil`, exit the reconciliation
   128  1. Reconcile provider-specific machine infrastructure
   129      1. If any errors are encountered:
   130          1. If they are terminal failures, set `status.failureReason` and `status.failureMessage`
   131          1. Exit the reconciliation
   132      1. If this is a control plane machine, register the instance with the provider's control plane load balancer
   133         (optional)
   134  1. Set `spec.providerID` to the provider-specific identifier for the provider's machine instance
   135  1. Set `status.ready` to `true`
   136  1. Set `status.addresses` to the provider-specific set of instance addresses (optional)
   137  1. Set `spec.failureDomain` to the provider-specific failure domain the instance is running in (optional)
   138  1. Patch the resource to persist changes
   139  
   140  ### Deleted resource
   141  
   142  1. If the resource has a `Machine` owner
   143      1. Perform deletion of provider-specific machine infrastructure
   144      1. If this is a control plane machine, deregister the instance from the provider's control plane load balancer
   145         (optional)
   146      1. If any errors are encountered, exit the reconciliation
   147  1. Remove the provider-specific finalizer from the resource
   148  1. Patch the resource to persist changes
   149  
   150  ## RBAC
   151  
   152  ### Provider controller
   153  
   154  A machine infrastructure provider must have RBAC permissions for the types it defines. If you are using `kubebuilder` to
   155  generate new API types, these permissions should be configured for you automatically. For example, the AWS provider has
   156  the following configuration for its `AWSMachine` type:
   157  
   158  ```
   159  // +kubebuilder:rbac:groups=infrastructure.cluster.x-k8s.io,resources=awsmachines,verbs=get;list;watch;create;update;patch;delete
   160  // +kubebuilder:rbac:groups=infrastructure.cluster.x-k8s.io,resources=awsmachines/status,verbs=get;update;patch
   161  ```
   162  
   163  A machine infrastructure provider may also need RBAC permissions for other types, such as `Cluster` and `Machine`. If
   164  you need read-only access, you can limit the permissions to `get`, `list`, and `watch`. You can use the following
   165  configuration for retrieving `Cluster` and `Machine` resources:
   166  
   167  ```
   168  // +kubebuilder:rbac:groups=cluster.x-k8s.io,resources=clusters;clusters/status,verbs=get;list;watch
   169  // +kubebuilder:rbac:groups=cluster.x-k8s.io,resources=machines;machines/status,verbs=get;list;watch
   170  ```
   171  
   172  ### Cluster API controllers
   173  
   174  The Cluster API controller for `Machine` resources is configured with full read/write RBAC
   175  permissions for all resources in the `infrastructure.cluster.x-k8s.io` API group. This group
   176  represents all machine infrastructure providers for SIG Cluster Lifecycle-sponsored provider
   177  subprojects. If you are writing a provider not sponsored by the SIG, you must grant full read/write
   178  RBAC permissions for the "infrastructure machine" resource in your API group to the Cluster API
   179  manager's `ServiceAccount`. `ClusterRoles` can be granted using the [aggregation label]
   180  `cluster.x-k8s.io/aggregate-to-manager: "true"`. The following is an example `ClusterRole` for a
   181  `FooMachine` resource:
   182  
   183  ```yaml
   184  apiVersion: rbac.authorization.k8s.io/v1
   185  kind: ClusterRole
   186  metadata:
   187    name: capi-foo-machines
   188    labels:
   189      cluster.x-k8s.io/aggregate-to-manager: "true"
   190  rules:
   191  - apiGroups:
   192    - infrastructure.foo.com
   193    resources:
   194    - foomachines
   195    verbs:
   196    - create
   197    - delete
   198    - get
   199    - list
   200    - patch
   201    - update
   202    - watch
   203  ```
   204  
   205  Note, the write permissions allow the `Machine` controller to set owner references and labels on the
   206  "infrastructure machine" resources; they are not used for general mutations of these resources.
   207  
   208  [aggregation label]: https://kubernetes.io/docs/reference/access-authn-authz/rbac/#aggregated-clusterroles