sigs.k8s.io/cluster-api@v1.7.1/docs/proposals/20220125-ipam-integration.md (about) 1 --- 2 title: IP Address Management Integration 3 authors: 4 - "@schrej" 5 reviewers: 6 - "@enxebre" 7 - "@fabriziopandini" 8 - "@furkatgofurov7" 9 - "@lubronzhan" 10 - "@maelk" 11 - "@neolit123" 12 - "@randomvariable" 13 - "@sbueringer" 14 - "@srm09" 15 - "@yastij" 16 creation-date: 2022-01-25 17 last-updated: 2022-04-20 18 status: provisional 19 --- 20 21 # IP Address Management Integration 22 23 ## Table of Contents 24 25 <!-- START doctoc generated TOC please keep comment here to allow auto update --> 26 <!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE --> 27 28 - [Glossary](#glossary) 29 - [IPAM Provider](#ipam-provider) 30 - [IPAddressClaim](#ipaddressclaim) 31 - [IPAddress](#ipaddress) 32 - [IP Pool](#ip-pool) 33 - [Summary](#summary) 34 - [Motivation](#motivation) 35 - [Goals](#goals) 36 - [Non-Goals/Future Work](#non-goalsfuture-work) 37 - [Proposal](#proposal) 38 - [User Stories](#user-stories) 39 - [Story 1](#story-1) 40 - [Story 2](#story-2) 41 - [Story 3](#story-3) 42 - [Story 4](#story-4) 43 - [IPAM API Contract](#ipam-api-contract) 44 - [Pools & IPAM Providers](#pools--ipam-providers) 45 - [Examples](#examples) 46 - [Consumption](#consumption) 47 - [Example](#example) 48 - [Implementation Details/Notes/Constraints](#implementation-detailsnotesconstraints) 49 - [New API Types](#new-api-types) 50 - [IPAddressClaim](#ipaddressclaim-1) 51 - [IPAddress](#ipaddress-1) 52 - [New Reference Type](#new-reference-type) 53 - [Implementing an IPAM Provider](#implementing-an-ipam-provider) 54 - [Consuming as an Infrastructure Provider](#consuming-as-an-infrastructure-provider) 55 - [Additional Notes](#additional-notes) 56 - [Security Model](#security-model) 57 - [Risks and Mitigations](#risks-and-mitigations) 58 - [Alternatives](#alternatives) 59 - [Implementation History](#implementation-history) 60 61 <!-- END doctoc generated TOC please keep comment here to allow auto update --> 62 63 ## Glossary 64 65 Refer to the [Cluster API Book Glossary](https://cluster-api.sigs.k8s.io/reference/glossary.html). 66 67 ### IPAM Provider 68 69 A controller that watches `IPAddressClaims` and fulfils them with `IPAddresses` that it allocates from an IP Pool. It comes with it's own IP Pool custom resource definition to provide parameters for allocating addresses. Providers can work in-cluster relying only on custom resources, or integrate with external systems like Netbox or Infoblox. 70 71 ### IPAddressClaim 72 73 An `IPAddressClaim` is a custom resource that can be used by infrastructure providers to allocate IP addresses from IP Pools that are managed by IPAM providers. 74 75 ### IPAddress 76 77 An `IPAddress` is a custom resource that gets created by an IPAM provider to fulfil an `IPAddressClaim`, which then gets consumed by the infrastructure provider that created the claim. 78 79 ### IP Pool 80 81 An IP Pool is a custom resource that is provided by a IPAM provider, which holds configuration for a single pool of IP addresses that can be allocated from using an `IPAddressClaim`. 82 83 ## Summary 84 85 This proposal adds an API contract for pluggable IP Address Management to CAPI that any infrastructure provider can choose to implement. It allows them to dynamically allocate and release IP Addresses from IP Address Pools following a similar concept as Kubernetes' PersistentVolumes. Pools are managed by IPAM Providers, which allow to integrate with external systems like Netbox or Infoblox. 86 87 ## Motivation 88 89 IP address management for machines is currently left to the infrastructure providers for CAPI. Most on-premise providers (e.g. CAPV) allow the use of either DHCP or statically configured IPs. Since Machines are created from templates, static allocation would require a single template for each machine, which prevents dynamic scaling of nodes without custom controllers that create new templates. 90 91 While DHCP is a viable solution for dynamic IP assignment which also enables scaling, it can cause problems in conjunction with CAPI. Especially in smaller networks rolling cluster upgrades can exhaust the network in use. When multiple machines are replaced in quick succession, each of them will get a new DHCP lease. Unless the lease time is very short, at least twice as many IPs as the maximum number of nodes has to be allocated to a cluster. 92 93 Metal3 has an ip-address-manager component that allows for in-cluster management of IP addresses through custom resources, allowing to avoid DHCP. CAPV allows to ommit the address from the template while having DHCP disabled, and will wait until it is set manually, allowing to implement custom controllers to take care of IP assignments. At DTAG we've extended metal3's ip-address-manager and wrote a custom controller for CAPV to integrate both with Infoblox, our IPAM solution of choice. The CAPV project has shown interest, and there has been a proposal to allow integrating metal3's ip-address-manager with external systems. 94 95 All on-premise providers have a need for IP Address Management, since they can't leverage SDN features that are available in clouds such as AWS, Azure or GCP. We therefore propose to add an API contract to CAPI, which allows infrastructure providers to integrate with different IPAM systems. A similar approach to Kubernetes' PersistentVolumes should be used where infrastructure providers create Claims that reference a specific IP Pool, and IPAM providers fulfill claims to Pools managed by them. 96 97 ### Goals 98 99 - Define an API contract that 100 - allows infrastructure providers to dynamically allocate and release addresses 101 - allows to write IPAM providers to integrate with any IPAM solution 102 - supports both IPv4 and IPv6 103 - allows to run multiple IPAM providers in parallel 104 105 ### Non-Goals/Future Work 106 107 - IPAM providers should not be added to CAPI directly, but live as individual projects instead. 108 - `IPAddress` and `IPAddressClaim` will only have limited validation (e.g. that the referenced Pool exists and that IP address strings are valid addresses). Advanced validation (whether an IP is actually part of the correct subnet for example) needs to be handled by IPAM providers. 109 - This propsoal only focuses on integrating Machine creation with an IPAM provider. 110 - Allocation of pools and addresses during cluster creation (e.g. the node network, the overlay network, API address etc.) is currently of scope of this proposal. 111 - Support for allocating Pools could be added by extending the Proposal with an `IPPoolClaim` that can be used to divide IP Pools into smaller Pools. It would work the same way as `IPAddressClaims`, but yield new Pools of the same kind instead of an IPAddress. 112 - Moving network configuration from infrastructure templates to `MachineDeployments` should be discussed in a separate proposal. The IPAM integration proposed here can be reused when network configuration is moved, and would allow `IPAddressClaims` to be created by CAPI instead of having to re-implement it in all infrastructure providers that want to support it. 113 114 ## Proposal 115 116 The chosen approach was heavily inspired by metal3's ip-address-manager, which in turn seems to have been inspired by Kubernetes' PersistentVolumes. 117 118 The Proposal suggests a way for infrastructure providers to integrate with different IPAM providers. It allows to allocate and release IP addresses in sync with the lifecycle of the Machines created by them. Releasing addresses as soon as Machines are deleted solves the shortcomings of DHCP and its time-based release of addresses, which can lead to temporary pool exhaustion. 119 120 The proposal is twofold and consists of the API contract that should be added to CAPI and two examples to better illustrate how the contract is used by IPAM and infrastructure providers. 121 122 ### User Stories 123 124 #### Story 1 125 126 As a cluster operator I want to use static IP addresses with my Machines to avoid problems with DHCP lease duration during rolling cluster upgrades. 127 128 #### Story 2 129 130 As a cluster operator I want to integrate machines provisioned by CAPI with an external IPAM system and allocate IP addresses from it. 131 132 #### Story 3 133 134 As an infrastructure provider I want a single interface to integrate with multiple different IPAM providers. 135 136 #### Story 4 137 138 As an IPAM provider I want a single interface to integrate with all infrastructure providers that can benefit from IPAM integration. 139 140 141 ### IPAM API Contract 142 143 The IPAM API contract provides a generic interface between CAPI infrastructure providers and to-be-developed IPAM providers. Any interested infrastructure is supposed to be able to request IP Addresses from a defined pool of addresses. IPAM providers should then be able to find and fulfil such requests using any custom logic to do so. 144 145 The contract consists of three resources: 146 147 - An IPAddressClaim, which is used to allocate addresses 148 - An `IPAddress` that IPAM providers use to fulfil IPAddressClaims 149 - IP Pools, which hold provider specific configuration on how addresses should be allocated, e.g. a specific subnet to use 150 151 Both **IPAddressClaims** and **IPAddresses** should be part of Cluster API, while **IPPools** are defined by the different IPAM providers. 152 153 An **IPAddressClaim** is used by infrastructure providers to request an IP Address. The claim contains a reference to an IP Pool. Because the pool is provider specific, the IPAM controllers can decide whether to handle a claim by inspecting the group and kind of the pool reference. 154 155 If a IPAM controller detects a Claim that references a Pool it controls, it allocates an IP address from that pool and creates an **IPAddress** to fulfil the claim. It also updates the status of the `IPAddressClaim` with a reference to the created Address. 156 157 158 ### Pools & IPAM Providers 159 160 In order for IPAM providers to fulfil IPAddressClaims, they likely require a few parameters, e.g. the network from which to allocate an IP address. If there are multiple different IPAM providers, they also need to know which claims they are supposed to fulfil, and which to ignore. 161 162 Therefore, IPAM providers are required to bring a custom IP Pool resource. The resource can have an arbitrary structure with the specific parameters for the IPAM system in use. 163 164 By inspecting the API group and kind of the pool reference held by IPAddressClaims, the provider can determine whether it is supposed to handle a claim or not. When an IPAM provider finds a `IPAddressClaim` that references one of its Pools, it allocates an IP address, creates an `IPAddress` resource and updates the status of the `IPAddressClaim` with a reference to this IPAddress. 165 166 #### Examples 167 168 A simple example for an IP Pool could look like this: 169 170 ```yaml 171 apiVersion: ipam.cluster.x-k8s.io/v1alpha1 172 kind: InClusterIPPool 173 metadata: 174 name: some-pool 175 spec: 176 pools: 177 - subnet: 10.10.10.0/24 178 start: 10.10.10.100 179 end: 10.10.10.200 180 ``` 181 A pool for an Infoblox Provider could look like this: 182 183 ```yaml 184 apiVersion: ipam.cluster.x-k8s.io/v1alpha1 185 kind: InfobloxIPPool 186 metadata: 187 name: ib-pool 188 spec: 189 networkView: "some-view" 190 dnsZone: "test.example.com." 191 network: 10.10.10.0/24 192 ``` 193 194 ### Consumption 195 196 The consumers of this API contract are the infrastructure providers. As mentioned in the introduction, not all providers will require integration with IPAM systems. In addition, DHCP may be a viable or even better option in some environments. Integration with this API should therefore be implemented as an optional feature. 197 198 To support IPAM integration, two things need to be implemented: 199 200 1. Users need to be able to specify the IP Pool to allocate addresses from. 201 2. The infrastructure provider needs to create `IPAddressClaims` for all ip addresses it requires, and then needs to wait until they are fulfilled. 202 203 The relationships between the provider and IPAM resources are shown in the diagram below. 204 205  206 207 More implementation details can be found [here](#consuming-as-an-infrastructure-provider). 208 209 #### Example 210 211 Using CAPV as an example, a template *could* look like this: 212 213 ```yaml 214 apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3 215 kind: VSphereMachineTemplate 216 metadata: 217 name: example 218 namespace: vsphere-site1 219 spec: 220 template: 221 spec: 222 cloneMode: FullClone 223 numCPUs: 8 224 memoryMiB: 8192 225 diskGiB: 45 226 network: 227 devices: 228 - dhcp4: false 229 fromPool: # reference to the pool 230 group: ipam.cluster.x-k8s.io/v1alpha1 231 kind: IPPool 232 name: testpool 233 ``` 234 235 A Machine generated from that template could then look like this: 236 237 ```yaml 238 apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3 239 kind: VSphereMachine 240 metadata: 241 name: example-1 242 namespace: vsphere-site1 243 spec: 244 cloneMode: FullClone 245 numCPUs: 8 246 memoryMiB: 8192 247 diskGiB: 45 248 network: 249 devices: 250 - dhcp4: false 251 fromPool: # reference to the pool 252 group: ipam.cluster.x-k8s.io/v1alpha1 253 kind: IPPool 254 name: testpool 255 status: 256 network: 257 devices: 258 - claim: 259 name: example-1-1 260 ``` 261 262 ### Implementation Details/Notes/Constraints 263 264 #### New API Types 265 266 The following API Types should be added to cluster-api. In the first iteration they will be added to the experimental API and will therefore live in the `cluster.x-k8s.io` group. 267 268 ##### IPAddressClaim 269 270 The `IPAddressClaim` is used to request ip addresses from a pool. It gets reconciled by IPAM providers, which can filter based on the `Spec.Pool` reference. 271 272 After an `IPAddressClaim` is created, the `Spec.Pool` reference, and therefore essentially the entire `Spec` is immutable. Otherwise handing over a claim between providers would need to be supported, and that is an unlikely and complex use-case. 273 274 ```go 275 // IPAddressClaimSpec describes the desired state of an IPAddressClaim 276 type IPAddressClaimSpec struct { 277 // Pool is a reference to the pool from which an IP address should be allocated. 278 Pool LocalObjectReference `json:"pool,omitempty"` 279 } 280 281 // IPAddressClaimStatus contains the status of an IPAddressClaim 282 type IPAddressClaimStatus struct { 283 // Address is a reference to the address that was allocated for this claim. 284 Address LocalObjectReference `json:"address,omitempty"` 285 286 // Conditions provide details about the status of the claim. 287 // +optional 288 Conditions clusterv1.Conditions `json:"conditions,omitempty"` 289 } 290 291 // IPAddressClaim can be used to allocate IPAddresses from an IP Pool. 292 type IPAddressClaim struct { 293 metav1.TypeMeta `json:",inline"` 294 metav1.ObjectMeta `json:"metadata,omitempty"` 295 296 Spec IPAddressClaimSpec `json:"spec,omitempty"` 297 Status IPAddressClaimStatus `json:"status,omitempty"` 298 } 299 ``` 300 301 ##### IPAddress 302 303 `IPAddress` resources are created to fulfil an `IPAddressClaim`. They are created by the IPAM provider that reconciles the claim. 304 305 The `Spec` of `IPAddresses` is immutable. 306 307 ```go 308 // IPAddressSpec describes an IPAddress 309 type IPAddressSpec struct { 310 // Claim is a reference to the claim this IPAddress was created for. 311 Claim LocalObjectReference `json:"claim,omitempty"` 312 313 // Pool is a reference to the pool that this IPAddress was created from. 314 Pool LocalObjectReference `json:"pool,omitempty"` 315 316 // Address is the IP address. 317 Address string `json:"address"` 318 319 // Prefix is the prefix of the address. 320 Prefix int `json:"prefix,omitempty"` 321 322 // Gateway is the network gateway of network the address is from. 323 Gateway string `json:"gateway,omitempty"` 324 } 325 326 // IPAddress is a representation of an IP Address that was allocated from an IP Pool. 327 type IPAddress struct { 328 metav1.TypeMeta `json:",inline"` 329 metav1.ObjectMeta `json:"metadata,omitempty"` 330 331 Spec IPAddressSpec `json:"spec,omitempty"` 332 } 333 ``` 334 335 ##### New Reference Type 336 337 ```go 338 type LocalObjectReference struct { 339 Group string 340 Kind string 341 Name string 342 } 343 ``` 344 345 #### Implementing an IPAM Provider 346 347 IPAM providers have to provide an IP Pool resource. It serves two purposes: Selecting which provider to use, and configuring that provider. How configuration works exactly is up to the provider. The resource can allow to select a specific network to allocate addresses from, where different clusters might use different pools. Or there could only be a single object shared with all clusters, with the provider deciding what network to use on its own. 348 349 The primary task of the IPAM provider is fulfilling `IPAddressClaims`, usually issued by infrastructure providers. The provider therefore has to reconcile `IPAddressClaims` that are referencing one of its pools. It has to allocate IPs and create `IPAddress` objects for each claim. 350 351 A claim is fulfilled once an `IPAddress` object is created, and the status of the claim is updated with a reference to that `IPAddress` object. During the allocation process, the `.Status.Conditions` array of the `IPAddressClaim` should be used to reflect the status of the allocation process. Especially errors have to be visible. 352 353 The IPAM provider needs to ensure that address allocation is correct. Most importantly it must not assign the same IP address more than once within a pool. There is no validation on uniqueness on `IPAddress` objects. 354 355 When an `IPAddressClaim` that is reconciled by a provider is deleted, the provider needs to release the IP address belonging to the claim and delete the related `IPAddress` object. To do so a Finalzier should be added to the claim to prevent its deletion until the address is released. 356 357 `IPAddress` objects (as well as claims) are immutable. For each `IPAddressClaim`, one corresponding `IPAddress` object gets created. As long as the claim exists that `IPAddress` object must not change. When an `IPAddress` object is deleted while its claim still exists there are two options: 358 - The provider can re-create an `IPAddress` object with identical values as the previous object. 359 - If the provider is unable to reconstruct the object, it can add a Finalizer to the `IPAddress` to block its deletion indefinitely. It then needs to ensure that the Finalizer of the `IPAddress` is removed **before** removing the Finalizer of the `IPAddressClaim` to avoid orphaned `IPAddress` objects. This is similar to the [In Use Protection](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#storage-object-in-use-protection) on PersistentVolumes. 360 361 #### Consuming as an Infrastructure Provider 362 363 In order to consume IP addresses from an IPAM provider, the infrastructure provider needs to know from which pool to allocate addresses. It is therefore necessary to allow to reference a pool in the Infrastructure Machine templates. The Pool must be in the same namespace as the cluster, which should be enforced by not providing a namespace field on the reference. 364 365 To consume IP addresses from a pool, the infrastructure machine controller then has to create `IPAddressClaims` for each address it needs. The claims should use a deterministic naming scheme that is derived from the name of the Infrastructure Machine they are created for, so it is easy to identify them, e.g. `<infra machine name>-<interface name>-<address index>`. 366 367 After the claim is created, the infrastructure provider needs to watch it until it's status contains a reference to the created IPAddress. The IP Address obtained through a claim will not change. Both the `IPAddressClaim` and the `IPAddress` are immutable. If a new address is required, the claim needs to be deleted and recreated. When a machine is deleted, the machine controller needs to delete the `IPAddressClaim` to release the address. 368 369 It is the responsibility of the Infrastructure Provider to ensure the `IPAddressClaim` is not deleted as long as the IP address is in use. It should block the deletion of the `IPAddressClaim` using a Finalizer until the Infrastructure Machine is destroyed. It has to make sure the Finalizer is removed before the Machine is deleted. 370 371 The `IPAddressClaim` must contain a owner reference to the Infrastructure Machine it was created for. 372 373 The Infrastructure Machine has to reflect IP Address allocation status in it's conditions, and this has to be incorporated into it's `Ready` condition. This is likely to be the case anyway, as the IP address will probably be required to provision the machine. 374 375 The used IP addresses must be reflected in `.Status.Addresses` of a Machine. 376 377 See the following picture for a sequence diagram of the allocation process. 378 379  380 381 #### Additional Notes 382 383 - An example provider that provides in-cluster IP Address management (also useful for testing, or when no other IPAM solution is available) should be implemented as a separate project in a separate repository. 384 - A shared library for providers and consumers should be implemented as part of that in-cluster IP Address provider. 385 - helpers to create IPAddresses 386 - a predicate to filter IPAddressClaims 387 - ... anything else that can be shared between providers and consumers 388 389 ### Security Model 390 391 - IPAddressClaim, `IPAddress` and Pools are required to be in the same namespace, and also in the same namespace as the Cluster. 392 - For `IPAddressClaim` and `IPAddress` this is enforced by the lack of a namespace field on the reference 393 - Infrastructure Providers need to ensure that only Pools in the same namespace can be referenced, ideally by excluding namespace from the reference. 394 395 ### Risks and Mitigations 396 397 - Unresponsive IPAM providers can prevent successful creation of a cluster 398 - Wrong IP address allocations (e.g. duplicates) can brick a cluster’s network 399 400 ## Alternatives 401 402 As an alternative to an official API contract, all interested providers could agree on an external contract, or an existing one like metal3-io/ip-address-manager. This would create a dependency on that project, which isn’t owned by the Kubernetes community (but it has been offered to donate that project to the community). 403 404 ## Implementation History 405 406 - [x] 07/01/2021: Compile a Google Doc following the CAEP template (link here) 407 - [x] 08/18/2021: Present proposal at CAPI office hours 408 - [x] 01/26/2022: Open proposal PR