sigs.k8s.io/kueue@v0.6.2/keps/1136-provisioning-request-support/README.md (about) 1 # KEP-1136: ProvisioningRequest support 2 3 <!-- toc --> 4 - [Summary](#summary) 5 - [Motivation](#motivation) 6 - [Goals](#goals) 7 - [Non-Goals](#non-goals) 8 - [Proposal](#proposal) 9 - [User Stories (Optional)](#user-stories-optional) 10 - [Story 1](#story-1) 11 - [Story 2](#story-2) 12 - [Risks and Mitigations](#risks-and-mitigations) 13 - [Design Details](#design-details) 14 - [Test Plan](#test-plan) 15 - [Prerequisite testing updates](#prerequisite-testing-updates) 16 - [Unit Tests](#unit-tests) 17 - [Integration tests](#integration-tests) 18 - [Graduation Criteria](#graduation-criteria) 19 - [Implementation History](#implementation-history) 20 - [Alternatives](#alternatives) 21 <!-- /toc --> 22 23 ## Summary 24 25 Introduce an [AdmissionCheck](https://github.com/kubernetes-sigs/kueue/tree/main/keps/993-two-phase-admission) 26 that will use [`ProvisioningRequest`](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/provisioning-request.md) 27 to ensure that there is enough capacity in the cluster before 28 admitting a workload. 29 30 ## Motivation 31 32 Currently Kueue admits workloads based on the quota check alone. 33 This works reasonably well in most cases, but doesn't provide 34 guarantee that an admitted workload will actually schedule 35 in full in the cluster. With `ProvisioningRequest`, SIG-Autoscaling owned 36 [ClusterAutoscaler](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler) 37 opens a way for stronger (but still not hard-guaranteed) all-or-nothing 38 scheduling in an autoscaled cloud environment. 39 40 Before admission, CA will check whether there is enough resources and 41 provide them if their number is not sufficient (details 42 depend on the exact engine used with `ProvisioningRequest)`. 43 44 ### Goals 45 46 * Provide Kueue integration with `ProvisioningRequest`. 47 * Define how users can configure what Kueue puts into `ProvisioningRequest`. 48 49 ### Non-Goals 50 51 * Define how Cluster Autoscaler handles ProvisioningRequest. 52 * Define underlying cloud-specific behavior. 53 54 ## Proposal 55 56 * Introduce a new controller in Kueue that will act as AdmissionCheck based on 57 the status of created `ProvisioningRequest`. 58 59 * Introduce a new cluster-scoped CRD to configure how `ProvisioningRequest` should be used. 60 61 62 ### User Stories (Optional) 63 64 #### Story 1 65 66 I want to admit workloads only after ClusterAutoscaler running on my cloud provider 67 expands a dedicated node group on which the workload will be run. 68 69 #### Story 2 70 71 I want to admit workloads only after a CheckCapacity request to ClusterAutoscaler 72 succeeds. 73 74 ### Risks and Mitigations 75 76 There doesn't seem to be much risks or mitigations. 77 [Two phase admission process](https://github.com/kubernetes-sigs/kueue/tree/main/keps/993-two-phase-admission) 78 was added specifically for use cases like this. 79 80 ## Design Details 81 82 The new ProvisioningRequest controller will: 83 84 * Watch for all workloads that require an `AdmissionCheck` with controller 85 name set to `"kueue.x-k8s.io/provisioning-request"`. For that it will also need to 86 to watch all `AdmissionCheck` definitions to understand whether the particular 87 check is in fact `ProvisioningRequest` or not. 88 89 * For each of such workloads create a `ProvisioningRequest` (and accompanying 90 PodTemplates) requesting capacity for the podsets of interest from the workload. 91 A podset is considered "of interest" if it requires at least one resource listed 92 in the `ProvisioningRequestConfig` `managedResources` field or `managedResources` 93 is empty. If the workload has no podsets of interest it is considered `Ready`. 94 The `ProvisioningRequest` should have the owner reference set to the workload. 95 To understand what details should it put into `ProvisioningRequest` the controller 96 will also need to watch `ProvisioningRequestConfigs`. 97 98 * Watch all changes CA makes to `ProvisioningRequests`. If the `Provisioned` 99 or `CapacityAvailable` condition is set to `True` then finish the `AdmissionCheck` 100 with success (and propagate the information about `ProvisioningRequest` name to 101 workload pods - [KEP #1145](https://github.com/kubernetes-sigs/kueue/blob/main/keps/1145-additional-labels/kep.yaml) under `"cluster-autoscaler.kubernetes.io/consume-provisioning-request"`. 102 If the `ProvisioningRequest` fails, fail the `AdmissionCheck`. 103 104 * Watch the admission of the workload - if it is again suspended or finished, 105 the provisioning request should also be deleted (the last one can be achieved via 106 OwnerReference). 107 108 * Retry ProvisioningRequests with respect to the `RetryConfig` configuration in 109 the `ProvisioningRequestConfig`. For each attempt a new provisioning request is 110 created with the suffix indicating the attempt number. The corresponding admission 111 check will remain in the `Pending` state until the retries end. The max number 112 of retries is 3, and the interval between attempts grows exponentially, starting 113 from 1min (1, 2, 4 min). 114 115 The definition of `ProvisioningRequestConfig` is relatively simple and is based on 116 what can be set in `ProvisioningRequest`. 117 118 ```go 119 // ProvisioningRequestConfig is the Schema for the provisioningrequestconfig API 120 type ProvisioningRequestConfig struct { 121 metav1.TypeMeta `json:",inline"` 122 metav1.ObjectMeta `json:"metadata,omitempty"` 123 124 Spec ProvisioningRequestConfigSpec `json:"spec,omitempty"` 125 } 126 127 type ProvisioningRequestConfigSpec struct { 128 // ProvisioningClassName describes the different modes of provisioning the resources. 129 // Check autoscaling.x-k8s.io ProvisioningRequestSpec.ProvisioningClassName for details. 130 // 131 // +kubebuilder:validation:Required 132 // +kubebuilder:validation:Pattern=`^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*$` 133 // +kubebuilder:validation:MaxLength=253 134 ProvisioningClassName string `json:"provisioningClassName"` 135 136 // Parameters contains all other parameters classes may require. 137 // 138 // +optional 139 // +kubebuilder:validation:MaxProperties=100 140 Parameters map[string]Parameter `json:"parameters,omitempty"` 141 142 // managedResources contains the list of resources managed by the autoscaling. 143 // 144 // If empty, all resources are considered managed. 145 // 146 // If not empty, the ProvisioningRequest will contain only the podsets that are 147 // requesting at least one of them. 148 // 149 // If none of the workloads podsets is requesting at least a managed resource, 150 // the workload is considered ready. 151 // 152 // +optional 153 // +listType=set 154 // +kubebuilder:validation:MaxItems=100 155 ManagedResources []corev1.ResourceName `json:"managedResources,omitempty"` 156 } 157 ``` 158 159 `AdmissionCheck` will point to this configuration: 160 161 ```yaml 162 kind: AdmissionCheck: 163 name: "SuperProvider" 164 spec: 165 controllerName: “kueue.x-k8s.io/provisioning-request” 166 parameters: 167 apiGroup: “kueue.x-k8s.io/v1beta1” 168 kind: “ProvisioningRequestConfig” 169 name: “SuperProviderConfig” 170 --- 171 kind: ProvsioningRequestConfig: 172 name: "SuperProviderConfig" 173 spec: 174 provisioningClass: "SuperSpot" 175 parameters: 176 "Priority": "TopTier" 177 managedResources: 178 - cpu 179 180 ``` 181 182 ### Test Plan 183 184 [x] I/we understand the owners of the involved components may require updates to 185 existing tests to make this code solid enough prior to committing the changes necessary 186 to implement this enhancement. 187 188 ##### Prerequisite testing updates 189 190 None. 191 192 #### Unit Tests 193 194 Regular unit tests covering the new controller should suffice. 195 196 #### Integration tests 197 198 Integration tests should be done without actual Cluster Autoscaler running 199 (but with integration tests flipping the `ProvisioningRequest` state) 200 to cover possible error scenarios. 201 202 The tests should start with a job going to a queue with `kueue.x-k8s.io/provisioning-request` based `AdmissionCheck`. 203 The appropriate `ProvisioningRequest` should be created, with the right `ProvisioningClass` set (taken from `ProvisioningRequestConfig`). 204 The following scenarios should be tested: 205 206 * `ProvisioningRequest` is completed successfully. Then: 207 * Workload completes till success. 208 * Workload is preempted and goes back to suspend. 209 * Workload is deleted. 210 * `ProvisioningRequest` is failed. 211 * Workload is deleted. 212 * Workload is suspended. 213 * Queue definition changes and doesn't require any `AdmissionChecks` anymore. 214 * `ProvisioningRequestConfig` changes. 215 * `ProvisioningRequestConfig` is removed. 216 217 ### Graduation Criteria 218 219 User feedback is positive. 220 221 ## Implementation History 222 223 2023-09-21: KEP 224 225 ## Alternatives 226 227 Not do `ProvisioningRequest` integration.