sigs.k8s.io/kueue@v0.6.2/site/content/en/docs/reference/kueue-config.v1beta1.md (about) 1 --- 2 title: Kueue Configuration API 3 content_type: tool-reference 4 package: /v1beta1 5 auto_generated: true 6 description: Generated API reference documentation for Kueue Configuration. 7 --- 8 9 10 ## Resource Types 11 12 13 - [Configuration](#Configuration) 14 15 16 17 18 ## `ClientConnection` {#ClientConnection} 19 20 21 **Appears in:** 22 23 24 25 26 <table class="table"> 27 <thead><tr><th width="30%">Field</th><th>Description</th></tr></thead> 28 <tbody> 29 30 31 <tr><td><code>qps</code> <B>[Required]</B><br/> 32 <code>float32</code> 33 </td> 34 <td> 35 <p>QPS controls the number of queries per second allowed for K8S api server 36 connection.</p> 37 </td> 38 </tr> 39 <tr><td><code>burst</code> <B>[Required]</B><br/> 40 <code>int32</code> 41 </td> 42 <td> 43 <p>Burst allows extra queries to accumulate when a client is exceeding its rate.</p> 44 </td> 45 </tr> 46 </tbody> 47 </table> 48 49 ## `ClusterQueueVisibility` {#ClusterQueueVisibility} 50 51 52 **Appears in:** 53 54 - [QueueVisibility](#QueueVisibility) 55 56 57 58 <table class="table"> 59 <thead><tr><th width="30%">Field</th><th>Description</th></tr></thead> 60 <tbody> 61 62 63 <tr><td><code>maxCount</code> <B>[Required]</B><br/> 64 <code>int32</code> 65 </td> 66 <td> 67 <p>MaxCount indicates the maximal number of pending workloads exposed in the 68 cluster queue status. When the value is set to 0, then ClusterQueue 69 visibility updates are disabled. 70 The maximal value is 4000. 71 Defaults to 10.</p> 72 </td> 73 </tr> 74 </tbody> 75 </table> 76 77 ## `Configuration` {#Configuration} 78 79 80 81 <p>Configuration is the Schema for the kueueconfigurations API</p> 82 83 84 <table class="table"> 85 <thead><tr><th width="30%">Field</th><th>Description</th></tr></thead> 86 <tbody> 87 88 89 <tr><td><code>namespace</code> <B>[Required]</B><br/> 90 <code>string</code> 91 </td> 92 <td> 93 <p>Namespace is the namespace in which kueue is deployed. It is used as part of DNSName of the webhook Service. 94 If not set, the value is set from the file /var/run/secrets/kubernetes.io/serviceaccount/namespace 95 If the file doesn't exist, default value is kueue-system.</p> 96 </td> 97 </tr> 98 <tr><td><code>ControllerManager</code> <B>[Required]</B><br/> 99 <a href="#ControllerManager"><code>ControllerManager</code></a> 100 </td> 101 <td>(Members of <code>ControllerManager</code> are embedded into this type.) 102 <p>ControllerManager returns the configurations for controllers</p> 103 </td> 104 </tr> 105 <tr><td><code>manageJobsWithoutQueueName</code> <B>[Required]</B><br/> 106 <code>bool</code> 107 </td> 108 <td> 109 <p>ManageJobsWithoutQueueName controls whether or not Kueue reconciles 110 batch/v1.Jobs that don't set the annotation kueue.x-k8s.io/queue-name. 111 If set to true, then those jobs will be suspended and never started unless 112 they are assigned a queue and eventually admitted. This also applies to 113 jobs created before starting the kueue controller. 114 Defaults to false; therefore, those jobs are not managed and if they are created 115 unsuspended, they will start immediately.</p> 116 </td> 117 </tr> 118 <tr><td><code>internalCertManagement</code> <B>[Required]</B><br/> 119 <a href="#InternalCertManagement"><code>InternalCertManagement</code></a> 120 </td> 121 <td> 122 <p>InternalCertManagement is configuration for internalCertManagement</p> 123 </td> 124 </tr> 125 <tr><td><code>waitForPodsReady</code> <B>[Required]</B><br/> 126 <a href="#WaitForPodsReady"><code>WaitForPodsReady</code></a> 127 </td> 128 <td> 129 <p>WaitForPodsReady is configuration to provide simple all-or-nothing 130 scheduling semantics for jobs to ensure they get resources assigned. 131 This is achieved by blocking the start of new jobs until the previously 132 started job has all pods running (ready).</p> 133 </td> 134 </tr> 135 <tr><td><code>clientConnection</code> <B>[Required]</B><br/> 136 <a href="#ClientConnection"><code>ClientConnection</code></a> 137 </td> 138 <td> 139 <p>ClientConnection provides additional configuration options for Kubernetes 140 API server client.</p> 141 </td> 142 </tr> 143 <tr><td><code>integrations</code> <B>[Required]</B><br/> 144 <a href="#Integrations"><code>Integrations</code></a> 145 </td> 146 <td> 147 <p>Integrations provide configuration options for AI/ML/Batch frameworks 148 integrations (including K8S job).</p> 149 </td> 150 </tr> 151 <tr><td><code>queueVisibility</code> <B>[Required]</B><br/> 152 <a href="#QueueVisibility"><code>QueueVisibility</code></a> 153 </td> 154 <td> 155 <p>QueueVisibility is configuration to expose the information about the top 156 pending workloads.</p> 157 </td> 158 </tr> 159 <tr><td><code>multiKueue</code> <B>[Required]</B><br/> 160 <a href="#MultiKueue"><code>MultiKueue</code></a> 161 </td> 162 <td> 163 <p>MultiKueue controls the behaviour of the MultiKueue AdmissionCheck Controller.</p> 164 </td> 165 </tr> 166 </tbody> 167 </table> 168 169 ## `ControllerConfigurationSpec` {#ControllerConfigurationSpec} 170 171 172 **Appears in:** 173 174 - [ControllerManager](#ControllerManager) 175 176 177 <p>ControllerConfigurationSpec defines the global configuration for 178 controllers registered with the manager.</p> 179 180 181 <table class="table"> 182 <thead><tr><th width="30%">Field</th><th>Description</th></tr></thead> 183 <tbody> 184 185 186 <tr><td><code>groupKindConcurrency</code><br/> 187 <code>map[string]int</code> 188 </td> 189 <td> 190 <p>GroupKindConcurrency is a map from a Kind to the number of concurrent reconciliation 191 allowed for that controller.</p> 192 <p>When a controller is registered within this manager using the builder utilities, 193 users have to specify the type the controller reconciles in the For(...) call. 194 If the object's kind passed matches one of the keys in this map, the concurrency 195 for that controller is set to the number specified.</p> 196 <p>The key is expected to be consistent in form with GroupKind.String(), 197 e.g. ReplicaSet in apps group (regardless of version) would be <code>ReplicaSet.apps</code>.</p> 198 </td> 199 </tr> 200 <tr><td><code>cacheSyncTimeout</code><br/> 201 <a href="https://pkg.go.dev/time#Duration"><code>time.Duration</code></a> 202 </td> 203 <td> 204 <p>CacheSyncTimeout refers to the time limit set to wait for syncing caches. 205 Defaults to 2 minutes if not set.</p> 206 </td> 207 </tr> 208 </tbody> 209 </table> 210 211 ## `ControllerHealth` {#ControllerHealth} 212 213 214 **Appears in:** 215 216 - [ControllerManager](#ControllerManager) 217 218 219 <p>ControllerHealth defines the health configs.</p> 220 221 222 <table class="table"> 223 <thead><tr><th width="30%">Field</th><th>Description</th></tr></thead> 224 <tbody> 225 226 227 <tr><td><code>healthProbeBindAddress</code><br/> 228 <code>string</code> 229 </td> 230 <td> 231 <p>HealthProbeBindAddress is the TCP address that the controller should bind to 232 for serving health probes 233 It can be set to "0" or "" to disable serving the health probe.</p> 234 </td> 235 </tr> 236 <tr><td><code>readinessEndpointName</code><br/> 237 <code>string</code> 238 </td> 239 <td> 240 <p>ReadinessEndpointName, defaults to "readyz"</p> 241 </td> 242 </tr> 243 <tr><td><code>livenessEndpointName</code><br/> 244 <code>string</code> 245 </td> 246 <td> 247 <p>LivenessEndpointName, defaults to "healthz"</p> 248 </td> 249 </tr> 250 </tbody> 251 </table> 252 253 ## `ControllerManager` {#ControllerManager} 254 255 256 **Appears in:** 257 258 259 260 261 <table class="table"> 262 <thead><tr><th width="30%">Field</th><th>Description</th></tr></thead> 263 <tbody> 264 265 266 <tr><td><code>webhook</code><br/> 267 <a href="#ControllerWebhook"><code>ControllerWebhook</code></a> 268 </td> 269 <td> 270 <p>Webhook contains the controllers webhook configuration</p> 271 </td> 272 </tr> 273 <tr><td><code>leaderElection</code><br/> 274 <a href="https://pkg.go.dev/k8s.io/component-base/config/v1alpha1#LeaderElectionConfiguration"><code>k8s.io/component-base/config/v1alpha1.LeaderElectionConfiguration</code></a> 275 </td> 276 <td> 277 <p>LeaderElection is the LeaderElection config to be used when configuring 278 the manager.Manager leader election</p> 279 </td> 280 </tr> 281 <tr><td><code>metrics</code><br/> 282 <a href="#ControllerMetrics"><code>ControllerMetrics</code></a> 283 </td> 284 <td> 285 <p>Metrics contains the controller metrics configuration</p> 286 </td> 287 </tr> 288 <tr><td><code>health</code><br/> 289 <a href="#ControllerHealth"><code>ControllerHealth</code></a> 290 </td> 291 <td> 292 <p>Health contains the controller health configuration</p> 293 </td> 294 </tr> 295 <tr><td><code>pprofBindAddress</code><br/> 296 <code>string</code> 297 </td> 298 <td> 299 <p>PprofBindAddress is the TCP address that the controller should bind to 300 for serving pprof. 301 It can be set to "" or "0" to disable the pprof serving. 302 Since pprof may contain sensitive information, make sure to protect it 303 before exposing it to public.</p> 304 </td> 305 </tr> 306 <tr><td><code>controller</code><br/> 307 <a href="#ControllerConfigurationSpec"><code>ControllerConfigurationSpec</code></a> 308 </td> 309 <td> 310 <p>Controller contains global configuration options for controllers 311 registered within this manager.</p> 312 </td> 313 </tr> 314 </tbody> 315 </table> 316 317 ## `ControllerMetrics` {#ControllerMetrics} 318 319 320 **Appears in:** 321 322 - [ControllerManager](#ControllerManager) 323 324 325 <p>ControllerMetrics defines the metrics configs.</p> 326 327 328 <table class="table"> 329 <thead><tr><th width="30%">Field</th><th>Description</th></tr></thead> 330 <tbody> 331 332 333 <tr><td><code>bindAddress</code><br/> 334 <code>string</code> 335 </td> 336 <td> 337 <p>BindAddress is the TCP address that the controller should bind to 338 for serving prometheus metrics. 339 It can be set to "0" to disable the metrics serving.</p> 340 </td> 341 </tr> 342 <tr><td><code>enableClusterQueueResources</code><br/> 343 <code>bool</code> 344 </td> 345 <td> 346 <p>EnableClusterQueueResources, if true the cluster queue resource usage and quotas 347 metrics will be reported.</p> 348 </td> 349 </tr> 350 </tbody> 351 </table> 352 353 ## `ControllerWebhook` {#ControllerWebhook} 354 355 356 **Appears in:** 357 358 - [ControllerManager](#ControllerManager) 359 360 361 <p>ControllerWebhook defines the webhook server for the controller.</p> 362 363 364 <table class="table"> 365 <thead><tr><th width="30%">Field</th><th>Description</th></tr></thead> 366 <tbody> 367 368 369 <tr><td><code>port</code><br/> 370 <code>int</code> 371 </td> 372 <td> 373 <p>Port is the port that the webhook server serves at. 374 It is used to set webhook.Server.Port.</p> 375 </td> 376 </tr> 377 <tr><td><code>host</code><br/> 378 <code>string</code> 379 </td> 380 <td> 381 <p>Host is the hostname that the webhook server binds to. 382 It is used to set webhook.Server.Host.</p> 383 </td> 384 </tr> 385 <tr><td><code>certDir</code><br/> 386 <code>string</code> 387 </td> 388 <td> 389 <p>CertDir is the directory that contains the server key and certificate. 390 if not set, webhook server would look up the server key and certificate in 391 {TempDir}/k8s-webhook-server/serving-certs. The server key and certificate 392 must be named tls.key and tls.crt, respectively.</p> 393 </td> 394 </tr> 395 </tbody> 396 </table> 397 398 ## `Integrations` {#Integrations} 399 400 401 **Appears in:** 402 403 404 405 406 <table class="table"> 407 <thead><tr><th width="30%">Field</th><th>Description</th></tr></thead> 408 <tbody> 409 410 411 <tr><td><code>frameworks</code> <B>[Required]</B><br/> 412 <code>[]string</code> 413 </td> 414 <td> 415 <p>List of framework names to be enabled. 416 Possible options:</p> 417 <ul> 418 <li>"batch/job"</li> 419 <li>"kubeflow.org/mpijob"</li> 420 <li>"ray.io/rayjob"</li> 421 <li>"ray.io/raycluster"</li> 422 <li>"jobset.x-k8s.io/jobset"</li> 423 <li>"kubeflow.org/mxjob"</li> 424 <li>"kubeflow.org/paddlejob"</li> 425 <li>"kubeflow.org/pytorchjob"</li> 426 <li>"kubeflow.org/tfjob"</li> 427 <li>"kubeflow.org/xgboostjob"</li> 428 <li>"pod"</li> 429 </ul> 430 </td> 431 </tr> 432 <tr><td><code>podOptions</code> <B>[Required]</B><br/> 433 <a href="#PodIntegrationOptions"><code>PodIntegrationOptions</code></a> 434 </td> 435 <td> 436 <p>PodOptions defines kueue controller behaviour for pod objects</p> 437 </td> 438 </tr> 439 </tbody> 440 </table> 441 442 ## `InternalCertManagement` {#InternalCertManagement} 443 444 445 **Appears in:** 446 447 448 449 450 <table class="table"> 451 <thead><tr><th width="30%">Field</th><th>Description</th></tr></thead> 452 <tbody> 453 454 455 <tr><td><code>enable</code> <B>[Required]</B><br/> 456 <code>bool</code> 457 </td> 458 <td> 459 <p>Enable controls whether to enable internal cert management or not. 460 Defaults to true. If you want to use a third-party management, e.g. cert-manager, 461 set it to false. See the user guide for more information.</p> 462 </td> 463 </tr> 464 <tr><td><code>webhookServiceName</code> <B>[Required]</B><br/> 465 <code>string</code> 466 </td> 467 <td> 468 <p>WebhookServiceName is the name of the Service used as part of the DNSName. 469 Defaults to kueue-webhook-service.</p> 470 </td> 471 </tr> 472 <tr><td><code>webhookSecretName</code> <B>[Required]</B><br/> 473 <code>string</code> 474 </td> 475 <td> 476 <p>WebhookSecretName is the name of the Secret used to store CA and server certs. 477 Defaults to kueue-webhook-server-cert.</p> 478 </td> 479 </tr> 480 </tbody> 481 </table> 482 483 ## `MultiKueue` {#MultiKueue} 484 485 486 **Appears in:** 487 488 489 490 491 <table class="table"> 492 <thead><tr><th width="30%">Field</th><th>Description</th></tr></thead> 493 <tbody> 494 495 496 <tr><td><code>gcInterval</code><br/> 497 <a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#duration-v1-meta"><code>k8s.io/apimachinery/pkg/apis/meta/v1.Duration</code></a> 498 </td> 499 <td> 500 <p>GCInterval defines the time interval between two consecutive garbage collection runs. 501 Defaults to 1min. If 0, the garbage collection is disabled.</p> 502 </td> 503 </tr> 504 <tr><td><code>origin</code><br/> 505 <code>string</code> 506 </td> 507 <td> 508 <p>Origin defines a label value used to track the creator of workloads in the worker 509 clusters. 510 This is used by multikueue in components like its garbage collector to identify 511 remote objects that ware created by this multikueue manager cluster and delete 512 them if their local counterpart no longer exists.</p> 513 </td> 514 </tr> 515 </tbody> 516 </table> 517 518 ## `PodIntegrationOptions` {#PodIntegrationOptions} 519 520 521 **Appears in:** 522 523 - [Integrations](#Integrations) 524 525 526 527 <table class="table"> 528 <thead><tr><th width="30%">Field</th><th>Description</th></tr></thead> 529 <tbody> 530 531 532 <tr><td><code>namespaceSelector</code> <B>[Required]</B><br/> 533 <a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#labelselector-v1-meta"><code>k8s.io/apimachinery/pkg/apis/meta/v1.LabelSelector</code></a> 534 </td> 535 <td> 536 <p>NamespaceSelector can be used to omit some namespaces from pod reconciliation</p> 537 </td> 538 </tr> 539 <tr><td><code>podSelector</code> <B>[Required]</B><br/> 540 <a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#labelselector-v1-meta"><code>k8s.io/apimachinery/pkg/apis/meta/v1.LabelSelector</code></a> 541 </td> 542 <td> 543 <p>PodSelector can be used to choose what pods to reconcile</p> 544 </td> 545 </tr> 546 </tbody> 547 </table> 548 549 ## `QueueVisibility` {#QueueVisibility} 550 551 552 **Appears in:** 553 554 555 556 557 <table class="table"> 558 <thead><tr><th width="30%">Field</th><th>Description</th></tr></thead> 559 <tbody> 560 561 562 <tr><td><code>clusterQueues</code> <B>[Required]</B><br/> 563 <a href="#ClusterQueueVisibility"><code>ClusterQueueVisibility</code></a> 564 </td> 565 <td> 566 <p>ClusterQueues is configuration to expose the information 567 about the top pending workloads in the cluster queue.</p> 568 </td> 569 </tr> 570 <tr><td><code>updateIntervalSeconds</code> <B>[Required]</B><br/> 571 <code>int32</code> 572 </td> 573 <td> 574 <p>UpdateIntervalSeconds specifies the time interval for updates to the structure 575 of the top pending workloads in the queues. 576 The minimum value is 1. 577 Defaults to 5.</p> 578 </td> 579 </tr> 580 </tbody> 581 </table> 582 583 ## `RequeuingStrategy` {#RequeuingStrategy} 584 585 586 **Appears in:** 587 588 - [WaitForPodsReady](#WaitForPodsReady) 589 590 591 592 <table class="table"> 593 <thead><tr><th width="30%">Field</th><th>Description</th></tr></thead> 594 <tbody> 595 596 597 <tr><td><code>timestamp</code><br/> 598 <a href="#RequeuingTimestamp"><code>RequeuingTimestamp</code></a> 599 </td> 600 <td> 601 <p>Timestamp defines the timestamp used for re-queuing a Workload 602 that was evicted due to Pod readiness. The possible values are:</p> 603 <ul> 604 <li><code>Eviction</code> (default) indicates from Workload <code>Evicted</code> condition with <code>PodsReadyTimeout</code> reason.</li> 605 <li><code>Creation</code> indicates from Workload .metadata.creationTimestamp.</li> 606 </ul> 607 </td> 608 </tr> 609 <tr><td><code>backoffLimitCount</code><br/> 610 <code>int32</code> 611 </td> 612 <td> 613 <p>BackoffLimitCount defines the maximum number of re-queuing retries. 614 Once the number is reached, the workload is deactivated (<code>.spec.activate</code>=<code>false</code>). 615 When it is null, the workloads will repeatedly and endless re-queueing.</p> 616 <p>Every backoff duration is about "1.41284738^(n-1)+Rand" where the "n" represents the "workloadStatus.requeueState.count", 617 and the "Rand" represents the random jitter. During this time, the workload is taken as an inadmissible and 618 other workloads will have a chance to be admitted. 619 For example, when the "waitForPodsReady.timeout" is the default, the workload deactivation time is as follows: 620 {backoffLimitCount, workloadDeactivationSeconds} 621 ~= {1, 601}, {2, 902}, ...,{5, 1811}, ...,{10, 3374}, ...,{20, 8730}, ...,{30, 86400(=24 hours)}, ...</p> 622 <p>Defaults to null.</p> 623 </td> 624 </tr> 625 </tbody> 626 </table> 627 628 ## `RequeuingTimestamp` {#RequeuingTimestamp} 629 630 (Alias of `string`) 631 632 **Appears in:** 633 634 - [RequeuingStrategy](#RequeuingStrategy) 635 636 637 638 639 640 ## `WaitForPodsReady` {#WaitForPodsReady} 641 642 643 **Appears in:** 644 645 646 647 648 <table class="table"> 649 <thead><tr><th width="30%">Field</th><th>Description</th></tr></thead> 650 <tbody> 651 652 653 <tr><td><code>enable</code> <B>[Required]</B><br/> 654 <code>bool</code> 655 </td> 656 <td> 657 <p>Enable when true, indicates that each admitted workload 658 blocks the admission of all other workloads from all queues until it is in the 659 <code>PodsReady</code> condition. If false, all workloads start as soon as they are 660 admitted and do not block admission of other workloads. The PodsReady 661 condition is only added if this setting is enabled. It defaults to false.</p> 662 </td> 663 </tr> 664 <tr><td><code>timeout</code><br/> 665 <a href="https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#duration-v1-meta"><code>k8s.io/apimachinery/pkg/apis/meta/v1.Duration</code></a> 666 </td> 667 <td> 668 <p>Timeout defines the time for an admitted workload to reach the 669 PodsReady=true condition. When the timeout is reached, the workload admission 670 is cancelled and requeued in the same cluster queue. Defaults to 5min.</p> 671 </td> 672 </tr> 673 <tr><td><code>blockAdmission</code> <B>[Required]</B><br/> 674 <code>bool</code> 675 </td> 676 <td> 677 <p>BlockAdmission when true, cluster queue will block admissions for all subsequent jobs 678 until the jobs reach the PodsReady=true condition. It defaults to false if Enable is false 679 and defaults to true otherwise.</p> 680 </td> 681 </tr> 682 <tr><td><code>requeuingStrategy</code><br/> 683 <a href="#RequeuingStrategy"><code>RequeuingStrategy</code></a> 684 </td> 685 <td> 686 <p>RequeuingStrategy defines the strategy for requeuing a Workload.</p> 687 </td> 688 </tr> 689 </tbody> 690 </table>