sigs.k8s.io/kueue@v0.6.2/site/content/en/docs/tasks/run_jobs.md (about) 1 --- 2 title: "Run A Job" 3 date: 2022-02-14 4 weight: 5 5 description: > 6 Run a Job in a Kubernetes cluster with Kueue enabled. 7 --- 8 9 This page shows you how to run a Job in a Kubernetes cluster with Kueue enabled. 10 11 The intended audience for this page are [batch users](/docs/tasks#batch-user). 12 13 ## Before you begin 14 15 Make sure the following conditions are met: 16 17 - A Kubernetes cluster is running. 18 - The kubectl command-line tool has communication with your cluster. 19 - [Kueue is installed](/docs/installation). 20 - The cluster has [quotas configured](/docs/tasks/administer_cluster_quotas). 21 22 The following picture shows all the concepts you will interact with in this tutorial: 23 24  25 26 ## 0. Identify the queues available in your namespace 27 28 Run the following command to list the `LocalQueues` available in your namespace. 29 30 ```shell 31 kubectl -n default get localqueues 32 # Or use the 'queues' alias. 33 kubectl -n default get queues 34 ``` 35 36 The output is similar to the following: 37 38 ```bash 39 NAME CLUSTERQUEUE PENDING WORKLOADS 40 user-queue cluster-queue 3 41 ``` 42 43 The [ClusterQueue](/docs/concepts/cluster_queue) defines the quotas for the 44 Queue. 45 46 ## 1. Define the Job 47 48 Running a Job in Kueue is similar to [running a Job in a Kubernetes cluster](https://kubernetes.io/docs/tasks/job/) 49 without Kueue. However, you must consider the following differences: 50 51 - You should create the Job in a [suspended state](https://kubernetes.io/docs/concepts/workloads/controllers/job/#suspending-a-job), 52 as Kueue will decide when it's the best time to start the Job. 53 - You have to set the Queue you want to submit the Job to. Use the 54 `kueue.x-k8s.io/queue-name` label. 55 - You should include the resource requests for each Job Pod. 56 57 Here is a sample Job with three Pods that just sleep for a few seconds. 58 59 {{< include "examples/jobs/sample-job.yaml" "yaml" >}} 60 61 ## 2. Run the Job 62 63 You can run the Job with the following command: 64 65 ```shell 66 kubectl create -f sample-job.yaml 67 ``` 68 69 Internally, Kueue will create a corresponding [Workload](/docs/concepts/workload) 70 for this Job with a matching name. 71 72 ```shell 73 kubectl -n default get workloads 74 ``` 75 76 The output will be similar to the following: 77 78 ```shell 79 NAME QUEUE ADMITTED BY AGE 80 sample-job-sl4bm user-queue 1s 81 ``` 82 83 ## 3. (Optional) Monitor the status of the workload 84 85 You can see the Workload status with the following command: 86 87 ```shell 88 kubectl -n default describe workload sample-job-sl4bm 89 ``` 90 91 If the ClusterQueue doesn't have enough quota to run the Workload, the output 92 will be similar to the following: 93 94 ```shell 95 Name: sample-job-sl4bm 96 Namespace: default 97 Labels: <none> 98 Annotations: <none> 99 API Version: kueue.x-k8s.io/v1beta1 100 Kind: Workload 101 Metadata: 102 ... 103 Spec: 104 ... 105 Status: 106 Conditions: 107 Last Probe Time: 2022-03-28T19:43:03Z 108 Last Transition Time: 2022-03-28T19:43:03Z 109 Message: workload didn't fit 110 Reason: Pending 111 Status: False 112 Type: Admitted 113 Events: <none> 114 ``` 115 116 When the ClusterQueue has enough quota to run the Workload, it will admit 117 the Workload. To see if the Workload was admitted, run the following command: 118 119 ```shell 120 kubectl -n default get workloads 121 ``` 122 123 The output is similar to the following: 124 125 ```shell 126 NAME QUEUE ADMITTED BY AGE 127 sample-job-sl4bm user-queue cluster-queue 45s 128 ``` 129 130 To view the event for the Workload admission, run the following command: 131 132 ```shell 133 kubectl -n default describe workload sample-job-sl4bm 134 ``` 135 136 The output is similar to the following: 137 138 ```shell 139 ... 140 Events: 141 Type Reason Age From Message 142 ---- ------ ---- ---- ------- 143 Normal Admitted 50s kueue-manager Admitted by ClusterQueue cluster-queue 144 ``` 145 146 To continue monitoring the Workload progress, you can run the following command: 147 148 ```shell 149 kubectl -n default describe workload sample-job-sl4bm 150 ``` 151 152 Once the Workload has finished running, the output is similar to the following: 153 154 ```shell 155 ... 156 Status: 157 Conditions: 158 ... 159 Last Probe Time: 2022-03-28T19:43:37Z 160 Last Transition Time: 2022-03-28T19:43:37Z 161 Message: Job finished successfully 162 Reason: JobFinished 163 Status: True 164 Type: Finished 165 ... 166 ``` 167 168 To review more details about the Job status, run the following command: 169 170 ```shell 171 kubectl -n default describe job sample-job-sl4bm 172 ``` 173 174 The output is similar to the following: 175 176 ```shell 177 Name: sample-job-sl4bm 178 Namespace: default 179 ... 180 Start Time: Mon, 28 Mar 2022 15:45:17 -0400 181 Completed At: Mon, 28 Mar 2022 15:45:49 -0400 182 Duration: 32s 183 Pods Statuses: 0 Active / 3 Succeeded / 0 Failed 184 Pod Template: 185 ... 186 Events: 187 Type Reason Age From Message 188 ---- ------ ---- ---- ------- 189 Normal Suspended 22m job-controller Job suspended 190 Normal CreatedWorkload 22m kueue-job-controller Created Workload: default/sample-job-sl4bm 191 Normal SuccessfulCreate 19m job-controller Created pod: sample-job-sl4bm-7bqld 192 Normal Started 19m kueue-job-controller Admitted by clusterQueue cluster-queue 193 Normal SuccessfulCreate 19m job-controller Created pod: sample-job-sl4bm-7jw4z 194 Normal SuccessfulCreate 19m job-controller Created pod: sample-job-sl4bm-m7wgm 195 Normal Resumed 19m job-controller Job resumed 196 Normal Completed 18m job-controller Job completed 197 ``` 198 199 Since events have a timestamp with a resolution of seconds, the events might 200 be listed in a slightly different order from which they actually occurred. 201 202 ## Partial admission 203 204 From version v0.4.0, Kueue provides the ability for a batch user to create Jobs that ideally will run with a parallelism `P0` but can accept a smaller parallelism, `Pn`, if the Job dose not fit within the available quota. 205 206 Kueue will only attempt to decrease the parallelism after both _borrowing_ and _preemption_ was taken into account in the admission process, and none of them are feasible. 207 208 To allow partial admission you can provide the minimum acceptable parallelism `Pmin` in `kueue.x-k8s.io/job-min-parallelism` annotation of the Job, `Pn` should be grater that 0 and less that `P0`. When a Job is partially admitted its parallelism will be set to `Pn`, `Pn` will be set to the maximum acceptable value between `Pmin` and `P0`. The Job's completions count will not be changed. 209 210 For example, a Job defined by the following manifest: 211 212 {{< include "examples/jobs/sample-job-partial-admission.yaml" "yaml" >}} 213 214 When queued in a ClusterQueue with only 9 CPUs available, it will be admitted with `parallelism=9`. Note that the number of completions doesn't change. 215 216 **NOTE:** PartialAdmission is an `Alpha` feature disabled by default, check the [Change the feature gates configuration](/docs/installation/#change-the-feature-gates-configuration) section of the [Installation](/docs/installation/) for details.