volcano.sh/volcano@v1.9.0/docs/design/elastic-scheduler.md (about) 1 ## Introduction 2 3 This feature allows Volcano to schedule workloads based on the `[min,max]` config to improve resource utilization rate and shorten the execution time of training job. 4 5 For example, K8s cluster has 10 GPUs, and I want to use Volcano to schedule training jobs(tfjob/pytorchjob/vcjob) in two queues: queue1 and queue2 6 7 ||weight| reclaimable| deserved GPUs| 8 |---|---|---|---| 9 |queue1| 1| false| 5| 10 |queue2| 1| false| 5| 11 12  13 14 If there is a job1-1 running in queue1, we set pod6 to pod10 as elastic pods/resources which can be preempted when queue1's resource is shortage. The elastic pods have the lowest priority. Specifically,**these pods will be created last and be preempted first**. 15 1. elastic pods can be created only when there are free resources. 16 2. elastic pods will be preempted if there are not enough resources for running minAvailable pods. 17 18 ```yaml 19 apiVersion: batch.volcano.sh/v1alpha1 20 kind: Job 21 metadata: 22 name: job1-1 23 spec: 24 minAvailable: 5 #min 25 queue: queue1 26 tasks: 27 - replicas: 10 #max 28 name: job1-1 29 template: 30 metadata: 31 name: job1-1 32 spec: 33 containers: 34 - image: train_script 35 name: xx 36 resources: 37 limits: 38 cpu: 1 39 nvidia.com/gpu: 1 40 ``` 41 42 In detail, there are some principles for elastic schedule 43 1. if job1-1 and job1-2 are submited at the same time, `job1.minAvailable` pods and `job2.minAvailable` pods will be created first. And then `job1/job2.elastic` pods will be created if there are extra resource. 44  45 2. if submit job1-1 and then submit job1-2 in queue1, elastic pods in job1-1 will be preempted 46  47 3. if submit job1-1 and then submit job2-1 in queue2, elastic pods in job1-1 will be preempted 48  49 50 ## Design 51 52 1. Enqueue action 53 - Modify the logic of job enqueue process. For elastic pods can be preempted at any time, elastic resources are free resources in a queue. So we will fix `jobEnqueueableFns` in `overcommit` and `proportion` plugin. it should be noticed that if total elastic resources can not meet new-job's minRequest and the new-job should also be pending. 54  55 56 2. Allocate action(already implemented) 57 - All pods will be created initially(by controller/operator), but minAvailable pods will be scheduled first and then schedule elastic pods if there are free resources. 58 59 3. Preempt action in queue scope 60 - Preempt elastic pods if there are starving job in the same queue(already implemented). 61 - It is not necessary to preempt elastic pods if total elastic resources can not meet the starving job's minRequest. 62 63 4. Reclaim action in cluster scope 64 - If a queue is overused, reclaim its elastic resources whether this queue's `reclaimable` filed is true or false.