volcano.sh/volcano@v1.9.0/docs/design/proportional.md (about) 1 ## Background 2 3 Volcano scheduler handles jobs requiring different types of resources, such as GPU, CPU, memory. Under particular circumstances, we may specify a 'primary' resource(e.g., GPU in deep learning), and preserve the amount of associated 'secondary' resources by a pre-set proportion. This plugin works in the phase of predicates, dedicates to ensure the node's idle resource is enough for the proportion after jobs requiring secondary resources are scheduled. 4 5 ## Scenario of default scheduler 6 7 Considering we have a node with 74CPUs, 8GPUs, 128G memory. As no job is submitted, resource NodeAllocatable is equal to NodeIdle. 8 9 Node | NodeAllocatable | NodeIdle 10 ---|---|--- 11 nodeC0-0 | cpu 74, memory 128G, nvidia.com/gpu 8 | cpu 74, memory 128G, nvidia.com/gpu 8 | 12 13 Then two jobs requiring 8CPUs, 8G memory are submitted, and scheduled to the node; the resource status is as below: 14 15 Job | Pod | Resource | Node | NodeAllocatable | NodeIdle 16 ---|---|---|---|---|--- 17 default/single-1000-0 | single-1000-0 | cpu 8, memory 8G, nvidia.com/gpu 0 | nodeC0-0 | cpu 74, memory 128G, nvidia.com/gpu 8 | cpu 66, memory 120G, nvidia.com/gpu 8 | 18 default/single-1000-1 | single-1000-1 | cpu 8, memory 8G, nvidia.com/gpu 0 | nodeC0-0 | cpu 66, memory 120G, nvidia.com/gpu 8 | cpu 58, memory 112G, nvidia.com/gpu 8 | 19 20 If we take GPU as primary resource and want to use 1GPU 'binded' with 8CPUs, the left 58 CPUs are insufficent for 8 GPUs; the proportion plugin is designed to solve this problem. 21 22 ## with proportion plugin 23 24  25 26 Firstly set the proportion binding in volcano-scheduler.conf: 27 28 ```yaml 29 actions: "enqueue, allocate, backfill" 30 tiers: 31 - plugins: 32 - name: predicates 33 arguments: 34 predicate.ProportionalEnable: true 35 predicate.resources: nvidia.com/gpu,nvidia.com/v100-sxm2-16gb 36 predicate.resources.nvidia.com/gpu.cpu: 8 37 predicate.resources.nvidia.com/gpu.memory: 8 38 predicate.resources.nvidia.com/v100-sxm2-16gb.cpu: 16 39 predicate.resources.nvidia.com/v100-sxm2-16gb.memory: 16 40 ``` 41 42 The proportion is GPU:CPU:MEMORY=1:8:8, and let the test scenario just as above: 43 44 Node | NodeAllocatable | NodeIdle 45 ---|---|--- 46 nodeC0-0 | cpu 74, memory 128G, nvidia.com/gpu 8 | cpu 74, memory 128G, nvidia.com/gpu 8 | 47 48 Job | Pod | Resource | Node | NodeAllocatable | NodeIdle 49 ---|---|---|---|---|--- 50 default/single-1000-0 | single-1000-0 | cpu 8, memory 8G, nvidia.com/gpu 0 | nodeC0-0 | cpu 74, memory 128G, nvidia.com/gpu 8 | cpu 66, memory 120G, nvidia.com/gpu 8 | 51 default/single-1000-1 | single-1000-1 | cpu 8, memory 8G, nvidia.com/gpu 0 | - | - | - | 52 53 After job single-1000-0 is scheduled, the Idel resouce is 8GPUs, 66CPUs, 120G memory. During the predicate phase, this plugin caculates the resource left if job single-1000-1 is scheduled`(node.Idel.CPU - task.Resreq.CPU < node.Idel.GPU * cpuRatio || 54 node.Idel.Memory - task.Resreq.Memory < node.Idel.GPU * memoryRatio)`; the result is 8GPUs, 58CPUs, 112G memory, that unsatisfies the 1:8:8 proportion. Therefore nodeC0-0 is removed from the predicateNodes, and NodeIdle remains 8GPUs, 66CPUs, 120G memory. 55 56 57