volcano.sh/volcano@v1.9.0/docs/design/queue/queue.md (about) 1 # Queue 2 3 [@k82cn](http://github.com/k82cn); April 17, 2019 4 5 ## Motivation 6 7 `Queue` was introduced in [kube-batch](http://github.com/kubernetes-sigs/kube-batch) long time ago as an internal feature, which makes all jobs are submitted to the same queue, named `default`. As more and more users would like to share resources with each other by queue, this proposal is going to cover primary features of queue achieve that. 8 9 ## Function Specification 10 11 The queue is cluster level, so the user from different namespaces can share resource within a `Queue`. The following section defines the api of queue. 12 13 ### API 14 15 ```go 16 type Queue struct { 17 metav1.TypeMeta `json:",inline"` 18 19 metav1.ObjectMeta `json:"metadata,omitempty" protobuf:"bytes,1,opt,name=metadata"` 20 21 // Specification of the desired behavior of a queue 22 // +optional 23 Spec QueueSpec `json:"spec,omitempty" protobuf:"bytes,2,opt,name=spec"` 24 25 // Current status of Queue 26 // +optional 27 Status QueueStatus `json:"status,omitempty" protobuf:"bytes,3,opt,name=status"` 28 } 29 30 type QueueSpec struct { 31 // The weight of queue to share the resources with each other. 32 Weight int32 `json:"weight,omitempty" protobuf:"bytes,1,opt,name=weight"` 33 } 34 35 type QueueStatus struct { 36 // The number of job in Unknown status 37 Unknown int32 `json:"running,omitempty" protobuf:"bytes,1,opt,name=running"` 38 // The number of job in Running status 39 Running int32 `json:"running,omitempty" protobuf:"bytes,2,opt,name=running"` 40 // The number of job in Pending status 41 Pending int32 `json:"pending,omitempty" protobuf:"bytes,3,opt,name=pending"` 42 // The number of job in Completed status 43 Completed int32 `json:"completed,omitempty" protobuf:"bytes,4,opt,name=completed"` 44 // The number of job in Failed status 45 Failed int32 `json:"failed,omitempty" protobuf:"bytes,5,opt,name=failed"` 46 // The number of job in Aborted status 47 Aborted int32 `json:"aborted,omitempty" protobuf:"bytes,6,opt,name=aborted"` 48 } 49 ``` 50 51 ### QueueController 52 53 The `QueueController` will manage the lifecycle of queue: 54 55 1. Watching `PodGroup`/`Job` for status 56 2. If `Queue` was deleted, also delete all related `PodGroup`/`Job` in the queue 57 58 ### Admission Controller 59 60 The admission controller will check `PodGroup`/`Job` 's queue when creation: 61 62 1. if the queue does not exist, the creation will be rejected 63 2. if the queue is releasing, the creation will be also rejected 64 65 ### Feature Interaction 66 67 #### Customized Job/PodGroup 68 69 If the `PodGroup` is created by customized controller, the `QueueController` will count those `PodGroup` into `Unknown` status; because `PodGroup` focus on scheduling specification which did not include customized job's status. 70 71 #### cli 72 73 Command line is also enhanced for operator engineers. Three sub-commands are introduced as follow: 74 75 __create__: 76 77 `create` command is used to create a queue with weight; for example, the following command will create a queue named `myqueue` with weight 10. 78 79 ```shell 80 $ vcctl queue create --name myqueue --weight 10 81 ``` 82 83 __view__: 84 85 `view` command is used to show the detail of a queue, e.g. creation time; the following command will show the detail of queue `myqueue` 86 87 ```shell 88 $ vcctl queue view myqueue 89 ``` 90 91 __list__: 92 93 `list` command is used to show all available queues to current user 94 95 ```shell 96 $ vcctl queue list 97 Name Weight Total Pending Running ... 98 myqueue 10 10 5 5 99 ``` 100 101 #### Scheduler 102 103 * Proportion plugin: 104 105 Proportion plugin is used to share resource between `Queue`s by weight. The deserved resource of a queue is `(weight/total-weight) * total-resource`. When allocating resources, it will not allocate resource more than its deserved resources. 106 107 * Reclaim action: 108 109 `reclaim` action will go through all queues to reclaim others by `ReclaimableFn`'s return value; the time complexity is `O(n^2)`. In `ReclaimableFn`, both `proportion` and `gang` will take effect: 1. `proportion` makes sure the queue will not be under-used after reclaim, 2. `gang` makes sure the job will not be reclaimed if its `minAvailable` > 1. 110 111 * Backfill action: 112 113 When `allocate` action assign resources to each queue, there's a case that ([kube-batch#492](<https://github.com/kubernetes-sigs/kube-batch/issues/492>)) the resources maybe unnecessary idle because of `proportion` plugin: there are one pending job in two queue each, and the deserved resources of each queue can not meet the requirement of their jobs. In such case, `backfill` action will ignore deserved guarantee of queue to fill idle resources as much as possible. This introduces another potential case that the coming smaller job is blocked; this case will be handle by reserved resources of each queue in other project.