volcano.sh/volcano@v1.9.0/docs/design/queue/queue.md (about)

     1  # Queue
     2  
     3  [@k82cn](http://github.com/k82cn); April 17, 2019
     4  
     5  ## Motivation
     6  
     7  `Queue` was introduced in [kube-batch](http://github.com/kubernetes-sigs/kube-batch) long time ago as an internal feature, which makes all jobs are submitted to the same queue, named `default`. As more and more users would like to share resources with each other by queue, this proposal is going to cover primary features of queue achieve that.
     8  
     9  ## Function Specification
    10  
    11  The queue is cluster level, so the user from different namespaces can share resource within a `Queue`. The following section defines the api of queue.
    12  
    13  ### API
    14  
    15  ```go
    16  type Queue struct {
    17      metav1.TypeMeta `json:",inline"`
    18  
    19      metav1.ObjectMeta `json:"metadata,omitempty" protobuf:"bytes,1,opt,name=metadata"`
    20  
    21      // Specification of the desired behavior of a queue
    22      // +optional
    23      Spec QueueSpec `json:"spec,omitempty" protobuf:"bytes,2,opt,name=spec"`
    24  
    25      // Current status of Queue
    26      // +optional
    27      Status QueueStatus `json:"status,omitempty" protobuf:"bytes,3,opt,name=status"`
    28  }
    29  
    30  type QueueSpec struct {
    31      // The weight of queue to share the resources with each other.
    32      Weight int32 `json:"weight,omitempty" protobuf:"bytes,1,opt,name=weight"`
    33  }
    34  
    35  type QueueStatus struct {
    36      // The number of job in Unknown status
    37      Unknown int32 `json:"running,omitempty" protobuf:"bytes,1,opt,name=running"`
    38      // The number of job in Running status
    39      Running int32 `json:"running,omitempty" protobuf:"bytes,2,opt,name=running"`
    40      // The number of job in Pending status
    41      Pending int32 `json:"pending,omitempty" protobuf:"bytes,3,opt,name=pending"`
    42      // The number of job in Completed status
    43      Completed int32 `json:"completed,omitempty" protobuf:"bytes,4,opt,name=completed"`
    44      // The number of job in Failed status
    45      Failed int32 `json:"failed,omitempty" protobuf:"bytes,5,opt,name=failed"`
    46      // The number of job in Aborted status
    47      Aborted int32 `json:"aborted,omitempty" protobuf:"bytes,6,opt,name=aborted"`
    48  }
    49  ```
    50  
    51  ### QueueController
    52  
    53  The `QueueController` will manage the lifecycle of queue:
    54  
    55  1. Watching `PodGroup`/`Job` for status
    56  2. If `Queue` was deleted, also delete all related `PodGroup`/`Job` in the queue
    57  
    58  ### Admission Controller
    59  
    60  The admission controller will check `PodGroup`/`Job` 's queue when creation:
    61  
    62  1. if the queue does not exist, the creation will be rejected
    63  2. if the queue is releasing, the creation will be also rejected
    64  
    65  ### Feature Interaction
    66  
    67  #### Customized Job/PodGroup
    68  
    69  If the `PodGroup` is created by customized controller, the `QueueController` will count those `PodGroup` into `Unknown` status; because `PodGroup` focus on scheduling specification which did not include customized job's status.
    70  
    71  #### cli
    72  
    73  Command line is also enhanced for operator engineers. Three sub-commands are introduced as follow:
    74  
    75  __create__:
    76  
    77  `create` command is used to create a queue with weight; for example, the following command will create a queue named `myqueue` with weight 10.
    78  
    79  ```shell
    80  $ vcctl queue create --name myqueue --weight 10
    81  ```
    82  
    83  __view__:
    84  
    85  `view` command is used to show the detail of a queue, e.g. creation time; the following command will show the detail of queue `myqueue`
    86  
    87  ```shell
    88  $ vcctl queue view myqueue
    89  ```
    90  
    91  __list__:
    92  
    93  `list` command is used to show all available queues to current user
    94  
    95  ```shell
    96  $ vcctl queue list
    97  Name      Weight  Total  Pending  Running ...
    98  myqueue   10      10     5        5
    99  ```
   100  
   101  #### Scheduler
   102  
   103  * Proportion plugin:
   104  
   105    Proportion plugin is used to share resource between `Queue`s by weight. The deserved resource of a queue is `(weight/total-weight) * total-resource`. When allocating resources, it will not allocate resource more than its deserved resources.
   106  
   107  * Reclaim action:
   108  
   109    `reclaim` action will go through all queues to reclaim others by `ReclaimableFn`'s return value; the time complexity is `O(n^2)`. In `ReclaimableFn`, both `proportion` and `gang` will take effect: 1. `proportion` makes sure the queue will not be under-used after reclaim, 2. `gang` makes sure the job will not be reclaimed if its `minAvailable` > 1.
   110  
   111  * Backfill action:
   112  
   113    When `allocate` action assign resources to each queue, there's a case that ([kube-batch#492](<https://github.com/kubernetes-sigs/kube-batch/issues/492>)) the resources maybe unnecessary idle because of `proportion` plugin: there are one pending job in two queue each, and the deserved resources of each queue can not meet the requirement of their jobs. In such case, `backfill` action will ignore deserved guarantee of queue to fill idle resources as much as possible. This introduces another potential case that the coming smaller job is blocked; this case will be handle by reserved resources of each queue in other project.