sigs.k8s.io/kueue@v0.6.2/test/performance/README.md (about)

     1  # Kueue Performance Testing
     2  
     3  ## Measurements
     4  
     5  ### Job startup latency
     6  
     7  How fast do jobs transition from `created` to `started` state?
     8  Time spent between the transition from `job.CreationTimestamp.Time` to `job.Status.StartTime.Time` state.
     9  
    10  High Job startup latency in Kueue is expected when the total quota is not enough to schedule all jobs immediately, because the jobs need to queue.
    11  
    12  ### Job startup throughput
    13  
    14  The best workload admission rate per second within 1 minute intervals.
    15  The rate is measured every 5 seconds (see more details in [PromQL examples](https://prometheus.io/docs/prometheus/latest/querying/examples/#subquery)):
    16  
    17  `max_over_time(sum(rate(kueue_admitted_workloads_total{cluster_queue="{{$clusterQueue}}"}[1m]))[{{$testTimeout}}:5s])`
    18  
    19  This measurement is not accurate if the cluster quota is big enough to schedule all workloads of the test immediately, because Kueue immediately admits all the workloads and the `kueue_admitted_workloads_total` never increases. In this case, the PromQL query returns 0.
    20  
    21  ## How to run the test?
    22  
    23  ### Prerequisites
    24  
    25  1. Deploy [Kueue](https://github.com/kubernetes-sigs/kueue/blob/main/docs/setup/install.md)
    26  2. Make sure you have `kubectl`, [jq](https://stedolan.github.io/jq/download/), [golang version](https://github.com/mikefarah/yq) of `yq` and `go`
    27  3. Checkout `Clusterloader2` framework: https://github.com/kubernetes/perf-tests and build `clusterloader` binary:
    28  
    29      * change to `clusterloader2` directory
    30      * run `go build -o clusterloader './cmd/'`
    31  
    32  ### Run the test
    33  
    34  1. Copy an environment file example to `.env` file:
    35  
    36      * `cp .env.example .env`
    37  
    38  2. Edit the environment variables
    39  
    40  | Variable          | Description |
    41  | -----------       | ----------- |
    42  | CL2_HOME_DIR      | Clusterloader home directory (checkout https://github.com/kubernetes/perf-tests)       |
    43  | USE_KUEUE         | Run the performance test with Kueue (this requires Kueue to be pre-deployed to the cluster) or without Kueue    |
    44  | EXPERIMENTS       | Configuration of iterations iterations (see configuration example in the file) |
    45  | KUBECONFIG        | Kubeconfig file location |
    46  | PROVIDER          | Kubernetes kind (tested on `gke` only)
    47  
    48  3. Run the `run-test.sh` file
    49  
    50  ### Test results
    51  
    52  Every test execution creates a `report_<timestamp>` directory inside `TEST_CONFIG_DIR` with `summary.csv` file, where the following metrics are available:
    53  
    54  * P50 Job Create to start latency (ms)
    55  * P90 Job Create to start latency (ms)
    56  * P50 Job Start to complete latency (ms)
    57  * P90 Job Start to complete latency (ms)
    58  * Max Job Throughput (max jobs/s)
    59  * Total Jobs
    60  * Total Pods
    61  * Duration (s)
    62  
    63  Additionally, the following metrics are added to the results only for reference. Kueue doesn't influence them directly.
    64  
    65  * Avg Pod Waiting time (s)
    66  * P90 Pod Waiting time (s)
    67  * Avg Pod Completion time (s)
    68  * P90 Pod Completion time (s)