k8s.io/test-infra@v0.0.0-20240520184403-27c6b4c223d8/metrics/README.md

k8s.io/test-infra@v0.0.0-20240520184403-27c6b4c223d8/metrics/README.md (about)

     1  # Bigquery metrics
     2  
     3  This `metrics-bigquery` job generates metrics that summarize data in our Bigquery
     4  test result database. Each metric is defined with a config file that is consumed
     5  by the `metrics-bigquery` periodic prow job.  Each metric config is a yaml file
     6  like the following:
     7  
     8  ```yaml
     9  # Metric name
    10  metric: failures
    11  # BigQuery query
    12  query: |
    13    #standardSQL
    14    select /* find the most recent time each job passed (may not be this week) */
    15      job,
    16      max(started) latest_pass
    17    from `k8s-gubernator.build.all`
    18    where
    19      result = 'SUCCESS'
    20    group by job
    21  
    22  # JQ filter to make daily results from raw query results
    23  jqfilter: |
    24    [(.[] | select((.latest_pass|length) > 0)
    25    | {(.job): {
    26        latest_pass: (.latest_pass)
    27    }})] | add
    28  ```
    29  
    30  ## Metrics
    31  
    32  * build-stats - number of daily builds and pass rate
    33      - [Config](configs/build-stats.yaml)
    34      - [build-stats-latest.json](http://storage.googleapis.com/k8s-metrics/build-stats-latest.json)
    35  * presubmit-health - presubmit failure rate and timing across PRs
    36      - [Config](configs/presubmit-health.yaml)
    37      - [presubmit-health-latest.json](http://storage.googleapis.com/k8s-metrics/presubmit-health-latest.json)
    38  * failures - find jobs that have been failing the longest
    39      - [Config](configs/failures-config.yaml)
    40      - [failures-latest.json](http://storage.googleapis.com/k8s-metrics/failures-latest.json)
    41  * flakes - find the flakiest jobs this week (and the flakiest tests in each job)
    42      - [Config](configs/flakes-config.yaml)
    43      - [flakes-latest.json](http://storage.googleapis.com/k8s-metrics/flakes-latest.json)
    44  * flakes-daily - find the flakiest jobs in the last 24h (and the flakiest tests in each job)
    45      - [Config](configs/flakes-daily-config.yaml)
    46      - [flakes-daily-latest.json](http://storage.googleapis.com/k8s-metrics/flakes-daily-latest.json)
    47  * job-health - compute daily health metrics for jobs (runs, tests, failure rate for each, duration percentiles)
    48      - [Config](configs/job-health.yaml)
    49      - [job-health-latest.json](http://storage.googleapis.com/k8s-metrics/job-health-latest.json)
    50  * job-flakes - compute consistency of all jobs
    51      - [Config](configs/job-flakes-config.yaml)
    52      - [job-flakes-latest.json](http://storage.googleapis.com/k8s-metrics/job-flakes-latest.json)
    53  * pr-consistency - calculate PR flakiness for the previous day.
    54      - [Config](configs/pr-consistency-config.yaml)
    55      - [pr-consistency-latest.json](http://storage.googleapis.com/k8s-metrics/pr-consistency-latest.json)
    56  * weekly-consistency - compute overall weekly consistency for PRs
    57      - [Config](configs/weekly-consistency-config.yaml)
    58      - [weekly-consistency-latest.json](http://storage.googleapis.com/k8s-metrics/weekly-consistency-latest.json)
    59  
    60  ## Adding a new metric
    61  
    62  To add a new metric, create a PR that adds a new yaml config file
    63  specifying the metric name (`metric`), the bigquery query to execute (`query`), and a
    64  jq filter to filter the data for the daily and latest files (`jqfilter`).
    65  
    66  Run `./bigquery.py --config configs/my-new-config.yaml` and verify that the
    67  output is what you expect.
    68  
    69  Add the new metric to the list above.
    70  
    71  After merging, find the new metric on GCS within 24 hours.
    72  
    73  ## Testing Metrics
    74  The metrics executed in these queries are stored in [BigQuery](https://cloud.google.com/bigquery). The tables that hold k8s test data are populated by [Kettle](https://github.com/kubernetes/test-infra/blob/master/kettle/README.md) and live in the `k8s-gubernator` project of [Big Query Tables].
    75  
    76  From these tables open `k8s-gubernator` -> `build` -> `<table you care about>`
    77  - Click on `Query Table`
    78  - Build or Copy Query into the editor
    79  - Click `> Run`
    80  - The results will be visible in a table at the bottom
    81  
    82  You can see the last time a table was updated by selecting a table, and opening the `Details` tab. The `Last modified` field will show the last time this table was updated. If data is stale, please create an issue against `Kettle`.
    83  
    84  ## Details
    85  
    86  Each query is run every 24 hours to produce a json
    87  file containing the complete raw query results named with the format
    88  `raw-yyyy-mm-dd.json`. The raw file is then filtered with the associated
    89  jq filter and the results are stored in `daily-yyyy-mm-dd.json`.  These
    90  files are stored in the k8s-metrics GCS bucket in a directory named with
    91  the metric name and persist for a year after their creation. Additionally,
    92  the latest filtered results for a metric are stored in the root of the
    93  k8s-metrics bucket and named with the format `METRICNAME-latest.json`.
    94  
    95  ## Query structure
    96  
    97  The `query` is written in `Standard SQL` which is really [BigQuery Standard SQL](https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax) that allows for working with arrays/repeated fields. Each sub-query, from the most indented out, will build a subtable that the outer query runs against. Any one of the sub query blocks can be run independently from the BigQuery console or optionally added to a test query config and run via the same `bigquery.py` line above.
    98  
    99  ## Consistency
   100  
   101  Consistency means the test, job, pr always produced the same answer. For
   102  example suppose we run a build of a job 5 times at the same commit:
   103  * 5 passing runs, 0 failing runs: consistent
   104  * 0 passing runs, 5 failing runs: consistent
   105  * 1-4 passing runs, 1-4 failing runs: inconsistent aka flaked
   106  
   107  [Big Query Tables]: https://console.cloud.google.com/bigquery?utm_source=bqui&utm_medium=link&utm_campaign=classic&project=k8s-gubernator