k8s.io/test-infra@v0.0.0-20240520184403-27c6b4c223d8/testgrid/config.md (about)

     1  # TestGrid Configuration
     2  
     3  ## Table of Contents
     4  
     5  * [Prow Job Configuration](#prow-job-configuration)
     6  * [Configuration](#configuration)
     7  * [Testing & Verification](#testing-your-configuration)
     8  * [Advanced Configuration](#advanced-configuration)
     9  
    10  Testgrid is composed of:
    11  
    12  * A list of **test groups** that contain results for a job over time.
    13  * A list of **dashboard tabs** that display a test group
    14  * A list of **dashboards**, or collections of dashboard tabs
    15  * A list of **dashboard groups** of related dashboards.
    16  
    17  Most of these objects are simply listed in a [YAML config file][configuration] for Testgrid to consume.
    18  
    19  ## Prow Job Configuration
    20  
    21  If you just have a [Prow job](/prow/jobs.md) configuration you want to appear in an existing
    22  dashboard, add annotations to that Prow job.
    23  
    24  Add this to your Prow job:
    25  
    26  ```yaml
    27  annotations:
    28    testgrid-dashboards: dashboard-name      # a dashboard already defined in a config.yaml.
    29    testgrid-tab-name: some-short-name       # optionally, a shorter name for the tab. If omitted, just uses the job name.
    30    testgrid-alert-email: me@me.com          # optionally, an alert email that will be applied to the tab created in the
    31                                             # first dashboard specified in testgrid-dashboards.
    32    description: Words about your job.       # optionally, a description of your job. If omitted, just uses the job name.
    33  
    34    testgrid-num-columns-recent: "10"        # optionally, the number of runs a row can be omitted from before it is
    35                                             # considered stale. Currently defaults to 10.
    36    testgrid-num-failures-to-alert: "3"      # optionally, the number of continuous failures before sending an email.
    37                                             # Currently defaults to 3.
    38    testgrid-alert-stale-results-hours: "12" # optionally, send an email if this many hours pass with no results at all.
    39    testgrid-in-cell-metric: coverage        # optionally, text property metric value to be evaluated, with the resulting
    40                                             # numeric value placed visually inside the test result cells.
    41    testgrid-base-options: base-options      # optionally, sets 'base_options' tab option.
    42  ```
    43  
    44  This functionality is provided by [Configurator](cmd/configurator). If you have Prow jobs in a new
    45  instance of Prow, you may also have to set up [Config Merger](./merging.md) also.
    46  
    47  This is sufficient for TestGrid and Prow. If you're using TestGrid independently of Prow,
    48  read on.
    49  
    50  ## Configuration
    51  
    52  Open or create a Testgrid config file [(example)][configuration] in your favorite editor and:
    53  
    54  1. Configure the test groups
    55  2. Add those testgroups to one or more tabs in one or more dashboards
    56  3. Consider using dashboard groups if multiple dashboards are needed.
    57  
    58  ### Defaults
    59  
    60  #### Overall Default.yaml
    61  
    62  For testgrid.k8s.io there is a default.yaml file that contains configuration that will apply to all other testgroups and dashboard_tabs. 
    63  This will rarely need to be changed, but to override these defaults, or to have defaults for your own testgroups/dashboard tabs you can use
    64  a directory default. 
    65  
    66  #### Directory Default.yaml
    67  
    68  If you want to override the default.yaml for configs in a directory, you can add a file named "default.yaml" and instead of applying the overall default to 
    69  those config files, it will apply this default file instead.
    70  
    71  What this default will NOT apply to:
    72  - Configs in subdirectories
    73  - Groups or tabs only defined in prow job configuration
    74  
    75  
    76  Ex:
    77  
    78  Overall default:
    79  ```yaml
    80  default_test_group:
    81    days_of_results: 5
    82  default_dashboard_tab:
    83    display_local_time: true
    84  ```
    85  
    86  foo/default.yaml:
    87  ```yaml
    88  default_test_group:
    89    days_of_results: 10
    90  default_dashboard_tab:
    91    display_local_time: false
    92  ```
    93  
    94  foo/config.yaml:
    95  ```yaml
    96  dashboards:
    97  - name: dash_1
    98    dashboard_tab:
    99    - name: tab_1
   100  test_groups:
   101  - name: testgroup_1
   102  ```
   103  
   104  resulting config:
   105  ```yaml
   106  dashboards:
   107  - name: dash_1
   108    dashboard_tab:
   109    - name: tab_1
   110      display_local_time: false
   111  test_groups:
   112  - name: testgroup_1
   113    days_of_results: 10
   114  ```
   115  
   116  The overall default was overrided by foo/default.yaml for other config files in the foo directory
   117  
   118  
   119  ### Test groups
   120  
   121  Test groups contain a set of test results across time for the same job.
   122  Each group backs one or more dashboard tabs.
   123  
   124  Add a new test group under `test_groups:`, specifying the group's name,
   125  and where the logs are located.
   126  
   127  Ex:
   128  
   129  ```yaml
   130  test_groups:
   131  - name: {test_group_name}
   132    gcs_prefix: kubernetes-jenkins/logs/{test_group_name}
   133  ```
   134  
   135  See the `TestGroup` message in [`config.proto`] for additional fields to
   136  configure like `days_of_results`, `tests_name_policy`, `notifications`, etc.
   137  
   138  ### Dashboard Tabs
   139  
   140  A dashboard tab is a particular view of a test group. Multiple dashboard tabs can view the same
   141  test group in different ways, via different configuration options. All dashboard tabs belong under
   142  a dashboard (see below).
   143  
   144  ### Dashboards
   145  
   146  A dashboard is a set of related dashboard tabs.  The dashboard name shows up as the top-level link
   147  when viewing TestGrid.
   148  
   149  Add a new dashboard under `dashboards` and a new dashboard tab under that.
   150  
   151  Ex:
   152  
   153  ```yaml
   154  dashboards:
   155  - name: {dashboard-name}
   156    dashboard_tab:
   157    - name: {dashboard-tab-name}
   158      test_group_name: {test-group-name}
   159  ```
   160  
   161  See the `Dashboard` and `DashboardTab` messages in [`config.proto`] for
   162  additional configuration options, such as `notifications`, `file_bug_template`,
   163  `description`, `code_search_url_template`, etc.
   164  
   165  ### Dashboard groups
   166  
   167  A dashboard group is a set of related dashboards. When viewing a dashboard's tabs, you'll see the
   168  other dashboards in the Dashboard Group at the top of the client.
   169  
   170  Add a new dashboard group, specifying names for all the dashboards that fall under this group.
   171  
   172  Ex:
   173  
   174  ```yaml
   175  dashboard_groups:
   176  - name: {dashboard-group-name}
   177    dashboard_names:
   178    - {dashboard-1}
   179    - {dashboard-2}
   180    - {dashboard-3}
   181  ```
   182  
   183  ## Testing your configuration
   184  
   185  Run [`go test ./config/tests/testgrids`](/config/tests/testgrids) to ensure the configuration is valid.
   186  
   187  ## Advanced configuration
   188  
   189  See [`config.proto`] for an extensive list of configuration options. Here are some commonly-used ones.
   190  
   191  ### More/Fewer Results
   192  
   193  Specify `days_of_results` in a test group to increase or decrease the number of days of results shown.
   194  
   195  ```yaml
   196  test_groups:
   197  - name: kubernetes-build
   198    gcs_prefix: kubernetes-jenkins/logs/ci-kubernetes-build
   199    days_of_results: 7
   200  ```
   201  
   202  ### Tab descriptions
   203  
   204  Add a short description to a dashboard tab describing its purpose.
   205  
   206  ```yaml
   207    dashboard_tab:
   208    - name: gce
   209      test_group_name: ci-kubernetes-e2e-gce
   210      base_options: 'include-filter-by-regex=Kubectl%7Ckubectl'
   211      description: 'kubectl gce e2e tests for master branch'
   212  ```
   213  
   214  ### Link/Bug/Regression-Search Templates
   215  
   216  `(DashboardTab) open_test_template` `(DashboardTab) open_bug_template`
   217  `(DashboardTab) file_bug_template` `(DashboardTab) results_url_template`
   218  `(DashboardTab) code_search_url_template`
   219  
   220  Need to change what links TestGrid opens for tests, bugs, or regression search?
   221  Customize them with templates!
   222  
   223  * `open_test_template`: The test result link when clicking on a cell (e.g. Spyglass)
   224  * `open_bug_template`: The bug link when clicking on associated bugs (e.g. GitHub)
   225  * `file_bug_template`: The default info when filing an issue through the client (e.g. GitHub)
   226  * `attach_bug_template`: The default info when attaching a target to an existing bug (e.g. GitHub)
   227  * `results_url_template`: The link to all test runs (e.g. Deck)
   228  * `code_search_url_template`: The link when searching a code base for a regression (e.g. GitHub)
   229  
   230  You can add fields to link templates to substitute them for an existing value!
   231  
   232  To URL encode something (see JavaScript's encodeURIComponent()) in a template,
   233  like a field, specify `<encode: [what-to-encode]>`.
   234  
   235  e.g. `url = "http://test/<encode:<test-name>>"`
   236  
   237  Fields for `open_test_template`, `open_bug_template`, `file_bug_template`,
   238  `results_url_template`:
   239  
   240  * `<environment>`: The tab name.
   241  * `<test-status>`: String description of the cell's test status (e.g. 'Failed').
   242  * `<test-id>`: Run ID for a cell.
   243  * `<test-name>`: The test name.
   244  * `<display-name>`: The name of the test, as displayed in TestGrid.
   245  * `<gcs_prefix>`: `gcs_prefix` (as defined in your test_group's config).
   246  * `<custom-N>`: The value of the Nth [custom column header](#column-headers) (as defined in
   247      your test_group's config).
   248  * `<results-explorer>`: The current URL (e.g. `https://testgrid.k8s.io/some-dash#some-tab`).
   249  * `<test-url>`: The resulting URL from applying `open_test_template` on this cell.
   250  * `<cs-path>`: `code_search_path` (as defined in your test_group's config).
   251  
   252  Fields for `code_search_url_template` (compared between two columns in
   253  TestGrid):
   254  
   255  * `<start-cl>`: The earlier CL/build ID in the comparison
   256  * `<end-cl>`: The later CL/build ID
   257  * `<start-custom-N>`: The earlier custom column header value (see `<custom-N>` above)
   258  * `<end-custom-N>`: The later custom column header value
   259  
   260  ### Column headers
   261  
   262  TestGrid shows date, build number, and k8s and test-infra commit shas above
   263  each run's results by default. To add your own custom column headers, add a
   264  key-value pair in your tests' metadata (see [metadata for
   265  finished.json](https://github.com/kubernetes/test-infra/tree/master/gubernator#job-artifact-gcs-layout)),
   266  and add the key for that pair as a `configuration_value` under `column_header`
   267  for your test group. Example:
   268  
   269  ```yaml
   270  test_groups:
   271  - name: ci-kubernetes-e2e-gce-ubuntudev-k8sdev-default
   272    gcs_prefix:
   273    kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-ubuntudev-k8sdev-default
   274    column_header:
   275    - configuration_value: Commit
   276    - configuration_value: my_custom_key
   277  ```
   278  
   279  ### Email alerts
   280  
   281  In TestGroup, set `num_failures_to_alert` (alerts for consistent failures)
   282  and/or `alert_stale_results_hours` (alerts when tests haven't run recently).
   283  You can also set `num_passes_to_disable_alert`.
   284  
   285  In DashboardTab, set `alert_mail_to_addresses` (comma-separated list of email
   286  addresses to send mail to).
   287  
   288  Additional options for DashboardTab alerts:
   289  
   290  * `num_passes_to_disable_alert`: the number of consecutive test passes to close the alert
   291  * `subject`: custom subject for alert mails
   292  * `debug_url`: custom link for further context/instructions on debugging this alert
   293  * `debug_message`: custom text to show for the debug link; `debug_url` is required for `debug_message` to appear
   294  
   295  These alerts will send whenever new failures are detected (or whenever the
   296  dashboard tab goes stale), and will stop when `num_passes_to_disable_alert`
   297  consecutive passes are found (or no failure is found in `num_columns_recent`
   298  runs).
   299  
   300  ```yaml
   301  # Send alerts to foo@bar.com whenever a test fails 3 times in a row, or tests
   302  # haven't run in the last day.
   303  test_groups:
   304  - name: ci-kubernetes-e2e-gce
   305    gcs_prefix: kubernetes-jenkins/logs/ci-kubernetes-e2e-gce
   306    alert_stale_results_hours: 24
   307    num_failures_to_alert: 3
   308  
   309  dashboards:
   310  - name: google-gce
   311    dashboard_tab:
   312    - name: gce
   313      test_group_name: ci-kubernetes-e2e-gce
   314      alert_options:
   315        alert_mail_to_addresses: 'foo@bar.com'
   316  ```
   317  
   318  ### Base options
   319  
   320  Default to a set of client modifiers when viewing this dashboard tab.
   321  
   322  ```yaml
   323  # Show test cases from ci-kubernetes-e2e-gce, but only if the test has 'Kubectl' or 'kubectl' in the name.
   324    dashboard_tab:
   325    - name: gce
   326      test_group_name: ci-kubernetes-e2e-gce
   327      base_options: 'include-filter-by-regex=Kubectl%7Ckubectl'
   328      description: 'kubectl gce e2e tests for master branch'
   329  ```
   330  
   331  ### More informative test names
   332  
   333  If you run multiple versions of a test against different parameters, show which parameters they with after the test name.
   334  
   335  ```yaml
   336  # Show a test case as "{test_case_name} [{Context}]"
   337  - name: ci-kubernetes-node-kubelet-benchmark
   338    gcs_prefix: kubernetes-jenkins/logs/ci-kubernetes-node-kubelet-benchmark
   339    test_name_config:
   340      name_elements:
   341      - target_config: Tests name
   342      - target_config: Context
   343      name_format: '%s [%s]'
   344  ```
   345  
   346  ### Customize regression search
   347  
   348  Narrow down where to search when searching for a regression between two builds/commits.
   349  
   350  ```yaml
   351    dashboard_tab:
   352    - name: bazel
   353      description: Runs bazel test //... on the test-infra repo.
   354      test_group_name: ci-test-infra-bazel
   355      code_search_url_template:
   356        url: https://github.com/kubernetes/test-infra/compare/<start-custom-0>...<end-custom-0>
   357  ```
   358  
   359  ### Notifications
   360  
   361  Testgrid supports the ability to add notifications, which appears as a yellow
   362  butter bar / toast message at the top of the screen.
   363  
   364  This is an effective way to broadcast system wide information (all
   365  FOO suites are failing due to blah, upgrade frobber to vX before the
   366  weekend, etc.)
   367  
   368  Configure the list of `notifications:` under dashboard or testgroup:
   369  Each notification includes a `summary:` that defines the text displayed.
   370  Notifications benefit from including a `context_link:` url that can be clicked
   371  to provide more information.
   372  
   373  Ex:
   374  
   375  ```yaml
   376  dashboards:
   377  - name: k8s
   378    dashboard_tab:
   379    - name: build
   380      test_group_name: kubernetes-build
   381    notifications:  # Attach to a specific dashboard
   382    - summary: Hello world (first notification).
   383    - summary: Tests are failing to start (second notification).
   384      context_link: https://github.com/kubernetes/kubernetes/issues/123
   385  ```
   386  
   387  or
   388  
   389  ```yaml
   390  test_groups:  # Attach to a specific test_group
   391  - name: kubernetes-build
   392    gcs_prefix: kubernetes-jenkins/logs/ci-kubernetes-build
   393    notifications:
   394    - summary: Hello world (first notification)
   395    - summary: Tests are failing to start (second notification).
   396      context_link: https://github.com/kubernetes/kubernetes/issues/123
   397  ```
   398  
   399  ### What Counts as 'Recent'
   400  
   401  Configure `num_columns_recent` to change how many columns TestGrid should consider 'recent' for results.
   402  TestGrid uses this to calculate things like 'is this test stale?' (and hides the test).
   403  
   404  ```yaml
   405  test_groups:
   406  - name: kubernetes-build
   407    gcs_prefix: kubernetes-jenkins/logs/ci-kubernetes-build
   408    num_columns_recent: 3
   409  ```
   410  
   411  ### Long-Running Tests
   412  
   413  If your tests run for a very long time (more than 24 hours), set
   414  `max_test_runtime_hours`.
   415  
   416  ```yaml
   417  # This test group has tests that run for 48 hours; set a high max runtime.
   418  test_groups:
   419  - name: some-tests
   420    gcs_prefix: path/to/test/logs/some-tests
   421    max_test_runtime_hours: 50  # Leave a small buffer just in case.
   422  ```
   423  
   424  ### Ignore Pending Results
   425  
   426  `ignore_pending` is false by default, which means that in-progress results will
   427  be shown if we have data for them. If you want to have these not show up, add:
   428  
   429  ```yaml
   430  test_groups:
   431  - name: kubernetes-build
   432    gcs_prefix: kubernetes-jenkins/logs/ci-kubernetes-build
   433    ignore_pending: true
   434  ```
   435  
   436  ### Showing a metric in the cells
   437  
   438  Specify `short_text_metric` to display a custom numeric metric in the TestGrid cells. Example:
   439  
   440  ```yaml
   441  test_groups:
   442  - name: ci-kubernetes-coverage-conformance
   443    gcs_prefix: kubernetes-jenkins/logs/ci-kubernetes-coverage-conformance
   444    short_text_metric: coverage
   445  ```
   446  
   447  [`config.proto`]: ./config/config.proto
   448  [configuration]: /config/testgrids