github.com/shashidharatd/test-infra@v0.0.0-20171006011030-71304e1ca560/testgrid/config/README.md (about)

     1  # Testgrid
     2  
     3  The testgrid site is accessible at https://testgrid.k8s.io. The site is
     4  configured by [`config.yaml`].
     5  Updates to the config are automatically tested and pushed to production.
     6  
     7  Testgrid is composed of:
     8  * A list of test groups that contain results for a job over time.
     9  * A list of dashboards that are composed of tabs that display a test group
    10  * A list of dashboard groups of related dashboards.
    11  
    12  ## Configuration
    13  Open [`config.yaml`] in your favorite editor and:
    14  1. Configure the test groups
    15  2. Add those testgroups to one or more tabs in one or more dashboards
    16  3. Consider using dashboard groups if multiple dashboards are needed.
    17  
    18  ### Test groups
    19  Test groups contain a set of test results across time for the same job. Each group backs one or more dashboard tabs.
    20  
    21  Add a new test group under `test_groups:`, specifying the group's name, and where the logs are located.
    22  
    23  Ex:
    24  
    25  ```
    26  test_groups:
    27  - name: {test_group_name}
    28    gcs_prefix: kubernetes-jenkins/logs/{test_group_name}
    29  ```
    30  
    31  See the `TestGroup` message in [`config.proto`] for additional fields to
    32  configure like `days_of_results`, `tests_name_policy`, `notifications`, etc.
    33  
    34  ### Dashboards
    35  #### Tabs
    36  A dashboard tab is a particular view of a test group. Multiple dashboard tabs can view the same test group in different ways, via different configuration options. All dashboard tabs belong under a dashboard (see below).
    37  
    38  #### Dashboards
    39  
    40  A dashboard is a set of related dashboard tabs.  The dashboard name shows up as the top-level link when viewing TestGrid.
    41  
    42  Add a new dashboard under `dashboards` and a new dashboard tab under that.
    43  
    44  Ex:
    45  
    46  ```
    47  dashboards:
    48  - name: {dashboard-name}
    49    dashboard_tab:
    50    - name: {dashboard-tab-name}
    51      test_group_name: {test-group-name}
    52  ```
    53  
    54  See the `Dashboard` and `DashboardTab` messages in [`config.proto`] for
    55  additional configuration options, such as `notifications`, `file_bug_template`,
    56  `description`, `code_search_url_template`, etc.
    57  
    58  #### Dashboard groups
    59  A dashboard group is a set of related dashboards. When viewing a dashboard's tabs, you'll see the other dashboards in the Dashboard Group at the top of the client.
    60  
    61  Add a new dashboard group, specifying names for all the dashboards that fall under this group.
    62  
    63  Ex:
    64  
    65  ```
    66  dashboard_groups:
    67  - name: {dashboard-group-name}
    68    dashboard_names:
    69    - {dashboard-1}
    70    - {dashboard-2}
    71    - {dashboard-3}
    72  ```
    73  
    74  ## Advanced configuration
    75  See [`config.proto`] for an extensive list of configuration options. Here are some commonly-used ones.
    76  
    77  ### More/Fewer Results
    78  Specify `days_of_results` in a test group to increase or decrease the number of days of results shown.
    79  
    80  ```
    81  test_groups:
    82  - name: kubernetes-build
    83    gcs_prefix: kubernetes-jenkins/logs/ci-kubernetes-build
    84    days_of_results: 7
    85  ```
    86  
    87  ### Tab descriptions
    88  Add a short description to a dashboard tab describing its purpose.
    89  
    90  ```
    91    dashboard_tab:
    92    - name: gce
    93      test_group_name: ci-kubernetes-e2e-gce
    94      base_options: 'include-filter-by-regex=Kubectl%7Ckubectl'
    95      description: 'kubectl gce e2e tests for master branch'
    96  ```
    97  
    98  ### Email alerts
    99  In TestGroup, set `num_failures_to_alert` (alerts for consistent failures)
   100  and/or `alert_stale_results_hours` (alerts when tests haven't run recently.)
   101  
   102  In DashboardTab, set `alert_mail_to_addresses` (comma-separated list of email
   103  addresses to send mail to).
   104  
   105  These alerts will send whenever new failures are detected (or whenever the
   106  dashboard tab goes stale).
   107  
   108  ```
   109  # Send alerts to foo@bar.com whenever a test fails 3 times in a row, or tests
   110  # haven't run in the last day.
   111  test_groups:
   112  - name: ci-kubernetes-e2e-gce
   113    gcs_prefix: kubernetes-jenkins/logs/ci-kubernetes-e2e-gce
   114    alert_stale_results_hours: 24
   115    num_failures_to_alert: 3
   116    
   117  dashboards:
   118  - name: google-gce
   119    dashboard_tab:
   120    - name: gce
   121      test_group_name: ci-kubernetes-e2e-gce
   122      alert_options:
   123        alert_mail_to_addresses: 'foo@bar.com'
   124  ```
   125  
   126  
   127  ### Base options
   128  Default to a set of client modifiers when viewing this dashboard tab.
   129  
   130  ```
   131  # Show test cases from ci-kubernetes-e2e-gce, but only if the test has 'Kubectl' or 'kubectl' in the name.
   132    dashboard_tab:
   133    - name: gce
   134      test_group_name: ci-kubernetes-e2e-gce
   135      base_options: 'include-filter-by-regex=Kubectl%7Ckubectl'
   136      description: 'kubectl gce e2e tests for master branch'
   137  ```
   138  
   139  ### More informative test names
   140  If you run multiple versions of a test against different parameters, show which parameters they with after the test name.
   141  
   142  ```
   143  # Show a test case as "{test_case_name} [{Context}]"
   144  - name: ci-kubernetes-node-kubelet-benchmark
   145    gcs_prefix: kubernetes-jenkins/logs/ci-kubernetes-node-kubelet-benchmark
   146    test_name_config:
   147      name_elements:
   148      - target_config: Tests name
   149      - target_config: Context
   150      name_format: '%s [%s]'
   151  ```
   152  
   153  ### Customize regression search
   154  Narrow down where to search when searching for a regression between two builds/commits.
   155  
   156  ```
   157    dashboard_tab:
   158    - name: bazel
   159      description: Runs bazel test //... on the test-infra repo.
   160      test_group_name: ci-test-infra-bazel
   161      code_search_url_template:
   162        url: https://github.com/kubernetes/test-infra/compare/<start-custom-0>...<end-custom-0>
   163  ```
   164  
   165  ### Notifications
   166  Testgrid supports the ability to add notifications, which appears as a yellow
   167  butter bar / toast message at the top of the screen.
   168  
   169  This is an effective way to broadcast system wide information (all
   170  FOO suites are failing due to blah, upgrade frobber to vX before the
   171  weekend, etc.)
   172  
   173  Configure the list of `notifications:` under dashboard or testgroup:
   174  Each notification includes a `summary:` that defines the text displayed.
   175  Notifications benefit from including a `context_link:` url that can be clicked
   176  to provide more information.
   177  
   178  Ex:
   179  
   180  ```
   181  dashboards:
   182  - name: k8s
   183    dashboard_tab:
   184    - name: build
   185      test_group_name: kubernetes-build
   186    notifications:  # Attach to a specific dashboard
   187    - summary: Hello world (first notification).
   188    - summary: Tests are failing to start (second notification).
   189      context_link: https://github.com/kubernetes/kubernetes/issues/123
   190  ```
   191  
   192  or
   193  
   194  ```
   195  test_groups:  # Attach to a specific test_group
   196  - name: kubernetes-build
   197    gcs_prefix: kubernetes-jenkins/logs/ci-kubernetes-build
   198    notifications:
   199    - summary: Hello world (first notification)
   200    - summary: Tests are failing to start (second notification).
   201      context_link: https://github.com/kubernetes/kubernetes/issues/123
   202  ```
   203  
   204  ### What Counts as 'Recent'
   205  Configure `num_columns_recent` to change how many columns TestGrid should consider 'recent' for results.
   206  TestGrid uses this to calculate things like 'is this test stale?' (and hides the test).
   207  
   208  ```
   209  test_groups:
   210  - name: kubernetes-build
   211    gcs_prefix: kubernetes-jenkins/logs/ci-kubernetes-build
   212    num_columns_recent: 3
   213  ```
   214  
   215  ### Ignore Pending Results
   216  `ignore_pending` is false by default, which means that in-progress results will
   217  be shown if we have data for them. If you want to have these not show up, add:
   218  
   219  ```
   220  test_groups:
   221  - name: kubernetes-build
   222    gcs_prefix: kubernetes-jenkins/logs/ci-kubernetes-build
   223    ignore_pending: true
   224  ```
   225  
   226  ## Using the client
   227  
   228  Here are some quick tips and clarifications for using the TestGrid site!
   229  
   230  ## Tab Statuses
   231  
   232  TestGrid assigns dashboard tabs a status based on recent test runs.
   233  
   234   *  **PASSING**: No failures found in recent (`num_columns_recent`) test runs.
   235   *  **FAILING**: One or more consistent failures in recent test runs.
   236   *  **FLAKY**: The tab is neither PASSING nor FAILING. There is at least one
   237      recent failed result that is not a consistent failure.
   238  
   239  ### Customizing Test Result Sizes
   240  
   241  Change the size of the test result rectangles.
   242  
   243  The three sizes are Standard, Compact, and Super Compact. You can also specify
   244  `width=X` in the URL (X > 3) to customize the width. For small widths, this may
   245  mean the date and/or changelist, or other custom headers, are no longer
   246  visible.
   247  
   248  ### Filtering Tests
   249  
   250  You can repeatedly add filters to include/exclude test rows. Under **Options**:
   251  
   252  *   **Include/Exclude Filter by RegEx**: Specify a regular expression that
   253      matches test names for rows you'd like to include/exclude.
   254  *   **Exclude non-failed Tests**: Omit rows with no failing results.
   255  
   256  ### Grouping Tests
   257  
   258  Grouped tests are summarized in a single row that is collapsible/expandable by
   259  clicking on the test name (shown as a triangle on the left). Under **Options**:
   260  
   261  *   **Group by RegEx Mask**: Specify a regular expression to mask a portion of
   262      the test name. Any test names that match after applying this mask will be
   263      grouped together.
   264  *   **Group by Target**: Any tests that contain the same target will be
   265      grouped together.
   266  *   **Group by Hierarchy Pattern**: Specify a regular expression that matches
   267      one or more parts of the tests' names and the tests will be grouped
   268      hierarchically. For example, if you have these tests in your dashboard:
   269  
   270      ```text
   271      /test/dir1/target1
   272      /test/dir1/target2
   273      /test/dir2/target3
   274      ```
   275  
   276      By specifing regular expression "\w+", the tests will be orgranized into:
   277  
   278      ```text
   279      ▼test
   280        ▼dir1
   281          target1
   282        ▼dir2
   283          target2
   284          target3
   285      ```
   286  
   287  ## Sorting Tests
   288  
   289  Under **Options**
   290  
   291  *   **Sort by Failures**: Tests with more recent failures will appear before
   292      other tests.
   293  *   **Sort by Flakiness**: Tests with a higher flakiness score will appear
   294      before tests with a lower flakiness score. The flakiness score, which is not
   295      reported, is based on the number of transitions from passing to failing (and
   296      vice versa) with more weight given to more recent transitions.
   297  
   298  ## Unit testing
   299  
   300  Run `bazel test //testgrid/...` to ensure the config is valid.
   301  
   302  This finds common problems such as malformed yaml, a tab referring to a
   303  non-existent test group, a test group never appearing on any tab, etc.
   304  
   305  Run `bazel test //...` for slightly more advanced testing, such as ensuring that
   306  every job in our CI system appears somewhere in testgrid, etc.
   307  
   308  All PRs updating the configuration must pass prior to merging
   309  
   310  
   311  ## Merging changes
   312  
   313  Updates to the testgrid configuration are automatically pushed immediately when
   314  merging a change.
   315  
   316  It may take some time (around an hour) after merging a change for test results
   317  to first appear.
   318  
   319  If for some reason you want to run this manually then do the following:
   320  ```
   321  go build ./yaml2proto  # Build the yaml2proto library
   322  go install .  # Install the config converter
   323  config --yaml=config.yaml --output=config.pb.txt  # Run the conversion
   324  ```
   325  
   326  
   327  # Changing `config.proto`
   328  Contact #sig-testing on slack before changing [`config.proto`].
   329  
   330  Devs - `config.proto` changes require rebuilding to golang module:
   331  
   332  1. Install [`protoc`],
   333  2. Output the go library with `protoc --go_out=pb config.proto`
   334  3. Search-replace all json:"foo,omitempty" with yaml:"foo,omitempty".
   335  ```
   336    # Be sure to add back the header
   337    sed -i -e 's/json:/yaml:/g' pb/config.pb.go
   338  ```
   339  4. Commit both `config.proto` and `config.pb.go`
   340  
   341  
   342  [`config.proto`]: https://github.com/kubernetes/test-infra/blob/master/testgrid/config/config.proto
   343  [`config.yaml`]: https://github.com/kubernetes/test-infra/blob/master/testgrid/config/config.yaml
   344  [`protoc`]: https://github.com/golang/protobuf