github.com/munnerz/test-infra@v0.0.0-20190108210205-ce3d181dc989/testgrid/README.md (about) 1 # Testgrid 2 3 ### Table of Contents 4 * [Configuration](#configuration) 5 * [Advanced Configuration](#advanced-configuration) 6 * [Using the Client](#using-the-client) 7 * [Unit Testing](#unit-testing) 8 * [Merging Changes](#merging-changes) 9 10 11 The testgrid site is accessible at https://testgrid.k8s.io. The site is 12 configured by [`config.yaml`]. 13 Updates to the config are automatically tested and pushed to production. 14 15 Testgrid is composed of: 16 * A list of test groups that contain results for a job over time. 17 * A list of dashboards that are composed of tabs that display a test group 18 * A list of dashboard groups of related dashboards. 19 20 ## Tip and Tricks 21 22 We have a short [video] from the testgrid session at the 2018 contributor summit. 23 24 The video demos power features of testgrid, including: 25 * Sorting 26 * Filtering 27 * Graphing 28 * Grouping 29 * Dashboard groups 30 * Summaries 31 32 Please have a look! 33 34 ## Configuration 35 Open [`config.yaml`] in your favorite editor and: 36 1. Configure the test groups 37 2. Add those testgroups to one or more tabs in one or more dashboards 38 3. Consider using dashboard groups if multiple dashboards are needed. 39 40 ### Test groups 41 Test groups contain a set of test results across time for the same job. Each group backs one or more dashboard tabs. 42 43 Add a new test group under `test_groups:`, specifying the group's name, and where the logs are located. 44 45 Ex: 46 47 ``` 48 test_groups: 49 - name: {test_group_name} 50 gcs_prefix: kubernetes-jenkins/logs/{test_group_name} 51 ``` 52 53 See the `TestGroup` message in [`config.proto`] for additional fields to 54 configure like `days_of_results`, `tests_name_policy`, `notifications`, etc. 55 56 ### Dashboards 57 #### Tabs 58 A dashboard tab is a particular view of a test group. Multiple dashboard tabs can view the same test group in different ways, via different configuration options. All dashboard tabs belong under a dashboard (see below). 59 60 #### Dashboards 61 62 A dashboard is a set of related dashboard tabs. The dashboard name shows up as the top-level link when viewing TestGrid. 63 64 Add a new dashboard under `dashboards` and a new dashboard tab under that. 65 66 Ex: 67 68 ``` 69 dashboards: 70 - name: {dashboard-name} 71 dashboard_tab: 72 - name: {dashboard-tab-name} 73 test_group_name: {test-group-name} 74 ``` 75 76 See the `Dashboard` and `DashboardTab` messages in [`config.proto`] for 77 additional configuration options, such as `notifications`, `file_bug_template`, 78 `description`, `code_search_url_template`, etc. 79 80 #### Dashboard groups 81 A dashboard group is a set of related dashboards. When viewing a dashboard's tabs, you'll see the other dashboards in the Dashboard Group at the top of the client. 82 83 Add a new dashboard group, specifying names for all the dashboards that fall under this group. 84 85 Ex: 86 87 ``` 88 dashboard_groups: 89 - name: {dashboard-group-name} 90 dashboard_names: 91 - {dashboard-1} 92 - {dashboard-2} 93 - {dashboard-3} 94 ``` 95 96 ## Advanced configuration 97 See [`config.proto`] for an extensive list of configuration options. Here are some commonly-used ones. 98 99 ### More/Fewer Results 100 Specify `days_of_results` in a test group to increase or decrease the number of days of results shown. 101 102 ``` 103 test_groups: 104 - name: kubernetes-build 105 gcs_prefix: kubernetes-jenkins/logs/ci-kubernetes-build 106 days_of_results: 7 107 ``` 108 109 ### Tab descriptions 110 Add a short description to a dashboard tab describing its purpose. 111 112 ``` 113 dashboard_tab: 114 - name: gce 115 test_group_name: ci-kubernetes-e2e-gce 116 base_options: 'include-filter-by-regex=Kubectl%7Ckubectl' 117 description: 'kubectl gce e2e tests for master branch' 118 ``` 119 120 ### Column headers 121 TestGrid shows date, build number, and k8s and test-infra commit shas above 122 each run's results by default. To add your own custom column headers, add a 123 key-value pair in your tests' metadata (see [metadata for 124 finished.json](https://github.com/kubernetes/test-infra/tree/master/gubernator#job-artifact-gcs-layout)), 125 and add the key for that pair as a `configuration_value` under `column_header` 126 for your test group. Example: 127 128 ``` 129 test_groups: 130 - name: ci-kubernetes-e2e-gce-ubuntudev-k8sdev-default 131 gcs_prefix: 132 kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-ubuntudev-k8sdev-default 133 column_header: 134 - configuration_value: node_os_image 135 - configuration_value: master_os_image 136 - configuration_value: Commit 137 - configuration_value: infra-commit 138 ``` 139 140 ### Email alerts 141 In TestGroup, set `num_failures_to_alert` (alerts for consistent failures) 142 and/or `alert_stale_results_hours` (alerts when tests haven't run recently). 143 You can also set `num_passes_to_disable_alert`. 144 145 In DashboardTab, set `alert_mail_to_addresses` (comma-separated list of email 146 addresses to send mail to). 147 148 These alerts will send whenever new failures are detected (or whenever the 149 dashboard tab goes stale), and will stop when `num_passes_to_disable_alert` 150 consecutive passes are found (or no failure is found in `num_columns_recent` 151 runs). 152 153 ``` 154 # Send alerts to foo@bar.com whenever a test fails 3 times in a row, or tests 155 # haven't run in the last day. 156 test_groups: 157 - name: ci-kubernetes-e2e-gce 158 gcs_prefix: kubernetes-jenkins/logs/ci-kubernetes-e2e-gce 159 alert_stale_results_hours: 24 160 num_failures_to_alert: 3 161 162 dashboards: 163 - name: google-gce 164 dashboard_tab: 165 - name: gce 166 test_group_name: ci-kubernetes-e2e-gce 167 alert_options: 168 alert_mail_to_addresses: 'foo@bar.com' 169 ``` 170 171 172 ### Base options 173 Default to a set of client modifiers when viewing this dashboard tab. 174 175 ``` 176 # Show test cases from ci-kubernetes-e2e-gce, but only if the test has 'Kubectl' or 'kubectl' in the name. 177 dashboard_tab: 178 - name: gce 179 test_group_name: ci-kubernetes-e2e-gce 180 base_options: 'include-filter-by-regex=Kubectl%7Ckubectl' 181 description: 'kubectl gce e2e tests for master branch' 182 ``` 183 184 ### More informative test names 185 If you run multiple versions of a test against different parameters, show which parameters they with after the test name. 186 187 ``` 188 # Show a test case as "{test_case_name} [{Context}]" 189 - name: ci-kubernetes-node-kubelet-benchmark 190 gcs_prefix: kubernetes-jenkins/logs/ci-kubernetes-node-kubelet-benchmark 191 test_name_config: 192 name_elements: 193 - target_config: Tests name 194 - target_config: Context 195 name_format: '%s [%s]' 196 ``` 197 198 ### Customize regression search 199 Narrow down where to search when searching for a regression between two builds/commits. 200 201 ``` 202 dashboard_tab: 203 - name: bazel 204 description: Runs bazel test //... on the test-infra repo. 205 test_group_name: ci-test-infra-bazel 206 code_search_url_template: 207 url: https://github.com/kubernetes/test-infra/compare/<start-custom-0>...<end-custom-0> 208 ``` 209 210 ### Notifications 211 Testgrid supports the ability to add notifications, which appears as a yellow 212 butter bar / toast message at the top of the screen. 213 214 This is an effective way to broadcast system wide information (all 215 FOO suites are failing due to blah, upgrade frobber to vX before the 216 weekend, etc.) 217 218 Configure the list of `notifications:` under dashboard or testgroup: 219 Each notification includes a `summary:` that defines the text displayed. 220 Notifications benefit from including a `context_link:` url that can be clicked 221 to provide more information. 222 223 Ex: 224 225 ``` 226 dashboards: 227 - name: k8s 228 dashboard_tab: 229 - name: build 230 test_group_name: kubernetes-build 231 notifications: # Attach to a specific dashboard 232 - summary: Hello world (first notification). 233 - summary: Tests are failing to start (second notification). 234 context_link: https://github.com/kubernetes/kubernetes/issues/123 235 ``` 236 237 or 238 239 ``` 240 test_groups: # Attach to a specific test_group 241 - name: kubernetes-build 242 gcs_prefix: kubernetes-jenkins/logs/ci-kubernetes-build 243 notifications: 244 - summary: Hello world (first notification) 245 - summary: Tests are failing to start (second notification). 246 context_link: https://github.com/kubernetes/kubernetes/issues/123 247 ``` 248 249 ### What Counts as 'Recent' 250 Configure `num_columns_recent` to change how many columns TestGrid should consider 'recent' for results. 251 TestGrid uses this to calculate things like 'is this test stale?' (and hides the test). 252 253 ``` 254 test_groups: 255 - name: kubernetes-build 256 gcs_prefix: kubernetes-jenkins/logs/ci-kubernetes-build 257 num_columns_recent: 3 258 ``` 259 260 ### Ignore Pending Results 261 `ignore_pending` is false by default, which means that in-progress results will 262 be shown if we have data for them. If you want to have these not show up, add: 263 264 ``` 265 test_groups: 266 - name: kubernetes-build 267 gcs_prefix: kubernetes-jenkins/logs/ci-kubernetes-build 268 ignore_pending: true 269 ``` 270 271 ### Showing a metric in the cells 272 Specify `short_text_metric` to display a custom numeric metric in the TestGrid cells. Example: 273 274 ``` 275 test_groups: 276 - name: ci-kubernetes-coverage-conformance 277 gcs_prefix: kubernetes-jenkins/logs/ci-kubernetes-coverage-conformance 278 short_text_metric: coverage 279 ``` 280 281 ## Using the client 282 283 Here are some quick tips and clarifications for using the TestGrid site! 284 285 ### Tab Statuses 286 287 TestGrid assigns dashboard tabs a status based on recent test runs. 288 289 * **PASSING**: No failures found in recent (`num_columns_recent`) test runs. 290 * **FAILING**: One or more consistent failures in recent test runs. 291 * **FLAKY**: The tab is neither PASSING nor FAILING. There is at least one 292 recent failed result that is not a consistent failure. 293 294 ### Summary Widget 295 296 You can get a small widget showing the status of your dashboard tab, based on 297 the tab statuses above! For example: 298 299 `sig-testing-misc#bazel`: [![sig-testing-misc/bazel](https://testgrid.k8s.io/q/summary/sig-testing-misc/bazel/tests_status?style=svg)](https://testgrid.k8s.io/sig-testing-misc#bazel) 300 301 Inline it with: 302 303 ``` 304 <!-- Inline with a link to your tab --> 305 [![<dashboard_name>/<tab_name>](https://testgrid.k8s.io/q/summary/<dashboard_name>/<tab_name>/tests_status?style=svg)](https://testgrid.k8s.io/<dashboard_name>#<tab_name>) 306 ``` 307 308 ### Customizing Test Result Sizes 309 310 Change the size of the test result rectangles. 311 312 The three sizes are Standard, Compact, and Super Compact. You can also specify 313 `width=X` in the URL (X > 3) to customize the width. For small widths, this may 314 mean the date and/or changelist, or other custom headers, are no longer 315 visible. 316 317 ### Filtering Tests 318 319 You can repeatedly add filters to include/exclude test rows. Under **Options**: 320 321 * **Include/Exclude Filter by RegEx**: Specify a regular expression that 322 matches test names for rows you'd like to include/exclude. 323 * **Exclude non-failed Tests**: Omit rows with no failing results. 324 325 ### Grouping Tests 326 327 Grouped tests are summarized in a single row that is collapsible/expandable by 328 clicking on the test name (shown as a triangle on the left). Under **Options**: 329 330 * **Group by RegEx Mask**: Specify a regular expression to mask a portion of 331 the test name. Any test names that match after applying this mask will be 332 grouped together. 333 * **Group by Target**: Any tests that contain the same target will be 334 grouped together. 335 * **Group by Hierarchy Pattern**: Specify a regular expression that matches 336 one or more parts of the tests' names and the tests will be grouped 337 hierarchically. For example, if you have these tests in your dashboard: 338 339 ```text 340 /test/dir1/target1 341 /test/dir1/target2 342 /test/dir2/target3 343 ``` 344 345 By specifying regular expression "\w+", the tests will be organized into: 346 347 ```text 348 ▼test 349 ▼dir1 350 target1 351 ▼dir2 352 target2 353 target3 354 ``` 355 356 ### Sorting Tests 357 358 Under **Options** 359 360 * **Sort by Failures**: Tests with more recent failures will appear before 361 other tests. 362 * **Sort by Flakiness**: Tests with a higher flakiness score will appear 363 before tests with a lower flakiness score. The flakiness score, which is not 364 reported, is based on the number of transitions from passing to failing (and 365 vice versa) with more weight given to more recent transitions. 366 * **Sort by Name**: Sort alphabetically. 367 368 ## Unit testing 369 370 Run `bazel test //testgrid/...` to ensure the config is valid. 371 372 This finds common problems such as malformed yaml, a tab referring to a 373 non-existent test group, a test group never appearing on any tab, etc. 374 375 Run `bazel test //...` for slightly more advanced testing, such as ensuring that 376 every job in our CI system appears somewhere in testgrid, etc. 377 378 All PRs updating the configuration must pass prior to merging 379 380 381 ## Merging changes 382 383 Updates to the testgrid configuration are automatically pushed immediately when 384 merging a change. 385 386 Manually convert the yaml file to the config proto with: 387 ``` 388 bazel run //testgrid/cmd/config -- \ 389 --yaml=testgrid/config.yaml \ 390 --print-text \ 391 --oneshot \ 392 --output=/tmp/config.pb \ 393 # Or push to gcs 394 # --output=gs://my-bucket/config 395 # --gcp-service-account=/path/to/foo.json 396 ``` 397 398 [`config.proto`]: ./config.proto 399 [`config.yaml`]: ./config.yaml 400 [video]: https://www.youtube.com/watch?v=jm2l2SLq_yE