github.com/dmaizel/tests@v0.0.0-20210728163746-cae6a2d9cee8/metrics/README.md

github.com/dmaizel/tests@v0.0.0-20210728163746-cae6a2d9cee8/metrics/README.md (about)

     1  # Kata Containers metrics
     2  
     3  * [Kata Containers metrics](#kata-containers-metrics)
     4      * [Goals](#goals)
     5          * [PR regression checks](#pr-regression-checks)
     6          * [Developer pre-checking](#developer-pre-checking)
     7      * [Stability or Performance?](#stability-or-performance)
     8      * [Requirements](#requirements)
     9          * [For PR checks](#for-pr-checks)
    10      * [Categories](#categories)
    11          * [Time (Speed)](#time-speed)
    12          * [Density](#density)
    13          * [Networking](#networking)
    14          * [Storage](#storage)
    15      * [Saving Results](#saving-results)
    16          * [JSON API](#json-api)
    17              * [`metrics_json_init()`](#metrics_json_init)
    18              * [`metrics_json_save()`](#metrics_json_save)
    19              * [`metrics_json_add_fragment(json)`](#metrics_json_add_fragmentjson)
    20              * [`metrics_json_start_array()`](#metrics_json_start_array)
    21              * [`metrics_json_add_array_element(json)`](#metrics_json_add_array_elementjson)
    22              * [`metrics_json_add_array_fragment(json)`](#metrics_json_add_array_fragmentjson)
    23              * [`metrics_json_close_array_element()`](#metrics_json_close_array_element)
    24              * [`metrics_json_end_array(name)`](#metrics_json_end_arrayname)
    25      * [Preserving results](#preserving-results)
    26      * [Report generator](#report-generator)
    27  
    28  This directory contains the metrics tests for Kata Containers.
    29  
    30  The tests within this directory have a number of potential use cases:
    31  - CI checks for regressions on PRs
    32  - CI data gathering for master branch merges
    33  - Developer use for pre-checking code changes before raising a PR
    34  - As part of report generation
    35  
    36  ## Goals
    37  
    38  This section details some of the goals for the potential use cases.
    39  
    40  ### PR regression checks
    41  
    42  The goal for the PR CI regression checking is to provide a relatively quick
    43  CI metrics check and feedback directly back to the GitHub PR.
    44  
    45  Due to the relatively fast feedback requirement, there is generally a compromise
    46  that has to be made with the metrics - precision vs time.
    47  
    48  Therefore, there is a separate [script](../.ci/run_metrics_PR_ci.sh) which invokes a
    49  subset of the metrics tests, configured for speed over accuracy.
    50  
    51  Having said that, accuracy is still important. If we have very noisy tests, then the
    52  CI will either not spot regressions that are below that noise factor, or will cause
    53  false failures, which are very undesirable in a CI.
    54  
    55  ### Developer pre-checking
    56  
    57  The PR regression check scripts can be executed "by hand", and thus are available
    58  for developers to use as a "pre-check" before submitting a PR. It might be prudent for
    59  developers to follow this procedure particularly for large architectural or version changes
    60  of components.
    61  
    62  ## Stability or Performance?
    63  
    64  When forming, or configuring, metrics tests, often we have to make a choice or compromise
    65  on if we want the test to take a more repeatable measurement (less noise and variance), or if
    66  we want to measure the "best performance".
    67  
    68  Generally for CI regression checking, we prefer stability over performance, as that allows
    69  us a more accurate (narrow bound) check for regressions.
    70  
    71  > **NOTE** It should thus be noted that if you are gathering data to discuss performance,
    72  > then you may *not* want to use the CI metrics data, as this may be favoring stable results
    73  > over best performance.
    74  
    75  ## Requirements
    76  
    77  To try and maintain the quality of the metrics data gathered and the accuracy of the CI
    78  regression checking, we try to define and stick to some "quality measures" for our metrics.
    79  
    80  ### For PR checks
    81  
    82  The PR CI is generally required to execute within a "reasonable time" to provide timely
    83  feedback to the developers (and not stall the review and development process). To that
    84  end, we relax the quality requirements of the PR CI. The quality requirements are:
    85  - <= 5% run to run variance
    86  - <= 5 minutes runtime per test
    87  
    88  ## Categories
    89  
    90  Kata Container metrics tend to fall into a set of categories, and we organise the tests
    91  within this folder as such.
    92  
    93  Each sub-folder contains its own `README` detailing its own tests.
    94  
    95  ### Time (Speed)
    96  
    97  Generally tests that measure the "speed" of the runtime itself, such as time to
    98  boot into a workload or kill a container.
    99  
   100  This directory does *not* contain "speed" tests that measure network or storage
   101  for instance.
   102  
   103  ### Density
   104  
   105  Tests that measure the size and overheads of the runtime. Generally this is looking at
   106  memory footprint sizes, but could also cover disk space or even CPU consumption.
   107  
   108  For further details see the [density tests documentation](density).
   109  
   110  ### Networking
   111  
   112  Tests relating to networking. General items could include:
   113  - bandwidth
   114  - jitter
   115  - latency
   116  
   117  For further details see the [network tests documentation](network).
   118  
   119  ### Storage
   120  
   121  Tests relating to the storage (graph, volume) drivers. Measures may include:
   122  - bandwidth
   123  - latency
   124  - jitter
   125  - conformance (to any relevant standards)
   126  
   127  For further details see the [storage tests documentation](storage).
   128  
   129  ## Saving Results
   130  
   131  
   132  In order to ensure continuity, and thus testing and historical tracking of results,
   133  we provide a bash API to aid storing results in a uniform manner.
   134  
   135  ### JSON API
   136  
   137  The preferred API to store results is through the provided JSON API.
   138  
   139  The API provides the following groups of functions:
   140  - A set of functions to init/save the data and add "top level" JSON fragments
   141  - A set of functions to construct arrays of JSON fragments, which are then added as a top level fragment when complete
   142  - A set of functions to construct elements of an array from sub-fragments, and then finalize that element when all fragments are added.
   143  
   144  Construction of JSON data under bash could be relatively complex. This API does not pretend
   145  to support all possible data constructs or features, and individual tests may find they need
   146  to do some JSON handling themselves before injecting their JSON into the API.
   147  
   148  > If you find a common use case that many tests are implementing themselves, then please
   149  > factor out that functionality and consider extending this API.
   150  
   151  #### `metrics_json_init()`
   152  
   153  Initialise the API. Must be called before all other JSON API calls.
   154  Should be matched by a final call to `metrics_json_save`.
   155  
   156  Relies upon the `TEST_NAME` variable to derive the file name the final JSON
   157  data is stored in (under the `metrics/results` directory). If your test generates
   158  multiple `TEST_NAME` sets of data then:
   159  - Ensure you have a matching JSON init/save call pair for each of those sets.
   160  - These sets could be a hangover from a previous CSV based test - consider using a single JSON file if possible to store all the results.
   161  
   162  This function may add system level information to the results file as a top level
   163  fragment, for example:
   164  - `env` - A fragment containing system level environment information
   165  - "time" - A fragment containing a nanosecond timestamp of when the test was executed
   166  
   167  
   168  Consider these top level JSON section names to be reserved by the API.
   169  
   170  #### `metrics_json_save()`
   171  
   172  This function saves all registered JSON fragments out to the JSON results file.
   173  
   174  > Note: this function will not save any part-registered array fragments. They will
   175  > be lost.
   176  
   177  #### `metrics_json_add_fragment(json)`
   178  
   179  Add a JSON formatted fragment at the top level.
   180  
   181  | Arg    | Description |
   182  | ------ | ----------- |
   183  | `json` | A fully formed JSON fragment |
   184  
   185  #### `metrics_json_start_array()`
   186  
   187  Initialise the JSON array API subsystem, ready to accept JSON fragments via
   188  `metrics_json_add_array_element`.
   189  
   190  This JSON array API subset allows accumulation of multiple entries into a
   191  JSON `[]` array, to later be added as a top level fragment.
   192  
   193  #### `metrics_json_add_array_element(json)`
   194  
   195  Add a fully formed JSON fragment to the JSON array store.
   196  
   197  | Arg    | Description |
   198  | ------ | ----------- |
   199  | `json` | A fully formed JSON fragment |
   200  
   201  #### `metrics_json_add_array_fragment(json)`
   202  
   203  Add a fully formed JSON fragment to the current array element.
   204  
   205  | Arg    | Description |
   206  | ------ | ----------- |
   207  | `json` | A fully formed JSON fragment |
   208  
   209  #### `metrics_json_close_array_element()`
   210  
   211  Finalize (close) the current array element. This incorporates
   212  any array_fragment parts into the current array element, closes that
   213  array element, and reset the in-flight array_fragment store.
   214  
   215  #### `metrics_json_end_array(name)`
   216  
   217  Save the stored JSON array store as a top level fragment, with the
   218  name `name`.
   219  
   220  | Arg    | Description |
   221  | ------ | ----------- |
   222  | `name` | The name to be given to the generated top level fragment array |
   223  
   224  ## Preserving results
   225  
   226  The JSON library contains a hook that enables results to be injected to a
   227  data store at the same time they are saved to the results files.
   228  
   229  The hook supports transmission via [`curl`](https://curl.haxx.se/) or
   230  [`socat`](http://www.dest-unreach.org/socat/). Configuration is via environment
   231  variables.
   232  
   233  | Variable         | Description |
   234  | --------         | ----------- |
   235  | JSON_HOST        | Destination host path for use with `socat` |
   236  | JSON_SOCKET      | Destination socket number for use with `socat` |
   237  | JSON_URL         | Destination URL for use with `curl` |
   238  | JSON_TX_ONELINE  | If set, the JSON will be sent as a single line (CR and tabs stripped) |
   239  
   240  `socat` transmission will only happen if `JSON_HOST` is set. `curl` transmission will only
   241  happen if `JSON_URL` is set. The settings are not mutually exclusive, and both can be
   242  set if necessary.
   243  
   244  `JSON_TX_ONELINE` applies to both types of transmission.
   245  
   246  ## Report generator
   247  
   248  See the [report generator](report) documentation.