github.com/kubeflow/training-operator@v1.7.0/docs/testing/e2e_testing.md

github.com/kubeflow/training-operator@v1.7.0/docs/testing/e2e_testing.md (about)

     1  # How to Write an E2E Test for Kubeflow Training Operator
     2  
     3  TODO (andreyvelich): This doc is outdated. Currently, E2Es are located here:
     4  [`sdk/python/test/e2e`](../../sdk/python/test/e2e)
     5  
     6  The E2E tests for Kubeflow Training operator are implemented as Argo workflows. For more background and details
     7  about Argo (not required for understanding the rest of this document), please take a look at
     8  [this link](https://github.com/kubeflow/testing/blob/master/README.md).
     9  
    10  Test results can be monitored at the [Prow dashboard](http://prow.kubeflow-testing.com/?repo=kubeflow%2Ftraining-operator).
    11  
    12  At a high level, the E2E test suites are structured as Python test classes. Each test class contains
    13  one or more tests. A test typically runs the following:
    14  
    15  - Create a ksonnet component using a TFJob spec;
    16  - Creates the specified TFJob;
    17  - Verifies some expected results (e.g. number of pods started, job status);
    18  - Deletes the TFJob.
    19  
    20  ## Adding a Test Method
    21  
    22  An example can be found [here](https://github.com/kubeflow/training-operator/blob/master/py/kubeflow/tf_operator/simple_tfjob_tests.py).
    23  
    24  A test class can have several test methods. Each method executes a series of user actions (e.g.
    25  starting or deleting a TFJob), and performs verifications of expected results (e.g. TFJob exits with
    26  correct status, pods are deleted, etc).
    27  
    28  Test classes should follow this pattern:
    29  
    30  ```python
    31  class MyTest(test_util.TestCase):
    32    def __init__(self, args):
    33      # Initialize environment
    34  
    35    def test_case_1(self):
    36      # Test code
    37  
    38    def test_case_2(self):
    39      # Test code
    40  
    41  if __name__ == "__main__"
    42    test_runner.main(module=__name__)
    43  ```
    44  
    45  The code here ideally should only contain API calls. Any common functionalities used by the test code should
    46  be added to one of the helper modules:
    47  
    48  - k8s_util - for K8s operations like querying/deleting a pod
    49  - ks_util - for ksonnet operations
    50  - tf_job_client - for TFJob-specific operations, such as waiting for the job to be in a certain phase
    51  
    52  ## Adding a TFJob Spec
    53  
    54  This is needed if you want to use your own TFJob spec instead of an existing one. An example can be found
    55  [here](https://github.com/kubeflow/training-operator/tree/master/test/workflows/components/simple_tfjob_v1.jsonnet).
    56  All TFJob specs should be placed in the same directory.
    57  
    58  These are similar to actual TFJob specs. Note that many of these are using the
    59  [training-operator-test-server](https://github.com/kubeflow/training-operator/tree/master/test/test-server) as the test image.
    60  This gives us more control over when each replica exits, and allows us to send specific requests like fetching the
    61  runtime TensorFlow config.
    62  
    63  ## Adding a New Test Class
    64  
    65  This is needed if you are creating a new test class. Creating a new test class is recommended if you are implementing
    66  a new feature, and want to group all relevant E2E tests together.
    67  
    68  New test classes should be added as Argo workflow steps to the
    69  [workflows.libsonnet](https://github.com/kubeflow/training-operator/blob/master/test/workflows/components/workflows.libsonnet) file.
    70  
    71  Under the templates section, add the following to the dag:
    72  
    73  ```
    74    {
    75      name: "my-test",
    76      template: "my-test",
    77      dependencies: ["setup-kubeflow"],
    78    },
    79  ```
    80  
    81  This will configure Argo to run `my-test` after setting up the Kubeflow cluster.
    82  
    83  Next, add the following lines toward the end of the file:
    84  
    85  ```
    86    $.parts(namespace, name, overrides).e2e(prow_env, bucket).buildTestTemplate(
    87           "my-test"),
    88  ```
    89  
    90  This assumes that there is a corresponding Python file named `my_test.py` (note the difference between dashes and
    91  underscores).