github.com/redhat-appstudio/e2e-tests@v0.0.0-20230619105049-9a422b2094d7/docs/InvestigatingCIFailures.md (about)

     1  # Stonesoup CI Failure Investigation Process
     2  
     3  ## Investigating CI Infra issues
     4  1. Review job results
     5  (You should be able to identify from the prow job log, if the failure is related to OpenShift CI or tests)
     6      1. Cluster is provisioned and tests have been run?
     7          - No
     8              1. Review log
     9                  - Could be related to OpenShift CI itself: no cluster from cluster pool, job failed to provision cluster, quay.io outage..
    10                  - OpenShift CI failures are usually on channels: **#forum-testplatform**, **#4-dev-triag**, **#announce-testplatform**
    11              2. Restart the run
    12                  - (in case of OpenShift CI outage - it will not help, but in case of a high workload on OpenShift CI it could help - typically in case no clusters in cluster pool)
    13          - Yes, you can see the test results
    14  2. Review test results:
    15      1. Some of the tests failed
    16          - Review tests logs
    17              1. Review, if these failures can be related to your PR
    18              2. Review issues marked with label **ci-fail**
    19                  - You can get these issues with [Jira filter](https://issues.redhat.com/issues/?filter=12405699)
    20                  - The failure could be a known and already reported issue
    21              3. You can look at the stacktrace and source code to determine, which component/part of the test failed
    22                  - Investigate OpenShift CI Cluster logs
    23  3. Investigate OpenShift CI Cluster logs
    24      - Every Prow job executed by the CI system generates an artifacts directory containing information about that execution and its results.
    25      1. Open a link to prow job from your PR -> **Open Artifacts link**
    26      2. Review logs from folders:
    27          - **redhat-appstudio_e2e-tests/redhat-appstudio-e2e/**               
    28              - Store xunit files related to appstudio e2e-tests.
    29          - **/artifacts/appstudio-e2e-tests/redhat-appstudio-gather/artifacts**
    30              - Contains information about pipelineruns, pipelines, operators, configuration, Stonesoup Kube APIs informations, components, application, environment..
    31          - **/artifacts/appstudio-e2e-tests/redhat-appstudio-hypershift-gather/artifacts/**
    32             - Contains information about PVC, roles, bindings, configmaps…
    33             - Contains also folder pods. This folder contains logs from all pods and running services.
    34                 - For example there is log from application-service named like: application-service_application-service-application-service-controller-manager
    35             - This artifacts are present only with hypershift installer.
    36          - **redhat-appstudio_e2e-tests/gather-extra/**
    37             - Stores all cluster pods logs, events, configmaps etc. 
    38             - This artifacts are present only when we dont use hypershift.
    39          - More details on all artifacts can be found in [OpenShift CI documentation](https://docs.ci.openshift.org/docs/how-tos/artifacts/ )
    40  
    41  ## Reporting and escalating CI Issue
    42  1. Create JIRA issue
    43  Please report the issue in the STONEBUGS JIRA project with label **ci-fail** and **quality** in case you don't know the correct component/service. In case you know which component is responsible for the failure, use components project and also use the label **ci-fail**.
    44  The QE team will get a notification, when a new issue is created with this label.
    45  Please include:
    46      - **Link to the prow job**
    47      - **Failure message**
    48      - **Relevant logs**
    49      - (+ could be helpful to also include Slack thread conversation link in the ticket)
    50  2. Post this issue in **#forum-stonesoup-qe**(ping **@ic-appstudio-qe**) channel and relevant component channel.
    51      - You can also raise this issue on **#wg-developer-stonesoup** channel and your lead can raise this issue on SoS call(and PM call and architects call, if this is necessary).
    52