github.com/crossplane/upjet@v1.3.0/docs/testing-with-uptest.md (about) 1 <!-- 2 SPDX-FileCopyrightText: 2023 The Crossplane Authors <https://crossplane.io> 3 4 SPDX-License-Identifier: CC-BY-4.0 5 --> 6 # Testing resources by using Uptest 7 8 `Uptest` provides a framework to test resources in an end-to-end pipeline during 9 the resource configuration process. Together with the example manifest 10 generation tool, it allows us to avoid manual interventions and shortens testing 11 processes. 12 13 These integration tests are costly as they create real resources in cloud 14 providers. So they are not executed by default. Instead, a comment should be 15 posted to the PR for triggering tests. 16 17 Tests can be run by adding something like the following expressions to the 18 anywhere in comment: 19 20 - `/test-examples="provider-azure/examples/kubernetes/cluster.yaml"` 21 - `/test-examples="provider-aws/examples/s3/bucket.yaml, provider-aws/examples/eks/cluster.yaml"` 22 23 You can trigger a test job for an only provider. Provider that the tests will 24 run is determined by using the first element of the comma separated list. If the 25 comment contains resources that are from different providers, then these 26 different resources will be skipped. So, if you want to run tests more than one 27 provider, you must post separate comments for each provider. 28 29 ## Debugging Failed Test 30 31 After a test failed, it is important to understand what is going wrong. For 32 debugging the tests, we push some collected logs to GitHub Action artifacts. 33 These artifacts contain the following data: 34 35 - Dump of Kind Cluster 36 - Kuttl input files (Applied manifests, assertion files) 37 - Managed resource yaml outputs 38 39 To download the artifacts, firstly you must go to the `Summary` page of the 40 relevant job: 41 42  43 44 Then click the `1` under the `Artifacts` button in the upper right. If the 45 automated tests run for more than one providers, this number will be higher. 46 47 When you click this, you can see the `Artifacts` list of job. You can download 48 the artifact you are interested in by clicking it. 49 50  51 52 When a test fails, the first point to look is the provider container's logs. In 53 test environment, we run provider by using the `-d` flag to see the debug logs. 54 In the provider logs, it is possible to see all errors caused by the content of 55 the resource manifest, caused by the configuration or returned by the cloud 56 provider. 57 58 Also, as you know, yaml output of the managed resources (it is located in the 59 `managed.yaml` of the artifact archive's root level) are very useful to catch 60 errors. 61 62 If you have any doubts about the generated kuttl files, please check the 63 `kuttl-inputs.yaml` file in the archive's root. 64 65 ## Running Uptest locally 66 67 For a faster feedback loop, you might want to run `uptest` locally in your 68 development setup. 69 70 To do so run a special `uptest-local` target that accepts `PROVIDER_NAME` and 71 `EXAMPLE_LIST` arguments as in the example below. 72 73 ```bash 74 make uptest-local PROVIDER_NAME=provider-azure EXAMPLE_LIST="provider-azure/examples/resource/resourcegroup.yaml" 75 ``` 76 77 You may also provide all the files in a folder like below: 78 79 ```bash 80 make uptest-local PROVIDER_NAME=provider-aws EXAMPLE_LIST=$(find provider-aws/examples/secretsmanager/*.yaml | tr '\n' ',') 81 ``` 82 83 The local invocation is intentionally lightweight and skips the local cluster, 84 credentials and ProviderConfig setup assuming you already have it all already 85 configured in your environment. 86 87 For a more heavyweight setup see `run_automated_tests` target which is used in a 88 centralized GitHub Actions invocation. 89 90 ## Testing Instructions and Known Error Cases 91 92 While configuring resources, the testing effort is the longest part. Because the 93 characteristics of cloud providers and services can change. This test effort can 94 be executed in two main methods. The first one is testing the resources in a 95 manual way and the second one is using the `Uptest` that is an automated test 96 tool for Official Providers. `Uptest` provides a framework to test resources in 97 an end-to-end pipeline during the resource configuration process. Together with 98 the example manifest generation tool, it allows us to avoid manual interventions 99 and shortens testing processes. 100 101 ### Testing Methods 102 103 #### Manual Test 104 105 Configured resources can be tested by using manual method. This method generally 106 contains the environment preparation and creating the example manifest in the 107 Kubernetes cluster steps. The following steps can be followed for preparing the 108 environment: 109 110 1. Obtaining a Kubernetes Cluster: For manual/local effort, generally a Kind 111 cluster is sufficient and can be used. For detailed information about Kind see 112 [this repo]. 113 An alternative way to obtain a cluster is: [k3d] 114 115 2. Registering the CRDs (Custom Resource Definitions) to Cluster: We need to 116 apply the CRD manifests to the cluster. The relevant manifests are located in 117 the `package/crds` folder of provider subdirectories such as: 118 `provider-aws/package/crds`. For registering them please run the following 119 command: `kubectl apply -f package/crds` 120 121 3. Create ProviderConfig: ProviderConfig Custom Resource contains some 122 configurations and credentials for the provider. For example, to connect to the 123 cloud provider, we use the credentials field of ProviderConfig. For creating the 124 ProviderConfig with correct credentials, please see [the documentation]: 125 126 4. Start Provider: For every Custom Resource, there is a controller and these 127 controllers are part of the provider. So, for starting the reconciliations for 128 Custom Resources, we need to run the provider (collect of controllers). For 129 running provider, two ways can be used: 130 - `make run`: This make target starts the controllers. 131 - Running provider in IDE: Especially for debug effort, you may want to use 132 an IDE. For running the provider in an IDE, some program arguments are 133 needed to be passed. The following example is for `provider-aws`. 134 Values of the `--terraform-version`, `--terraform-provider-source` and 135 `--terraform-provider-version` options can be collected from the Makefile of 136 the provider: `provider-aws/Makefile` 137 - `-d` -> To see debug level logs. `make run` also is run the provider in 138 debug mode. 139 - `--terraform-version 1.2.1`: Terraform version. 140 - `--terraform-provider-source hashicorp/aws`: Provider source name. 141 - `--terraform-provider-version 4.15.1`: Provider version. 142 143 Now our preparation steps are completed. This is the time for testing: 144 145 - Create Examples and Start Testing: After completing the steps above, your 146 environment is ready to testing. For testing, we need to apply some example 147 manifests to the cluster. The manifests in the `examples-generated` folder can be 148 used as a first step. Before starting to change these manifests, you should move 149 them from `examples-generated` folder to the `examples` folder. There are two 150 main reasons for this. The first one is that these manifests are generated for 151 every `make generate` command to catch the latest changes in the resources. So 152 for preserving your changes moving them is necessary. The second reason is that 153 we use the `examples` folder as the source for keeping these manifests and using 154 them in our automated test effort. 155 156 In some cases, these manifests need manual interventions so, for successfully 157 applying them to a cluster (passing the Kubernetes schema validation) you may 158 need to do some work. Possible problems you might face: 159 160 - The generated manifest cannot provide at least one required field. So 161 before creating the resource you must set the required field in the manifest. 162 - In some fields of generated manifest the types of values cannot be matched. 163 For example, X field expects a string but the manifest provides an integer. 164 In these cases you need to provide the correct type in your example YAML 165 manifest. 166 167 Successfully applying these example manifests to cluster is only the 168 first step. After successfully creating these Managed Resources, we need to 169 check whether their statuses are ready or not. So we need to expect a `True` 170 value for `Synced` and `Ready` conditions. To check the statuses of all created 171 example manifests quickly you can run the `kubectl get managed` command. We will 172 wait for all values to be `True` in this list: 173 174  175 176 When all of the `Synced` and `Ready` fields are `True`, the test was 177 successfully completed! However, if there are some resource values that are 178 `False`, you need to debug this situation. The main debugging ways will be 179 mentioned in the next parts. 180 181 > [!NOTE] 182 > For following the test processes in a more accurate way, we have `UpToDate` 183 status condition. This status condition will be visible when you set the 184 annotation: `upjet.upbound.io/test=true`. Without adding this annotation you 185 cannot see the mentioned condition. Uptest adds this annotation to the tested 186 resources, but if you want to see the value of conditions in your tests in 187 your local environment (during manual tests) you need to add this condition 188 manually. For the goal and details of this status condition please see this 189 PR: https://github.com/crossplane/upjet/pull/23 190 191 > [!NOTE] 192 > The resources that are tried to be created may have dependencies. For example, 193 you might actually need resources Y and Z while trying to test resource X. 194 Many of the generated examples include these dependencies. However, in some 195 cases, there may be missing dependencies. In these cases, please add the 196 relevant dependencies to your example manifest. This is important both for you 197 to pass the tests and to provide the correct manifests. 198 199 #### Automated Tests - Uptest 200 201 Configured resources can be tested also by using `Uptest`. We can also separate 202 this part into two main application methods: 203 204 ##### Using Uptest in GitHub Actions 205 206 We have a GitHub workflow `Automated Tests`. This is an integration test for 207 Official Providers. This workflow prepares the environment (provisioning Kind 208 cluster, creating ProviderConfig, installing Provider, etc.) and runs the Uptest 209 with the input manifest list that will be given by the person who triggers the 210 test. 211 212 This `Automated Tests` job can be triggered from the PR that contains the 213 configuration test works for the related resources/groups. For triggering the 214 test, you need to leave a comment in the PR in the following format: 215 216 `/test-examples="provider-aws/examples/s3/bucket.yaml, provider-aws/examples/eks/cluster.yaml"` 217 218 We test using the API group approach for `Automated-Tests`. So, we wait for the 219 entire API group's resources to pass the test in a single test run. This means 220 that while triggering tests, leaving the following type of comment is expected: 221 222 `/test-examples="provider-aws/examples/s3` 223 224 This comment will test all the examples of the `s3` group. 225 226 **Ignoring Some Resources in Automated Tests** 227 228 Some resources require manual intervention such as providing valid public keys 229 or using on-the-fly values. These cases can be handled in manual tests, but in 230 cases where we cannot provide generic values for automated tests, we can skip 231 some resources in the tests of the relevant group via an annotation: 232 233 ```yaml 234 upjet.upbound.io/manual-intervention: "The Certificate needs to be provisioned successfully which requires a real domain." 235 ``` 236 237 The key is important for skipping, we are checking this `upjet.upbound.io/manual-intervention` 238 annotation key and if is in there, we skip the related resource. The value is also 239 important to see why we skip this resource. 240 241 > [!NOTE] 242 > For resources that are ignored during Automated Tests, manual testing is a 243 must. Because we need to make sure that all resources published in the 244 `v1beta1` version are working. 245 246 At the end of the tests, Uptest will provide a report for you. And also for all 247 GitHub Actions, we will have an artifact that contains logs for debugging. For 248 details please see [here]. 249 250 ##### Using Uptest in Local Dev Environment 251 252 The main difference between running `Uptest` from your local environment and 253 running GitHub Actions is that the environment is also prepared during GitHub 254 Actions. During your tests on local, `Uptest` is only responsible for creating 255 instance manifests and assertions of them. Therefore, all the preparation steps 256 mentioned in the Manual Testing section are also necessary for tests performed 257 using `Uptest` locally. 258 259 After preparing the testing environment, you should run the following command to 260 trigger tests locally by using `Uptest`: 261 262 Example for single file test: 263 264 ```bash 265 make uptest-local PROVIDER_NAME=provider-aws EXAMPLE_LIST=provider-aws/examples/secretsmanager/secret.yaml 266 ``` 267 268 Example of whole API Group test: 269 270 ```bash 271 make uptest-local PROVIDER_NAME=provider-aws EXAMPLE_LIST=$(find provider-aws/examples/secretsmanager/*.yaml | tr '\n' ',') 272 ``` 273 274 ### Debugging Tests 275 276 Whether the tests fail using `Uptest` or when testing manually, the steps to be 277 followed are the same. What finally failed was a Managed Resource tested against 278 Official Providers. In this case, the first thing to do is to check the manifest 279 of the failing resource (where the value of `Synced` or `Ready` condition is 280 `False`) in the cluster. 281 282 If the test was in your local environment, you can check the current state of 283 the resource by using the following command: 284 `kubectl get network.compute.gcp.upbound.io/example-network-1 -o yaml` 285 286 If the test ran in the GitHub Actions, you need to check the action artifact 287 mentioned in the previous part of the documentation. 288 289 The second important point to understand the problem is the provider logs. If 290 the test was in your local environment, you need to check the `make run` or IDE 291 logs. If testing was in GitHub Actions, you need to check the action artifact. 292 It contains the cluster dump that has the provider logs. 293 294 ## Known Error Cases 295 296 1. `prevent_destroy` Case: In some cases, when unexpected changes or situations 297 occur in the resources, Terraform tries to delete the related resource and 298 create it again. However, in order to prevent this situation, the resources are 299 configurable. In this context, the name of the field where you can provide this 300 control is `prevent_destroy`. Please see details of [Terraform Resource Lifecycle]. 301 For resources in Official Providers, this value defaults to `true`. So the 302 deletion of the resource is blocked. 303 304 Encountering this situation (i.e. Terraform trying to delete and recreate the 305 resource) is not normal and may indicate a specific error. Some possible 306 problems could be: 307 308 - As a result of overriding the constructed ID after Terraform calls, Terraform 309 could not match the IDs and tries to recreate the resource. Please see 310 [this issue] for details. In this type of cases, you need to review your 311 external name configuration. 312 - Crossplane's concept of [Late Initialization] may cause some side effects. 313 One of them is while late initialization, filling a field that is not initially 314 filled on the manifest may cause the resource to be destroyed and recreated. 315 In such a case, it should be evaluated that which field's value is set will 316 cause such an error. During this evaluation, it will be necessary to make use 317 of the terraform registry document. In the end, the field that is thought to 318 solve the problem is put into the ignore list using the 319 [late initialization configuration] and the test is repeated from the 320 beginning. 321 - Some resources fall into `tainted` state as a result of certain steps in the 322 creation process fail. Please see [tainted issue] for details. 323 324 2. External Name Configuration Related Errors: The most common known issue is 325 errors in the external name configuration. A clear error message regarding this 326 situation may not be visible. Many error messages can be related to an incorrect 327 external name configuration. Such as, a field cannot be read properly from the 328 parameter map, there are unexpected fields in the generated `main.tf.json` file, 329 etc. 330 331 Therefore, when debugging a non-ready resource; if you do not see errors 332 returned by the Cloud API related to the constraints or characteristics of the 333 service (for example, you are stuck on the creation limit of this resource in 334 this region, or the use of the relevant field for this resource depends on the 335 following conditions etc.), the first point to check is external name 336 configuration. 337 338 3. Late Initialization Errors: Late Initialization is one of the key concepts of 339 Crossplane. It allows for some values that are not initially located in the 340 resource's manifest to be filled with the values returned by the cloud providers. 341 342 As a side effect of this, some fields conflict each other. In this case, a 343 detailed error message is usually displayed about which fields conflict with 344 each other. In this case, the relevant field should be skipped by [these steps]. 345 346 4. Provider Service Specific Errors: Every cloud provider and every service has 347 its own features and behavior. Therefore, you may see special error messages in 348 the status of the resources from time to time. These may say that you are out of 349 the allowed values in some fields of the resource, or that you need to enable 350 the relevant service, etc. In such cases, please review your example manifest 351 and try to find the appropriate example. 352 353 > [!IMPORTANT] 354 > `make reviewable` and `kubectl apply -f package/crds` commands must be run 355 after any change that will affect the schema or controller of the 356 configured/tested resource. In addition, the provider needs to be restarted 357 after the changes in the controllers, because the controller change actually 358 corresponds to the changes made in the running code. 359 360 [this repo]: https://github.com/kubernetes-sigs/kind 361 [the documentation]: https://crossplane.io/docs/v1.9/getting-started/install-configure.html#install-configuration-package 362 [here]: https://github.com/upbound/official-providers/blob/main/docs/testing-resources-by-using-uptest.md#debugging-failed-test 363 [these steps]: https://github.com/upbound/crossplane/blob/main/docs/configuring-a-resource.md#late-initialization-configuration 364 [late initialization configuration]: https://github.com/upbound/crossplane/blob/main/docs/configuring-a-resource.md#late-initialization-configuration 365 [Terraform Resource Lifecycle]: https://learn.hashicorp.com/tutorials/terraform/resource-lifecycle 366 [this issue]: https://github.com/upbound/crossplane/issues/32 367 [Late Initialization]: https://crossplane.io/docs/v1.9/concepts/managed-resources.html#late-initialization 368 [tainted issue]: https://github.com/upbound/crossplane/issues/80 369 [k3d]: https://k3d.io/