gvisor.dev/gvisor@v0.0.0-20240520182842-f9d4d51c7e0f/test/runtimes/README.md (about) 1 # gVisor Runtime Tests 2 3 These tests execute language runtime test suites inside gVisor. They serve as 4 high-level integration tests for the various runtimes. 5 6 ## Runtime Test Components 7 8 The runtime tests have the following components: 9 10 - [`images`][runtime-images] - These are Docker images for each language 11 runtime we test. The images contain all the particular runtime tests, and 12 whatever other libraries or utilities are required to run the tests. 13 - [`proctor`](proctor) - This is a binary that acts as an agent inside the 14 container and provides a uniform command-line API to list and run the 15 various language tests. 16 - [`runner`](runner) - This is the test entrypoint invoked by `bazel run`. 17 This binary spawns Docker (using `runsc` runtime) and runs the language 18 image with `proctor` binary mounted. 19 - [`exclude`](exclude) - Holds a CSV file for each language runtime containing 20 the full path of tests that should be excluded from running along with a 21 reason for exclusion. 22 23 ## Testing Locally 24 25 The following `make` targets will run an entire runtime test suite locally. 26 27 Note: java runtime test take 1+ hours with 16 cores. 28 29 Language | Version | Running the test suite 30 -------- | ------- | ---------------------------------- 31 Go | 1.22 | `make go1.22-runtime-tests` 32 Java | 21 | `make java21-runtime-tests` 33 NodeJS | 16.13.2 | `make nodejs16.13.2-runtime-tests` 34 Php | 8.1.1 | `make php8.1.1-runtime-tests` 35 Python | 3.10.2 | `make python3.10.2-runtime-tests` 36 37 You can modify the runtime test behaviors by passing in the following `make` 38 variables: 39 40 * `RUNTIME_TESTS_FILTER`: Comma-separated list of tests to run, even if 41 otherwise excluded. Useful to debug single failing test cases. 42 * `RUNTIME_TESTS_PER_TEST_TIMEOUT`: Modify per-test timeout. Useful when 43 debugging a test that has a tendency to get stuck, in order to make it fail 44 faster. 45 * `RUNTIME_TESTS_RUNS_PER_TEST`: Number of times to run each test. Useful to 46 find flaky tests. 47 * `RUNTIME_TESTS_FLAKY_IS_ERROR`: Boolean indicating whether tests found flaky 48 (i.e. running them multiple times has sometimes succeeded, sometimes failed) 49 should be considered a test suite failure (`true`) or success (`false`). 50 * `RUNTIME_TESTS_FLAKY_SHORT_CIRCUIT`: If true, when running tests multiple 51 times, and a test has been found flaky (i.e. running it multiple times has 52 succeeded at least once and failed at least once), exit immediately, rather 53 than running all `RUNTIME_TESTS_RUNS_PER_TEST` attempts. 54 55 Example invocation: 56 57 ```shell 58 $ make php8.1.1-runtime-tests \ 59 RUNTIME_TESTS_FILTER=ext/standard/tests/file/bug60120.phpt \ 60 RUNTIME_TESTS_PER_TEST_TIMEOUT=10s \ 61 RUNTIME_TESTS_RUNS_PER_TEST=100 62 ``` 63 64 ### Clean Up 65 66 Sometimes when runtime tests fail or when the testing container itself crashes 67 unexpectedly, the containers are not removed or sometimes do not even exit. This 68 can cause some docker commands like `docker system prune` to hang forever. 69 70 Here are some helpful commands (should be executed in order): 71 72 ```bash 73 docker ps -a # Lists all docker processes; useful when investigating hanging containers. 74 docker kill $(docker ps -a -q) # Kills all running containers. 75 docker rm $(docker ps -a -q) # Removes all exited containers. 76 docker system prune # Remove unused data. 77 ``` 78 79 ## Updating Runtime Tests 80 81 To bump the version of an existing runtime test: 82 83 1. Update the [Docker image](../../images/runtimes) for with the new runtime 84 version. Rename the `Dockerfile` directory name and update any packages or 85 downloaded urls to point to the new version. Test building the image with 86 `docker build images/runtimes/<new_runtime>`. 87 88 2. Update [`runtime_test`](BUILD) target. The `name` field must be the 89 directory name for the `Dockerfile` created in Step 1. 90 91 3. Update [Buildkite pipeline](../../.buildkite/pipeline.yaml). 92 93 4. Run the tests, and triage any failures. Some language tests are flaky (or 94 never pass at all), other failures may indicate a gVisor bug or divergence 95 from Linux behavior. 96 97 5. Update the [exclude](exclude) file by renaming it with the right version and 98 adding any failing tests to it with a reason. 99 100 ### Cleaning up exclude files 101 102 Usually when the runtime is updated, a lot has changed. Tests may have been 103 deleted, modified (fixed or broken) or added. After you have an exclude list 104 from step 3 above with which all runtime tests pass, it is useful to clean up 105 the exclude files with the following steps: 106 107 1. Check for the existence of tests in the runtime image. See how each runtime 108 lists all its tests (see `ListTests()` implementations in `proctor/lib` 109 directory). Then you can compare against that list and remove any excluded 110 tests that don't exist anymore. 111 2. Run all excluded tests with runc (native) for each runtime. If the test 112 fails, we can consider the test as broken. Such tests should be marked with 113 `Broken test` in the reason column. These tests don't provide a 114 compatibility gap signal for gvisor. We can happily ignore them. Some tests 115 which were previously broken may not be unbroken and for them the reason 116 field should be cleared. 117 3. Run all the unbroken and non-flaky tests on runsc (gVisor). If the test is 118 now passing, then the test should be removed from the exclude list. This 119 effectively increases our testing surface. Once upon a time, this test was 120 failing. Now it is passing. Something was fixed in between. Enabling this 121 test is equivalent to adding a regression test for the fix. 122 4. Some tests are excluded and marked flaky. Run these tests 100 times on runsc 123 (gVisor). If it does not flake, then you can remove it from the exclude 124 list. 125 5. Finally, close all corresponding bugs for tests that are now passing. These 126 bugs are stale. 127 128 Creating new runtime tests for an entirely new language is similar to the above, 129 except that Step 1 is a bit harder. You have to figure out how to download and 130 run the language tests in a Docker container. Once you have that, you must also 131 implement the [`proctor/TestRunner`](proctor/lib/lib.go) interface for that 132 language, so that proctor can list and run the tests in the image you created.