gvisor.dev/gvisor@v0.0.0-20240520182842-f9d4d51c7e0f/test/runtimes/README.md

gvisor.dev/gvisor@v0.0.0-20240520182842-f9d4d51c7e0f/test/runtimes/README.md (about)

1 # gVisor Runtime Tests
2
3 These tests execute language runtime test suites inside gVisor. They serve as
4 high-level integration tests for the various runtimes.
5
6 ## Runtime Test Components
7
8 The runtime tests have the following components:
9
10 - [`images`][runtime-images] - These are Docker images for each language
11 runtime we test. The images contain all the particular runtime tests, and
12 whatever other libraries or utilities are required to run the tests.
13 - [`proctor`](proctor) - This is a binary that acts as an agent inside the
14 container and provides a uniform command-line API to list and run the
15 various language tests.
16 - [`runner`](runner) - This is the test entrypoint invoked by `bazel run`.
17 This binary spawns Docker (using `runsc` runtime) and runs the language
18 image with `proctor` binary mounted.
19 - [`exclude`](exclude) - Holds a CSV file for each language runtime containing
20 the full path of tests that should be excluded from running along with a
21 reason for exclusion.
22
23 ## Testing Locally
24
25 The following `make` targets will run an entire runtime test suite locally.
26
27 Note: java runtime test take 1+ hours with 16 cores.
28
29 Language | Version | Running the test suite
30 -------- | ------- | ----------------------------------
31 Go | 1.22 | `make go1.22-runtime-tests`
32 Java | 21 | `make java21-runtime-tests`
33 NodeJS | 16.13.2 | `make nodejs16.13.2-runtime-tests`
34 Php | 8.1.1 | `make php8.1.1-runtime-tests`
35 Python | 3.10.2 | `make python3.10.2-runtime-tests`
36
37 You can modify the runtime test behaviors by passing in the following `make`
38 variables:
39
40 * `RUNTIME_TESTS_FILTER`: Comma-separated list of tests to run, even if
41 otherwise excluded. Useful to debug single failing test cases.
42 * `RUNTIME_TESTS_PER_TEST_TIMEOUT`: Modify per-test timeout. Useful when
43 debugging a test that has a tendency to get stuck, in order to make it fail
44 faster.
45 * `RUNTIME_TESTS_RUNS_PER_TEST`: Number of times to run each test. Useful to
46 find flaky tests.
47 * `RUNTIME_TESTS_FLAKY_IS_ERROR`: Boolean indicating whether tests found flaky
48 (i.e. running them multiple times has sometimes succeeded, sometimes failed)
49 should be considered a test suite failure (`true`) or success (`false`).
50 * `RUNTIME_TESTS_FLAKY_SHORT_CIRCUIT`: If true, when running tests multiple
51 times, and a test has been found flaky (i.e. running it multiple times has
52 succeeded at least once and failed at least once), exit immediately, rather
53 than running all `RUNTIME_TESTS_RUNS_PER_TEST` attempts.
54
55 Example invocation:
56
57 ```shell
58 $ make php8.1.1-runtime-tests \
59 RUNTIME_TESTS_FILTER=ext/standard/tests/file/bug60120.phpt \
60 RUNTIME_TESTS_PER_TEST_TIMEOUT=10s \
61 RUNTIME_TESTS_RUNS_PER_TEST=100
62 ```
63
64 ### Clean Up
65
66 Sometimes when runtime tests fail or when the testing container itself crashes
67 unexpectedly, the containers are not removed or sometimes do not even exit. This
68 can cause some docker commands like `docker system prune` to hang forever.
69
70 Here are some helpful commands (should be executed in order):
71
72 ```bash
73 docker ps -a # Lists all docker processes; useful when investigating hanging containers.
74 docker kill $(docker ps -a -q) # Kills all running containers.
75 docker rm $(docker ps -a -q) # Removes all exited containers.
76 docker system prune # Remove unused data.
77 ```
78
79 ## Updating Runtime Tests
80
81 To bump the version of an existing runtime test:
82
83 1. Update the [Docker image](../../images/runtimes) for with the new runtime
84 version. Rename the `Dockerfile` directory name and update any packages or
85 downloaded urls to point to the new version. Test building the image with
86 `docker build images/runtimes/<new_runtime>`.
87
88 2. Update [`runtime_test`](BUILD) target. The `name` field must be the
89 directory name for the `Dockerfile` created in Step 1.
90
91 3. Update [Buildkite pipeline](../../.buildkite/pipeline.yaml).
92
93 4. Run the tests, and triage any failures. Some language tests are flaky (or
94 never pass at all), other failures may indicate a gVisor bug or divergence
95 from Linux behavior.
96
97 5. Update the [exclude](exclude) file by renaming it with the right version and
98 adding any failing tests to it with a reason.
99
100 ### Cleaning up exclude files
101
102 Usually when the runtime is updated, a lot has changed. Tests may have been
103 deleted, modified (fixed or broken) or added. After you have an exclude list
104 from step 3 above with which all runtime tests pass, it is useful to clean up
105 the exclude files with the following steps:
106
107 1. Check for the existence of tests in the runtime image. See how each runtime
108 lists all its tests (see `ListTests()` implementations in `proctor/lib`
109 directory). Then you can compare against that list and remove any excluded
110 tests that don't exist anymore.
111 2. Run all excluded tests with runc (native) for each runtime. If the test
112 fails, we can consider the test as broken. Such tests should be marked with
113 `Broken test` in the reason column. These tests don't provide a
114 compatibility gap signal for gvisor. We can happily ignore them. Some tests
115 which were previously broken may not be unbroken and for them the reason
116 field should be cleared.
117 3. Run all the unbroken and non-flaky tests on runsc (gVisor). If the test is
118 now passing, then the test should be removed from the exclude list. This
119 effectively increases our testing surface. Once upon a time, this test was
120 failing. Now it is passing. Something was fixed in between. Enabling this
121 test is equivalent to adding a regression test for the fix.
122 4. Some tests are excluded and marked flaky. Run these tests 100 times on runsc
123 (gVisor). If it does not flake, then you can remove it from the exclude
124 list.
125 5. Finally, close all corresponding bugs for tests that are now passing. These
126 bugs are stale.
127
128 Creating new runtime tests for an entirely new language is similar to the above,
129 except that Step 1 is a bit harder. You have to figure out how to download and
130 run the language tests in a Docker container. Once you have that, you must also
131 implement the [`proctor/TestRunner`](proctor/lib/lib.go) interface for that
132 language, so that proctor can list and run the tests in the image you created.