golang.org/toolchain@v0.0.1-go1.9rc2.windows-amd64/blog/content/subtests.article (about) 1 Using Subtests and Sub-benchmarks 2 03 Oct 2016 3 Tags: testing, hierarchy, table-driven, subtests, sub-benchmarks 4 5 Marcel van Lohuizen 6 7 * Introduction 8 9 In Go 1.7, the `testing` package introduces a Run method on the 10 [[https://golang.org/pkg/testing/#T.Run][`T`]] and 11 [[https://golang.org/pkg/testing/#B.Run][`B`]] types 12 that allows for the creation of subtests and sub-benchmarks. 13 The introduction of subtests and sub-benchmarks enables better handling of 14 failures, fine-grained control of which tests to run from the command line, 15 control of parallelism, and often results in simpler and more maintainable code. 16 17 * Table-driven tests basics 18 19 Before digging into the details, let's first discuss a common 20 way of writing tests in Go. 21 A series of related checks can be implemented by looping over a slice of test 22 cases: 23 24 func TestTime(t *testing.T) { 25 testCases := []struct { 26 gmt string 27 loc string 28 want string 29 }{ 30 {"12:31", "Europe/Zuri", "13:31"}, // incorrect location name 31 {"12:31", "America/New_York", "7:31"}, // should be 07:31 32 {"08:08", "Australia/Sydney", "18:08"}, 33 } 34 for _, tc := range testCases { 35 loc, err := time.LoadLocation(tc.loc) 36 if err != nil { 37 t.Fatalf("could not load location %q", tc.loc) 38 } 39 gmt, _ := time.Parse("15:04", tc.gmt) 40 if got := gmt.In(loc).Format("15:04"); got != tc.want { 41 t.Errorf("In(%s, %s) = %s; want %s", tc.gmt, tc.loc, got, tc.want) 42 } 43 } 44 } 45 46 This approach, commonly referred to as table-driven tests, reduces the amount 47 of repetitive code compared to repeating the same code for each test 48 and makes it straightforward to add more test cases. 49 50 * Table-driven benchmarks 51 52 Before Go 1.7 it was not possible to use the same table-driven approach for 53 benchmarks. 54 A benchmark tests the performance of an entire function, so iterating over 55 benchmarks would just measure all of them as a single benchmark. 56 57 A common workaround was to define separate top-level benchmarks 58 that each call a common function with different parameters. 59 For instance, before 1.7 the `strconv` package's benchmarks for `AppendFloat` 60 looked something like this: 61 62 func benchmarkAppendFloat(b *testing.B, f float64, fmt byte, prec, bitSize int) { 63 dst := make([]byte, 30) 64 b.ResetTimer() // Overkill here, but for illustrative purposes. 65 for i := 0; i < b.N; i++ { 66 AppendFloat(dst[:0], f, fmt, prec, bitSize) 67 } 68 } 69 70 func BenchmarkAppendFloatDecimal(b *testing.B) { benchmarkAppendFloat(b, 33909, 'g', -1, 64) } 71 func BenchmarkAppendFloat(b *testing.B) { benchmarkAppendFloat(b, 339.7784, 'g', -1, 64) } 72 func BenchmarkAppendFloatExp(b *testing.B) { benchmarkAppendFloat(b, -5.09e75, 'g', -1, 64) } 73 func BenchmarkAppendFloatNegExp(b *testing.B) { benchmarkAppendFloat(b, -5.11e-95, 'g', -1, 64) } 74 func BenchmarkAppendFloatBig(b *testing.B) { benchmarkAppendFloat(b, 123456789123456789123456789, 'g', -1, 64) } 75 ... 76 77 Using the `Run` method available in Go 1.7, the same set of benchmarks is now 78 expressed as a single top-level benchmark: 79 80 func BenchmarkAppendFloat(b *testing.B) { 81 benchmarks := []struct{ 82 name string 83 float float64 84 fmt byte 85 prec int 86 bitSize int 87 }{ 88 {"Decimal", 33909, 'g', -1, 64}, 89 {"Float", 339.7784, 'g', -1, 64}, 90 {"Exp", -5.09e75, 'g', -1, 64}, 91 {"NegExp", -5.11e-95, 'g', -1, 64}, 92 {"Big", 123456789123456789123456789, 'g', -1, 64}, 93 ... 94 } 95 dst := make([]byte, 30) 96 for _, bm := range benchmarks { 97 b.Run(bm.name, func(b *testing.B) { 98 for i := 0; i < b.N; i++ { 99 AppendFloat(dst[:0], bm.float, bm.fmt, bm.prec, bm.bitSize) 100 } 101 }) 102 } 103 } 104 105 Each invocation of the `Run` method creates a separate benchmark. 106 An enclosing benchmark function that calls a `Run` method is only run once and 107 is not measured. 108 109 The new code has more lines of code, but is more maintainable, more readable, 110 and consistent with the table-driven approach commonly used for testing. 111 Moreover, common setup code is now shared between runs while eliminating the 112 need to reset the timer. 113 114 115 116 * Table-driven tests using subtests 117 118 Go 1.7 also introduces a `Run` method for creating subtests. 119 This test is a rewritten version of our earlier example using subtests: 120 121 func TestTime(t *testing.T) { 122 testCases := []struct { 123 gmt string 124 loc string 125 want string 126 }{ 127 {"12:31", "Europe/Zuri", "13:31"}, 128 {"12:31", "America/New_York", "7:31"}, 129 {"08:08", "Australia/Sydney", "18:08"}, 130 } 131 for _, tc := range testCases { 132 t.Run(fmt.Sprintf("%s in %s", tc.gmt, tc.loc), func(t *testing.T) { 133 loc, err := time.LoadLocation(tc.loc) 134 if err != nil { 135 t.Fatal("could not load location") 136 } 137 gmt, _ := time.Parse("15:04", tc.gmt) 138 if got := gmt.In(loc).Format("15:04"); got != tc.want { 139 t.Errorf("got %s; want %s", got, tc.want) 140 } 141 }) 142 } 143 } 144 145 The first thing to note is the difference in output from the two implementations. 146 The original implementation prints: 147 148 --- FAIL: TestTime (0.00s) 149 time_test.go:62: could not load location "Europe/Zuri" 150 151 Even though there are two errors, execution of the test halts on the call to 152 `Fatalf` and the second test never runs. 153 154 The implementation using `Run` prints both: 155 156 --- FAIL: TestTime (0.00s) 157 --- FAIL: TestTime/12:31_in_Europe/Zuri (0.00s) 158 time_test.go:84: could not load location 159 --- FAIL: TestTime/12:31_in_America/New_York (0.00s) 160 time_test.go:88: got 07:31; want 7:31 161 162 `Fatal` and its siblings causes a subtest to be skipped but not its parent or 163 subsequent subtests. 164 165 Another thing to note is the shorter error messages in the new implementation. 166 Since the subtest name uniquely identifies the subtest there is no need to 167 identify the test again within the error messages. 168 169 There are several other benefits to using subtests or sub-benchmarks, 170 as clarified by the following sections. 171 172 173 * Running specific tests or benchmarks 174 175 Both subtests and sub-benchmarks can be singled out on the command line using 176 the [[https://golang.org/cmd/go/#hdr-Description_of_testing_flags][`-run` or `-bench` flag]]. 177 Both flags take a slash-separated list of regular expressions that match the 178 corresponding parts of the full name of the subtest or sub-benchmark. 179 180 The full name of a subtest or sub-benchmark is a slash-separated list of 181 its name and the names of all of its parents, starting with the top-level. 182 The name is the corresponding function name for top-level tests and benchmarks, 183 and the first argument to `Run` otherwise. 184 To avoid display and parsing issues, a name is sanitized by replacing spaces 185 with underscores and escaping non-printable characters. 186 The same sanitizing is applied to the regular expressions passed to 187 the `-run` or `-bench` flags. 188 189 A few examples: 190 191 Run tests that use a timezone in Europe: 192 193 $ go test -run=TestTime/"in Europe" 194 --- FAIL: TestTime (0.00s) 195 --- FAIL: TestTime/12:31_in_Europe/Zuri (0.00s) 196 time_test.go:85: could not load location 197 198 Run only tests for times after noon: 199 200 $ go test -run=Time/12:[0-9] -v 201 === RUN TestTime 202 === RUN TestTime/12:31_in_Europe/Zuri 203 === RUN TestTime/12:31_in_America/New_York 204 --- FAIL: TestTime (0.00s) 205 --- FAIL: TestTime/12:31_in_Europe/Zuri (0.00s) 206 time_test.go:85: could not load location 207 --- FAIL: TestTime/12:31_in_America/New_York (0.00s) 208 time_test.go:89: got 07:31; want 7:31 209 210 Perhaps a bit surprising, using `-run=TestTime/New_York` won't match any tests. 211 This is because the slash present in the location names is treated as 212 a separator as well. 213 Instead use: 214 215 $ go test -run=Time//New_York 216 --- FAIL: TestTime (0.00s) 217 --- FAIL: TestTime/12:31_in_America/New_York (0.00s) 218 time_test.go:88: got 07:31; want 7:31 219 220 Note the `//` in the string passed to `-run`. 221 The `/` in time zone name `America/New_York` is handled as if it were 222 a separator resulting from a subtest. 223 The first regular expression of the pattern (`TestTime`) matches the top-level 224 test. 225 The second regular expression (the empty string) matches anything, in this case 226 the time and the continent part of the location. 227 The third regular expression (`New_York`) matches the city part of the location. 228 229 Treating slashes in names as separators allows the user to refactor 230 hierarchies of tests without the need to change the naming. 231 It also simplifies the escaping rules. 232 The user should escape slashes in names, for instance by replacing them with 233 backslashes, if this poses a problem. 234 235 A unique sequence number is appended to test names that are not unique. 236 So one could just pass an empty string to `Run` 237 if there is no obvious naming scheme for subtests and the subtests 238 can easily be identified by their sequence number. 239 240 * Setup and Tear-down 241 242 Subtests and sub-benchmarks can be used to manage common setup and tear-down code: 243 244 func TestFoo(t *testing.T) { 245 // <setup code> 246 t.Run("A=1", func(t *testing.T) { ... }) 247 t.Run("A=2", func(t *testing.T) { ... }) 248 t.Run("B=1", func(t *testing.T) { 249 if !test(foo{B:1}) { 250 t.Fail() 251 } 252 }) 253 // <tear-down code> 254 } 255 256 The setup and tear-down code will run if any of the enclosed subtests are run 257 and will run at most once. 258 This applies even if any of the subtests calls `Skip`, `Fail`, or `Fatal`. 259 260 * Control of Parallelism 261 262 Subtests allow fine-grained control over parallelism. 263 To understand how to use subtests in the way 264 it is important to understand the semantics of parallel tests. 265 266 Each test is associated with a test function. 267 A test is called a parallel test if its test function calls the Parallel 268 method on its instance of `testing.T`. 269 A parallel test never runs concurrently with a sequential test and its execution 270 is suspended until its calling test function, that of the parent test, 271 has returned. 272 The `-parallel` flag defines the maximum number of parallel tests that can run 273 in parallel. 274 275 A test blocks until its test function returns and all of its subtests 276 have completed. 277 This means that the parallel tests that are run by a sequential test will 278 complete before any other consecutive sequential test is run. 279 280 This behavior is identical for tests created by `Run` and top-level tests. 281 In fact, under the hood top-level tests are implemented as subtests of 282 a hidden master test. 283 284 ** Run a group of tests in parallel 285 286 The above semantics allows for running a group of tests in parallel with 287 each other but not with other parallel tests: 288 289 func TestGroupedParallel(t *testing.T) { 290 for _, tc := range testCases { 291 tc := tc // capture range variable 292 t.Run(tc.Name, func(t *testing.T) { 293 t.Parallel() 294 if got := foo(tc.in); got != tc.out { 295 t.Errorf("got %v; want %v", got, tc.out) 296 } 297 ... 298 }) 299 } 300 } 301 302 The outer test will not complete until all parallel tests started by `Run` 303 have completed. 304 As a result, no other parallel tests can run in parallel to these parallel tests. 305 306 Note that we need to capture the range variable to ensure that `tc` gets bound to 307 the correct instance. 308 309 310 ** Cleaning up after a group of parallel tests 311 312 In the previous example we used the semantics to wait on a group of parallel 313 tests to complete before commencing other tests. 314 The same technique can be used to clean up after a group of parallel tests 315 that share common resources: 316 317 func TestTeardownParallel(t *testing.T) { 318 // <setup code> 319 // This Run will not return until its parallel subtests complete. 320 t.Run("group", func(t *testing.T) { 321 t.Run("Test1", parallelTest1) 322 t.Run("Test2", parallelTest2) 323 t.Run("Test3", parallelTest3) 324 }) 325 // <tear-down code> 326 } 327 328 The behavior of waiting on a group of parallel tests is identical to that 329 of the previous example. 330 331 * Conclusion 332 333 Go 1.7's addition of subtests and sub-benchmarks allows you to write structured 334 tests and benchmarks in a natural way that blends nicely into the existing 335 tools. 336 One way to think about this is that earlier versions of the testing package had 337 a 1-level hierarchy: the package-level test was structured as a set of 338 individual tests and benchmarks. 339 Now that structure has been extended to those individual tests and benchmarks, 340 recursively. 341 In fact, in the implementation, the top-level tests and benchmarks are tracked 342 as if they were subtests and sub-benchmarks of an implicit master test and 343 benchmark: the treatment really is the same at all levels. 344 345 The ability for tests to define this structure enables fine-grained execution of 346 specific test cases, shared setup and teardown, and better control over test 347 parallelism. 348 We are excited to see what other uses people find. Enjoy.