github.com/pingcap/failpoint@v0.0.0-20240412033321-fd0796e60f86/README.md (about) 1 # failpoint 2 [![LICENSE](https://img.shields.io/github/license/pingcap/failpoint.svg)](https://github.com/pingcap/failpoint/blob/master/LICENSE) 3 [![Language](https://img.shields.io/badge/Language-Go-blue.svg)](https://golang.org/) 4 [![Go Report Card](https://goreportcard.com/badge/github.com/pingcap/failpoint)](https://goreportcard.com/report/github.com/pingcap/failpoint) 5 [![Build Status](https://github.com/pingcap/failpoint/actions/workflows/suite.yml/badge.svg?branch=master)](https://github.com/pingcap/failpoint/actions/workflows/suite.yml?query=event%3Apush+branch%3Amaster) 6 [![Coverage Status](https://codecov.io/gh/pingcap/failpoint/branch/master/graph/badge.svg)](https://codecov.io/gh/pingcap/failpoint) 7 [![Mentioned in Awesome Go](https://awesome.re/mentioned-badge.svg)](https://github.com/avelino/awesome-go) 8 9 An implementation of [failpoints][failpoint] for Golang. Fail points are used to add code points where errors may be injected in a user controlled fashion. Fail point is a code snippet that is only executed when the corresponding failpoint is active. 10 11 [failpoint]: http://www.freebsd.org/cgi/man.cgi?query=fail 12 13 ## Quick Start (use `failpoint-ctl`) 14 15 1. Build `failpoint-ctl` from source 16 17 ``` bash 18 git clone https://github.com/pingcap/failpoint.git 19 cd failpoint 20 make 21 ls bin/failpoint-ctl 22 ``` 23 24 2. Inject failpoints to your program, eg: 25 26 ``` go 27 package main 28 29 import "github.com/pingcap/failpoint" 30 31 func main() { 32 failpoint.Inject("testPanic", func() { 33 panic("failpoint triggerd") 34 }) 35 } 36 ``` 37 38 3. Transfrom your code with `failpoint-ctl enable` 39 40 4. Build with `go build` 41 42 5. Enable failpoints with `GO_FAILPOINTS` environment variable 43 44 ``` bash 45 GO_FAILPOINTS="main/testPanic=return(true)" ./your-program 46 ``` 47 48 6. If you use `go run` to run the test, don't forget to add the generated `binding__failpoint_binding__.go` in your command, like: 49 50 ```bash 51 GO_FAILPOINTS="main/testPanic=return(true)" go run your-program.go binding__failpoint_binding__.go 52 ``` 53 54 ## Quick Start (use `failpoint-toolexec`) 55 56 1. Build `failpoint-toolexec` from source 57 58 ``` bash 59 git clone https://github.com/pingcap/failpoint.git 60 cd failpoint 61 make 62 ls bin/failpoint-toolexec 63 ``` 64 65 2. Inject failpoints to your program, eg: 66 67 ``` go 68 package main 69 70 import "github.com/pingcap/failpoint" 71 72 func main() { 73 failpoint.Inject("testPanic", func() { 74 panic("failpoint triggerd") 75 }) 76 } 77 ``` 78 79 3. Use a separate build cache to avoid mixing caches without `failpoint-toolexec`, and build 80 81 `GOCACHE=/tmp/failpoint-cache go build -toolexec path/to/failpoint-toolexec` 82 83 4. Enable failpoints with `GO_FAILPOINTS` environment variable 84 85 ``` bash 86 GO_FAILPOINTS="main/testPanic=return(true)" ./your-program 87 ``` 88 89 5. You can also use `go run` or `go test`, like: 90 91 ```bash 92 GOCACHE=/tmp/failpoint-cache GO_FAILPOINTS="main/testPanic=return(true)" go run -toolexec path/to/failpoint-toolexec your-program.go 93 ``` 94 95 ## Design principles 96 97 - Define failpoint in valid Golang code, not comments or anything else 98 - Failpoint does not have any extra cost 99 100 - Will not take effect on regular logic 101 - Will not cause regular code performance regression 102 - Failpoint code will not appear in the final binary 103 104 - Failpoint routine is writable/readable and should be checked by a compiler 105 - Generated code by failpoint definition is easy to read 106 - Keep the line numbers same with the injecting codes(easier to debug) 107 - Support parallel tests with context.Context 108 109 ## Key concepts 110 111 - Failpoint 112 113 Faillpoint is a code snippet that is only executed when the corresponding failpoint is active. 114 The closure will never be executed if `failpoint.Disable("failpoint-name-for-demo")` is executed. 115 116 ```go 117 var outerVar = "declare in outer scope" 118 failpoint.Inject("failpoint-name-for-demo", func(val failpoint.Value) { 119 fmt.Println("unit-test", val, outerVar) 120 }) 121 ``` 122 123 - Marker functions 124 125 - It is just an empty function 126 127 - To hint the rewriter to rewrite with an equality statement 128 - To receive some parameters as the rewrite rule 129 - It will be inline in the compiling time and emit nothing to binary (zero cost) 130 - The variables in external scope can be accessed in closure by capturing, and the converted code is still legal 131 because all the captured-variables location in outer scope of IF statement. 132 133 - It is easy to write/read 134 - Introduce a compiler check for failpoints which cannot compile in the regular mode if failpoint code is invalid 135 136 - Marker funtion list 137 138 - `func Inject(fpname string, fpblock func(val Value)) {}` 139 - `func InjectContext(fpname string, ctx context.Context, fpblock func(val Value)) {}` 140 - `func Break(label ...string) {}` 141 - `func Goto(label string) {}` 142 - `func Continue(label ...string) {}` 143 - `func Fallthrough() {}` 144 - `func Return(results ...interface{}) {}` 145 - `func Label(label string) {}` 146 147 - Supported failpoint environment variable 148 149 failpoint can be enabled by export environment variables with the following patten, which is quite similar to [freebsd failpoint SYSCTL VARIABLES](https://www.freebsd.org/cgi/man.cgi?query=fail) 150 151 ```regexp 152 [<percent>%][<count>*]<type>[(args...)][-><more terms>] 153 ``` 154 155 The <type> argument specifies which action to take; it can be one of: 156 157 - off: Take no action (does not trigger failpoint code) 158 - return: Trigger failpoint with specified argument 159 - sleep: Sleep the specified number of milliseconds 160 - panic: Panic 161 - break: Execute gdb and break into debugger 162 - print: Print failpoint path for inject variable 163 - pause: Pause will pause until the failpoint is disabled 164 165 ## How to inject a failpoint to your program 166 167 - You can call `failpoint.Inject` to inject a failpoint to the call site, where `failpoint-name` is 168 used to trigger the failpoint and `failpoint-closure` will be expanded as the body of the IF statement. 169 170 ```go 171 failpoint.Inject("failpoint-name", func(val failpoint.Value) { 172 failpoint.Return("unit-test", val) 173 }) 174 ``` 175 176 The converted code looks like: 177 178 ```go 179 if val, _err_ := failpoint.Eval(_curpkg_("failpoint-name")); _err_ == nil { 180 return "unit-test", val 181 } 182 ``` 183 184 - `failpoint.Value` is the value that passes by `failpoint.Enable("failpoint-name", "return(5)")` 185 which can be ignored. 186 187 ```go 188 failpoint.Inject("failpoint-name", func(_ failpoint.Value) { 189 fmt.Println("unit-test") 190 }) 191 ``` 192 193 OR 194 195 ```go 196 failpoint.Inject("failpoint-name", func() { 197 fmt.Println("unit-test") 198 }) 199 ``` 200 201 And the converted code looks like: 202 203 ```go 204 if _, _err_ := failpoint.Eval(_curpkg_("failpoint-name")); _err_ == nil { 205 fmt.Println("unit-test") 206 } 207 ``` 208 209 - Also, the failpoint closure can be a function which takes `context.Context`. You can 210 do some customized things with `context.Context` like controlling whether a failpoint is 211 active in parallel tests or other cases. For example, 212 213 ```go 214 failpoint.InjectContext(ctx, "failpoint-name", func(val failpoint.Value) { 215 fmt.Println("unit-test", val) 216 }) 217 ``` 218 219 The converted code looks like: 220 221 ```go 222 if val, _err_ := failpoint.EvalContext(ctx, _curpkg_("failpoint-name")); _err_ == nil { 223 fmt.Println("unit-test", val) 224 } 225 ``` 226 227 - You can ignore `context.Context`, and this will generate the same code as above non-context version. For example, 228 229 ```go 230 failpoint.InjectContext(nil, "failpoint-name", func(val failpoint.Value) { 231 fmt.Println("unit-test", val) 232 }) 233 ``` 234 235 Becomes 236 237 ```go 238 if val, _err_ := failpoint.EvalContext(nil, _curpkg_("failpoint-name")); _err_ == nil { 239 fmt.Println("unit-test", val) 240 } 241 ``` 242 243 - You can control a failpoint by failpoint.WithHook 244 245 ```go 246 func (s *dmlSuite) TestCRUDParallel() { 247 sctx := failpoint.WithHook(context.Backgroud(), func(ctx context.Context, fpname string) bool { 248 return ctx.Value(fpname) != nil // Determine by ctx key 249 }) 250 insertFailpoints = map[string]struct{} { 251 "insert-record-fp": {}, 252 "insert-index-fp": {}, 253 "on-duplicate-fp": {}, 254 } 255 ictx := failpoint.WithHook(context.Backgroud(), func(ctx context.Context, fpname string) bool { 256 _, found := insertFailpoints[fpname] // Only enables some failpoints. 257 return found 258 }) 259 deleteFailpoints = map[string]struct{} { 260 "tikv-is-busy-fp": {}, 261 "fetch-tso-timeout": {}, 262 } 263 dctx := failpoint.WithHook(context.Backgroud(), func(ctx context.Context, fpname string) bool { 264 _, found := deleteFailpoints[fpname] // Only disables failpoints. 265 return !found 266 }) 267 // other DML parallel test cases. 268 s.RunParallel(buildSelectTests(sctx)) 269 s.RunParallel(buildInsertTests(ictx)) 270 s.RunParallel(buildDeleteTests(dctx)) 271 } 272 ``` 273 274 - If you use a failpoint in the loop context, maybe you will use other marker functions. 275 276 ```go 277 failpoint.Label("outer") 278 for i := 0; i < 100; i++ { 279 inner: 280 for j := 0; j < 1000; j++ { 281 switch rand.Intn(j) + i { 282 case j / 5: 283 failpoint.Break() 284 case j / 7: 285 failpoint.Continue("outer") 286 case j / 9: 287 failpoint.Fallthrough() 288 case j / 10: 289 failpoint.Goto("outer") 290 default: 291 failpoint.Inject("failpoint-name", func(val failpoint.Value) { 292 fmt.Println("unit-test", val.(int)) 293 if val == j/11 { 294 failpoint.Break("inner") 295 } else { 296 failpoint.Goto("outer") 297 } 298 }) 299 } 300 } 301 } 302 ``` 303 304 The above code block will generate the following code: 305 306 ```go 307 outer: 308 for i := 0; i < 100; i++ { 309 inner: 310 for j := 0; j < 1000; j++ { 311 switch rand.Intn(j) + i { 312 case j / 5: 313 break 314 case j / 7: 315 continue outer 316 case j / 9: 317 fallthrough 318 case j / 10: 319 goto outer 320 default: 321 if val, _err_ := failpoint.Eval(_curpkg_("failpoint-name")); _err_ == nil { 322 fmt.Println("unit-test", val.(int)) 323 if val == j/11 { 324 break inner 325 } else { 326 goto outer 327 } 328 } 329 } 330 } 331 } 332 ``` 333 334 - You may doubt why we do not use `label`, `break`, `continue`, and `fallthrough` directly 335 instead of using failpoint marker functions. 336 337 - Any unused symbol like an ident or a label is not permitted in Golang. It will be invalid if some 338 label is only used in the failpoint closure. For example, 339 340 ```go 341 label1: // compiler error: unused label1 342 failpoint.Inject("failpoint-name", func(val failpoint.Value) { 343 if val.(int) == 1000 { 344 goto label1 // illegal to use goto here 345 } 346 fmt.Println("unit-test", val) 347 }) 348 ``` 349 350 - `break` and `continue` can only be used in the loop context, which is not legal in the Golang code 351 if we use them in closure directly. 352 353 ### Some complicated failpoints demo 354 355 - Inject a failpoint to the IF INITIAL statement or CONDITIONAL expression 356 357 ```go 358 if a, b := func() { 359 failpoint.Inject("failpoint-name", func(val failpoint.Value) { 360 fmt.Println("unit-test", val) 361 }) 362 }, func() int { return rand.Intn(200) }(); b > func() int { 363 failpoint.Inject("failpoint-name", func(val failpoint.Value) int { 364 return val.(int) 365 }) 366 return rand.Intn(3000) 367 }() && b < func() int { 368 failpoint.Inject("failpoint-name-2", func(val failpoint.Value) { 369 return rand.Intn(val.(int)) 370 }) 371 return rand.Intn(6000) 372 }() { 373 a() 374 failpoint.Inject("failpoint-name-3", func(val failpoint.Value) { 375 fmt.Println("unit-test", val) 376 }) 377 } 378 ``` 379 380 The above code block will generate something like this: 381 382 ```go 383 if a, b := func() { 384 if val, _err_ := failpoint.Eval(_curpkg_("failpoint-name")); _err_ == nil { 385 fmt.Println("unit-test", val) 386 } 387 }, func() int { return rand.Intn(200) }(); b > func() int { 388 if val, _err_ := failpoint.Eval(_curpkg_("failpoint-name")); _err_ == nil { 389 return val.(int) 390 } 391 return rand.Intn(3000) 392 }() && b < func() int { 393 if val, ok := failpoint.Eval(_curpkg_("failpoint-name-2")); ok { 394 return rand.Intn(val.(int)) 395 } 396 return rand.Intn(6000) 397 }() { 398 a() 399 if val, ok := failpoint.Eval(_curpkg_("failpoint-name-3")); ok { 400 fmt.Println("unit-test", val) 401 } 402 } 403 ``` 404 405 - Inject a failpoint to the SELECT statement to make it block one CASE if the failpoint is active 406 407 ```go 408 func (s *StoreService) ExecuteStoreTask() { 409 select { 410 case <-func() chan *StoreTask { 411 failpoint.Inject("priority-fp", func(_ failpoint.Value) { 412 return make(chan *StoreTask) 413 }) 414 return s.priorityHighCh 415 }(): 416 fmt.Println("execute high priority task") 417 418 case <- s.priorityNormalCh: 419 fmt.Println("execute normal priority task") 420 421 case <- s.priorityLowCh: 422 fmt.Println("execute normal low task") 423 } 424 } 425 ``` 426 427 The above code block will generate something like this: 428 429 ```go 430 func (s *StoreService) ExecuteStoreTask() { 431 select { 432 case <-func() chan *StoreTask { 433 if _, ok := failpoint.Eval(_curpkg_("priority-fp")); ok { 434 return make(chan *StoreTask) 435 }) 436 return s.priorityHighCh 437 }(): 438 fmt.Println("execute high priority task") 439 440 case <- s.priorityNormalCh: 441 fmt.Println("execute normal priority task") 442 443 case <- s.priorityLowCh: 444 fmt.Println("execute normal low task") 445 } 446 } 447 ``` 448 449 - Inject a failpoint to dynamically extend SWITCH CASE arms 450 451 ```go 452 switch opType := operator.Type(); { 453 case opType == "balance-leader": 454 fmt.Println("create balance leader steps") 455 456 case opType == "balance-region": 457 fmt.Println("create balance region steps") 458 459 case opType == "scatter-region": 460 fmt.Println("create scatter region steps") 461 462 case func() bool { 463 failpoint.Inject("dynamic-op-type", func(val failpoint.Value) bool { 464 return strings.Contains(val.(string), opType) 465 }) 466 return false 467 }(): 468 fmt.Println("do something") 469 470 default: 471 panic("unsupported operator type") 472 } 473 ``` 474 475 The above code block will generate something like this: 476 477 ```go 478 switch opType := operator.Type(); { 479 case opType == "balance-leader": 480 fmt.Println("create balance leader steps") 481 482 case opType == "balance-region": 483 fmt.Println("create balance region steps") 484 485 case opType == "scatter-region": 486 fmt.Println("create scatter region steps") 487 488 case func() bool { 489 if val, ok := failpoint.Eval(_curpkg_("dynamic-op-type")); ok { 490 return strings.Contains(val.(string), opType) 491 } 492 return false 493 }(): 494 fmt.Println("do something") 495 496 default: 497 panic("unsupported operator type") 498 } 499 ``` 500 501 - More complicated failpoints 502 503 - There are more complicated failpoint sites that can be injected to 504 - for the loop INITIAL statement, CONDITIONAL expression and POST statement 505 - for the RANGE statement 506 - SWITCH INITIAL statement 507 - … 508 - Anywhere you can call a function 509 510 ## Failpoint name best practice 511 512 As you see above, `_curpkg_` will automatically wrap the original failpoint name in `failpoint.Eval` call. 513 You can think of `_curpkg_` as a macro that automatically prepends the current package path to the failpoint name. For example, 514 515 ```go 516 package ddl // which parent package is `github.com/pingcap/tidb` 517 518 func demo() { 519 // _curpkg_("the-original-failpoint-name") will be expanded as `github.com/pingcap/tidb/ddl/the-original-failpoint-name` 520 if val, ok := failpoint.Eval(_curpkg_("the-original-failpoint-name")); ok {...} 521 } 522 ``` 523 524 You do not need to care about `_curpkg_` in your application. It is automatically generated after running `failpoint-ctl enable` 525 and is deleted with `failpoint-ctl disable`. 526 527 Because all failpoints in a package share the same namespace, we need to be careful to 528 avoid name conflict. There are some recommended naming rules to improve this situation. 529 530 - Keep name unique in current subpackage 531 - Use a self-explanatory name for the failpoint 532 533 You can enable failpoints by environment variables 534 ```shell 535 GO_FAILPOINTS="github.com/pingcap/tidb/ddl/renameTableErr=return(100);github.com/pingcap/tidb/planner/core/illegalPushDown=return(true);github.com/pingcap/pd/server/schedulers/balanceLeaderFailed=return(true)" 536 ``` 537 538 ## Implementation details 539 540 1. Define a group of marker functions 541 2. Parse imports and prune a source file which does not import a failpoint 542 3. Traverse AST to find marker function calls 543 4. Marker function calls will be rewritten with an IF statement, which calls `failpoint.Eval` to determine whether a 544 failpoint is active and executes failpoint code if the failpoint is enabled 545 546 ![rewrite-demo](./media/rewrite-demo.png) 547 548 ## Acknowledgments 549 550 - Thanks [gofail](https://github.com/etcd-io/gofail) to provide initial implementation.