github.com/pingcap/failpoint@v0.0.0-20240412033321-fd0796e60f86/README.md (about)

     1  # failpoint
     2  [![LICENSE](https://img.shields.io/github/license/pingcap/failpoint.svg)](https://github.com/pingcap/failpoint/blob/master/LICENSE)
     3  [![Language](https://img.shields.io/badge/Language-Go-blue.svg)](https://golang.org/)
     4  [![Go Report Card](https://goreportcard.com/badge/github.com/pingcap/failpoint)](https://goreportcard.com/report/github.com/pingcap/failpoint)
     5  [![Build Status](https://github.com/pingcap/failpoint/actions/workflows/suite.yml/badge.svg?branch=master)](https://github.com/pingcap/failpoint/actions/workflows/suite.yml?query=event%3Apush+branch%3Amaster)
     6  [![Coverage Status](https://codecov.io/gh/pingcap/failpoint/branch/master/graph/badge.svg)](https://codecov.io/gh/pingcap/failpoint)
     7  [![Mentioned in Awesome Go](https://awesome.re/mentioned-badge.svg)](https://github.com/avelino/awesome-go)  
     8  
     9  An implementation of [failpoints][failpoint] for Golang. Fail points are used to add code points where errors may be injected in a user controlled fashion. Fail point is a code snippet that is only executed when the corresponding failpoint is active.
    10  
    11  [failpoint]: http://www.freebsd.org/cgi/man.cgi?query=fail
    12  
    13  ## Quick Start (use `failpoint-ctl`)
    14  
    15  1.  Build `failpoint-ctl` from source
    16  
    17      ``` bash
    18      git clone https://github.com/pingcap/failpoint.git
    19      cd failpoint
    20      make
    21      ls bin/failpoint-ctl
    22      ```
    23  
    24  2.  Inject failpoints to your program, eg:
    25  
    26      ``` go
    27      package main
    28  
    29      import "github.com/pingcap/failpoint"
    30  
    31      func main() {
    32          failpoint.Inject("testPanic", func() {
    33              panic("failpoint triggerd")
    34          })
    35      }
    36      ```
    37  
    38  3.  Transfrom your code with `failpoint-ctl enable`
    39  
    40  4.  Build with `go build`
    41  
    42  5.  Enable failpoints with `GO_FAILPOINTS` environment variable
    43  
    44      ``` bash
    45      GO_FAILPOINTS="main/testPanic=return(true)" ./your-program
    46      ```
    47  
    48  6.  If you use `go run` to run the test, don't forget to add the generated `binding__failpoint_binding__.go` in your command, like:
    49  
    50      ```bash
    51      GO_FAILPOINTS="main/testPanic=return(true)" go run your-program.go binding__failpoint_binding__.go
    52      ```
    53  
    54  ## Quick Start (use `failpoint-toolexec`)
    55  
    56  1.  Build `failpoint-toolexec` from source
    57  
    58      ``` bash
    59      git clone https://github.com/pingcap/failpoint.git
    60      cd failpoint
    61      make
    62      ls bin/failpoint-toolexec
    63      ```
    64  
    65  2.  Inject failpoints to your program, eg:
    66  
    67      ``` go
    68      package main
    69  
    70      import "github.com/pingcap/failpoint"
    71  
    72      func main() {
    73          failpoint.Inject("testPanic", func() {
    74              panic("failpoint triggerd")
    75          })
    76      }
    77      ```
    78  
    79  3.  Use a separate build cache to avoid mixing caches without `failpoint-toolexec`, and build
    80  
    81      `GOCACHE=/tmp/failpoint-cache go build -toolexec path/to/failpoint-toolexec`
    82  
    83  4.  Enable failpoints with `GO_FAILPOINTS` environment variable
    84  
    85      ``` bash
    86      GO_FAILPOINTS="main/testPanic=return(true)" ./your-program
    87      ```
    88  
    89  5.  You can also use `go run` or `go test`, like:
    90  
    91      ```bash
    92      GOCACHE=/tmp/failpoint-cache GO_FAILPOINTS="main/testPanic=return(true)" go run -toolexec path/to/failpoint-toolexec your-program.go
    93      ```
    94  
    95  ## Design principles
    96  
    97  - Define failpoint in valid Golang code, not comments or anything else
    98  - Failpoint does not have any extra cost
    99  
   100      - Will not take effect on regular logic
   101      - Will not cause regular code performance regression
   102      - Failpoint code will not appear in the final binary
   103  
   104  - Failpoint routine is writable/readable and should be checked by a compiler
   105  - Generated code by failpoint definition is easy to read
   106  - Keep the line numbers same with the injecting codes(easier to debug)
   107  - Support parallel tests with context.Context
   108  
   109  ## Key concepts
   110  
   111  - Failpoint
   112  
   113      Faillpoint is a code snippet that is only executed when the corresponding failpoint is active.
   114      The closure will never be executed if `failpoint.Disable("failpoint-name-for-demo")` is executed.
   115  
   116      ```go
   117      var outerVar = "declare in outer scope"
   118      failpoint.Inject("failpoint-name-for-demo", func(val failpoint.Value) {
   119          fmt.Println("unit-test", val, outerVar)
   120      })
   121      ```
   122  
   123  - Marker functions
   124  
   125      - It is just an empty function
   126  
   127          - To hint the rewriter to rewrite with an equality statement
   128          - To receive some parameters as the rewrite rule
   129          - It will be inline in the compiling time and emit nothing to binary (zero cost)
   130          - The variables in external scope can be accessed in closure by capturing, and the converted code is still legal
   131          because all the captured-variables location in outer scope of IF statement.
   132  
   133      - It is easy to write/read 
   134      - Introduce a compiler check for failpoints which cannot compile in the regular mode if failpoint code is invalid
   135  
   136  - Marker funtion list
   137  
   138      - `func Inject(fpname string, fpblock func(val Value)) {}`
   139      - `func InjectContext(fpname string, ctx context.Context, fpblock func(val Value)) {}`
   140      - `func Break(label ...string) {}`
   141      - `func Goto(label string) {}`
   142      - `func Continue(label ...string) {}`
   143      - `func Fallthrough() {}`
   144      - `func Return(results ...interface{}) {}`
   145      - `func Label(label string) {}`
   146  
   147  - Supported failpoint environment variable
   148  
   149      failpoint can be enabled by export environment variables with the following patten, which is quite similar to [freebsd failpoint SYSCTL VARIABLES](https://www.freebsd.org/cgi/man.cgi?query=fail)
   150  
   151      ```regexp
   152      [<percent>%][<count>*]<type>[(args...)][-><more terms>]
   153      ```
   154  
   155      The <type> argument specifies which action to take; it can be one of:
   156  
   157      - off: Take no action (does not trigger failpoint code)
   158      - return: Trigger failpoint with specified argument
   159      - sleep: Sleep the specified number of milliseconds
   160      - panic: Panic
   161      - break: Execute gdb and break into debugger
   162      - print: Print failpoint path for inject variable
   163      - pause: Pause will pause until the failpoint is disabled
   164  
   165  ## How to inject a failpoint to your program
   166  
   167  - You can call `failpoint.Inject` to inject a failpoint to the call site, where `failpoint-name` is
   168  used to trigger the failpoint and `failpoint-closure` will be expanded as the body of the IF statement.
   169  
   170      ```go
   171      failpoint.Inject("failpoint-name", func(val failpoint.Value) {
   172          failpoint.Return("unit-test", val)
   173      })
   174      ```
   175  
   176      The converted code looks like:
   177  
   178      ```go
   179      if val, _err_ := failpoint.Eval(_curpkg_("failpoint-name")); _err_ == nil {
   180          return "unit-test", val
   181      }
   182      ```
   183  
   184  - `failpoint.Value` is the value that passes by `failpoint.Enable("failpoint-name", "return(5)")`
   185  which can be ignored.
   186  
   187      ```go
   188      failpoint.Inject("failpoint-name", func(_ failpoint.Value) {
   189          fmt.Println("unit-test")
   190      })
   191      ```
   192  
   193      OR
   194  
   195      ```go
   196      failpoint.Inject("failpoint-name", func() {
   197          fmt.Println("unit-test")
   198      })
   199      ```
   200  
   201      And the converted code looks like:
   202  
   203      ```go
   204      if _, _err_ := failpoint.Eval(_curpkg_("failpoint-name")); _err_ == nil {
   205          fmt.Println("unit-test")
   206      }
   207      ```
   208  
   209  - Also, the failpoint closure can be a function which takes `context.Context`. You can
   210  do some customized things with `context.Context` like controlling whether a failpoint is
   211  active in parallel tests or other cases. For example,
   212  
   213      ```go
   214      failpoint.InjectContext(ctx, "failpoint-name", func(val failpoint.Value) {
   215          fmt.Println("unit-test", val)
   216      })
   217      ```
   218  
   219      The converted code looks like:
   220  
   221      ```go
   222      if val, _err_ := failpoint.EvalContext(ctx, _curpkg_("failpoint-name")); _err_ == nil {
   223          fmt.Println("unit-test", val)
   224      }
   225      ```
   226  
   227  - You can ignore `context.Context`, and this will generate the same code as above non-context version. For example,
   228  
   229      ```go
   230      failpoint.InjectContext(nil, "failpoint-name", func(val failpoint.Value) {
   231          fmt.Println("unit-test", val)
   232      })
   233      ```
   234  
   235      Becomes
   236  
   237      ```go
   238      if val, _err_ := failpoint.EvalContext(nil, _curpkg_("failpoint-name")); _err_ == nil {
   239          fmt.Println("unit-test", val)
   240      }
   241      ```
   242  
   243  - You can control a failpoint by failpoint.WithHook
   244  
   245      ```go
   246      func (s *dmlSuite) TestCRUDParallel() {
   247          sctx := failpoint.WithHook(context.Backgroud(), func(ctx context.Context, fpname string) bool {
   248              return ctx.Value(fpname) != nil // Determine by ctx key
   249          })
   250          insertFailpoints = map[string]struct{} {
   251              "insert-record-fp": {},
   252              "insert-index-fp": {},
   253              "on-duplicate-fp": {},
   254          }
   255          ictx := failpoint.WithHook(context.Backgroud(), func(ctx context.Context, fpname string) bool {
   256              _, found := insertFailpoints[fpname] // Only enables some failpoints.
   257              return found
   258          })
   259          deleteFailpoints = map[string]struct{} {
   260              "tikv-is-busy-fp": {},
   261              "fetch-tso-timeout": {},
   262          }
   263          dctx := failpoint.WithHook(context.Backgroud(), func(ctx context.Context, fpname string) bool {
   264              _, found := deleteFailpoints[fpname] // Only disables failpoints. 
   265              return !found
   266          })
   267          // other DML parallel test cases.
   268          s.RunParallel(buildSelectTests(sctx))
   269          s.RunParallel(buildInsertTests(ictx))
   270          s.RunParallel(buildDeleteTests(dctx))
   271      }
   272      ```
   273  
   274  - If you use a failpoint in the loop context, maybe you will use other marker functions.
   275  
   276      ```go
   277      failpoint.Label("outer")
   278      for i := 0; i < 100; i++ {
   279          inner:
   280              for j := 0; j < 1000; j++ {
   281                  switch rand.Intn(j) + i {
   282                  case j / 5:
   283                      failpoint.Break()
   284                  case j / 7:
   285                      failpoint.Continue("outer")
   286                  case j / 9:
   287                      failpoint.Fallthrough()
   288                  case j / 10:
   289                      failpoint.Goto("outer")
   290                  default:
   291                      failpoint.Inject("failpoint-name", func(val failpoint.Value) {
   292                          fmt.Println("unit-test", val.(int))
   293                          if val == j/11 {
   294                              failpoint.Break("inner")
   295                          } else {
   296                              failpoint.Goto("outer")
   297                          }
   298                      })
   299              }
   300          }
   301      }
   302      ```
   303  
   304      The above code block will generate the following code:
   305  
   306      ```go
   307      outer:
   308          for i := 0; i < 100; i++ {
   309          inner:
   310              for j := 0; j < 1000; j++ {
   311                  switch rand.Intn(j) + i {
   312                  case j / 5:
   313                      break
   314                  case j / 7:
   315                      continue outer
   316                  case j / 9:
   317                      fallthrough
   318                  case j / 10:
   319                      goto outer
   320                  default:
   321                      if val, _err_ := failpoint.Eval(_curpkg_("failpoint-name")); _err_ == nil {
   322                          fmt.Println("unit-test", val.(int))
   323                          if val == j/11 {
   324                              break inner
   325                          } else {
   326                              goto outer
   327                          }
   328                      }
   329                  }
   330              }
   331          }
   332      ```
   333  
   334  - You may doubt why we do not use `label`, `break`, `continue`, and `fallthrough` directly
   335  instead of using failpoint marker functions. 
   336  
   337      - Any unused symbol like an ident or a label is not permitted in Golang. It will be invalid if some
   338      label is only used in the failpoint closure. For example,
   339      
   340          ```go
   341          label1: // compiler error: unused label1
   342              failpoint.Inject("failpoint-name", func(val failpoint.Value) {
   343                  if val.(int) == 1000 {
   344                      goto label1 // illegal to use goto here
   345                  }
   346                  fmt.Println("unit-test", val)
   347              })
   348          ```
   349  
   350      - `break` and `continue` can only be used in the loop context, which is not legal in the Golang code 
   351      if we use them in closure directly.
   352  
   353  ### Some complicated failpoints demo
   354  
   355  - Inject a failpoint to the IF INITIAL statement or CONDITIONAL expression
   356  
   357      ```go
   358      if a, b := func() {
   359          failpoint.Inject("failpoint-name", func(val failpoint.Value) {
   360              fmt.Println("unit-test", val)
   361          })
   362      }, func() int { return rand.Intn(200) }(); b > func() int {
   363          failpoint.Inject("failpoint-name", func(val failpoint.Value) int {
   364              return val.(int)
   365          })
   366          return rand.Intn(3000)
   367      }() && b < func() int {
   368          failpoint.Inject("failpoint-name-2", func(val failpoint.Value) {
   369              return rand.Intn(val.(int))
   370          })
   371          return rand.Intn(6000)
   372      }() {
   373          a()
   374          failpoint.Inject("failpoint-name-3", func(val failpoint.Value) {
   375              fmt.Println("unit-test", val)
   376          })
   377      }
   378      ```
   379  
   380      The above code block will generate something like this:
   381  
   382      ```go
   383      if a, b := func() {
   384          if val, _err_ := failpoint.Eval(_curpkg_("failpoint-name")); _err_ == nil {
   385              fmt.Println("unit-test", val)
   386          }
   387      }, func() int { return rand.Intn(200) }(); b > func() int {
   388          if val, _err_ := failpoint.Eval(_curpkg_("failpoint-name")); _err_ == nil {
   389              return val.(int)
   390          }
   391          return rand.Intn(3000)
   392      }() && b < func() int {
   393          if val, ok := failpoint.Eval(_curpkg_("failpoint-name-2")); ok {
   394              return rand.Intn(val.(int))
   395          }
   396          return rand.Intn(6000)
   397      }() {
   398          a()
   399          if val, ok := failpoint.Eval(_curpkg_("failpoint-name-3")); ok {
   400              fmt.Println("unit-test", val)
   401          }
   402      }
   403      ```
   404  
   405  - Inject a failpoint to the SELECT statement to make it block one CASE if the failpoint is active
   406  
   407      ```go
   408      func (s *StoreService) ExecuteStoreTask() {
   409          select {
   410          case <-func() chan *StoreTask {
   411              failpoint.Inject("priority-fp", func(_ failpoint.Value) {
   412                  return make(chan *StoreTask)
   413              })
   414              return s.priorityHighCh
   415          }():
   416              fmt.Println("execute high priority task")
   417  
   418          case <- s.priorityNormalCh:
   419              fmt.Println("execute normal priority task")
   420  
   421          case <- s.priorityLowCh:
   422              fmt.Println("execute normal low task")
   423          }
   424      }
   425      ```
   426  
   427      The above code block will generate something like this:
   428  
   429      ```go
   430      func (s *StoreService) ExecuteStoreTask() {
   431          select {
   432          case <-func() chan *StoreTask {
   433              if _, ok := failpoint.Eval(_curpkg_("priority-fp")); ok {
   434                  return make(chan *StoreTask)
   435              })
   436              return s.priorityHighCh
   437          }():
   438              fmt.Println("execute high priority task")
   439  
   440          case <- s.priorityNormalCh:
   441              fmt.Println("execute normal priority task")
   442  
   443          case <- s.priorityLowCh:
   444              fmt.Println("execute normal low task")
   445          }
   446      }
   447      ```
   448  
   449  - Inject a failpoint to dynamically extend SWITCH CASE arms
   450  
   451      ```go
   452      switch opType := operator.Type(); {
   453      case opType == "balance-leader":
   454          fmt.Println("create balance leader steps")
   455  
   456      case opType == "balance-region":
   457          fmt.Println("create balance region steps")
   458  
   459      case opType == "scatter-region":
   460          fmt.Println("create scatter region steps")
   461  
   462      case func() bool {
   463          failpoint.Inject("dynamic-op-type", func(val failpoint.Value) bool {
   464              return strings.Contains(val.(string), opType)
   465          })
   466          return false
   467      }():
   468          fmt.Println("do something")
   469  
   470      default:
   471          panic("unsupported operator type")
   472      }
   473      ```
   474  
   475      The above code block will generate something like this:
   476  
   477      ```go
   478      switch opType := operator.Type(); {
   479      case opType == "balance-leader":
   480          fmt.Println("create balance leader steps")
   481  
   482      case opType == "balance-region":
   483          fmt.Println("create balance region steps")
   484  
   485      case opType == "scatter-region":
   486          fmt.Println("create scatter region steps")
   487  
   488      case func() bool {
   489          if val, ok := failpoint.Eval(_curpkg_("dynamic-op-type")); ok {
   490              return strings.Contains(val.(string), opType)
   491          }
   492          return false
   493      }():
   494          fmt.Println("do something")
   495  
   496      default:
   497          panic("unsupported operator type")
   498      }
   499      ```
   500  
   501  - More complicated failpoints
   502  
   503      - There are more complicated failpoint sites that can be injected to
   504          - for the loop INITIAL statement, CONDITIONAL expression and POST statement
   505          - for the RANGE statement
   506          - SWITCH INITIAL statement
   507          - …
   508      - Anywhere you can call a function
   509  
   510  ## Failpoint name best practice
   511  
   512  As you see above, `_curpkg_` will automatically wrap the original failpoint name in `failpoint.Eval` call.
   513  You can think of `_curpkg_` as a macro that automatically prepends the current package path to the failpoint name. For example,
   514  
   515  ```go
   516  package ddl // which parent package is `github.com/pingcap/tidb`
   517  
   518  func demo() {
   519  	// _curpkg_("the-original-failpoint-name") will be expanded as `github.com/pingcap/tidb/ddl/the-original-failpoint-name`
   520  	if val, ok := failpoint.Eval(_curpkg_("the-original-failpoint-name")); ok {...}
   521  }
   522  ```
   523  
   524  You do not need to care about `_curpkg_` in your application. It is automatically generated after running `failpoint-ctl enable`
   525  and is deleted with `failpoint-ctl disable`.
   526  
   527  Because all failpoints in a package share the same namespace, we need to be careful to
   528  avoid name conflict. There are some recommended naming rules to improve this situation.
   529  
   530  - Keep name unique in current subpackage
   531  - Use a self-explanatory name for the failpoint
   532      
   533      You can enable failpoints by environment variables
   534      ```shell
   535      GO_FAILPOINTS="github.com/pingcap/tidb/ddl/renameTableErr=return(100);github.com/pingcap/tidb/planner/core/illegalPushDown=return(true);github.com/pingcap/pd/server/schedulers/balanceLeaderFailed=return(true)"
   536      ```
   537      
   538  ## Implementation details
   539  
   540  1. Define a group of marker functions
   541  2. Parse imports and prune a source file which does not import a failpoint
   542  3. Traverse AST to find marker function calls
   543  4. Marker function calls will be rewritten with an IF statement, which calls `failpoint.Eval` to determine whether a
   544  failpoint is active and executes failpoint code if the failpoint is enabled
   545  
   546  ![rewrite-demo](./media/rewrite-demo.png)
   547  
   548  ## Acknowledgments
   549  
   550  - Thanks [gofail](https://github.com/etcd-io/gofail) to provide initial implementation.