github.com/badrootd/nibiru-cometbft@v0.37.5-0.20240307173500-2a75559eee9b/docs/rfc/rfc-008-do-not-panic.md (about) 1 # RFC 008: Don't Panic 2 3 ## Changelog 4 5 - 2021-12-17: initial draft (@tychoish) 6 7 ## Abstract 8 9 Today, the Tendermint core codebase has panics in a number of cases as 10 a response to exceptional situations. These panics complicate testing, 11 and might make tendermint components difficult to use as a library in 12 some circumstances. This document outlines a project of converting 13 panics to errors and describes the situations where its safe to 14 panic. 15 16 ## Background 17 18 Panics in Go are a great mechanism for aborting the current execution 19 for truly exceptional situations (e.g. memory errors, data corruption, 20 processes initialization); however, because they resemble exceptions 21 in other languages, it can be easy to over use them in the 22 implementation of software architectures. This certainly happened in 23 the history of Tendermint, and as we embark on the project of 24 stabilizing the package, we find ourselves in the right moment to 25 reexamine our use of panics, and largely where panics happen in the 26 code base. 27 28 There are still some situations where panics are acceptable and 29 desireable, but it's important that Tendermint, as a project, comes to 30 consensus--perhaps in the text of this document--on the situations 31 where it is acceptable to panic. 32 33 ### References 34 35 - [Defer Panic and Recover](https://go.dev/blog/defer-panic-and-recover) 36 - [Why Go gets exceptions right](https://dave.cheney.net/tag/panic) 37 - [Don't panic](https://dave.cheney.net/practical-go/presentations/gophercon-singapore-2019.html#_dont_panic) 38 39 ## Discussion 40 41 ### Acceptable Panics 42 43 #### Initialization 44 45 It is unambiguously safe (and desireable) to panic in `init()` 46 functions in response to any kind of error. These errors are caught by 47 tests, and occur early enough in process initialization that they 48 won't cause unexpected runtime crashes. 49 50 Other code that is called early in process initialization MAY panic, 51 in some situations if it's not possible to return an error or cause 52 the process to abort early, although these situations should be 53 vanishingly slim. 54 55 #### Data Corruption 56 57 If Tendermint code encounters an inconsistency that could be 58 attributed to data corruption or a logical impossibility it is safer 59 to panic and crash the process than continue to attempt to make 60 progress in these situations. 61 62 Examples including reading data out of the storage engine that 63 is invalid or corrupt, or encountering an ambiguous situation where 64 the process should halt. Generally these forms of corruption are 65 detected after interacting with a trusted but external data source, 66 and reflect situations where the author thinks its safer to terminate 67 the process immediately rather than allow execution to continue. 68 69 #### Unrecoverable Consensus Failure 70 71 In general, a panic should be used in the case of unrecoverable 72 consensus failures. If a process detects that the network is 73 behaving in an incoherent way and it does not have a clearly defined 74 and mechanism for recovering, the process should panic. 75 76 #### Static Validity 77 78 It is acceptable to panic for invariant violations, within a library 79 or package, in situations that should be statically impossible, 80 because there is no way to make these kinds of assertions at compile 81 time. 82 83 For example, type-asserting `interface{}` values returned by 84 `container/list` and `container/heap` (and similar), is acceptable, 85 because package authors should have exclusive control of the inputs to 86 these containers. Packages should not expose the ability to add 87 arbitrary values to these data structures. 88 89 #### Controlled Panics Within Libraries 90 91 In some algorithms with highly recursive structures or very nested 92 call patterns, using a panic, in combination with conditional recovery 93 handlers results in more manageable code. Ultimately this is a limited 94 application, and implementations that use panics internally should 95 only recover conditionally, filtering out panics rather than ignoring 96 or handling all panics. 97 98 #### Request Handling 99 100 Code that handles responses to incoming/external requests 101 (e.g. `http.Handler`) should avoid panics, but practice this isn't 102 totally possible, and it makes sense that request handlers have some 103 kind of default recovery mechanism that will prevent one request from 104 terminating a service. 105 106 ### Unacceptable Panics 107 108 In **no** other situation is it acceptable for the code to panic: 109 110 - there should be **no** controlled panics that callers are required 111 to handle across library/package boundaries. 112 - callers of library functions should not expect panics. 113 - ensuring that arbitrary go routines can't panic. 114 - ensuring that there are no arbitrary panics in core production code, 115 espically code that can run at any time during the lifetime of a 116 process. 117 - all test code and fixture should report normal test assertions with 118 a mechanism like testify's `require` assertion rather than calling 119 panic directly. 120 121 The goal of this increased "panic rigor" is to ensure that any escaped 122 panic is reflects a fixable bug in Tendermint. 123 124 ### Removing Panics 125 126 The process for removing panics involve a few steps, and will be part 127 of an ongoing process of code modernization: 128 129 - converting existing explicit panics to errors in cases where it's 130 possible to return an error, the errors can and should be handled, and returning 131 an error would not lead to data corruption or cover up data 132 corruption. 133 134 - increase rigor around operations that can cause runtime errors, like 135 type assertions, nil pointer errors, array bounds access issues, and 136 either avoid these situations or return errors where possible. 137 138 - remove generic panic handlers which could cover and hide known 139 panics.