github.com/badrootd/nibiru-cometbft@v0.37.5-0.20240307173500-2a75559eee9b/docs/rfc/rfc-008-do-not-panic.md

github.com/badrootd/nibiru-cometbft@v0.37.5-0.20240307173500-2a75559eee9b/docs/rfc/rfc-008-do-not-panic.md (about)

     1  # RFC 008: Don't Panic
     2  
     3  ## Changelog
     4  
     5  - 2021-12-17: initial draft (@tychoish)
     6  
     7  ## Abstract
     8  
     9  Today, the Tendermint core codebase has panics in a number of cases as
    10  a response to exceptional situations. These panics complicate testing,
    11  and might make tendermint components difficult to use as a library in
    12  some circumstances. This document outlines a project of converting
    13  panics to errors and describes the situations where its safe to
    14  panic.
    15  
    16  ## Background
    17  
    18  Panics in Go are a great mechanism for aborting the current execution
    19  for truly exceptional situations (e.g. memory errors, data corruption,
    20  processes initialization); however, because they resemble exceptions
    21  in other languages, it can be easy to over use them in the
    22  implementation of software architectures. This certainly happened in
    23  the history of Tendermint, and as we embark on the project of
    24  stabilizing the package, we find ourselves in the right moment to
    25  reexamine our use of panics, and largely where panics happen in the
    26  code base.
    27  
    28  There are still some situations where panics are acceptable and
    29  desireable, but it's important that Tendermint, as a project, comes to
    30  consensus--perhaps in the text of this document--on the situations
    31  where it is acceptable to panic.
    32  
    33  ### References
    34  
    35  - [Defer Panic and Recover](https://go.dev/blog/defer-panic-and-recover)
    36  - [Why Go gets exceptions right](https://dave.cheney.net/tag/panic)
    37  - [Don't panic](https://dave.cheney.net/practical-go/presentations/gophercon-singapore-2019.html#_dont_panic)
    38  
    39  ## Discussion
    40  
    41  ### Acceptable Panics
    42  
    43  #### Initialization
    44  
    45  It is unambiguously safe (and desireable) to panic in `init()`
    46  functions in response to any kind of error. These errors are caught by
    47  tests, and occur early enough in process initialization that they
    48  won't cause unexpected runtime crashes.
    49  
    50  Other code that is called early in process initialization MAY panic,
    51  in some situations if it's not possible to return an error or cause
    52  the process to abort early, although these situations should be
    53  vanishingly slim.
    54  
    55  #### Data Corruption
    56  
    57  If Tendermint code encounters an inconsistency that could be
    58  attributed to data corruption or a logical impossibility it is safer
    59  to panic and crash the process than continue to attempt to make
    60  progress in these situations.
    61  
    62  Examples including reading data out of the storage engine that
    63  is invalid or corrupt, or encountering an ambiguous situation where
    64  the process should halt. Generally these forms of corruption are
    65  detected after interacting with a trusted but external data source,
    66  and reflect situations where the author thinks its safer to terminate
    67  the process immediately rather than allow execution to continue.
    68  
    69  #### Unrecoverable Consensus Failure
    70  
    71  In general, a panic should be used in the case of unrecoverable
    72  consensus failures. If a process detects that the network is
    73  behaving in an incoherent way and it does not have a clearly defined
    74  and mechanism for recovering, the process should panic.
    75  
    76  #### Static Validity
    77  
    78  It is acceptable to panic for invariant violations, within a library
    79  or package, in situations that should be statically impossible,
    80  because there is no way to make these kinds of assertions at compile
    81  time.
    82  
    83  For example, type-asserting `interface{}` values returned by
    84  `container/list` and `container/heap` (and similar), is acceptable,
    85  because package authors should have exclusive control of the inputs to
    86  these containers. Packages should not expose the ability to add
    87  arbitrary values to these data structures.
    88  
    89  #### Controlled Panics Within Libraries
    90  
    91  In some algorithms with highly recursive structures or very nested
    92  call patterns, using a panic, in combination with conditional recovery
    93  handlers results in more manageable code. Ultimately this is a limited
    94  application, and implementations that use panics internally should
    95  only recover conditionally, filtering out panics rather than ignoring
    96  or handling all panics.
    97  
    98  #### Request Handling
    99  
   100  Code that handles responses to incoming/external requests
   101  (e.g. `http.Handler`) should avoid panics, but practice this isn't
   102  totally possible, and it makes sense that request handlers have some
   103  kind of default recovery mechanism that will prevent one request from
   104  terminating a service.
   105  
   106  ### Unacceptable Panics
   107  
   108  In **no** other situation is it acceptable for the code to panic:
   109  
   110  - there should be **no** controlled panics that callers are required
   111    to handle across library/package boundaries.
   112  - callers of library functions should not expect panics.
   113  - ensuring that arbitrary go routines can't panic.
   114  - ensuring that there are no arbitrary panics in core production code,
   115    espically code that can run at any time during the lifetime of a
   116    process.
   117  - all test code and fixture should report normal test assertions with
   118    a mechanism like testify's `require` assertion rather than calling
   119    panic directly.
   120  
   121  The goal of this increased "panic rigor" is to ensure that any escaped
   122  panic is reflects a fixable bug in Tendermint.
   123  
   124  ### Removing Panics
   125  
   126  The process for removing panics involve a few steps, and will be part
   127  of an ongoing process of code modernization:
   128  
   129  - converting existing explicit panics to errors in cases where it's
   130    possible to return an error, the errors can and should be handled, and returning
   131    an error would not lead to data corruption or cover up data
   132    corruption.
   133  
   134  - increase rigor around operations that can cause runtime errors, like
   135    type assertions, nil pointer errors, array bounds access issues, and
   136    either avoid these situations or return errors where possible.
   137  
   138  - remove generic panic handlers which could cover and hide known
   139    panics.