github.com/theQRL/go-zond@v0.1.1/docs/postmortems/2021-08-22-split-postmortem.md (about)

     1  # Minority split 2021-08-27 post mortem
     2  
     3  This is a post-mortem concerning the minority split that occurred on Ethereum mainnet on block [13107518](https://etherscan.io/block/13107518), at which a minority chain split occurred.
     4  
     5  ## Timeline
     6  
     7  
     8  - 2021-08-17: Guido Vranken submitted a bounty report. Investigation started, root cause identified, patch variations discussed. 
     9  - 2021-08-18: Made public announcement over twitter about upcoming security release upcoming Tuesday. Downstream projects were also notified about the upcoming patch-release.
    10  - 2021-08-24: Released [v1.10.8](https://github.com/theQRL/go-zond/releases/tag/v1.10.8) containing the fix on Tuesday morning (CET). Erigon released [v2021.08.04](https://github.com/ledgerwatch/erigon/releases/tag/v2021.08.04).
    11  - 2021-08-27: At 12:50:07 UTC, issue exploited. Analysis started roughly 30m later, 
    12  
    13  
    14  
    15  ## Bounty report
    16  
    17  ###  2021-08-17 RETURNDATA corruption via datacopy
    18  
    19  On 2021-08-17, Guido Vranken submitted a report to bounty@ethereum.org. This coincided with a geth-meetup in Berlin, so the geth team could fairly quickly analyse the issue. 
    20  
    21  He submitted a proof of concept which called the `dataCopy` precompile, where the input slice and output slice were overlapping but shifted. Doing a `copy` where the `src` and `dest` overlaps is not a problem in itself, however, the `returnData`slice was _also_ using the same memory as a backing-array.
    22  
    23  #### Technical details
    24  
    25  During CALL-variants, `geth` does not copy the input. This was changed at one point, to avoid a DoS attack reported by Hubert Ritzdorf, to avoid copying data a lot on repeated `CALL`s -- essentially combating a DoS via `malloc`. Further, the datacopy precompile also does not copy the data, but just returns the same slice. This is fine so far. 
    26  
    27  After the execution of `dataCopy`, we copy the `ret` into the designated memory area, and this is what causes a problem. Because we're copying a slice of memory over a slice of memory, and this operation modifies (shifts) the data in the source -- the `ret`. So this means we wind up with corrupted returndata.
    28  
    29  
    30  ```
    31  1. Calling datacopy
    32  
    33    memory: [0, 1, 2, 3, 4]
    34    in (mem[0:4]) : [0,1,2,3]
    35    out (mem[1:5]): [1,2,3,4]
    36  
    37  2. dataCopy returns
    38  
    39    returndata (==in, mem[0:4]): [0,1,2,3]
    40   
    41  3. Copy in -> out
    42  
    43    => memory: [0,0,1,2,3]
    44    => returndata: [0,0,1,2]
    45  ```
    46  
    47  
    48  #### Summary
    49  
    50  A memory-corruption bug within the EVM can cause a consensus error, where vulnerable nodes obtain a different `stateRoot` when processing a maliciously crafted transaction. This, in turn, would lead to the chain being split: mainnet splitting in two forks.
    51  
    52  #### Handling
    53  
    54  On the evening of 17th, we discussed options on how to handle it. We made a state test to reproduce the issue, and verified that neither `openethereum`, `nethermind` nor `besu` were affected by the same vulnerability, and started a full-sync with a patched version of `geth`. 
    55  
    56  It was decided that in this specific instance, it would be possible to make a public announcement and a patch release: 
    57  
    58  - The fix can be made pretty 'generically', e.g. always copying data on input to precompiles. 
    59  - The flaw is pretty difficult to find, given a generic fix in the call. The attacker needs to figure out that it concerns the precompiles, specifically the datcopy, and that it concerns the `RETURNDATA` buffer rather than the regular memory, and lastly the special circumstances to trigger it (overlapping but shifted input/output). 
    60  
    61  Since we had merged the removal of `ETH65`, if the entire network were to upgrade, then nodes which have not yet implemented `ETH66` would be cut off from the network. After further discussions, we decided to:
    62  
    63  - Announce an upcoming security release on Tuesday (August 24th), via Twitter and official channels, plus reach out to downstream projects.
    64  - Temporarily revert the `ETH65`-removal.
    65  - Place the fix into the PR optimizing the jumpdest analysis [233381](https://github.com/theQRL/go-zond/pull/23381). 
    66  - After 4-8 weeks, release details about the vulnerability. 
    67  
    68  
    69  ## Exploit
    70  
    71  At block [13107518](https://etherscan.io/block/13107518), mined at Aug-27-2021 12:50:07 PM +UTC, a minority chain split occurred. The discord user @AlexSSD7 notified the allcoredevs-channel on the Eth R&D discord, on Aug 27 13:09  UTC. 
    72  
    73  
    74  At 14:09 UTC, it was confirmed that the transaction `0x1cb6fb36633d270edefc04d048145b4298e67b8aa82a9e5ec4aa1435dd770ce4` had triggered the bug, leading to a minority-split of the chain. The term 'minority split' means that the majority of miners continued to mine on the correct chain.
    75  
    76  At 14:17 UTC, @mhswende tweeted out about the issue [2]. 
    77  
    78  The attack was sent from an account funded from Tornado cash. 
    79  
    80  It was also found that the same attack had been carried out on the BSC chain at roughly the same time -- at a block mined [12 minutes earlier](https://bscscan.com/tx/0xf667f820631f6adbd04a4c92274374034a3e41fa9057dc42cb4e787535136dce), at Aug-27-2021 12:38:30 PM +UTC. 
    81  
    82  The blocks on the 'bad' chain were investigated, and Tim Beiko reached out to those mining operators on the minority chain who could be identified via block extradata. 
    83  
    84  
    85  ## Lessons learned
    86  
    87  
    88  ### Disclosure decision
    89  
    90  The geth-team have an official policy regarding [vulnerability disclosure](https://geth.ethereum.org/docs/vulnerabilities/vulnerabilities). 
    91  
    92  > The primary goal for the Geth team is the health of the Ethereum network as a whole, and the decision whether or not to publish details about a serious vulnerability boils down to minimizing the risk and/or impact of discovery and exploitation.
    93  
    94  In this case, it was decided that public pre-announce + patch would likely lead to sufficient update-window for a critical mass of nodes/miners to upgrade in time before it could be exploited. In hindsight, this was a dangerous decision, and it's unlikely that the same decision would be reached were a similar incident to happen again. 
    95  
    96  
    97  ### Disclosure path
    98  
    99  Several subprojects were informed about the upcoming security patch:
   100  
   101  - Polygon/Matic
   102  - MEV
   103  - Avalanche
   104  - Erigon
   105  - BSC 
   106  - EWF
   107  - Quorum
   108  - ETC
   109  - xDAI
   110  
   111  However, some were 'lost', and only notified later
   112  
   113  - Optimism
   114  - Summa
   115  - Harmony
   116  
   117  Action point: create a low-volume geth-announce@ethereum.org email list where dependent projects/operators can receive public announcements. 
   118  - This has been done. If you wish to receive release- and security announcements, sign up [here](https://groups.google.com/a/ethereum.org/g/geth-announce/about)
   119  
   120  ### Fork monitoring
   121  
   122  The fork monitor behaved 'ok' during the incident, but had to be restarted during the evening. 
   123  
   124  Action point: improve the resiliency of the forkmon, which is currently not performing great when many nodes are connected. 
   125  
   126  Action point: enable push-based alerts to be sent from the forkmon, to speed up the fork detection.
   127  
   128  
   129  ## Links
   130  
   131  - [1] https://twitter.com/go_ethereum/status/1428051458763763721
   132  - [2] https://twitter.com/mhswende/status/1431259601530458112
   133  
   134  
   135  ## Appendix
   136  
   137  ### Subprojects
   138  
   139  
   140  The projects were sent variations of the following text: 
   141  ```
   142  We have identified a security issue with go-ethereum, and will issue a
   143  new release (v1.10.8) on Tuesday next week.
   144  
   145  At this point, we will not disclose details about the issue, but
   146  recommend downstream/dependent projects to be ready to take actions to
   147  upgrade to the latest go-ethereum codebase. More information about the
   148  issue will be disclosed at a later date.
   149  
   150  https://twitter.com/go_ethereum/status/1428051458763763721
   151  
   152  ```
   153  ### Patch
   154  
   155  ```diff
   156  diff --git a/core/vm/instructions.go b/core/vm/instructions.go
   157  index f7ef2f900e..6c8c6e6e6f 100644
   158  --- a/core/vm/instructions.go
   159  +++ b/core/vm/instructions.go
   160  @@ -669,6 +669,7 @@ func opCall(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([]byt
   161          }
   162          stack.push(&temp)
   163          if err == nil || err == ErrExecutionReverted {
   164  +               ret = common.CopyBytes(ret)
   165                  scope.Memory.Set(retOffset.Uint64(), retSize.Uint64(), ret)
   166          }
   167          scope.Contract.Gas += returnGas
   168  @@ -703,6 +704,7 @@ func opCallCode(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext) ([
   169          }
   170          stack.push(&temp)
   171          if err == nil || err == ErrExecutionReverted {
   172  +               ret = common.CopyBytes(ret)
   173                  scope.Memory.Set(retOffset.Uint64(), retSize.Uint64(), ret)
   174          }
   175          scope.Contract.Gas += returnGas
   176  @@ -730,6 +732,7 @@ func opDelegateCall(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext
   177          }
   178          stack.push(&temp)
   179          if err == nil || err == ErrExecutionReverted {
   180  +               ret = common.CopyBytes(ret)
   181                  scope.Memory.Set(retOffset.Uint64(), retSize.Uint64(), ret)
   182          }
   183          scope.Contract.Gas += returnGas
   184  @@ -757,6 +760,7 @@ func opStaticCall(pc *uint64, interpreter *EVMInterpreter, scope *ScopeContext)
   185          }
   186          stack.push(&temp)
   187          if err == nil || err == ErrExecutionReverted {
   188  +               ret = common.CopyBytes(ret)
   189                  scope.Memory.Set(retOffset.Uint64(), retSize.Uint64(), ret)
   190          }
   191          scope.Contract.Gas += returnGas
   192  diff --git a/core/vm/interpreter.go b/core/vm/interpreter.go
   193  index 9cf0c4e2c1..9fb83799c9 100644
   194  --- a/core/vm/interpreter.go
   195  +++ b/core/vm/interpreter.go
   196  @@ -262,7 +262,7 @@ func (in *EVMInterpreter) Run(contract *Contract, input []byte, readOnly bool) (
   197                  // if the operation clears the return data (e.g. it has returning data)
   198                  // set the last return to the result of the operation.
   199                  if operation.returns {
   200  -                       in.returnData = common.CopyBytes(res)
   201  +                       in.returnData = res
   202                  }
   203   
   204                  switch {
   205  ```
   206  
   207  ### Statetest to test for the issue
   208  
   209  ```json
   210  {
   211    "trigger-issue": {
   212      "env": {
   213        "currentCoinbase": "b94f5374fce5edbc8e2a8697c15331677e6ebf0b",
   214        "currentDifficulty": "0x20000",
   215        "currentGasLimit": "0x26e1f476fe1e22",
   216        "currentNumber": "0x1",
   217        "currentTimestamp": "0x3e8",
   218        "previousHash": "0x0000000000000000000000000000000000000000000000000000000000000000"
   219      },
   220      "pre": {
   221        "0x00000000000000000000000000000000000000bb": {
   222          "code": "0x6001600053600260015360036002536004600353600560045360066005536006600260066000600060047f7ef0367e633852132a0ebbf70eb714015dd44bc82e1e55a96ef1389c999c1bcaf13d600060003e596000208055",
   223          "storage": {},
   224          "balance": "0x5",
   225          "nonce": "0x0"
   226        },
   227        "0xa94f5374fce5edbc8e2a8697c15331677e6ebf0b": {
   228          "code": "0x",
   229          "storage": {},
   230          "balance": "0xffffffff",
   231          "nonce": "0x0"
   232        }
   233      },
   234      "transaction": {
   235        "gasPrice": "0x1",
   236        "nonce": "0x0",
   237        "to": "0x00000000000000000000000000000000000000bb",
   238        "data": [
   239          "0x"
   240        ],
   241        "gasLimit": [
   242          "0x7a1200"
   243        ],
   244        "value": [
   245          "0x01"
   246        ],
   247        "secretKey": "0x45a915e4d060149eb4365960e6a7a45f334393093061116b197e3240065ff2d8"
   248      },
   249      "out": "0x",
   250      "post": {
   251        "Berlin": [
   252          {
   253            "hash": "2a38a040bab1e1fa499253d98b2fd363e5756ecc52db47dd59af7116c068368c",
   254            "logs": "1dcc4de8dec75d7aab85b567b6ccd41ad312451b948a7413f0a142fd40d49347",
   255            "indexes": {
   256              "data": 0,
   257              "gas": 0,
   258              "value": 0
   259            }
   260          }
   261        ]
   262      }
   263    }
   264  }
   265  ```
   266