github.com/decred/dcrlnd@v0.7.6/docs/safety.md (about)

     1  # lnd Operational Safety Guidelines
     2  
     3  ## Table of Contents
     4  
     5  * [Overview](#overview)
     6    - [aezeed](#aezeed)
     7    - [Wallet password](#wallet-password)
     8    - [TLS](#tls)
     9    - [Macaroons](#macaroons)
    10    - [Static Channel Backups (SCBs)](#static-channel-backups-scbs)
    11    - [Static remote keys](#static-remote-keys)
    12  * [Best practices](#best-practices)
    13    - [aezeed storage](#aezeed-storage)
    14    - [File based backups](#file-based-backups)
    15    - [Keeping Static Channel Backups (SCBs) safe](#keeping-static-channel-backups-scb-safe)
    16    - [Keep `lnd` updated](#keep-lnd-updated)
    17    - [Zombie channels](#zombie-channels)
    18    - [Migrating a node to a new device](#migrating-a-node-to-a-new-device)
    19    - [Migrating a node from clearnet to Tor](#migrating-a-node-from-clearnet-to-tor)
    20    - [Prevent data corruption](#prevent-data-corruption)
    21    - [Don't interrupt `lncli` commands](#dont-interrupt-lncli-commands)
    22    - [Regular accounting/monitoring](#regular-accountingmonitoring)
    23    - [Pruned bitcoind node](#pruned-bitcoind-node)
    24    - [The `--noseedbackup` flag](#the---noseedbackup-flag)
    25    
    26  ## Overview
    27  
    28  This chapter describes the security/safety mechanisms that are implemented in
    29  `lnd`. We encourage every person that is planning on putting mainnet funds into
    30  a Lightning Network channel using `lnd` to read this guide carefully.   
    31  As of this writing, `lnd` is still in beta and it is considered `#reckless` to
    32  put any life altering amounts of BTC into the network.   
    33  That said, we constantly put in a lot of effort to make `lnd` safer to use and
    34  more secure. We will update this documentation with each safety mechanism that
    35  we implement.
    36  
    37  The first part of this document describes the security elements that are used in
    38  `lnd` and how they work on a high level.   
    39  The second part is a list of best practices that has crystallized from bug
    40  reports, developer recommendations and experiences from a lot of individuals
    41  running mainnet `lnd` nodes during the last 18 months and counting.
    42  
    43  ### aezeed
    44  
    45  This is what all the on-chain private keys are derived from. `aezeed` is similar
    46  to BIP39 as it uses the same word list to encode the seed as a mnemonic phrase.
    47  But this is where the similarities end, because `aezeed` is _not_ compatible
    48  with BIP39. The 24 words of `aezeed` encode a 128 bit entropy (the seed itself),
    49  a wallet birthday (days since BTC genesis block) and a version.   
    50  This data is _encrypted_ with a password using the AEZ cipher suite (hence the
    51  name). Encrypting the content instead of using the password to derive the HD
    52  extended root key has the advantage that the password can actually be checked
    53  for correctness and can also be changed without affecting any of the derived
    54  keys.  
    55  A BIP for the `aezeed` scheme is being written and should be published soon.
    56  
    57  Important to know:
    58  * As with any bitcoin seed phrase, never reveal this to any person and store
    59    the 24 words (and the password) in a safe place.
    60  * You should never run two different `lnd` nodes with the same seed! Even if
    61    they aren't running at the same time. This will lead to strange/unpredictable
    62    behavior or even loss of funds. To migrate an `lnd` node to a new device,
    63    please see the [node migration section](#migrating-a-node-to-a-new-device).
    64  * For more technical information [see the aezeed README](../aezeed/README.md).
    65  
    66  ### Wallet password
    67  
    68  The wallet password is one of the first things that has to be entered if a new
    69  wallet is created using `lnd`. It is completely independent from the `aezeed`
    70  cipher seed passphrase (which is optional). The wallet password is used to
    71  encrypt the sensitive parts of `lnd`'s databases, currently some parts of
    72  `wallet.db` and `macaroons.db`. Loss of this password does not necessarily
    73  mean loss of funds, as long as the `aezeed` passphrase is still available.
    74  But the node will need to be restored using the
    75  [SCB restore procedure](recovery.md).
    76  
    77  ### TLS
    78  
    79  By default the two API connections `lnd` offers (gRPC on port 10009 and REST on
    80  port 8080) use TLS with a self-signed certificate for transport level security.
    81  Specifying the certificate on the client side (for example `lncli`) is only a
    82  protection against man-in-the-middle attacks and does not provide any
    83  authentication. In fact, `lnd` will never even see the certificate that is
    84  supplied to `lncli` with the `--tlscertpath` argument. `lncli` only uses that
    85  certificate to verify it is talking to the correct gRPC server.   
    86  If the key/certificate pair (`tls.cert` and `tls.key` in the main `lnd` data
    87  directory) is missing on startup, a new self-signed key/certificate pair is
    88  generated. Clients connecting to `lnd` then have to use the new certificate
    89  to verify they are talking to the correct server.
    90  
    91  ### Macaroons
    92  
    93  Macaroons are used as the main authentication method in `lnd`. A macaroon is a
    94  cryptographically verifiable token, comparable to a [JWT](https://jwt.io/)
    95  or other form of API access token. In `lnd` this token consists of a _list of
    96  permissions_ (what operations does the user of the token have access to) and a
    97  set of _restrictions_ (e.g. token expiration timestamp, IP address restriction).
    98  `lnd` does not keep track of the individual macaroons issued, only the key that
    99  was used to create (and later verify) them. That means, individual tokens cannot
   100  currently be invalidated, only all of them at once.   
   101  See the [high-level macaroons documentation](macaroons.md) or the [technical
   102  README](../macaroons/README.md) for more information.
   103  
   104  Important to know:
   105  * Deleting the `*.macaroon` files in the `<lnd-dir>/data/chain/bitcoin/mainnet/`
   106    folder will trigger `lnd` to recreate the default macaroons. But this does
   107    **NOT** invalidate clients that use an old macaroon. To make sure all
   108    previously generated macaroons are invalidated, the `macaroons.db` has to be
   109    deleted as well as all `*.macaroon`.
   110  
   111  ### Static Channel Backups (SCBs)
   112  
   113  A Static Channel Backup is a piece of data that contains all _static_
   114  information about a channel, like funding transaction, capacity, key derivation
   115  paths, remote node public key, remote node last known network addresses and
   116  some static settings like CSV timeout and min HTLC setting.   
   117  Such a backup can either be obtained as a file containing entries for multiple
   118  channels or by calling RPC methods to get individual (or all) channel data.
   119  See the section on [keeping SCBs safe](#keeping-static-channel-backups-scb-safe)
   120  for more information.
   121  
   122  What the SCB does **not** contain is the current channel balance (or the
   123  associated commitment transaction). So how can a channel be restored using
   124  SCBs?   
   125  That's the important part: _A channel cannot be restored using SCBs_, but the
   126  funds that are in the channel can be claimed. The restore procedure relies on
   127  the Data Loss Prevention (DLP) protocol which works by connecting to the remote
   128  node and asking them to **force close** the channel and hand over the needed
   129  information to sweep the on-chain funds that belong to the local node.   
   130  Because of this, [restoring a node from SCB](recovery.md) should be seen as an
   131  emergency measure as all channels will be closed and on-chain fees incur to the
   132  party that opened the channel initially.   
   133  To migrate an existing, working node to a new device, SCBs are _not_ the way to
   134  do it. See the section about
   135  [migrating a node](#migrating-a-node-to-a-new-device) on how to do it correctly.
   136  
   137  Important to know:
   138  * [Restoring a node from SCB](recovery.md) will force-close all channels
   139    contained in that file.
   140  * Restoring a node from SCB relies on the remote node of each channel to be
   141    online and respond to the DLP protocol. That's why it's important to
   142    [get rid of zombie channels](#zombie-channels) because they cannot be
   143    recovered using SCBs.
   144  * The SCB data is encrypted with a key from the seed the node was created with.
   145    A node can therefore only be restored from SCB if the seed is also known.
   146  
   147  ### Static remote keys
   148  
   149  Since version `v0.8.0-beta`, `lnd` supports the `option_static_remote_key` (also
   150  known as "safu commitments"). All new channels will be opened with this option
   151  enabled by default, if the other node also supports it.  
   152  In essence, this change makes it possible for a node to sweep their channel
   153  funds if the remote node force-closes, without any further communication between
   154  the nodes. Previous to this change, your node needed to get a random channel
   155  secret (called the `per_commit_point`) from the remote node even if they
   156  force-closed the channel, which could make recovery very difficult.
   157  
   158  ## Best practices
   159  
   160  ### aezeed storage
   161  
   162  When creating a new wallet, `lnd` will print out 24 words to write down, which
   163  is the wallet's seed (in the [aezeed](#aezeed) format). That seed is optionally
   164  encrypted with a passphrase, also called the _cipher seed passphrase_.   
   165  It is absolutely important to write both the seed and, if set, the password down
   166  and store it in a safe place as **there is no way of exporting the seed from an
   167  lnd wallet**. When creating the wallet, after printing the seed to the command
   168  line, it is hashed and only the hash (or to be more exact, the BIP32 extended
   169  root key) is stored in the `wallet.db` file.   
   170  There is
   171  [a tool being worked on](https://github.com/lightningnetwork/lnd/pull/2373)
   172  that can extract the BIP32 extended root key but currently you cannot restore
   173  lnd with only this root key.
   174  
   175  Important to know:
   176  * Setting a password/passphrase for the aezeed is meant to protect it from
   177    an attacker that finds the paper/storage device. Writing down the password
   178    alongside the 24 seed words does not enhance the security in any way.
   179    Therefore the password should be stored in a separate place.
   180  
   181  ### File based backups
   182  
   183  There is a lot of confusion and also some myths about how to best backup the
   184  off-chain funds of an `lnd` node. Making a mistake here is also still the single
   185  biggest risk of losing off-chain funds, even though we do everything to mitigate
   186  those risks.
   187  
   188  **What files can/should I regularly backup?**   
   189  The single most important file that needs to be backed up whenever it changes
   190  is the `<lnddir>/data/chain/bitcoin/mainnet/channel.backup` file which holds
   191  the Static Channel Backups (SCBs). This file is only updated every time `lnd`
   192  starts, a channel is opened or a channel is closed.
   193  
   194  Most consumer Lightning wallet apps upload the file to the cloud automatically.
   195  
   196  See the [SCB chapter](#static-channel-backups-scbs) for more
   197  information on how to use the file to restore channels.
   198  
   199  **What files should never be backed up to avoid problems?**   
   200  This is a bit of a trick question, as making the backup is not the problem.
   201  Restoring/using an old version of a specific file called
   202  `<lnddir>/data/graph/mainnet/channel.db` is what is very risky and should
   203  _never_ be done!   
   204  This requires some explanation:    
   205  The way LN channels are currently set up (until `eltoo` is implemented) is that
   206  both parties agree on a current balance. To make sure none of the two peers in
   207  a channel ever try to publish an old state of that balance, they both hand over
   208  their keys to the other peer that gives them the means to take _all_ funds (not
   209  just their agreed upon part) from a channel, if an _old_ state is ever
   210  published. Therefore, having an old state of a channel basically means
   211  forfeiting the balance to the other party.   
   212  
   213  As payments in `lnd` can be made multiple times a second, it's very hard to
   214  make a backup of the channel database every time it is updated. And even if it
   215  can be technically done, the confidence that a particular state is certainly the
   216  most up-to-date can never be very high. That's why the focus should be on
   217  [making sure the channel database is not corrupted](#prevent-data-corruption),
   218  [closing out the zombie channels](#zombie-channels) and keeping your SCBs safe.
   219  
   220  ### Keeping Static Channel Backups (SCB) safe
   221  
   222  As mentioned in the previous chapter, there is a file where `lnd` stores and
   223  updates a backup of all channels whenever the node is restarted, a new channel
   224  is opened or a channel is closed:
   225  `<lnddir>/data/chain/bitcoin/mainnet/channel.backup`
   226  
   227  One straight-forward way of backing that file up is to create a file watcher and
   228  react whenever the file is changed. Here is an example script that
   229  [automatically makes a copy of the file whenever it changes](https://gist.github.com/alexbosworth/2c5e185aedbdac45a03655b709e255a3).
   230  
   231  Other ways of obtaining SCBs for a node's channels are
   232  [described in the recovery documentation](recovery.md#obtaining-scbs).
   233  
   234  Because the backup file is encrypted with a key from the seed the node was
   235  created with, it can safely be stored on a cloud storage or any other storage
   236  medium. Many consumer focused wallet smartphone apps automatically store a
   237  backup file to the cloud, if the phone is set up to allow it.
   238  
   239  ### Keep `lnd` updated
   240  
   241  With every larger update of `lnd`, new security features are added. Users are
   242  always encouraged to update their nodes as soon as possible. This also helps the
   243  network in general as new safety features that require compatibility among nodes
   244  can be used sooner.
   245  
   246  ### Zombie channels
   247  
   248  Zombie channels are channels that are most likely dead but are still around.
   249  This can happen if one of the channel peers has gone offline for good (possibly
   250  due to a failure of some sort) and didn't close its channels. The other, still
   251  online node doesn't necessarily know that its partner will never come back
   252  online.
   253  
   254  Funds that are in such channels are at great risk, as is described quite
   255  dramatically in
   256  [this article](https://medium.com/@gcomxx/get-rid-of-those-zombie-channels-1267d5a2a708?)
   257  .
   258  
   259  The TL;DR of the article is that if you have funds in a zombie channel and you
   260  need to recover your node after a failure, SCBs won't be able to recover those
   261  funds. Because SCB restore
   262  [relies on the remote node cooperating](#static-channel-backups-scbs).
   263  
   264  That's why it's important to **close channels with peers that have been
   265  offline** for a length of time as a precautionary measure.
   266  
   267  Of course this might not be good advice for a routing node operator that wants
   268  to support mobile users and route for them. Nodes running on a mobile device
   269  tend to be offline for long periods of time. It would be bad for those users if
   270  they needed to open a new channel every time they want to use the wallet.
   271  Most mobile wallets only open private channels as they do not intend to route
   272  payments through them. A routing node operator should therefore take into
   273  account if a channel is public or private when thinking about closing it.
   274  
   275  ### Migrating a node to a new device
   276  
   277  As mentioned in the chapters [aezeed](#aezeed) and
   278  [SCB](#static-channel-backups-scbs) you should never use the same seed on two
   279  different nodes and restoring from SCB is not a migration but an emergency
   280  procedure.   
   281  What is the correct way to migrate an existing node to a new device? There is
   282  an easy way that should work for most people and there's the harder/costlier
   283  fallback way to do it.
   284  
   285  **Option 1: Move the whole data directory to the new device**   
   286  This option works very well if the new device runs the same operating system on
   287  the same (or at least very similar) architecture. If that is the case, the whole
   288  `/home/<user>/.lnd` directory in Linux (or
   289  `$HOME/Library/Application Support/lnd` in MacOS, `%LOCALAPPDATA%\lnd` in
   290  Windows) can be moved to the new device and `lnd` started there. It is important
   291  to shut down `lnd` on the old device before moving the directory!   
   292  **Not supported/untested** is moving the data directory between different
   293  operating systems (for example `MacOS` <-> `Linux` or `Windows` <-> `Linux`) or
   294  different system architectures (for example `ARM` -> `amd64`). Data
   295  corruption or unexpected behavior can be the result. Users switching between
   296  operating systems or architectures should always use Option 2!
   297  
   298  Migrating between 32bit and 64bit of the same architecture (e.g. `ARM32` -> 
   299  `ARM64`) is known to be safe. To avoid issues with the main channel database
   300  (`channel.db`) becoming too large for 32bit systems, it is in fact recommended
   301  for Raspberry Pi users (for example RaspiBlitz or myNode) to migrate to the
   302  latest version that supports running 64bit `lnd`.
   303  
   304  **Option 2: Start from scratch**   
   305  If option 1 does not work or is too risky, the safest course of action is to
   306  initialize the existing node again from scratch. Unfortunately this incurs some
   307  on-chain fee costs as all channels will need to be closed. Using the same seed
   308  means restoring the same network node identity as before. If a new identity
   309  should be created, a new seed needs to be created.   
   310  Follow these steps to create the **same node (with the same seed)** from
   311  scratch:   
   312  1. On the old device, close all channels (`lncli closeallchannels`). The
   313     command can take up to several minutes depending on the number of channels.
   314     **Do not interrupt the command!**
   315  1. Wait for all channels to be fully closed. If some nodes don't respond to the
   316     close request it can be that `lnd` will go ahead and force close those
   317     channels. This means that the local balance will be time locked for up to
   318     two weeks (depending on the channel size). Check `lncli pendingchannels` to
   319     see if any channels are still in the process of being force closed.
   320  1. After all channels are fully closed (and `lncli pendingchannels` lists zero
   321     channels), `lnd` can be shut down on the old device.
   322  1. Start `lnd` on the new device and create a new wallet with the existing seed
   323     that was used on the old device (answer "yes" when asked if an existing seed
   324     should be used).
   325  1. Wait for the wallet to rescan the blockchain. This can take up to several
   326     hours depending on the age of the seed and the speed of the chain backend.
   327  1. After the chain is fully synced (`lncli getinfo` shows
   328     `"synced_to_chain": true`) the on-chain funds from the previous device should
   329     now be visible on the new device as well and new channels can be opened.
   330  
   331  **What to do after the move**   
   332  If things don't work as expected on the moved or re-created node, consider this
   333  list things that possibly need to be changed to work on a new device:
   334  * In case the new device has a different hostname and TLS connection problems
   335    occur, delete the `tls.key` and `tls.cert` files in the data directory and
   336    restart `lnd` to recreate them.
   337  * If an external IP is set (either with `--externalip` or `--tlsextraip`) these
   338    might need to be changed if the new machine has a different address. Changing
   339    the `--tlsextraip` setting also means regenerating the certificate pair. See
   340    point 1.
   341  * If port `9735` (or `10009` for gRPC) was forwarded on the router, these
   342    forwarded ports need to point to the new device. The same applies to firewall
   343    rules.
   344  * It might take more than 24 hours for a new IP address to be visible on
   345    network explorers.
   346  * If channels show as offline after several hours, try to manually connect to
   347    the remote peer. They might still try to reach `lnd` on the old address.
   348  
   349  ### Migrating a node from clearnet to Tor
   350  
   351  If an `lnd` node has already been connected to the internet with an IPv4 or IPv6
   352  (clearnet) address and has any non-private channels, this connection between
   353  channels and IP address is known to the network and cannot be deleted.   
   354  Starting the same node with the same identity and channels using Tor is trivial
   355  to link back to any previously used clearnet IP address and does therefore not
   356  provide any privacy benefits.   
   357  The following steps are recommended to cut all links between the old clearnet
   358  node and the new Tor node:
   359  1. Close all channels on the old node and wait for them to fully close.
   360  1. Send all on-chain funds of the old node through a Coin Join service (like
   361     Wasabi or Samurai/Whirlpool) until a sufficiently high anonymity set is
   362     reached.
   363  1. Create a new `lnd` node with a **new seed** that is only connected to Tor
   364     and generate an on-chain address on the new node.
   365  1. Send the mixed/coinjoined coins to the address of the new node.
   366  1. Start opening channels.
   367  1. Check an online network explorer that no IPv4 or IPv6 address is associated
   368     with the new node's identity.
   369  
   370  ### Prevent data corruption
   371  
   372  Many problems while running an `lnd` node can be prevented by avoiding data
   373  corruption in the channel database (`<lnddir>/data/graph/mainnet/channel.db`).
   374  
   375  The following (non-exhaustive) list of things can lead to data corruption:
   376  * A spinning hard drive gets a physical shock.
   377  * `lnd`'s main data directory being written on an SD card or USB thumb drive
   378    (SD cards and USB thumb drives _must_ be considered unsafe for critical files
   379    that are written to very often, as the channel DB is).
   380  * `lnd`'s main data directory being written to a network drive without
   381    `fsync` support.
   382  * Unclean shutdown of `lnd`.
   383  * Aborting channel operation commands (see next chapter).
   384  * Not enough disk space for a growing channel DB file.
   385  * Moving `lnd`'s main data directory between different operating systems/
   386    architectures.
   387  
   388  To avoid most of these factors, it is recommended to store `lnd`'s main data
   389  directory on an Solid State Drive (SSD) of a reliable manufacturer.
   390  An alternative or extension to that is to use a replicated disk setup. Making
   391  sure a power failure does not interrupt the node by running a UPS (
   392  uninterruptible power supply) might also make sense depending on the reliability
   393  of the local power grid and the amount of funds at stake.
   394  
   395  ### Don't interrupt `lncli` commands
   396  
   397  Things can start to take a while to execute if a node has more than 50 to 100
   398  channels. It is extremely important to **never interrupt an `lncli` command**
   399  if it is manipulating the channel database, which is true for the following
   400  commands:
   401   - `openchannel`
   402   - `closechannel` and `closeallchannels`
   403   - `abandonchannel`
   404   - `updatechanpolicy`
   405   - `restorechanbackup`
   406  
   407  Interrupting any of those commands can lead to an inconsistent state of the
   408  channel database and unpredictable behavior. If it is uncertain if a command
   409  is really stuck or if the node is still working on it, a look at the log file
   410  can help to get an idea.
   411  
   412  ### Regular accounting/monitoring
   413  
   414  Regular monitoring of a node and keeping track of the movement of funds can help
   415  prevent problems. Tools like [`lndmon`](https://github.com/lightninglabs/lndmon)
   416  can assist with these tasks.
   417  
   418  ### Pruned bitcoind node
   419  
   420  Running `lnd` connected to a `bitcoind` node that is running in prune mode is
   421  not supported! `lnd` needs to verify the funding transaction of every channel
   422  in the network and be able to retrieve that information from `bitcoind` which
   423  it cannot deliver when that information is pruned away.
   424  
   425  In theory pruning away all blocks _before_ the SegWit activation would work
   426  as LN channels rely on SegWit. But this has neither been tested nor would it
   427  be recommended/supported.
   428  
   429  In addition to not running a pruned node, it is recommended to run `bitcoind`
   430  with the `-txindex` flag for performance reasons, though this is not strictly
   431  required.
   432  
   433  Multiple `lnd` nodes can run off of a single `bitcoind` instance. There will be
   434  connection/thread/performance limits at some number of `lnd` nodes but in
   435  practice running 2 or 3 `lnd` instances per `bitcoind` node didn't show any
   436  problems.
   437  
   438  ### The `--noseedbackup` flag
   439  
   440  This is a flag that is only used for integration tests and should **never** be
   441  used on mainnet! Turning this flag on means that the 24 word seed will not be
   442  shown when creating a wallet. The seed is required to restore a node in case
   443  of data corruption and without it all funds (on-chain and off-chain) are
   444  being put at risk.