github.com/decred/dcrlnd@v0.7.6/docs/safety.md (about) 1 # lnd Operational Safety Guidelines 2 3 ## Table of Contents 4 5 * [Overview](#overview) 6 - [aezeed](#aezeed) 7 - [Wallet password](#wallet-password) 8 - [TLS](#tls) 9 - [Macaroons](#macaroons) 10 - [Static Channel Backups (SCBs)](#static-channel-backups-scbs) 11 - [Static remote keys](#static-remote-keys) 12 * [Best practices](#best-practices) 13 - [aezeed storage](#aezeed-storage) 14 - [File based backups](#file-based-backups) 15 - [Keeping Static Channel Backups (SCBs) safe](#keeping-static-channel-backups-scb-safe) 16 - [Keep `lnd` updated](#keep-lnd-updated) 17 - [Zombie channels](#zombie-channels) 18 - [Migrating a node to a new device](#migrating-a-node-to-a-new-device) 19 - [Migrating a node from clearnet to Tor](#migrating-a-node-from-clearnet-to-tor) 20 - [Prevent data corruption](#prevent-data-corruption) 21 - [Don't interrupt `lncli` commands](#dont-interrupt-lncli-commands) 22 - [Regular accounting/monitoring](#regular-accountingmonitoring) 23 - [Pruned bitcoind node](#pruned-bitcoind-node) 24 - [The `--noseedbackup` flag](#the---noseedbackup-flag) 25 26 ## Overview 27 28 This chapter describes the security/safety mechanisms that are implemented in 29 `lnd`. We encourage every person that is planning on putting mainnet funds into 30 a Lightning Network channel using `lnd` to read this guide carefully. 31 As of this writing, `lnd` is still in beta and it is considered `#reckless` to 32 put any life altering amounts of BTC into the network. 33 That said, we constantly put in a lot of effort to make `lnd` safer to use and 34 more secure. We will update this documentation with each safety mechanism that 35 we implement. 36 37 The first part of this document describes the security elements that are used in 38 `lnd` and how they work on a high level. 39 The second part is a list of best practices that has crystallized from bug 40 reports, developer recommendations and experiences from a lot of individuals 41 running mainnet `lnd` nodes during the last 18 months and counting. 42 43 ### aezeed 44 45 This is what all the on-chain private keys are derived from. `aezeed` is similar 46 to BIP39 as it uses the same word list to encode the seed as a mnemonic phrase. 47 But this is where the similarities end, because `aezeed` is _not_ compatible 48 with BIP39. The 24 words of `aezeed` encode a 128 bit entropy (the seed itself), 49 a wallet birthday (days since BTC genesis block) and a version. 50 This data is _encrypted_ with a password using the AEZ cipher suite (hence the 51 name). Encrypting the content instead of using the password to derive the HD 52 extended root key has the advantage that the password can actually be checked 53 for correctness and can also be changed without affecting any of the derived 54 keys. 55 A BIP for the `aezeed` scheme is being written and should be published soon. 56 57 Important to know: 58 * As with any bitcoin seed phrase, never reveal this to any person and store 59 the 24 words (and the password) in a safe place. 60 * You should never run two different `lnd` nodes with the same seed! Even if 61 they aren't running at the same time. This will lead to strange/unpredictable 62 behavior or even loss of funds. To migrate an `lnd` node to a new device, 63 please see the [node migration section](#migrating-a-node-to-a-new-device). 64 * For more technical information [see the aezeed README](../aezeed/README.md). 65 66 ### Wallet password 67 68 The wallet password is one of the first things that has to be entered if a new 69 wallet is created using `lnd`. It is completely independent from the `aezeed` 70 cipher seed passphrase (which is optional). The wallet password is used to 71 encrypt the sensitive parts of `lnd`'s databases, currently some parts of 72 `wallet.db` and `macaroons.db`. Loss of this password does not necessarily 73 mean loss of funds, as long as the `aezeed` passphrase is still available. 74 But the node will need to be restored using the 75 [SCB restore procedure](recovery.md). 76 77 ### TLS 78 79 By default the two API connections `lnd` offers (gRPC on port 10009 and REST on 80 port 8080) use TLS with a self-signed certificate for transport level security. 81 Specifying the certificate on the client side (for example `lncli`) is only a 82 protection against man-in-the-middle attacks and does not provide any 83 authentication. In fact, `lnd` will never even see the certificate that is 84 supplied to `lncli` with the `--tlscertpath` argument. `lncli` only uses that 85 certificate to verify it is talking to the correct gRPC server. 86 If the key/certificate pair (`tls.cert` and `tls.key` in the main `lnd` data 87 directory) is missing on startup, a new self-signed key/certificate pair is 88 generated. Clients connecting to `lnd` then have to use the new certificate 89 to verify they are talking to the correct server. 90 91 ### Macaroons 92 93 Macaroons are used as the main authentication method in `lnd`. A macaroon is a 94 cryptographically verifiable token, comparable to a [JWT](https://jwt.io/) 95 or other form of API access token. In `lnd` this token consists of a _list of 96 permissions_ (what operations does the user of the token have access to) and a 97 set of _restrictions_ (e.g. token expiration timestamp, IP address restriction). 98 `lnd` does not keep track of the individual macaroons issued, only the key that 99 was used to create (and later verify) them. That means, individual tokens cannot 100 currently be invalidated, only all of them at once. 101 See the [high-level macaroons documentation](macaroons.md) or the [technical 102 README](../macaroons/README.md) for more information. 103 104 Important to know: 105 * Deleting the `*.macaroon` files in the `<lnd-dir>/data/chain/bitcoin/mainnet/` 106 folder will trigger `lnd` to recreate the default macaroons. But this does 107 **NOT** invalidate clients that use an old macaroon. To make sure all 108 previously generated macaroons are invalidated, the `macaroons.db` has to be 109 deleted as well as all `*.macaroon`. 110 111 ### Static Channel Backups (SCBs) 112 113 A Static Channel Backup is a piece of data that contains all _static_ 114 information about a channel, like funding transaction, capacity, key derivation 115 paths, remote node public key, remote node last known network addresses and 116 some static settings like CSV timeout and min HTLC setting. 117 Such a backup can either be obtained as a file containing entries for multiple 118 channels or by calling RPC methods to get individual (or all) channel data. 119 See the section on [keeping SCBs safe](#keeping-static-channel-backups-scb-safe) 120 for more information. 121 122 What the SCB does **not** contain is the current channel balance (or the 123 associated commitment transaction). So how can a channel be restored using 124 SCBs? 125 That's the important part: _A channel cannot be restored using SCBs_, but the 126 funds that are in the channel can be claimed. The restore procedure relies on 127 the Data Loss Prevention (DLP) protocol which works by connecting to the remote 128 node and asking them to **force close** the channel and hand over the needed 129 information to sweep the on-chain funds that belong to the local node. 130 Because of this, [restoring a node from SCB](recovery.md) should be seen as an 131 emergency measure as all channels will be closed and on-chain fees incur to the 132 party that opened the channel initially. 133 To migrate an existing, working node to a new device, SCBs are _not_ the way to 134 do it. See the section about 135 [migrating a node](#migrating-a-node-to-a-new-device) on how to do it correctly. 136 137 Important to know: 138 * [Restoring a node from SCB](recovery.md) will force-close all channels 139 contained in that file. 140 * Restoring a node from SCB relies on the remote node of each channel to be 141 online and respond to the DLP protocol. That's why it's important to 142 [get rid of zombie channels](#zombie-channels) because they cannot be 143 recovered using SCBs. 144 * The SCB data is encrypted with a key from the seed the node was created with. 145 A node can therefore only be restored from SCB if the seed is also known. 146 147 ### Static remote keys 148 149 Since version `v0.8.0-beta`, `lnd` supports the `option_static_remote_key` (also 150 known as "safu commitments"). All new channels will be opened with this option 151 enabled by default, if the other node also supports it. 152 In essence, this change makes it possible for a node to sweep their channel 153 funds if the remote node force-closes, without any further communication between 154 the nodes. Previous to this change, your node needed to get a random channel 155 secret (called the `per_commit_point`) from the remote node even if they 156 force-closed the channel, which could make recovery very difficult. 157 158 ## Best practices 159 160 ### aezeed storage 161 162 When creating a new wallet, `lnd` will print out 24 words to write down, which 163 is the wallet's seed (in the [aezeed](#aezeed) format). That seed is optionally 164 encrypted with a passphrase, also called the _cipher seed passphrase_. 165 It is absolutely important to write both the seed and, if set, the password down 166 and store it in a safe place as **there is no way of exporting the seed from an 167 lnd wallet**. When creating the wallet, after printing the seed to the command 168 line, it is hashed and only the hash (or to be more exact, the BIP32 extended 169 root key) is stored in the `wallet.db` file. 170 There is 171 [a tool being worked on](https://github.com/lightningnetwork/lnd/pull/2373) 172 that can extract the BIP32 extended root key but currently you cannot restore 173 lnd with only this root key. 174 175 Important to know: 176 * Setting a password/passphrase for the aezeed is meant to protect it from 177 an attacker that finds the paper/storage device. Writing down the password 178 alongside the 24 seed words does not enhance the security in any way. 179 Therefore the password should be stored in a separate place. 180 181 ### File based backups 182 183 There is a lot of confusion and also some myths about how to best backup the 184 off-chain funds of an `lnd` node. Making a mistake here is also still the single 185 biggest risk of losing off-chain funds, even though we do everything to mitigate 186 those risks. 187 188 **What files can/should I regularly backup?** 189 The single most important file that needs to be backed up whenever it changes 190 is the `<lnddir>/data/chain/bitcoin/mainnet/channel.backup` file which holds 191 the Static Channel Backups (SCBs). This file is only updated every time `lnd` 192 starts, a channel is opened or a channel is closed. 193 194 Most consumer Lightning wallet apps upload the file to the cloud automatically. 195 196 See the [SCB chapter](#static-channel-backups-scbs) for more 197 information on how to use the file to restore channels. 198 199 **What files should never be backed up to avoid problems?** 200 This is a bit of a trick question, as making the backup is not the problem. 201 Restoring/using an old version of a specific file called 202 `<lnddir>/data/graph/mainnet/channel.db` is what is very risky and should 203 _never_ be done! 204 This requires some explanation: 205 The way LN channels are currently set up (until `eltoo` is implemented) is that 206 both parties agree on a current balance. To make sure none of the two peers in 207 a channel ever try to publish an old state of that balance, they both hand over 208 their keys to the other peer that gives them the means to take _all_ funds (not 209 just their agreed upon part) from a channel, if an _old_ state is ever 210 published. Therefore, having an old state of a channel basically means 211 forfeiting the balance to the other party. 212 213 As payments in `lnd` can be made multiple times a second, it's very hard to 214 make a backup of the channel database every time it is updated. And even if it 215 can be technically done, the confidence that a particular state is certainly the 216 most up-to-date can never be very high. That's why the focus should be on 217 [making sure the channel database is not corrupted](#prevent-data-corruption), 218 [closing out the zombie channels](#zombie-channels) and keeping your SCBs safe. 219 220 ### Keeping Static Channel Backups (SCB) safe 221 222 As mentioned in the previous chapter, there is a file where `lnd` stores and 223 updates a backup of all channels whenever the node is restarted, a new channel 224 is opened or a channel is closed: 225 `<lnddir>/data/chain/bitcoin/mainnet/channel.backup` 226 227 One straight-forward way of backing that file up is to create a file watcher and 228 react whenever the file is changed. Here is an example script that 229 [automatically makes a copy of the file whenever it changes](https://gist.github.com/alexbosworth/2c5e185aedbdac45a03655b709e255a3). 230 231 Other ways of obtaining SCBs for a node's channels are 232 [described in the recovery documentation](recovery.md#obtaining-scbs). 233 234 Because the backup file is encrypted with a key from the seed the node was 235 created with, it can safely be stored on a cloud storage or any other storage 236 medium. Many consumer focused wallet smartphone apps automatically store a 237 backup file to the cloud, if the phone is set up to allow it. 238 239 ### Keep `lnd` updated 240 241 With every larger update of `lnd`, new security features are added. Users are 242 always encouraged to update their nodes as soon as possible. This also helps the 243 network in general as new safety features that require compatibility among nodes 244 can be used sooner. 245 246 ### Zombie channels 247 248 Zombie channels are channels that are most likely dead but are still around. 249 This can happen if one of the channel peers has gone offline for good (possibly 250 due to a failure of some sort) and didn't close its channels. The other, still 251 online node doesn't necessarily know that its partner will never come back 252 online. 253 254 Funds that are in such channels are at great risk, as is described quite 255 dramatically in 256 [this article](https://medium.com/@gcomxx/get-rid-of-those-zombie-channels-1267d5a2a708?) 257 . 258 259 The TL;DR of the article is that if you have funds in a zombie channel and you 260 need to recover your node after a failure, SCBs won't be able to recover those 261 funds. Because SCB restore 262 [relies on the remote node cooperating](#static-channel-backups-scbs). 263 264 That's why it's important to **close channels with peers that have been 265 offline** for a length of time as a precautionary measure. 266 267 Of course this might not be good advice for a routing node operator that wants 268 to support mobile users and route for them. Nodes running on a mobile device 269 tend to be offline for long periods of time. It would be bad for those users if 270 they needed to open a new channel every time they want to use the wallet. 271 Most mobile wallets only open private channels as they do not intend to route 272 payments through them. A routing node operator should therefore take into 273 account if a channel is public or private when thinking about closing it. 274 275 ### Migrating a node to a new device 276 277 As mentioned in the chapters [aezeed](#aezeed) and 278 [SCB](#static-channel-backups-scbs) you should never use the same seed on two 279 different nodes and restoring from SCB is not a migration but an emergency 280 procedure. 281 What is the correct way to migrate an existing node to a new device? There is 282 an easy way that should work for most people and there's the harder/costlier 283 fallback way to do it. 284 285 **Option 1: Move the whole data directory to the new device** 286 This option works very well if the new device runs the same operating system on 287 the same (or at least very similar) architecture. If that is the case, the whole 288 `/home/<user>/.lnd` directory in Linux (or 289 `$HOME/Library/Application Support/lnd` in MacOS, `%LOCALAPPDATA%\lnd` in 290 Windows) can be moved to the new device and `lnd` started there. It is important 291 to shut down `lnd` on the old device before moving the directory! 292 **Not supported/untested** is moving the data directory between different 293 operating systems (for example `MacOS` <-> `Linux` or `Windows` <-> `Linux`) or 294 different system architectures (for example `ARM` -> `amd64`). Data 295 corruption or unexpected behavior can be the result. Users switching between 296 operating systems or architectures should always use Option 2! 297 298 Migrating between 32bit and 64bit of the same architecture (e.g. `ARM32` -> 299 `ARM64`) is known to be safe. To avoid issues with the main channel database 300 (`channel.db`) becoming too large for 32bit systems, it is in fact recommended 301 for Raspberry Pi users (for example RaspiBlitz or myNode) to migrate to the 302 latest version that supports running 64bit `lnd`. 303 304 **Option 2: Start from scratch** 305 If option 1 does not work or is too risky, the safest course of action is to 306 initialize the existing node again from scratch. Unfortunately this incurs some 307 on-chain fee costs as all channels will need to be closed. Using the same seed 308 means restoring the same network node identity as before. If a new identity 309 should be created, a new seed needs to be created. 310 Follow these steps to create the **same node (with the same seed)** from 311 scratch: 312 1. On the old device, close all channels (`lncli closeallchannels`). The 313 command can take up to several minutes depending on the number of channels. 314 **Do not interrupt the command!** 315 1. Wait for all channels to be fully closed. If some nodes don't respond to the 316 close request it can be that `lnd` will go ahead and force close those 317 channels. This means that the local balance will be time locked for up to 318 two weeks (depending on the channel size). Check `lncli pendingchannels` to 319 see if any channels are still in the process of being force closed. 320 1. After all channels are fully closed (and `lncli pendingchannels` lists zero 321 channels), `lnd` can be shut down on the old device. 322 1. Start `lnd` on the new device and create a new wallet with the existing seed 323 that was used on the old device (answer "yes" when asked if an existing seed 324 should be used). 325 1. Wait for the wallet to rescan the blockchain. This can take up to several 326 hours depending on the age of the seed and the speed of the chain backend. 327 1. After the chain is fully synced (`lncli getinfo` shows 328 `"synced_to_chain": true`) the on-chain funds from the previous device should 329 now be visible on the new device as well and new channels can be opened. 330 331 **What to do after the move** 332 If things don't work as expected on the moved or re-created node, consider this 333 list things that possibly need to be changed to work on a new device: 334 * In case the new device has a different hostname and TLS connection problems 335 occur, delete the `tls.key` and `tls.cert` files in the data directory and 336 restart `lnd` to recreate them. 337 * If an external IP is set (either with `--externalip` or `--tlsextraip`) these 338 might need to be changed if the new machine has a different address. Changing 339 the `--tlsextraip` setting also means regenerating the certificate pair. See 340 point 1. 341 * If port `9735` (or `10009` for gRPC) was forwarded on the router, these 342 forwarded ports need to point to the new device. The same applies to firewall 343 rules. 344 * It might take more than 24 hours for a new IP address to be visible on 345 network explorers. 346 * If channels show as offline after several hours, try to manually connect to 347 the remote peer. They might still try to reach `lnd` on the old address. 348 349 ### Migrating a node from clearnet to Tor 350 351 If an `lnd` node has already been connected to the internet with an IPv4 or IPv6 352 (clearnet) address and has any non-private channels, this connection between 353 channels and IP address is known to the network and cannot be deleted. 354 Starting the same node with the same identity and channels using Tor is trivial 355 to link back to any previously used clearnet IP address and does therefore not 356 provide any privacy benefits. 357 The following steps are recommended to cut all links between the old clearnet 358 node and the new Tor node: 359 1. Close all channels on the old node and wait for them to fully close. 360 1. Send all on-chain funds of the old node through a Coin Join service (like 361 Wasabi or Samurai/Whirlpool) until a sufficiently high anonymity set is 362 reached. 363 1. Create a new `lnd` node with a **new seed** that is only connected to Tor 364 and generate an on-chain address on the new node. 365 1. Send the mixed/coinjoined coins to the address of the new node. 366 1. Start opening channels. 367 1. Check an online network explorer that no IPv4 or IPv6 address is associated 368 with the new node's identity. 369 370 ### Prevent data corruption 371 372 Many problems while running an `lnd` node can be prevented by avoiding data 373 corruption in the channel database (`<lnddir>/data/graph/mainnet/channel.db`). 374 375 The following (non-exhaustive) list of things can lead to data corruption: 376 * A spinning hard drive gets a physical shock. 377 * `lnd`'s main data directory being written on an SD card or USB thumb drive 378 (SD cards and USB thumb drives _must_ be considered unsafe for critical files 379 that are written to very often, as the channel DB is). 380 * `lnd`'s main data directory being written to a network drive without 381 `fsync` support. 382 * Unclean shutdown of `lnd`. 383 * Aborting channel operation commands (see next chapter). 384 * Not enough disk space for a growing channel DB file. 385 * Moving `lnd`'s main data directory between different operating systems/ 386 architectures. 387 388 To avoid most of these factors, it is recommended to store `lnd`'s main data 389 directory on an Solid State Drive (SSD) of a reliable manufacturer. 390 An alternative or extension to that is to use a replicated disk setup. Making 391 sure a power failure does not interrupt the node by running a UPS ( 392 uninterruptible power supply) might also make sense depending on the reliability 393 of the local power grid and the amount of funds at stake. 394 395 ### Don't interrupt `lncli` commands 396 397 Things can start to take a while to execute if a node has more than 50 to 100 398 channels. It is extremely important to **never interrupt an `lncli` command** 399 if it is manipulating the channel database, which is true for the following 400 commands: 401 - `openchannel` 402 - `closechannel` and `closeallchannels` 403 - `abandonchannel` 404 - `updatechanpolicy` 405 - `restorechanbackup` 406 407 Interrupting any of those commands can lead to an inconsistent state of the 408 channel database and unpredictable behavior. If it is uncertain if a command 409 is really stuck or if the node is still working on it, a look at the log file 410 can help to get an idea. 411 412 ### Regular accounting/monitoring 413 414 Regular monitoring of a node and keeping track of the movement of funds can help 415 prevent problems. Tools like [`lndmon`](https://github.com/lightninglabs/lndmon) 416 can assist with these tasks. 417 418 ### Pruned bitcoind node 419 420 Running `lnd` connected to a `bitcoind` node that is running in prune mode is 421 not supported! `lnd` needs to verify the funding transaction of every channel 422 in the network and be able to retrieve that information from `bitcoind` which 423 it cannot deliver when that information is pruned away. 424 425 In theory pruning away all blocks _before_ the SegWit activation would work 426 as LN channels rely on SegWit. But this has neither been tested nor would it 427 be recommended/supported. 428 429 In addition to not running a pruned node, it is recommended to run `bitcoind` 430 with the `-txindex` flag for performance reasons, though this is not strictly 431 required. 432 433 Multiple `lnd` nodes can run off of a single `bitcoind` instance. There will be 434 connection/thread/performance limits at some number of `lnd` nodes but in 435 practice running 2 or 3 `lnd` instances per `bitcoind` node didn't show any 436 problems. 437 438 ### The `--noseedbackup` flag 439 440 This is a flag that is only used for integration tests and should **never** be 441 used on mainnet! Turning this flag on means that the 24 word seed will not be 442 shown when creating a wallet. The seed is required to restore a node in case 443 of data corruption and without it all funds (on-chain and off-chain) are 444 being put at risk.