github.com/advanderveer/restic@v0.8.1-0.20171209104529-42a8c19aaea6/doc/100_references.rst (about) 1 .. 2 Normally, there are no heading levels assigned to certain characters as the structure is 3 determined from the succession of headings. However, this convention is used in Python’s 4 Style Guide for documenting which you may follow: 5 6 # with overline, for parts 7 * for chapters 8 = for sections 9 - for subsections 10 ^ for subsubsections 11 " for paragraphs 12 13 ########## 14 References 15 ########## 16 17 ****** 18 Design 19 ****** 20 21 Terminology 22 =========== 23 24 This section introduces terminology used in this document. 25 26 *Repository*: All data produced during a backup is sent to and stored in 27 a repository in a structured form, for example in a file system 28 hierarchy with several subdirectories. A repository implementation must 29 be able to fulfill a number of operations, e.g. list the contents. 30 31 *Blob*: A Blob combines a number of data bytes with identifying 32 information like the SHA-256 hash of the data and its length. 33 34 *Pack*: A Pack combines one or more Blobs, e.g. in a single file. 35 36 *Snapshot*: A Snapshot stands for the state of a file or directory that 37 has been backed up at some point in time. The state here means the 38 content and meta data like the name and modification time for the file 39 or the directory and its contents. 40 41 *Storage ID*: A storage ID is the SHA-256 hash of the content stored in 42 the repository. This ID is required in order to load the file from the 43 repository. 44 45 Repository Format 46 ================= 47 48 All data is stored in a restic repository. A repository is able to store 49 data of several different types, which can later be requested based on 50 an ID. This so-called "storage ID" is the SHA-256 hash of the content of 51 a file. All files in a repository are only written once and never 52 modified afterwards. This allows accessing and even writing to the 53 repository with multiple clients in parallel. Only the delete operation 54 removes data from the repository. 55 56 Repositories consist of several directories and a top-level file called 57 ``config``. For all other files stored in the repository, the name for 58 the file is the lower case hexadecimal representation of the storage ID, 59 which is the SHA-256 hash of the file's contents. This allows for easy 60 verification of files for accidental modifications, like disk read 61 errors, by simply running the program ``sha256sum`` on the file and 62 comparing its output to the file name. If the prefix of a filename is 63 unique amongst all the other files in the same directory, the prefix may 64 be used instead of the complete filename. 65 66 Apart from the files stored within the ``keys`` directory, all files are 67 encrypted with AES-256 in counter mode (CTR). The integrity of the 68 encrypted data is secured by a Poly1305-AES message authentication code 69 (sometimes also referred to as a "signature"). 70 71 In the first 16 bytes of each encrypted file the initialisation vector 72 (IV) is stored. It is followed by the encrypted data and completed by 73 the 16 byte MAC. The format is: ``IV || CIPHERTEXT || MAC``. The 74 complete encryption overhead is 32 bytes. For each file, a new random IV 75 is selected. 76 77 The file ``config`` is encrypted this way and contains a JSON document 78 like the following: 79 80 .. code:: json 81 82 { 83 "version": 1, 84 "id": "5956a3f67a6230d4a92cefb29529f10196c7d92582ec305fd71ff6d331d6271b", 85 "chunker_polynomial": "25b468838dcb75" 86 } 87 88 After decryption, restic first checks that the version field contains a 89 version number that it understands, otherwise it aborts. At the moment, 90 the version is expected to be 1. The field ``id`` holds a unique ID 91 which consists of 32 random bytes, encoded in hexadecimal. This uniquely 92 identifies the repository, regardless if it is accessed via SFTP or 93 locally. The field ``chunker_polynomial`` contains a parameter that is 94 used for splitting large files into smaller chunks (see below). 95 96 Repository Layout 97 ----------------- 98 99 The ``local`` and ``sftp`` backends are implemented using files and 100 directories stored in a file system. The directory layout is the same 101 for both backend types. 102 103 The basic layout of a repository stored in a ``local`` or ``sftp`` 104 backend is shown here: 105 106 :: 107 108 /tmp/restic-repo 109 ├── config 110 ├── data 111 │ ├── 21 112 │ │ └── 2159dd48f8a24f33c307b750592773f8b71ff8d11452132a7b2e2a6a01611be1 113 │ ├── 32 114 │ │ └── 32ea976bc30771cebad8285cd99120ac8786f9ffd42141d452458089985043a5 115 │ ├── 59 116 │ │ └── 59fe4bcde59bd6222eba87795e35a90d82cd2f138a27b6835032b7b58173a426 117 │ ├── 73 118 │ │ └── 73d04e6125cf3c28a299cc2f3cca3b78ceac396e4fcf9575e34536b26782413c 119 │ [...] 120 ├── index 121 │ ├── c38f5fb68307c6a3e3aa945d556e325dc38f5fb68307c6a3e3aa945d556e325d 122 │ └── ca171b1b7394d90d330b265d90f506f9984043b342525f019788f97e745c71fd 123 ├── keys 124 │ └── b02de829beeb3c01a63e6b25cbd421a98fef144f03b9a02e46eff9e2ca3f0bd7 125 ├── locks 126 ├── snapshots 127 │ └── 22a5af1bdc6e616f8a29579458c49627e01b32210d09adb288d1ecda7c5711ec 128 └── tmp 129 130 A local repository can be initialized with the ``restic init`` command, 131 e.g.: 132 133 .. code-block:: console 134 135 $ restic -r /tmp/restic-repo init 136 137 The local and sftp backends will auto-detect and accept all layouts described 138 in the following sections, so that remote repositories mounted locally e.g. via 139 fuse can be accessed. The layout auto-detection can be overridden by specifying 140 the option ``-o local.layout=default``, valid values are ``default`` and 141 ``s3legacy``. The option for the sftp backend is named ``sftp.layout``, for the 142 s3 backend ``s3.layout``. 143 144 S3 Legacy Layout 145 ---------------- 146 147 Unfortunately during development the AWS S3 backend uses slightly different 148 paths (directory names use singular instead of plural for ``key``, 149 ``lock``, and ``snapshot`` files), and the data files are stored directly below 150 the ``data`` directory. The S3 Legacy repository layout looks like this: 151 152 :: 153 154 /config 155 /data 156 ├── 2159dd48f8a24f33c307b750592773f8b71ff8d11452132a7b2e2a6a01611be1 157 ├── 32ea976bc30771cebad8285cd99120ac8786f9ffd42141d452458089985043a5 158 ├── 59fe4bcde59bd6222eba87795e35a90d82cd2f138a27b6835032b7b58173a426 159 ├── 73d04e6125cf3c28a299cc2f3cca3b78ceac396e4fcf9575e34536b26782413c 160 [...] 161 /index 162 ├── c38f5fb68307c6a3e3aa945d556e325dc38f5fb68307c6a3e3aa945d556e325d 163 └── ca171b1b7394d90d330b265d90f506f9984043b342525f019788f97e745c71fd 164 /key 165 └── b02de829beeb3c01a63e6b25cbd421a98fef144f03b9a02e46eff9e2ca3f0bd7 166 /lock 167 /snapshot 168 └── 22a5af1bdc6e616f8a29579458c49627e01b32210d09adb288d1ecda7c5711ec 169 170 The S3 backend understands and accepts both forms, new backends are 171 always created with the default layout for compatibility reasons. 172 173 Pack Format 174 =========== 175 176 All files in the repository except Key and Pack files just contain raw 177 data, stored as ``IV || Ciphertext || MAC``. Pack files may contain one 178 or more Blobs of data. 179 180 A Pack's structure is as follows: 181 182 :: 183 184 EncryptedBlob1 || ... || EncryptedBlobN || EncryptedHeader || Header_Length 185 186 At the end of the Pack file is a header, which describes the content. 187 The header is encrypted and authenticated. ``Header_Length`` is the 188 length of the encrypted header encoded as a four byte integer in 189 little-endian encoding. Placing the header at the end of a file allows 190 writing the blobs in a continuous stream as soon as they are read during 191 the backup phase. This reduces code complexity and avoids having to 192 re-write a file once the pack is complete and the content and length of 193 the header is known. 194 195 All the blobs (``EncryptedBlob1``, ``EncryptedBlobN`` etc.) are 196 authenticated and encrypted independently. This enables repository 197 reorganisation without having to touch the encrypted Blobs. In addition 198 it also allows efficient indexing, for only the header needs to be read 199 in order to find out which Blobs are contained in the Pack. Since the 200 header is authenticated, authenticity of the header can be checked 201 without having to read the complete Pack. 202 203 After decryption, a Pack's header consists of the following elements: 204 205 :: 206 207 Type_Blob1 || Length(EncryptedBlob1) || Hash(Plaintext_Blob1) || 208 [...] 209 Type_BlobN || Length(EncryptedBlobN) || Hash(Plaintext_Blobn) || 210 211 This is enough to calculate the offsets for all the Blobs in the Pack. 212 Length is the length of a Blob as a four byte integer in little-endian 213 format. The type field is a one byte field and labels the content of a 214 blob according to the following table: 215 216 +--------+-----------+ 217 | Type | Meaning | 218 +========+===========+ 219 | 0 | data | 220 +--------+-----------+ 221 | 1 | tree | 222 +--------+-----------+ 223 224 All other types are invalid, more types may be added in the future. 225 226 For reconstructing the index or parsing a pack without an index, first 227 the last four bytes must be read in order to find the length of the 228 header. Afterwards, the header can be read and parsed, which yields all 229 plaintext hashes, types, offsets and lengths of all included blobs. 230 231 Indexing 232 ======== 233 234 Index files contain information about Data and Tree Blobs and the Packs 235 they are contained in and store this information in the repository. When 236 the local cached index is not accessible any more, the index files can 237 be downloaded and used to reconstruct the index. The files are encrypted 238 and authenticated like Data and Tree Blobs, so the outer structure is 239 ``IV || Ciphertext || MAC`` again. The plaintext consists of a JSON 240 document like the following: 241 242 .. code:: json 243 244 { 245 "supersedes": [ 246 "ed54ae36197f4745ebc4b54d10e0f623eaaaedd03013eb7ae90df881b7781452" 247 ], 248 "packs": [ 249 { 250 "id": "73d04e6125cf3c28a299cc2f3cca3b78ceac396e4fcf9575e34536b26782413c", 251 "blobs": [ 252 { 253 "id": "3ec79977ef0cf5de7b08cd12b874cd0f62bbaf7f07f3497a5b1bbcc8cb39b1ce", 254 "type": "data", 255 "offset": 0, 256 "length": 25 257 },{ 258 "id": "9ccb846e60d90d4eb915848add7aa7ea1e4bbabfc60e573db9f7bfb2789afbae", 259 "type": "tree", 260 "offset": 38, 261 "length": 100 262 }, 263 { 264 "id": "d3dc577b4ffd38cc4b32122cabf8655a0223ed22edfd93b353dc0c3f2b0fdf66", 265 "type": "data", 266 "offset": 150, 267 "length": 123 268 } 269 ] 270 }, [...] 271 ] 272 } 273 274 This JSON document lists Packs and the blobs contained therein. In this 275 example, the Pack ``73d04e61`` contains two data Blobs and one Tree 276 blob, the plaintext hashes are listed afterwards. 277 278 The field ``supersedes`` lists the storage IDs of index files that have 279 been replaced with the current index file. This happens when index files 280 are repacked, for example when old snapshots are removed and Packs are 281 recombined. 282 283 There may be an arbitrary number of index files, containing information 284 on non-disjoint sets of Packs. The number of packs described in a single 285 file is chosen so that the file size is kept below 8 MiB. 286 287 Keys, Encryption and MAC 288 ======================== 289 290 All data stored by restic in the repository is encrypted with AES-256 in 291 counter mode and authenticated using Poly1305-AES. For encrypting new 292 data first 16 bytes are read from a cryptographically secure 293 pseudorandom number generator as a random nonce. This is used both as 294 the IV for counter mode and the nonce for Poly1305. This operation needs 295 three keys: A 32 byte for AES-256 for encryption, a 16 byte AES key and 296 a 16 byte key for Poly1305. For details see the original paper `The 297 Poly1305-AES message-authentication 298 code <http://cr.yp.to/mac/poly1305-20050329.pdf>`__ by Dan Bernstein. 299 The data is then encrypted with AES-256 and afterwards a message 300 authentication code (MAC) is computed over the ciphertext, everything is 301 then stored as IV \|\| CIPHERTEXT \|\| MAC. 302 303 The directory ``keys`` contains key files. These are simple JSON 304 documents which contain all data that is needed to derive the 305 repository's master encryption and message authentication keys from a 306 user's password. The JSON document from the repository can be 307 pretty-printed for example by using the Python module ``json`` 308 (shortened to increase readability): 309 310 :: 311 312 $ python -mjson.tool /tmp/restic-repo/keys/b02de82* 313 { 314 "hostname": "kasimir", 315 "username": "fd0" 316 "kdf": "scrypt", 317 "N": 65536, 318 "r": 8, 319 "p": 1, 320 "created": "2015-01-02T18:10:13.48307196+01:00", 321 "data": "tGwYeKoM0C4j4/9DFrVEmMGAldvEn/+iKC3te/QE/6ox/V4qz58FUOgMa0Bb1cIJ6asrypCx/Ti/pRXCPHLDkIJbNYd2ybC+fLhFIJVLCvkMS+trdywsUkglUbTbi+7+Ldsul5jpAj9vTZ25ajDc+4FKtWEcCWL5ICAOoTAxnPgT+Lh8ByGQBH6KbdWabqamLzTRWxePFoYuxa7yXgmj9A==", 322 "salt": "uW4fEI1+IOzj7ED9mVor+yTSJFd68DGlGOeLgJELYsTU5ikhG/83/+jGd4KKAaQdSrsfzrdOhAMftTSih5Ux6w==", 323 } 324 325 When the repository is opened by restic, the user is prompted for the 326 repository password. This is then used with ``scrypt``, a key derivation 327 function (KDF), and the supplied parameters (``N``, ``r``, ``p`` and 328 ``salt``) to derive 64 key bytes. The first 32 bytes are used as the 329 encryption key (for AES-256) and the last 32 bytes are used as the 330 message authentication key (for Poly1305-AES). These last 32 bytes are 331 divided into a 16 byte AES key ``k`` followed by 16 bytes of secret key 332 ``r``. The key ``r`` is then masked for use with Poly1305 (see the paper 333 for details). 334 335 Those keys are used to authenticate and decrypt the bytes contained in 336 the JSON field ``data`` with AES-256 and Poly1305-AES as if they were 337 any other blob (after removing the Base64 encoding). If the 338 password is incorrect or the key file has been tampered with, the 339 computed MAC will not match the last 16 bytes of the data, and restic 340 exits with an error. Otherwise, the data yields a JSON document 341 which contains the master encryption and message authentication keys for 342 this repository (encoded in Base64). The command 343 ``restic cat masterkey`` can be used as follows to decrypt and 344 pretty-print the master key: 345 346 .. code-block:: console 347 348 $ restic -r /tmp/restic-repo cat masterkey 349 { 350 "mac": { 351 "k": "evFWd9wWlndL9jc501268g==", 352 "r": "E9eEDnSJZgqwTOkDtOp+Dw==" 353 }, 354 "encrypt": "UQCqa0lKZ94PygPxMRqkePTZnHRYh1k1pX2k2lM2v3Q=", 355 } 356 357 All data in the repository is encrypted and authenticated with these 358 master keys. For encryption, the AES-256 algorithm in Counter mode is 359 used. For message authentication, Poly1305-AES is used as described 360 above. 361 362 A repository can have several different passwords, with a key file for 363 each. This way, the password can be changed without having to re-encrypt 364 all data. 365 366 Snapshots 367 ========= 368 369 A snapshot represents a directory with all files and sub-directories at 370 a given point in time. For each backup that is made, a new snapshot is 371 created. A snapshot is a JSON document that is stored in an encrypted 372 file below the directory ``snapshots`` in the repository. The filename 373 is the storage ID. This string is unique and used within restic to 374 uniquely identify a snapshot. 375 376 The command ``restic cat snapshot`` can be used as follows to decrypt 377 and pretty-print the contents of a snapshot file: 378 379 .. code-block:: console 380 381 $ restic -r /tmp/restic-repo cat snapshot 251c2e58 382 enter password for repository: 383 { 384 "time": "2015-01-02T18:10:50.895208559+01:00", 385 "tree": "2da81727b6585232894cfbb8f8bdab8d1eccd3d8f7c92bc934d62e62e618ffdf", 386 "dir": "/tmp/testdata", 387 "hostname": "kasimir", 388 "username": "fd0", 389 "uid": 1000, 390 "gid": 100, 391 "tags": [ 392 "NL" 393 ] 394 } 395 396 Here it can be seen that this snapshot represents the contents of the 397 directory ``/tmp/testdata``. The most important field is ``tree``. When 398 the meta data (e.g. the tags) of a snapshot change, the snapshot needs 399 to be re-encrypted and saved. This will change the storage ID, so in 400 order to relate these seemingly different snapshots, a field 401 ``original`` is introduced which contains the ID of the original 402 snapshot, e.g. after adding the tag ``DE`` to the snapshot above it 403 becomes: 404 405 .. code-block:: console 406 407 $ restic -r /tmp/restic-repo cat snapshot 22a5af1b 408 enter password for repository: 409 { 410 "time": "2015-01-02T18:10:50.895208559+01:00", 411 "tree": "2da81727b6585232894cfbb8f8bdab8d1eccd3d8f7c92bc934d62e62e618ffdf", 412 "dir": "/tmp/testdata", 413 "hostname": "kasimir", 414 "username": "fd0", 415 "uid": 1000, 416 "gid": 100, 417 "tags": [ 418 "NL", 419 "DE" 420 ], 421 "original": "251c2e5841355f743f9d4ffd3260bee765acee40a6229857e32b60446991b837" 422 } 423 424 Once introduced, the ``original`` field is not modified when the 425 snapshot's meta data is changed again. 426 427 All content within a restic repository is referenced according to its 428 SHA-256 hash. Before saving, each file is split into variable sized 429 Blobs of data. The SHA-256 hashes of all Blobs are saved in an ordered 430 list which then represents the content of the file. 431 432 In order to relate these plaintext hashes to the actual location within 433 a Pack file , an index is used. If the index is not available, the 434 header of all data Blobs can be read. 435 436 Trees and Data 437 ============== 438 439 A snapshot references a tree by the SHA-256 hash of the JSON string 440 representation of its contents. Trees and data are saved in pack files 441 in a subdirectory of the directory ``data``. 442 443 The command ``restic cat blob`` can be used to inspect the tree 444 referenced above (piping the output of the command to ``jq .`` so that 445 the JSON is indented): 446 447 .. code-block:: console 448 449 $ restic -r /tmp/restic-repo cat blob 2da81727b6585232894cfbb8f8bdab8d1eccd3d8f7c92bc934d62e62e618ffdf | jq . 450 enter password for repository: 451 { 452 "nodes": [ 453 { 454 "name": "testdata", 455 "type": "dir", 456 "mode": 493, 457 "mtime": "2014-12-22T14:47:59.912418701+01:00", 458 "atime": "2014-12-06T17:49:21.748468803+01:00", 459 "ctime": "2014-12-22T14:47:59.912418701+01:00", 460 "uid": 1000, 461 "gid": 100, 462 "user": "fd0", 463 "inode": 409704562, 464 "content": null, 465 "subtree": "b26e315b0988ddcd1cee64c351d13a100fedbc9fdbb144a67d1b765ab280b4dc" 466 } 467 ] 468 } 469 470 A tree contains a list of entries (in the field ``nodes``) which contain 471 meta data like a name and timestamps. When the entry references a 472 directory, the field ``subtree`` contains the plain text ID of another 473 tree object. 474 475 When the command ``restic cat blob`` is used, the plaintext ID is needed 476 to print a tree. The tree referenced above can be dumped as follows: 477 478 .. code-block:: console 479 480 $ restic -r /tmp/restic-repo cat blob b26e315b0988ddcd1cee64c351d13a100fedbc9fdbb144a67d1b765ab280b4dc 481 enter password for repository: 482 { 483 "nodes": [ 484 { 485 "name": "testfile", 486 "type": "file", 487 "mode": 420, 488 "mtime": "2014-12-06T17:50:23.34513538+01:00", 489 "atime": "2014-12-06T17:50:23.338468713+01:00", 490 "ctime": "2014-12-06T17:50:23.34513538+01:00", 491 "uid": 1000, 492 "gid": 100, 493 "user": "fd0", 494 "inode": 416863351, 495 "size": 1234, 496 "links": 1, 497 "content": [ 498 "50f77b3b4291e8411a027b9f9b9e64658181cc676ce6ba9958b95f268cb1109d" 499 ] 500 }, 501 [...] 502 ] 503 } 504 505 This tree contains a file entry. This time, the ``subtree`` field is not 506 present and the ``content`` field contains a list with one plain text 507 SHA-256 hash. 508 509 The command ``restic cat blob`` can also be used to extract and decrypt 510 data given a plaintext ID, e.g. for the data mentioned above: 511 512 .. code-block:: console 513 514 $ restic -r /tmp/restic-repo cat blob 50f77b3b4291e8411a027b9f9b9e64658181cc676ce6ba9958b95f268cb1109d | sha256sum 515 enter password for repository: 516 50f77b3b4291e8411a027b9f9b9e64658181cc676ce6ba9958b95f268cb1109d - 517 518 As can be seen from the output of the program ``sha256sum``, the hash 519 matches the plaintext hash from the map included in the tree above, so 520 the correct data has been returned. 521 522 Locks 523 ===== 524 525 The restic repository structure is designed in a way that allows 526 parallel access of multiple instance of restic and even parallel writes. 527 However, there are some functions that work more efficient or even 528 require exclusive access of the repository. In order to implement these 529 functions, restic processes are required to create a lock on the 530 repository before doing anything. 531 532 Locks come in two types: Exclusive and non-exclusive locks. At most one 533 process can have an exclusive lock on the repository, and during that 534 time there must not be any other locks (exclusive and non-exclusive). 535 There may be multiple non-exclusive locks in parallel. 536 537 A lock is a file in the subdir ``locks`` whose filename is the storage 538 ID of the contents. It is encrypted and authenticated the same way as 539 other files in the repository and contains the following JSON structure: 540 541 .. code:: json 542 543 { 544 "time": "2015-06-27T12:18:51.759239612+02:00", 545 "exclusive": false, 546 "hostname": "kasimir", 547 "username": "fd0", 548 "pid": 13607, 549 "uid": 1000, 550 "gid": 100 551 } 552 553 The field ``exclusive`` defines the type of lock. When a new lock is to 554 be created, restic checks all locks in the repository. When a lock is 555 found, it is tested if the lock is stale, which is the case for locks 556 with timestamps older than 30 minutes. If the lock was created on the 557 same machine, even for younger locks it is tested whether the process is 558 still alive by sending a signal to it. If that fails, restic assumes 559 that the process is dead and considers the lock to be stale. 560 561 When a new lock is to be created and no other conflicting locks are 562 detected, restic creates a new lock, waits, and checks if other locks 563 appeared in the repository. Depending on the type of the other locks and 564 the lock to be created, restic either continues or fails. 565 566 Backups and Deduplication 567 ========================= 568 569 For creating a backup, restic scans the source directory for all files, 570 sub-directories and other entries. The data from each file is split into 571 variable length Blobs cut at offsets defined by a sliding window of 64 572 byte. The implementation uses Rabin Fingerprints for implementing this 573 Content Defined Chunking (CDC). An irreducible polynomial is selected at 574 random and saved in the file ``config`` when a repository is 575 initialized, so that watermark attacks are much harder. 576 577 Files smaller than 512 KiB are not split, Blobs are of 512 KiB to 8 MiB 578 in size. The implementation aims for 1 MiB Blob size on average. 579 580 For modified files, only modified Blobs have to be saved in a subsequent 581 backup. This even works if bytes are inserted or removed at arbitrary 582 positions within the file. 583 584 Threat Model 585 ============ 586 587 The design goals for restic include being able to securely store backups 588 in a location that is not completely trusted, e.g. a shared system where 589 others can potentially access the files or (in the case of the system 590 administrator) even modify or delete them. 591 592 General assumptions: 593 594 - The host system a backup is created on is trusted. This is the most 595 basic requirement, and essential for creating trustworthy backups. 596 597 The restic backup program guarantees the following: 598 599 - Accessing the unencrypted content of stored files and metadata should 600 not be possible without a password for the repository. Everything 601 except the metadata included for informational purposes in the key 602 files is encrypted and authenticated. 603 604 - Modifications (intentional or unintentional) can be detected 605 automatically on several layers: 606 607 1. For all accesses of data stored in the repository it is checked 608 whether the cryptographic hash of the contents matches the storage 609 ID (the file's name). This way, modifications (bad RAM, broken 610 harddisk) can be detected easily. 611 612 2. Before decrypting any data, the MAC on the encrypted data is 613 checked. If there has been a modification, the MAC check will 614 fail. This step happens even before the data is decrypted, so data 615 that has been tampered with is not decrypted at all. 616 617 However, the restic backup program is not designed to protect against 618 attackers deleting files at the storage location. There is nothing that 619 can be done about this. If this needs to be guaranteed, get a secure 620 location without any access from third parties. If you assume that 621 attackers have write access to your files at the storage location, 622 attackers are able to figure out (e.g. based on the timestamps of the 623 stored files) which files belong to what snapshot. When only these files 624 are deleted, the particular snapshot vanished and all snapshots 625 depending on data that has been added in the snapshot cannot be restored 626 completely. Restic is not designed to detect this attack. 627 628 Local Cache 629 =========== 630 631 In order to speed up certain operations, restic manages a local cache of data. 632 This document describes the data structures for the local cache with version 1. 633 634 Versions 635 -------- 636 637 The cache directory is selected according to the `XDG base dir specification 638 <http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html>`__. 639 Each repository has its own cache sub-directory, consting of the repository ID 640 which is chosen at ``init``. All cache directories for different repos are 641 independent of each other. 642 643 The cache dir for a repo contains a file named ``version``, which contains a 644 single ASCII integer line that stands for the current version of the cache. If 645 a lower version number is found the cache is recreated with the current 646 version. If a higher version number is found the cache is ignored and left as 647 is. 648 649 Snapshots and Indexes 650 --------------------- 651 652 Snapshot, Data and Index files are cached in the sub-directories ``snapshots``, 653 ``data`` and ``index``, as read from the repository. 654 655 656 ************ 657 REST Backend 658 ************ 659 660 Restic can interact with HTTP Backend that respects the following REST 661 API. The following values are valid for ``{type}``: ``data``, ``keys``, 662 ``locks``, ``snapshots``, ``index``, ``config``. ``{path}`` is a path to 663 the repository, so that multiple different repositories can be accessed. 664 The default path is ``/``. 665 666 POST {path}?create=true 667 ======================= 668 669 This request is used to initially create a new repository. The server 670 responds with "200 OK" if the repository structure was created 671 successfully or already exists, otherwise an error is returned. 672 673 DELETE {path} 674 ============= 675 676 Deletes the repository on the server side. The server responds with "200 677 OK" if the repository was successfully removed. If this function is not 678 implemented the server returns "501 Not Implemented", if this it is 679 denied by the server it returns "403 Forbidden". 680 681 HEAD {path}/config 682 ================== 683 684 Returns "200 OK" if the repository has a configuration, an HTTP error 685 otherwise. 686 687 GET {path}/config 688 ================= 689 690 Returns the content of the configuration file if the repository has a 691 configuration, an HTTP error otherwise. 692 693 Response format: binary/octet-stream 694 695 POST {path}/config 696 ================== 697 698 Returns "200 OK" if the configuration of the request body has been 699 saved, an HTTP error otherwise. 700 701 GET {path}/{type}/ 702 ================== 703 704 Returns a JSON array containing the names of all the blobs stored for a 705 given type. 706 707 Response format: JSON 708 709 HEAD {path}/{type}/{name} 710 ========================= 711 712 Returns "200 OK" if the blob with the given name and type is stored in 713 the repository, "404 not found" otherwise. If the blob exists, the HTTP 714 header ``Content-Length`` is set to the file size. 715 716 GET {path}/{type}/{name} 717 ======================== 718 719 Returns the content of the blob with the given name and type if it is 720 stored in the repository, "404 not found" otherwise. 721 722 If the request specifies a partial read with a Range header field, then 723 the status code of the response is 206 instead of 200 and the response 724 only contains the specified range. 725 726 Response format: binary/octet-stream 727 728 POST {path}/{type}/{name} 729 ========================= 730 731 Saves the content of the request body as a blob with the given name and 732 type, an HTTP error otherwise. 733 734 Request format: binary/octet-stream 735 736 DELETE {path}/{type}/{name} 737 =========================== 738 739 Returns "200 OK" if the blob with the given name and type has been 740 deleted from the repository, an HTTP error otherwise. 741 742 743 ***** 744 Talks 745 ***** 746 747 The following talks will be or have been given about restic: 748 749 - 2016-01-31: Lightning Talk at the Go Devroom at FOSDEM 2016, 750 Brussels, Belgium 751 - 2016-01-29: `restic - Backups mal 752 richtig <https://media.ccc.de/v/c4.openchaos.2016.01.restic>`__: 753 Public lecture in German at `CCC Cologne 754 e.V. <https://koeln.ccc.de>`__ in Cologne, Germany 755 - 2015-08-23: `A Solution to the Backup 756 Inconvenience <https://programm.froscon.de/2015/events/1515.html>`__: 757 Lecture at `FROSCON 2015 <https://www.froscon.de>`__ in Bonn, Germany 758 - 2015-02-01: `Lightning Talk at FOSDEM 759 2015 <https://www.youtube.com/watch?v=oM-MfeflUZ8&t=11m40s>`__: A 760 short introduction (with slightly outdated command line) 761 - 2015-01-27: `Talk about restic at CCC 762 Aachen <https://videoag.fsmpi.rwth-aachen.de/?view=player&lectureid=4442#content>`__ 763 (in German)