github.com/dtroyer-salad/og2/v2@v2.0.0-20240412154159-c47231610877/README.Salad.md (about)

     1  # Salad Extension to ORAS Go Library
     2  
     3  ## TL;DR
     4  
     5  The cost of failed downloads being restarted by deleting partially downloaded layers
     6  is a high one to pay in the Salad network especially when some of these layers may exceed
     7  10GB.  Resuming partial downloads is an important part of a robust and resilient and
     8  performant distributed compute node with limited bandwidth.
     9  
    10  This repository is a fork of https://github.com/oras-project/oras-go/ just after the v2.4.0
    11  tag.  The branch `resume` contains Salad's download changes.  The only changes required to
    12  build the ORAS CLI (`oras`) (https://github.com/oras-project/oras) are to use this replacement
    13  for `oras-go`.
    14  
    15  ## Summary
    16  
    17  ### Resumable Downloads
    18  
    19  This resumable download implementation is contained entirely within oras-go and the code path
    20  below oras.doCopyNode().  Attempts have been made to not alter the existing external interfaces
    21  although some new ones have been added.  Resume download is always enabled but conditions are
    22  carefully evaluated and falls back to the original code path when not possible. This
    23  implementation does not include any way to force resume enabled (fail if not possible) or
    24  disabled (do not attempt even when possible).
    25  
    26  Resumable downloads are limited to remote registry source targets and local storage destination
    27  targets.
    28  
    29  In order to implement this with minimally-changed interfaces a method for passing some state
    30  between ares is needed, the Annotations field of the Descriptor was selected using Salad-named
    31  keys defined in `internal/spec/artifact.go` using constants with names beginning with
    32  `AnnotationResume`.
    33  
    34  * `Annotations` key constants (`internal/spec/artifact.go`)
    35    * AnnotationResume* - the keys used in the Annotations[] map
    36  
    37  * `oras.doCopyNode()` (`copy.go`)
    38    * Look for files in the ingest directory that match the current `Descriptor` being downloaded
    39      * if found: save full filename and file size to the `Annotations` map for the `Descriptor`
    40      * if not found: nothing to see here, proceed as normal
    41  
    42  * `remote.FetcherHead` (`registry/remote/repository.go`)
    43    * interface defining `FetchHead()`
    44  
    45  * `remote.BlobStoreHead` (`registry/remote/repository.go`)
    46    * interface combining `registry.BlobStore` with `FetcherHead`
    47  
    48  * `remote.Repository.FetchHead()` (new) (`registry/remote/repository.go`)
    49    * call `FetchHead()` when `BlobStoreHead` is implemented
    50  
    51  * `remote.blobStore` (`registry/remote/repository.go`)
    52    * `blobStore.Fetch()`
    53      * call `FetchHead()` to check for the `Range` header support from the server
    54        * FALSE:
    55          * reset resume flag and proceed as usual
    56        * TRUE:
    57          * Set `Range` header
    58      * after GET request to remote repository if in resume
    59        * `StatusPartialContent`:
    60          * check response `ContentLength` against `target.Size - ingestSize`
    61        * `StatusOK`:
    62          * check response `ContentLength` against `target.Size`
    63    * `blobStore.FetchHead()` (new)
    64      * do HEAD call to src
    65        * `StatusOK`:
    66          * check response `ContentLength` against `target.Size`
    67          * check response header `Accept-Ranges` has value `bytes`
    68            * TRUE:
    69              * Set resume flag
    70  
    71  * `content.Storage.Push()` (`content/oci/storage.go`)
    72    * call `Storage.ingest()` as usual
    73  
    74  * `content.Storage.ingest()` (`content/oci/storage.go`)
    75    * if resume conditions are all met
    76      * TRUE:
    77        * open existing ingest file
    78        * seek to 0 in ingest file
    79        * make a new Hash to contain the running hash of the ingest file
    80        * save encoded Hash to `Annotations[hash]`
    81      * FALSE:
    82        * if not found: `CreateTemp()` a new ingest file as usual
    83    * if `0 <= ingest size < content-length`
    84      * TRUE:
    85        * call `ioutil.CopyBuffer()` as usual
    86  
    87  * `ioutil.CopyBuffer()` (`internal/ioutil.io.go`)
    88    * call `content.NewVerifyReader()` as usual
    89    * handle `io.ErrUnexpectedEOF`: check `bytes read == desc.Size - ingestSize`
    90  
    91  * `content.NewVerifyReader()` (`content/reader.go`)
    92    * Add resume field to `VerifyReader` struct
    93    * if `Annotations[offset]` > 0
    94      * TRUE:
    95        * decode `Annotations[Hash]`
    96        * create a new `content.hashVerifier` with the new `Hash` and the original `desc.Digest`
    97      * FALSE:
    98        * create a new `digest.hashVerifier` from `desc.Digest`
    99  
   100  * `content.hashVerifier` (new) (`content/verifiers.go`)
   101    * `digest.hashVerifier` is copied here from `opencontainers/go-digest/blob/master/verifiers.go`
   102      because it is private and we need to construct one with our new `Hash` and the original `Digest`.
   103  
   104  * `content.Resumer` (new) (`content.storage.go`)
   105    * Interface to get ingest filenames, also used to determine support for resumable downloads
   106  
   107  * `content.Store.IngestFile()` (new) (`content/oci/storage.go`)
   108    * Provide access to `content.Store.storage.IngestFile()`
   109  
   110  * `content.Storage.IngestFile()` (new) (`content/oci/storage.go`)
   111    * Locate and return the first matching ingest file, if any