github.com/containerd/containerd@v22.0.0-20200918172823-438c87b8e050+incompatible/design/snapshots.md (about)

     1  # Snapshots
     2  
     3  Docker containers, from the beginning, have long been built on a snapshotting
     4  methodology known as _layers_. _Layers_ provide the ability to fork a
     5  filesystem, make changes then save the changeset back to a new layer.
     6  
     7  Historically, these have been tightly integrated into the Docker daemon as a
     8  component called the `graphdriver`. The `graphdriver` allows one to run the
     9  docker daemon on several different operating systems while still maintaining
    10  roughly similar snapshot semantics for committing and distributing changes to
    11  images.
    12  
    13  The `graphdriver` is deeply integrated with the import and export of images,
    14  including managing layer relationships and container runtime filesystems. The
    15  behavior of the `graphdriver` informs the transport of image formats.
    16  
    17  In this document, we propose a more flexible model for managing layers. It
    18  focuses on providing an API for the base snapshotting functionality without
    19  coupling so tightly to the structure of images and their identification. The
    20  minimal API simplifies behavior without sacrificing power. This makes the
    21  surface area for driver implementations smaller, ensuring that behavior is more
    22  consistent between implementations.
    23  
    24  These differ from the concept of the graphdriver in that the _Snapshotter_
    25  has no knowledge of images or containers. Users simply prepare and commit
    26  directories. We also avoid the integration between graph drivers and the tar
    27  format used to represent the changesets.
    28  
    29  The best aspect is that we can get to this model by refactoring the existing
    30  graphdrivers, minimizing the need for new code and sprawling tests.
    31  
    32  ## Scope
    33  
    34  In the past, the `graphdriver` component has provided quite a lot of
    35  functionality in Docker. This includes serialization, hashing, unpacking,
    36  packing, mounting.
    37  
    38  The _Snapshotter_ will only provide mount-oriented snapshot
    39  access with minimal metadata. Serialization, hashing, unpacking, packing and
    40  mounting are not included in this design, opting for common implementations
    41  between graphdrivers, rather than specialized ones. This is less of a problem
    42  for performance since direct access to changesets is provided in the
    43  interface.
    44  
    45  ## Architecture
    46  
    47  The _Snapshotter_ provides an API for allocating, snapshotting and mounting
    48  abstract, layer-based filesystems. The model works by building up sets of
    49  directories with parent-child relationships, known as _Snapshots_.
    50  
    51  A _Snapshot_ represents a filesystem state.  Every snapshot has a parent,
    52  where the empty parent is represented by the empty string.  A diff can be taken
    53  between a parent and its snapshot to create a classic layer.
    54  
    55  Snapshots are best understood by their lifecycle.  _Active_ snapshots are always
    56  created with `Prepare` or `View` from a _Committed_ snapshot (including the
    57  empty snapshot).  _Committed_ snapshots are always created with
    58  `Commit` from an _Active_ snapshot.  Active snapshots never become committed
    59  snapshots and vice versa. All snapshots may be removed.
    60  
    61  After mounting an _Active_ snapshot, changes can be made to the snapshot.  The
    62  act of committing creates a _Committed_ snapshot.  The committed snapshot will
    63  inherit the parent of the active snapshot.  The committed snapshot can then be
    64  used as a parent.  Active snapshots can never be used as a parent.
    65  
    66  The following diagram demonstrates the relationships of snapshots:
    67  
    68  ![snapshot model diagram, showing active snapshots on the left and
    69  committed snapshots on the right](snapshot_model.png)
    70  
    71  In this diagram, you can see that the active snapshot _a_ is created by calling
    72  `Prepare` with the committed snapshot _P<sub>0</sub>_.  After modification, _a_
    73  becomes _a'_ and a committed snapshot _P<sub>1</sub>_ is created by calling
    74  `Commit`.  _a'_ can be further modified as _a''_ and a second committed snapshot
    75  can be created as _P<sub>2</sub>_ by calling `Commit` again.  Note here that
    76  _P<sub>2</sub>_'s parent is _P<sub>0</sub>_ and not _P<sub>1</sub>_.
    77  
    78  ### Operations
    79  
    80  The manifestation of _snapshots_ is facilitated by the `Mount` object and
    81  user-defined directories used for opaque data storage. When creating a new
    82  active snapshot, the caller provides an identifier called the _key_. This
    83  operation returns a list of mounts that, if mounted, will have the fully
    84  prepared snapshot at the mounted path. We call this the _prepare_ operation.
    85  
    86  Once a snapshot is _prepared_ and mounted, the caller may write new data to the
    87  snapshot. Depending on the application, a user may want to capture these changes
    88  or not.
    89  
    90  For a read-only view of a snapshot, the _view_ operation can be used. Like
    91  _prepare_, _view_ will return a list of mounts that, if mounted, will have the
    92  fully prepared snapshot at the mounted path.
    93  
    94  If the user wants to keep the changes, the _commit_ operation is employed. The
    95  _commit_ operation takes the _key_ identifier, which represents an active
    96  snapshot, and a _name_ identifier. A successful result will create a _committed_
    97  snapshot that can be used as the parent of new _active_ snapshots when
    98  referenced by the _name_.
    99  
   100  If the user wants to discard the changes in an active snapshot, the _remove_
   101  operation will release any resources associated with the snapshot.  The mounts
   102  provided by _prepare_ or _view_ should be unmounted before calling this method.
   103  
   104  If the user wants to discard committed snapshots, the _remove_ operation can
   105  also be used, but any children must be removed before proceeding.
   106  
   107  For detailed usage information, see the
   108  [GoDoc](https://godoc.org/github.com/containerd/containerd/snapshots#Snapshotter).
   109  
   110  ### Graph metadata
   111  
   112  As snapshots are imported into the container system, a "graph" of snapshots and
   113  their parents will form. Queries over this graph must be a supported operation.
   114  
   115  ## How snapshots work
   116  
   117  To flesh out the _Snapshots_ terminology, we are going to demonstrate the use of
   118  the _Snapshotter_ from the perspective of importing layers. We'll use a Go API
   119  to represent the process.
   120  
   121  ### Importing a Layer
   122  
   123  To import a layer, we simply have the _Snapshotter_ provide a list of
   124  mounts to be applied such that our destination will capture a changeset. We start
   125  out by getting a path to the layer tar file and creating a temp location to
   126  unpack it to:
   127  
   128  	layerPath, tmpDir := getLayerPath(), mkTmpDir() // just a path to layer tar file.
   129  
   130  We start by using a _Snapshotter_ to _Prepare_ a new snapshot transaction, using
   131  a _key_ and descending from the empty parent "":
   132  
   133  	mounts, err := snapshotter.Prepare(key, "")
   134  	if err != nil { ... }
   135  
   136  We get back a list of mounts from `Snapshotter.Prepare`, with the `key`
   137  identifying the active snapshot. Mount this to the temporary location with the
   138  following:
   139  
   140  	if err := mount.All(mounts, tmpDir); err != nil { ... }
   141  
   142  Once the mounts are performed, our temporary location is ready to capture
   143  a diff. In practice, this works similar to a filesystem transaction. The
   144  next step is to unpack the layer. We have a special function `unpackLayer`
   145  that applies the contents of the layer to target location and calculates the
   146  `DiffID` of the unpacked layer (this is a requirement for docker
   147  implementation):
   148  
   149  	layer, err := os.Open(layerPath)
   150  	if err != nil { ... }
   151  	digest, err := unpackLayer(tmpLocation, layer) // unpack into layer location
   152  	if err != nil { ... }
   153  
   154  When the above completes, we should have a filesystem the represents the
   155  contents of the layer. Careful implementations should verify that digest
   156  matches the expected `DiffID`. When completed, we unmount the mounts:
   157  
   158  	unmount(mounts) // optional, for now
   159  
   160  Now that we've verified and unpacked our layer, we commit the active
   161  snapshot to a _name_. For this example, we are just going to use the layer
   162  digest, but in practice, this will probably be the `ChainID`:
   163  
   164  	if err := snapshotter.Commit(digest.String(), key); err != nil { ... }
   165  
   166  Now, we have a layer in the _Snapshotter_ that can be accessed with the digest
   167  provided during commit. Once you have committed the snapshot, the active
   168  snapshot can be removed with the following:
   169  
   170  	snapshotter.Remove(key)
   171  
   172  ### Importing the Next Layer
   173  
   174  Making a layer depend on the above is identical to the process described
   175  above except that the parent is provided as `parent` when calling
   176  `Snapshotter.Prepare`, assuming a clean `tmpLocation`:
   177  
   178  	mounts, err := snapshotter.Prepare(tmpLocation, parentDigest)
   179  
   180  We then mount, apply and commit, as we did above. The new snapshot will be
   181  based on the content of the previous one.
   182  
   183  ### Running a Container
   184  
   185  To run a container, we simply provide `Snapshotter.Prepare` the committed image
   186  snapshot as the parent. After mounting, the prepared path can
   187  be used directly as the container's filesystem:
   188  
   189  	mounts, err := snapshotter.Prepare(containerKey, imageRootFSChainID)
   190  
   191  The returned mounts can then be passed directly to the container runtime. If
   192  one would like to create a new image from the filesystem, `Snapshotter.Commit`
   193  is called:
   194  
   195  	if err := snapshotter.Commit(newImageSnapshot, containerKey); err != nil { ... }
   196  
   197  Alternatively, for most container runs, `Snapshotter.Remove` will be called to
   198  signal the Snapshotter to abandon the changes.