github.com/cvmfs/docker-graphdriver@v0.0.0-20181206110523-155ec6df0521/repository-manager/README.md (about)

     1  # Automatic conversion of docker images into the thin format
     2  
     3  This utility will automatically convert normal docker images into the thin
     4  format.
     5  
     6  ## Vocabulary
     7  
     8  There are several concepts to keep track in this process, and none of them is
     9  very common, so before to dive in we can agree on a shared vocabulary.
    10  
    11  **Registry** does refer to the docker image registry, with protocol extensions,
    12  common examples are:
    13  
    14      * https://registry.hub.docker.com
    15      * https://gitlab-registry.cern.ch
    16  
    17  **Repository** This specifies a class of images, each image will be indexed,
    18  then by tag or digest. Common examples are:
    19   
    20      * library/redis
    21      * library/ubuntu
    22  
    23  **Tag** is a way to identify an image inside a repository, tags are mutable
    24  and may change in a feature. Common examples are:
    25  
    26      * 4
    27      * 3-alpine
    28  
    29  **Digest** is another way to identify images inside a repository, digests are
    30  **immutable**, since they are the result of a hash function to the content of
    31  the image. Thanks to this technique the images are content addressable.
    32  Common examples are:
    33  
    34      * sha256:2aa24e8248d5c6483c99b6ce5e905040474c424965ec866f7decd87cb316b541
    35      * sha256:d582aa10c3355604d4133d6ff3530a35571bd95f97aadc5623355e66d92b6d2c
    36  
    37  
    38  An **image** belongs to a repository -- which in turns belongs to a registry --
    39  and it is identified by a tag, or a digest or both, if you can choose is always
    40  better to identify the image using at least the digest.
    41  
    42  To unique identify an image so we need to provide all those information:
    43  
    44      1. registry
    45      2. repository
    46      3. tag or digest or tag + digest
    47  
    48  We will use slash (`/`) to separate the `registry` from the `repository` and
    49  the colon (`/`) to separate the `repository` from the `tag` and the at (`@`) to
    50  separate the `digest` from the tag or from the `repository`.
    51  
    52  The final syntax will be:
    53  
    54      REGISTRY/REPOSITORY[:TAG][@DIGEST]
    55  
    56  Examples of images are:
    57      * https://registry.hub.docker.com/library/redis:4
    58      * https://registry.hub.docker.com/minio/minio@sha256:b1e5dd4a7be831107822243a0675ceb5eabe124356a9815f2519fe02beb3f167
    59      * https://registry.hub.docker.com/wurstmeister/kafka:1.1.0@sha256:3a63b48894bce633fb2f0d2579e162163367113d79ea12ca296120e90952b463
    60  
    61  ## Concepts
    62  
    63  The converter has a declarative approach. You specify what is your end goal and
    64  it tries to reach it.
    65  
    66  The main component of this approach is the **wish** which is a triplet
    67  composed by the input image, the output image and in which cvmfs repository you
    68  want to store the data.
    69  
    70      wish => (input_image, output_image, cvmfs_repository)
    71  
    72  The input image in your wish should be as more specific as possible,
    73  ideally specifying both the tag and the digest.
    74  
    75  On the other end, you cannot be so specific for the output image, simple
    76  because is impossible to know the digest before to generate the image itself.
    77  
    78  Finally we model the repository as an append only structure, deleting
    79  layers could break some images actually running.
    80  
    81  ## Recipes
    82  
    83  Recipes are a way to describe the wish we are interested in convert.
    84  
    85  ### Recipe Syntax v1
    86  
    87  An example of a complete recipe file is above, let's go over each key
    88  
    89  ``` yaml
    90  version: 1
    91  user: smosciat
    92  cvmfs_repo: unpacked.cern.ch
    93  output_format: '$(scheme)://registry.gitlab.cern.ch/thin/$(image)'
    94  input:
    95          - 'https://registry.hub.docker.com/econtal/numpy-mkl:latest'
    96          - 'https://registry.hub.docker.com/agladstein/simprily:version1'
    97          - 'https://registry.hub.docker.com/library/fedora:latest'
    98          - 'https://registry.hub.docker.com/library/debian:stable'
    99  ```
   100  
   101  **version**: indicate what version of recipe we are using, at the moment only
   102  `1` is supported.
   103  **user**: the user that will push the thin docker images into the registry,
   104  the password must be stored in the `DOCKER2CVMFS_DOCKER_REGISTRY_PASS`
   105  environment variable.
   106  **cvmfs_repo**: in which CVMFS repository store the layers and the singularity
   107  images.
   108  **output_format**: how to name the thin images. It accepts few "variables" that
   109  reference to the input image.
   110  
   111  * $(scheme), the very first part of the image url, most likely `http` or `https`
   112  * $(registry), in which registry the image is locate, in the case of the example it would be `registry.hub.docker.com`
   113  * $(repository), the repository of the input image, so something like `library/ubuntu` or `atlas/athena`
   114  * $(tag), the tag of the image examples could be `latest` or `stable` or `v0.1.4`
   115  * $(image), the $(repository) plus the $(tag)
   116  
   117  **input**: list of docker images to convert
   118  
   119  This recipe format allow to specify only some wish, specifically all the images
   120  need to be stored in the same CVMFS repository and have the same format.
   121  
   122  ## Commands
   123  
   124  ### convert
   125  
   126  ```
   127  convert recipe.yaml
   128  ```
   129  
   130  This command will try to convert all the wish in the recipe.
   131  
   132  ### loop
   133  
   134  ```
   135  loop recipe.yaml
   136  ```
   137  
   138  This command is equivalent to call `convert` in an infinite loop, useful to
   139  make sure that all the images are up to date.
   140  
   141  ## convert workflow
   142  
   143  The goal of convert is to actually create the thin images starting from the
   144  regular one.
   145  
   146  In order to convert we iterate for every wish in the recipe.
   147  
   148  In general, some wish will be already converted while others will need to
   149  be converted ex-novo.
   150  
   151  The first step is then to check if the wish is already been converted.
   152  In order to do this check, we download the input image manifest and check
   153  in the repository if the specific image is been already converted, if it is we
   154  safely skip such conversion.
   155  
   156  Then, every image is made of different layers, some of them could already be
   157  on the repository.
   158  In order to avoid expensive CVMFS transaction, before to download and ingest
   159  the layer we check if it is already in the repository, if it is we do not
   160  download nor ingest the layer.
   161  
   162  The conversion simply ingest every layer in an image, create a thin image and
   163  finally push the thin image to the registry.
   164  
   165  Such images can be used by docker with the  thin image plugins.
   166  
   167  The daemon also transform the images into singularity images and store them
   168  into the repository.
   169  
   170  The layers are stored into the `.layer` subdirectory, while the singularity
   171  images are stored in the `singularity` subdirectory.
   172  
   173  ## General workflow
   174  
   175  This section explains how this utility is intended to be used.
   176  
   177  Internally this utility invokes `cvmfs_server`, `docker` and `singularity`
   178  commands, so it is necessary to use it in a stratum0 that also have docker
   179  installed. 
   180  
   181  The conversion is quite straightforward, we first download the input image, we
   182  store each layer on the cvmfs repository, we create the output image and unpack
   183  the singularity one, finally we upload the output image to the registry.
   184  
   185  It does not support dowloading images that are not public.
   186  
   187  In order to publish images to a repository is necessary to sign up in the
   188  docker hub. It will use the user from the recipe, while it will read the
   189  password from the `DOCKER2CVMFS_DOCKER_REGISTRY_PASS` environment variable.