github.com/cvmfs/docker-graphdriver@v0.0.0-20181206110523-155ec6df0521/repository-manager/README.md (about) 1 # Automatic conversion of docker images into the thin format 2 3 This utility will automatically convert normal docker images into the thin 4 format. 5 6 ## Vocabulary 7 8 There are several concepts to keep track in this process, and none of them is 9 very common, so before to dive in we can agree on a shared vocabulary. 10 11 **Registry** does refer to the docker image registry, with protocol extensions, 12 common examples are: 13 14 * https://registry.hub.docker.com 15 * https://gitlab-registry.cern.ch 16 17 **Repository** This specifies a class of images, each image will be indexed, 18 then by tag or digest. Common examples are: 19 20 * library/redis 21 * library/ubuntu 22 23 **Tag** is a way to identify an image inside a repository, tags are mutable 24 and may change in a feature. Common examples are: 25 26 * 4 27 * 3-alpine 28 29 **Digest** is another way to identify images inside a repository, digests are 30 **immutable**, since they are the result of a hash function to the content of 31 the image. Thanks to this technique the images are content addressable. 32 Common examples are: 33 34 * sha256:2aa24e8248d5c6483c99b6ce5e905040474c424965ec866f7decd87cb316b541 35 * sha256:d582aa10c3355604d4133d6ff3530a35571bd95f97aadc5623355e66d92b6d2c 36 37 38 An **image** belongs to a repository -- which in turns belongs to a registry -- 39 and it is identified by a tag, or a digest or both, if you can choose is always 40 better to identify the image using at least the digest. 41 42 To unique identify an image so we need to provide all those information: 43 44 1. registry 45 2. repository 46 3. tag or digest or tag + digest 47 48 We will use slash (`/`) to separate the `registry` from the `repository` and 49 the colon (`/`) to separate the `repository` from the `tag` and the at (`@`) to 50 separate the `digest` from the tag or from the `repository`. 51 52 The final syntax will be: 53 54 REGISTRY/REPOSITORY[:TAG][@DIGEST] 55 56 Examples of images are: 57 * https://registry.hub.docker.com/library/redis:4 58 * https://registry.hub.docker.com/minio/minio@sha256:b1e5dd4a7be831107822243a0675ceb5eabe124356a9815f2519fe02beb3f167 59 * https://registry.hub.docker.com/wurstmeister/kafka:1.1.0@sha256:3a63b48894bce633fb2f0d2579e162163367113d79ea12ca296120e90952b463 60 61 ## Concepts 62 63 The converter has a declarative approach. You specify what is your end goal and 64 it tries to reach it. 65 66 The main component of this approach is the **wish** which is a triplet 67 composed by the input image, the output image and in which cvmfs repository you 68 want to store the data. 69 70 wish => (input_image, output_image, cvmfs_repository) 71 72 The input image in your wish should be as more specific as possible, 73 ideally specifying both the tag and the digest. 74 75 On the other end, you cannot be so specific for the output image, simple 76 because is impossible to know the digest before to generate the image itself. 77 78 Finally we model the repository as an append only structure, deleting 79 layers could break some images actually running. 80 81 ## Recipes 82 83 Recipes are a way to describe the wish we are interested in convert. 84 85 ### Recipe Syntax v1 86 87 An example of a complete recipe file is above, let's go over each key 88 89 ``` yaml 90 version: 1 91 user: smosciat 92 cvmfs_repo: unpacked.cern.ch 93 output_format: '$(scheme)://registry.gitlab.cern.ch/thin/$(image)' 94 input: 95 - 'https://registry.hub.docker.com/econtal/numpy-mkl:latest' 96 - 'https://registry.hub.docker.com/agladstein/simprily:version1' 97 - 'https://registry.hub.docker.com/library/fedora:latest' 98 - 'https://registry.hub.docker.com/library/debian:stable' 99 ``` 100 101 **version**: indicate what version of recipe we are using, at the moment only 102 `1` is supported. 103 **user**: the user that will push the thin docker images into the registry, 104 the password must be stored in the `DOCKER2CVMFS_DOCKER_REGISTRY_PASS` 105 environment variable. 106 **cvmfs_repo**: in which CVMFS repository store the layers and the singularity 107 images. 108 **output_format**: how to name the thin images. It accepts few "variables" that 109 reference to the input image. 110 111 * $(scheme), the very first part of the image url, most likely `http` or `https` 112 * $(registry), in which registry the image is locate, in the case of the example it would be `registry.hub.docker.com` 113 * $(repository), the repository of the input image, so something like `library/ubuntu` or `atlas/athena` 114 * $(tag), the tag of the image examples could be `latest` or `stable` or `v0.1.4` 115 * $(image), the $(repository) plus the $(tag) 116 117 **input**: list of docker images to convert 118 119 This recipe format allow to specify only some wish, specifically all the images 120 need to be stored in the same CVMFS repository and have the same format. 121 122 ## Commands 123 124 ### convert 125 126 ``` 127 convert recipe.yaml 128 ``` 129 130 This command will try to convert all the wish in the recipe. 131 132 ### loop 133 134 ``` 135 loop recipe.yaml 136 ``` 137 138 This command is equivalent to call `convert` in an infinite loop, useful to 139 make sure that all the images are up to date. 140 141 ## convert workflow 142 143 The goal of convert is to actually create the thin images starting from the 144 regular one. 145 146 In order to convert we iterate for every wish in the recipe. 147 148 In general, some wish will be already converted while others will need to 149 be converted ex-novo. 150 151 The first step is then to check if the wish is already been converted. 152 In order to do this check, we download the input image manifest and check 153 in the repository if the specific image is been already converted, if it is we 154 safely skip such conversion. 155 156 Then, every image is made of different layers, some of them could already be 157 on the repository. 158 In order to avoid expensive CVMFS transaction, before to download and ingest 159 the layer we check if it is already in the repository, if it is we do not 160 download nor ingest the layer. 161 162 The conversion simply ingest every layer in an image, create a thin image and 163 finally push the thin image to the registry. 164 165 Such images can be used by docker with the thin image plugins. 166 167 The daemon also transform the images into singularity images and store them 168 into the repository. 169 170 The layers are stored into the `.layer` subdirectory, while the singularity 171 images are stored in the `singularity` subdirectory. 172 173 ## General workflow 174 175 This section explains how this utility is intended to be used. 176 177 Internally this utility invokes `cvmfs_server`, `docker` and `singularity` 178 commands, so it is necessary to use it in a stratum0 that also have docker 179 installed. 180 181 The conversion is quite straightforward, we first download the input image, we 182 store each layer on the cvmfs repository, we create the output image and unpack 183 the singularity one, finally we upload the output image to the registry. 184 185 It does not support dowloading images that are not public. 186 187 In order to publish images to a repository is necessary to sign up in the 188 docker hub. It will use the user from the recipe, while it will read the 189 password from the `DOCKER2CVMFS_DOCKER_REGISTRY_PASS` environment variable.