github.com/demonoid81/moby@v0.0.0-20200517203328-62dd8e17c460/image/spec/v1.md (about) 1 # Docker Image Specification v1.0.0 2 3 An *Image* is an ordered collection of root filesystem changes and the 4 corresponding execution parameters for use within a container runtime. This 5 specification outlines the format of these filesystem changes and corresponding 6 parameters and describes how to create and use them for use with a container 7 runtime and execution tool. 8 9 ## Terminology 10 11 This specification uses the following terms: 12 13 <dl> 14 <dt> 15 Layer 16 </dt> 17 <dd> 18 Images are composed of <i>layers</i>. <i>Image layer</i> is a general 19 term which may be used to refer to one or both of the following: 20 <ol> 21 <li>The metadata for the layer, described in the JSON format.</li> 22 <li>The filesystem changes described by a layer.</li> 23 </ol> 24 To refer to the former you may use the term <i>Layer JSON</i> or 25 <i>Layer Metadata</i>. To refer to the latter you may use the term 26 <i>Image Filesystem Changeset</i> or <i>Image Diff</i>. 27 </dd> 28 <dt> 29 Image JSON 30 </dt> 31 <dd> 32 Each layer has an associated JSON structure which describes some 33 basic information about the image such as date created, author, and the 34 ID of its parent image as well as execution/runtime configuration like 35 its entry point, default arguments, CPU/memory shares, networking, and 36 volumes. 37 </dd> 38 <dt> 39 Image Filesystem Changeset 40 </dt> 41 <dd> 42 Each layer has an archive of the files which have been added, changed, 43 or deleted relative to its parent layer. Using a layer-based or union 44 filesystem such as AUFS, or by computing the diff from filesystem 45 snapshots, the filesystem changeset can be used to present a series of 46 image layers as if they were one cohesive filesystem. 47 </dd> 48 <dt> 49 Image ID <a name="id_desc"></a> 50 </dt> 51 <dd> 52 Each layer is given an ID upon its creation. It is 53 represented as a hexadecimal encoding of 256 bits, e.g., 54 <code>a9561eb1b190625c9adb5a9513e72c4dedafc1cb2d4c5236c9a6957ec7dfd5a9</code>. 55 Image IDs should be sufficiently random so as to be globally unique. 56 32 bytes read from <code>/dev/urandom</code> is sufficient for all 57 practical purposes. Alternatively, an image ID may be derived as a 58 cryptographic hash of image contents as the result is considered 59 indistinguishable from random. The choice is left up to implementors. 60 </dd> 61 <dt> 62 Image Parent 63 </dt> 64 <dd> 65 Most layer metadata structs contain a <code>parent</code> field which 66 refers to the Image from which another directly descends. An image 67 contains a separate JSON metadata file and set of changes relative to 68 the filesystem of its parent image. <i>Image Ancestor</i> and 69 <i>Image Descendant</i> are also common terms. 70 </dd> 71 <dt> 72 Image Checksum 73 </dt> 74 <dd> 75 Layer metadata structs contain a cryptographic hash of the contents of 76 the layer's filesystem changeset. Though the set of changes exists as a 77 simple Tar archive, two archives with identical filenames and content 78 will have different SHA digests if the last-access or last-modified 79 times of any entries differ. For this reason, image checksums are 80 generated using the TarSum algorithm which produces a cryptographic 81 hash of file contents and selected headers only. Details of this 82 algorithm are described in the separate <a href="https://github.com/demonoid81/moby/blob/master/pkg/tarsum/tarsum_spec.md">TarSum specification</a>. 83 </dd> 84 <dt> 85 Tag 86 </dt> 87 <dd> 88 A tag serves to map a descriptive, user-given name to any single image 89 ID. An image name suffix (the name component after <code>:</code>) is 90 often referred to as a tag as well, though it strictly refers to the 91 full name of an image. Acceptable values for a tag suffix are 92 implementation specific, but they SHOULD be limited to the set of 93 alphanumeric characters <code>[a-zA-Z0-9]</code>, punctuation 94 characters <code>[._-]</code>, and MUST NOT contain a <code>:</code> 95 character. 96 </dd> 97 <dt> 98 Repository 99 </dt> 100 <dd> 101 A collection of tags grouped under a common prefix (the name component 102 before <code>:</code>). For example, in an image tagged with the name 103 <code>my-app:3.1.4</code>, <code>my-app</code> is the <i>Repository</i> 104 component of the name. Acceptable values for repository name are 105 implementation specific, but they SHOULD be limited to the set of 106 alphanumeric characters <code>[a-zA-Z0-9]</code>, and punctuation 107 characters <code>[._-]</code>, however it MAY contain additional 108 <code>/</code> and <code>:</code> characters for organizational 109 purposes, with the last <code>:</code> character being interpreted 110 dividing the repository component of the name from the tag suffix 111 component. 112 </dd> 113 </dl> 114 115 ## Image JSON Description 116 117 Here is an example image JSON file: 118 119 ``` 120 { 121 "id": "a9561eb1b190625c9adb5a9513e72c4dedafc1cb2d4c5236c9a6957ec7dfd5a9", 122 "parent": "c6e3cedcda2e3982a1a6760e178355e8e65f7b80e4e5248743fa3549d284e024", 123 "checksum": "tarsum.v1+sha256:e58fcf7418d2390dec8e8fb69d88c06ec07039d651fedc3aa72af9972e7d046b", 124 "created": "2014-10-13T21:19:18.674353812Z", 125 "author": "Alyssa P. Hacker <alyspdev@example.com>", 126 "architecture": "amd64", 127 "os": "linux", 128 "Size": 271828, 129 "config": { 130 "User": "alice", 131 "Memory": 2048, 132 "MemorySwap": 4096, 133 "CpuShares": 8, 134 "ExposedPorts": { 135 "8080/tcp": {} 136 }, 137 "Env": [ 138 "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin", 139 "FOO=docker_is_a_really", 140 "BAR=great_tool_you_know" 141 ], 142 "Entrypoint": [ 143 "/bin/my-app-binary" 144 ], 145 "Cmd": [ 146 "--foreground", 147 "--config", 148 "/etc/my-app.d/default.cfg" 149 ], 150 "Volumes": { 151 "/var/job-result-data": {}, 152 "/var/log/my-app-logs": {}, 153 }, 154 "WorkingDir": "/home/alice", 155 } 156 } 157 ``` 158 159 ### Image JSON Field Descriptions 160 161 <dl> 162 <dt> 163 id <code>string</code> 164 </dt> 165 <dd> 166 Randomly generated, 256-bit, hexadecimal encoded. Uniquely identifies 167 the image. 168 </dd> 169 <dt> 170 parent <code>string</code> 171 </dt> 172 <dd> 173 ID of the parent image. If there is no parent image then this field 174 should be omitted. A collection of images may share many of the same 175 ancestor layers. This organizational structure is strictly a tree with 176 any one layer having either no parent or a single parent and zero or 177 more descendant layers. Cycles are not allowed and implementations 178 should be careful to avoid creating them or iterating through a cycle 179 indefinitely. 180 </dd> 181 <dt> 182 created <code>string</code> 183 </dt> 184 <dd> 185 ISO-8601 formatted combined date and time at which the image was 186 created. 187 </dd> 188 <dt> 189 author <code>string</code> 190 </dt> 191 <dd> 192 Gives the name and/or email address of the person or entity which 193 created and is responsible for maintaining the image. 194 </dd> 195 <dt> 196 architecture <code>string</code> 197 </dt> 198 <dd> 199 The CPU architecture which the binaries in this image are built to run 200 on. Possible values include: 201 <ul> 202 <li>386</li> 203 <li>amd64</li> 204 <li>arm</li> 205 </ul> 206 More values may be supported in the future and any of these may or may 207 not be supported by a given container runtime implementation. 208 </dd> 209 <dt> 210 os <code>string</code> 211 </dt> 212 <dd> 213 The name of the operating system which the image is built to run on. 214 Possible values include: 215 <ul> 216 <li>darwin</li> 217 <li>freebsd</li> 218 <li>linux</li> 219 </ul> 220 More values may be supported in the future and any of these may or may 221 not be supported by a given container runtime implementation. 222 </dd> 223 <dt> 224 checksum <code>string</code> 225 </dt> 226 <dd> 227 Image Checksum of the filesystem changeset associated with the image 228 layer. 229 </dd> 230 <dt> 231 Size <code>integer</code> 232 </dt> 233 <dd> 234 The size in bytes of the filesystem changeset associated with the image 235 layer. 236 </dd> 237 <dt> 238 config <code>struct</code> 239 </dt> 240 <dd> 241 The execution parameters which should be used as a base when running a 242 container using the image. This field can be <code>null</code>, in 243 which case any execution parameters should be specified at creation of 244 the container. 245 <h4>Container RunConfig Field Descriptions</h4> 246 <dl> 247 <dt> 248 User <code>string</code> 249 </dt> 250 <dd> 251 <p>The username or UID which the process in the container should 252 run as. This acts as a default value to use when the value is 253 not specified when creating a container.</p> 254 <p>All of the following are valid:</p> 255 <ul> 256 <li><code>user</code></li> 257 <li><code>uid</code></li> 258 <li><code>user:group</code></li> 259 <li><code>uid:gid</code></li> 260 <li><code>uid:group</code></li> 261 <li><code>user:gid</code></li> 262 </ul> 263 <p>If <code>group</code>/<code>gid</code> is not specified, the 264 default group and supplementary groups of the given 265 <code>user</code>/<code>uid</code> in <code>/etc/passwd</code> 266 from the container are applied.</p> 267 </dd> 268 <dt> 269 Memory <code>integer</code> 270 </dt> 271 <dd> 272 Memory limit (in bytes). This acts as a default value to use 273 when the value is not specified when creating a container. 274 </dd> 275 <dt> 276 MemorySwap <code>integer</code> 277 </dt> 278 <dd> 279 Total memory usage (memory + swap); set to <code>-1</code> to 280 disable swap. This acts as a default value to use when the 281 value is not specified when creating a container. 282 </dd> 283 <dt> 284 CpuShares <code>integer</code> 285 </dt> 286 <dd> 287 CPU shares (relative weight vs. other containers). This acts as 288 a default value to use when the value is not specified when 289 creating a container. 290 </dd> 291 <dt> 292 ExposedPorts <code>struct</code> 293 </dt> 294 <dd> 295 A set of ports to expose from a container running this image. 296 This JSON structure value is unusual because it is a direct 297 JSON serialization of the Go type 298 <code>map[string]struct{}</code> and is represented in JSON as 299 an object mapping its keys to an empty object. Here is an 300 example: 301 <pre>{ 302 "8080": {}, 303 "53/udp": {}, 304 "2356/tcp": {} 305 }</pre> 306 Its keys can be in the format of: 307 <ul> 308 <li> 309 <code>"port/tcp"</code> 310 </li> 311 <li> 312 <code>"port/udp"</code> 313 </li> 314 <li> 315 <code>"port"</code> 316 </li> 317 </ul> 318 with the default protocol being <code>"tcp"</code> if not 319 specified. These values act as defaults and are merged with any specified 320 when creating a container. 321 </dd> 322 <dt> 323 Env <code>array of strings</code> 324 </dt> 325 <dd> 326 Entries are in the format of <code>VARNAME="var value"</code>. 327 These values act as defaults and are merged with any specified 328 when creating a container. 329 </dd> 330 <dt> 331 Entrypoint <code>array of strings</code> 332 </dt> 333 <dd> 334 A list of arguments to use as the command to execute when the 335 container starts. This value acts as a default and is replaced 336 by an entrypoint specified when creating a container. 337 </dd> 338 <dt> 339 Cmd <code>array of strings</code> 340 </dt> 341 <dd> 342 Default arguments to the entry point of the container. These 343 values act as defaults and are replaced with any specified when 344 creating a container. If an <code>Entrypoint</code> value is 345 not specified, then the first entry of the <code>Cmd</code> 346 array should be interpreted as the executable to run. 347 </dd> 348 <dt> 349 Volumes <code>struct</code> 350 </dt> 351 <dd> 352 A set of directories which should be created as data volumes in 353 a container running this image. This JSON structure value is 354 unusual because it is a direct JSON serialization of the Go 355 type <code>map[string]struct{}</code> and is represented in 356 JSON as an object mapping its keys to an empty object. Here is 357 an example: 358 <pre>{ 359 "/var/my-app-data/": {}, 360 "/etc/some-config.d/": {}, 361 }</pre> 362 </dd> 363 <dt> 364 WorkingDir <code>string</code> 365 </dt> 366 <dd> 367 Sets the current working directory of the entry point process 368 in the container. This value acts as a default and is replaced 369 by a working directory specified when creating a container. 370 </dd> 371 </dl> 372 </dd> 373 </dl> 374 375 Any extra fields in the Image JSON struct are considered implementation 376 specific and should be ignored by any implementations which are unable to 377 interpret them. 378 379 ## Creating an Image Filesystem Changeset 380 381 An example of creating an Image Filesystem Changeset follows. 382 383 An image root filesystem is first created as an empty directory named with the 384 ID of the image being created. Here is the initial empty directory structure 385 for the changeset for an image with ID `c3167915dc9d` ([real IDs are much 386 longer](#id_desc), but this example use a truncated one here for brevity. 387 Implementations need not name the rootfs directory in this way but it may be 388 convenient for keeping record of a large number of image layers.): 389 390 ``` 391 c3167915dc9d/ 392 ``` 393 394 Files and directories are then created: 395 396 ``` 397 c3167915dc9d/ 398 etc/ 399 my-app-config 400 bin/ 401 my-app-binary 402 my-app-tools 403 ``` 404 405 The `c3167915dc9d` directory is then committed as a plain Tar archive with 406 entries for the following files: 407 408 ``` 409 etc/my-app-config 410 bin/my-app-binary 411 bin/my-app-tools 412 ``` 413 414 The TarSum checksum for the archive file is then computed and placed in the 415 JSON metadata along with the execution parameters. 416 417 To make changes to the filesystem of this container image, create a new 418 directory named with a new ID, such as `f60c56784b83`, and initialize it with 419 a snapshot of the parent image's root filesystem, so that the directory is 420 identical to that of `c3167915dc9d`. NOTE: a copy-on-write or union filesystem 421 can make this very efficient: 422 423 ``` 424 f60c56784b83/ 425 etc/ 426 my-app-config 427 bin/ 428 my-app-binary 429 my-app-tools 430 ``` 431 432 This example change is going to add a configuration directory at `/etc/my-app.d` 433 which contains a default config file. There's also a change to the 434 `my-app-tools` binary to handle the config layout change. The `f60c56784b83` 435 directory then looks like this: 436 437 ``` 438 f60c56784b83/ 439 etc/ 440 my-app.d/ 441 default.cfg 442 bin/ 443 my-app-binary 444 my-app-tools 445 ``` 446 447 This reflects the removal of `/etc/my-app-config` and creation of a file and 448 directory at `/etc/my-app.d/default.cfg`. `/bin/my-app-tools` has also been 449 replaced with an updated version. Before committing this directory to a 450 changeset, because it has a parent image, it is first compared with the 451 directory tree of the parent snapshot, `f60c56784b83`, looking for files and 452 directories that have been added, modified, or removed. The following changeset 453 is found: 454 455 ``` 456 Added: /etc/my-app.d/default.cfg 457 Modified: /bin/my-app-tools 458 Deleted: /etc/my-app-config 459 ``` 460 461 A Tar Archive is then created which contains *only* this changeset: The added 462 and modified files and directories in their entirety, and for each deleted item 463 an entry for an empty file at the same location but with the basename of the 464 deleted file or directory prefixed with `.wh.`. The filenames prefixed with 465 `.wh.` are known as "whiteout" files. NOTE: For this reason, it is not possible 466 to create an image root filesystem which contains a file or directory with a 467 name beginning with `.wh.`. The resulting Tar archive for `f60c56784b83` has 468 the following entries: 469 470 ``` 471 /etc/my-app.d/default.cfg 472 /bin/my-app-tools 473 /etc/.wh.my-app-config 474 ``` 475 476 Any given image is likely to be composed of several of these Image Filesystem 477 Changeset tar archives. 478 479 ## Combined Image JSON + Filesystem Changeset Format 480 481 There is also a format for a single archive which contains complete information 482 about an image, including: 483 484 - repository names/tags 485 - all image layer JSON files 486 - all tar archives of each layer filesystem changesets 487 488 For example, here's what the full archive of `library/busybox` is (displayed in 489 `tree` format): 490 491 ``` 492 . 493 ├── 5785b62b697b99a5af6cd5d0aabc804d5748abbb6d3d07da5d1d3795f2dcc83e 494 │ ├── VERSION 495 │ ├── json 496 │ └── layer.tar 497 ├── a7b8b41220991bfc754d7ad445ad27b7f272ab8b4a2c175b9512b97471d02a8a 498 │ ├── VERSION 499 │ ├── json 500 │ └── layer.tar 501 ├── a936027c5ca8bf8f517923169a233e391cbb38469a75de8383b5228dc2d26ceb 502 │ ├── VERSION 503 │ ├── json 504 │ └── layer.tar 505 ├── f60c56784b832dd990022afc120b8136ab3da9528094752ae13fe63a2d28dc8c 506 │ ├── VERSION 507 │ ├── json 508 │ └── layer.tar 509 └── repositories 510 ``` 511 512 There are one or more directories named with the ID for each layer in a full 513 image. Each of these directories contains 3 files: 514 515 * `VERSION` - The schema version of the `json` file 516 * `json` - The JSON metadata for an image layer 517 * `layer.tar` - The Tar archive of the filesystem changeset for an image 518 layer. 519 520 The content of the `VERSION` files is simply the semantic version of the JSON 521 metadata schema: 522 523 ``` 524 1.0 525 ``` 526 527 And the `repositories` file is another JSON file which describes names/tags: 528 529 ``` 530 { 531 "busybox":{ 532 "latest":"5785b62b697b99a5af6cd5d0aabc804d5748abbb6d3d07da5d1d3795f2dcc83e" 533 } 534 } 535 ``` 536 537 Every key in this object is the name of a repository, and maps to a collection 538 of tag suffixes. Each tag maps to the ID of the image represented by that tag. 539 540 ## Loading an Image Filesystem Changeset 541 542 Unpacking a bundle of image layer JSON files and their corresponding filesystem 543 changesets can be done using a series of steps: 544 545 1. Follow the parent IDs of image layers to find the root ancestor (an image 546 with no parent ID specified). 547 2. For every image layer, in order from root ancestor and descending down, 548 extract the contents of that layer's filesystem changeset archive into a 549 directory which will be used as the root of a container filesystem. 550 551 - Extract all contents of each archive. 552 - Walk the directory tree once more, removing any files with the prefix 553 `.wh.` and the corresponding file or directory named without this prefix. 554 555 556 ## Implementations 557 558 This specification is an admittedly imperfect description of an 559 imperfectly-understood problem. The Docker project is, in turn, an attempt to 560 implement this specification. Our goal and our execution toward it will evolve 561 over time, but our primary concern in this specification and in our 562 implementation is compatibility and interoperability.