github.com/portworx/docker@v1.12.1/image/spec/v1.md (about) 1 # Docker Image Specification v1.0.0 2 3 An *Image* is an ordered collection of root filesystem changes and the 4 corresponding execution parameters for use within a container runtime. This 5 specification outlines the format of these filesystem changes and corresponding 6 parameters and describes how to create and use them for use with a container 7 runtime and execution tool. 8 9 ## Terminology 10 11 This specification uses the following terms: 12 13 <dl> 14 <dt> 15 Layer 16 </dt> 17 <dd> 18 Images are composed of <i>layers</i>. <i>Image layer</i> is a general 19 term which may be used to refer to one or both of the following: 20 21 <ol> 22 <li>The metadata for the layer, described in the JSON format.</li> 23 <li>The filesystem changes described by a layer.</li> 24 </ol> 25 26 To refer to the former you may use the term <i>Layer JSON</i> or 27 <i>Layer Metadata</i>. To refer to the latter you may use the term 28 <i>Image Filesystem Changeset</i> or <i>Image Diff</i>. 29 </dd> 30 <dt> 31 Image JSON 32 </dt> 33 <dd> 34 Each layer has an associated JSON structure which describes some 35 basic information about the image such as date created, author, and the 36 ID of its parent image as well as execution/runtime configuration like 37 its entry point, default arguments, CPU/memory shares, networking, and 38 volumes. 39 </dd> 40 <dt> 41 Image Filesystem Changeset 42 </dt> 43 <dd> 44 Each layer has an archive of the files which have been added, changed, 45 or deleted relative to its parent layer. Using a layer-based or union 46 filesystem such as AUFS, or by computing the diff from filesystem 47 snapshots, the filesystem changeset can be used to present a series of 48 image layers as if they were one cohesive filesystem. 49 </dd> 50 <dt> 51 Image ID <a name="id_desc"></a> 52 </dt> 53 <dd> 54 Each layer is given an ID upon its creation. It is 55 represented as a hexadecimal encoding of 256 bits, e.g., 56 <code>a9561eb1b190625c9adb5a9513e72c4dedafc1cb2d4c5236c9a6957ec7dfd5a9</code>. 57 Image IDs should be sufficiently random so as to be globally unique. 58 32 bytes read from <code>/dev/urandom</code> is sufficient for all 59 practical purposes. Alternatively, an image ID may be derived as a 60 cryptographic hash of image contents as the result is considered 61 indistinguishable from random. The choice is left up to implementors. 62 </dd> 63 <dt> 64 Image Parent 65 </dt> 66 <dd> 67 Most layer metadata structs contain a <code>parent</code> field which 68 refers to the Image from which another directly descends. An image 69 contains a separate JSON metadata file and set of changes relative to 70 the filesystem of its parent image. <i>Image Ancestor</i> and 71 <i>Image Descendant</i> are also common terms. 72 </dd> 73 <dt> 74 Image Checksum 75 </dt> 76 <dd> 77 Layer metadata structs contain a cryptographic hash of the contents of 78 the layer's filesystem changeset. Though the set of changes exists as a 79 simple Tar archive, two archives with identical filenames and content 80 will have different SHA digests if the last-access or last-modified 81 times of any entries differ. For this reason, image checksums are 82 generated using the TarSum algorithm which produces a cryptographic 83 hash of file contents and selected headers only. Details of this 84 algorithm are described in the separate <a href="https://github.com/docker/docker/blob/master/pkg/tarsum/tarsum_spec.md">TarSum specification</a>. 85 </dd> 86 <dt> 87 Tag 88 </dt> 89 <dd> 90 A tag serves to map a descriptive, user-given name to any single image 91 ID. An image name suffix (the name component after <code>:</code>) is 92 often referred to as a tag as well, though it strictly refers to the 93 full name of an image. Acceptable values for a tag suffix are 94 implementation specific, but they SHOULD be limited to the set of 95 alphanumeric characters <code>[a-zA-z0-9]</code>, punctuation 96 characters <code>[._-]</code>, and MUST NOT contain a <code>:</code> 97 character. 98 </dd> 99 <dt> 100 Repository 101 </dt> 102 <dd> 103 A collection of tags grouped under a common prefix (the name component 104 before <code>:</code>). For example, in an image tagged with the name 105 <code>my-app:3.1.4</code>, <code>my-app</code> is the <i>Repository</i> 106 component of the name. Acceptable values for repository name are 107 implementation specific, but they SHOULD be limited to the set of 108 alphanumeric characters <code>[a-zA-z0-9]</code>, and punctuation 109 characters <code>[._-]</code>, however it MAY contain additional 110 <code>/</code> and <code>:</code> characters for organizational 111 purposes, with the last <code>:</code> character being interpreted 112 dividing the repository component of the name from the tag suffix 113 component. 114 </dd> 115 </dl> 116 117 ## Image JSON Description 118 119 Here is an example image JSON file: 120 121 ``` 122 { 123 "id": "a9561eb1b190625c9adb5a9513e72c4dedafc1cb2d4c5236c9a6957ec7dfd5a9", 124 "parent": "c6e3cedcda2e3982a1a6760e178355e8e65f7b80e4e5248743fa3549d284e024", 125 "checksum": "tarsum.v1+sha256:e58fcf7418d2390dec8e8fb69d88c06ec07039d651fedc3aa72af9972e7d046b", 126 "created": "2014-10-13T21:19:18.674353812Z", 127 "author": "Alyssa P. Hacker <alyspdev@example.com>", 128 "architecture": "amd64", 129 "os": "linux", 130 "Size": 271828, 131 "config": { 132 "User": "alice", 133 "Memory": 2048, 134 "MemorySwap": 4096, 135 "CpuShares": 8, 136 "ExposedPorts": { 137 "8080/tcp": {} 138 }, 139 "Env": [ 140 "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin", 141 "FOO=docker_is_a_really", 142 "BAR=great_tool_you_know" 143 ], 144 "Entrypoint": [ 145 "/bin/my-app-binary" 146 ], 147 "Cmd": [ 148 "--foreground", 149 "--config", 150 "/etc/my-app.d/default.cfg" 151 ], 152 "Volumes": { 153 "/var/job-result-data": {}, 154 "/var/log/my-app-logs": {}, 155 }, 156 "WorkingDir": "/home/alice", 157 } 158 } 159 ``` 160 161 ### Image JSON Field Descriptions 162 163 <dl> 164 <dt> 165 id <code>string</code> 166 </dt> 167 <dd> 168 Randomly generated, 256-bit, hexadecimal encoded. Uniquely identifies 169 the image. 170 </dd> 171 <dt> 172 parent <code>string</code> 173 </dt> 174 <dd> 175 ID of the parent image. If there is no parent image then this field 176 should be omitted. A collection of images may share many of the same 177 ancestor layers. This organizational structure is strictly a tree with 178 any one layer having either no parent or a single parent and zero or 179 more descendant layers. Cycles are not allowed and implementations 180 should be careful to avoid creating them or iterating through a cycle 181 indefinitely. 182 </dd> 183 <dt> 184 created <code>string</code> 185 </dt> 186 <dd> 187 ISO-8601 formatted combined date and time at which the image was 188 created. 189 </dd> 190 <dt> 191 author <code>string</code> 192 </dt> 193 <dd> 194 Gives the name and/or email address of the person or entity which 195 created and is responsible for maintaining the image. 196 </dd> 197 <dt> 198 architecture <code>string</code> 199 </dt> 200 <dd> 201 The CPU architecture which the binaries in this image are built to run 202 on. Possible values include: 203 <ul> 204 <li>386</li> 205 <li>amd64</li> 206 <li>arm</li> 207 </ul> 208 More values may be supported in the future and any of these may or may 209 not be supported by a given container runtime implementation. 210 </dd> 211 <dt> 212 os <code>string</code> 213 </dt> 214 <dd> 215 The name of the operating system which the image is built to run on. 216 Possible values include: 217 <ul> 218 <li>darwin</li> 219 <li>freebsd</li> 220 <li>linux</li> 221 </ul> 222 More values may be supported in the future and any of these may or may 223 not be supported by a given container runtime implementation. 224 </dd> 225 <dt> 226 checksum <code>string</code> 227 </dt> 228 <dd> 229 Image Checksum of the filesystem changeset associated with the image 230 layer. 231 </dd> 232 <dt> 233 Size <code>integer</code> 234 </dt> 235 <dd> 236 The size in bytes of the filesystem changeset associated with the image 237 layer. 238 </dd> 239 <dt> 240 config <code>struct</code> 241 </dt> 242 <dd> 243 The execution parameters which should be used as a base when running a 244 container using the image. This field can be <code>null</code>, in 245 which case any execution parameters should be specified at creation of 246 the container. 247 248 <h4>Container RunConfig Field Descriptions</h4> 249 250 <dl> 251 <dt> 252 User <code>string</code> 253 </dt> 254 <dd> 255 <p>The username or UID which the process in the container should 256 run as. This acts as a default value to use when the value is 257 not specified when creating a container.</p> 258 259 <p>All of the following are valid:</p> 260 261 <ul> 262 <li><code>user</code></li> 263 <li><code>uid</code></li> 264 <li><code>user:group</code></li> 265 <li><code>uid:gid</code></li> 266 <li><code>uid:group</code></li> 267 <li><code>user:gid</code></li> 268 </ul> 269 270 <p>If <code>group</code>/<code>gid</code> is not specified, the 271 default group and supplementary groups of the given 272 <code>user</code>/<code>uid</code> in <code>/etc/passwd</code> 273 from the container are applied.</p> 274 </dd> 275 <dt> 276 Memory <code>integer</code> 277 </dt> 278 <dd> 279 Memory limit (in bytes). This acts as a default value to use 280 when the value is not specified when creating a container. 281 </dd> 282 <dt> 283 MemorySwap <code>integer</code> 284 </dt> 285 <dd> 286 Total memory usage (memory + swap); set to <code>-1</code> to 287 disable swap. This acts as a default value to use when the 288 value is not specified when creating a container. 289 </dd> 290 <dt> 291 CpuShares <code>integer</code> 292 </dt> 293 <dd> 294 CPU shares (relative weight vs. other containers). This acts as 295 a default value to use when the value is not specified when 296 creating a container. 297 </dd> 298 <dt> 299 ExposedPorts <code>struct</code> 300 </dt> 301 <dd> 302 A set of ports to expose from a container running this image. 303 This JSON structure value is unusual because it is a direct 304 JSON serialization of the Go type 305 <code>map[string]struct{}</code> and is represented in JSON as 306 an object mapping its keys to an empty object. Here is an 307 example: 308 309 <pre>{ 310 "8080": {}, 311 "53/udp": {}, 312 "2356/tcp": {} 313 }</pre> 314 315 Its keys can be in the format of: 316 <ul> 317 <li> 318 <code>"port/tcp"</code> 319 </li> 320 <li> 321 <code>"port/udp"</code> 322 </li> 323 <li> 324 <code>"port"</code> 325 </li> 326 </ul> 327 with the default protocol being <code>"tcp"</code> if not 328 specified. 329 330 These values act as defaults and are merged with any specified 331 when creating a container. 332 </dd> 333 <dt> 334 Env <code>array of strings</code> 335 </dt> 336 <dd> 337 Entries are in the format of <code>VARNAME="var value"</code>. 338 These values act as defaults and are merged with any specified 339 when creating a container. 340 </dd> 341 <dt> 342 Entrypoint <code>array of strings</code> 343 </dt> 344 <dd> 345 A list of arguments to use as the command to execute when the 346 container starts. This value acts as a default and is replaced 347 by an entrypoint specified when creating a container. 348 </dd> 349 <dt> 350 Cmd <code>array of strings</code> 351 </dt> 352 <dd> 353 Default arguments to the entry point of the container. These 354 values act as defaults and are replaced with any specified when 355 creating a container. If an <code>Entrypoint</code> value is 356 not specified, then the first entry of the <code>Cmd</code> 357 array should be interpreted as the executable to run. 358 </dd> 359 <dt> 360 Volumes <code>struct</code> 361 </dt> 362 <dd> 363 A set of directories which should be created as data volumes in 364 a container running this image. This JSON structure value is 365 unusual because it is a direct JSON serialization of the Go 366 type <code>map[string]struct{}</code> and is represented in 367 JSON as an object mapping its keys to an empty object. Here is 368 an example: 369 <pre>{ 370 "/var/my-app-data/": {}, 371 "/etc/some-config.d/": {}, 372 }</pre> 373 </dd> 374 <dt> 375 WorkingDir <code>string</code> 376 </dt> 377 <dd> 378 Sets the current working directory of the entry point process 379 in the container. This value acts as a default and is replaced 380 by a working directory specified when creating a container. 381 </dd> 382 </dl> 383 </dd> 384 </dl> 385 386 Any extra fields in the Image JSON struct are considered implementation 387 specific and should be ignored by any implementations which are unable to 388 interpret them. 389 390 ## Creating an Image Filesystem Changeset 391 392 An example of creating an Image Filesystem Changeset follows. 393 394 An image root filesystem is first created as an empty directory named with the 395 ID of the image being created. Here is the initial empty directory structure 396 for the changeset for an image with ID `c3167915dc9d` ([real IDs are much 397 longer](#id_desc), but this example use a truncated one here for brevity. 398 Implementations need not name the rootfs directory in this way but it may be 399 convenient for keeping record of a large number of image layers.): 400 401 ``` 402 c3167915dc9d/ 403 ``` 404 405 Files and directories are then created: 406 407 ``` 408 c3167915dc9d/ 409 etc/ 410 my-app-config 411 bin/ 412 my-app-binary 413 my-app-tools 414 ``` 415 416 The `c3167915dc9d` directory is then committed as a plain Tar archive with 417 entries for the following files: 418 419 ``` 420 etc/my-app-config 421 bin/my-app-binary 422 bin/my-app-tools 423 ``` 424 425 The TarSum checksum for the archive file is then computed and placed in the 426 JSON metadata along with the execution parameters. 427 428 To make changes to the filesystem of this container image, create a new 429 directory named with a new ID, such as `f60c56784b83`, and initialize it with 430 a snapshot of the parent image's root filesystem, so that the directory is 431 identical to that of `c3167915dc9d`. NOTE: a copy-on-write or union filesystem 432 can make this very efficient: 433 434 ``` 435 f60c56784b83/ 436 etc/ 437 my-app-config 438 bin/ 439 my-app-binary 440 my-app-tools 441 ``` 442 443 This example change is going add a configuration directory at `/etc/my-app.d` 444 which contains a default config file. There's also a change to the 445 `my-app-tools` binary to handle the config layout change. The `f60c56784b83` 446 directory then looks like this: 447 448 ``` 449 f60c56784b83/ 450 etc/ 451 my-app.d/ 452 default.cfg 453 bin/ 454 my-app-binary 455 my-app-tools 456 ``` 457 458 This reflects the removal of `/etc/my-app-config` and creation of a file and 459 directory at `/etc/my-app.d/default.cfg`. `/bin/my-app-tools` has also been 460 replaced with an updated version. Before committing this directory to a 461 changeset, because it has a parent image, it is first compared with the 462 directory tree of the parent snapshot, `f60c56784b83`, looking for files and 463 directories that have been added, modified, or removed. The following changeset 464 is found: 465 466 ``` 467 Added: /etc/my-app.d/default.cfg 468 Modified: /bin/my-app-tools 469 Deleted: /etc/my-app-config 470 ``` 471 472 A Tar Archive is then created which contains *only* this changeset: The added 473 and modified files and directories in their entirety, and for each deleted item 474 an entry for an empty file at the same location but with the basename of the 475 deleted file or directory prefixed with `.wh.`. The filenames prefixed with 476 `.wh.` are known as "whiteout" files. NOTE: For this reason, it is not possible 477 to create an image root filesystem which contains a file or directory with a 478 name beginning with `.wh.`. The resulting Tar archive for `f60c56784b83` has 479 the following entries: 480 481 ``` 482 /etc/my-app.d/default.cfg 483 /bin/my-app-tools 484 /etc/.wh.my-app-config 485 ``` 486 487 Any given image is likely to be composed of several of these Image Filesystem 488 Changeset tar archives. 489 490 ## Combined Image JSON + Filesystem Changeset Format 491 492 There is also a format for a single archive which contains complete information 493 about an image, including: 494 495 - repository names/tags 496 - all image layer JSON files 497 - all tar archives of each layer filesystem changesets 498 499 For example, here's what the full archive of `library/busybox` is (displayed in 500 `tree` format): 501 502 ``` 503 . 504 ├── 5785b62b697b99a5af6cd5d0aabc804d5748abbb6d3d07da5d1d3795f2dcc83e 505 │ ├── VERSION 506 │ ├── json 507 │ └── layer.tar 508 ├── a7b8b41220991bfc754d7ad445ad27b7f272ab8b4a2c175b9512b97471d02a8a 509 │ ├── VERSION 510 │ ├── json 511 │ └── layer.tar 512 ├── a936027c5ca8bf8f517923169a233e391cbb38469a75de8383b5228dc2d26ceb 513 │ ├── VERSION 514 │ ├── json 515 │ └── layer.tar 516 ├── f60c56784b832dd990022afc120b8136ab3da9528094752ae13fe63a2d28dc8c 517 │ ├── VERSION 518 │ ├── json 519 │ └── layer.tar 520 └── repositories 521 ``` 522 523 There are one or more directories named with the ID for each layer in a full 524 image. Each of these directories contains 3 files: 525 526 * `VERSION` - The schema version of the `json` file 527 * `json` - The JSON metadata for an image layer 528 * `layer.tar` - The Tar archive of the filesystem changeset for an image 529 layer. 530 531 The content of the `VERSION` files is simply the semantic version of the JSON 532 metadata schema: 533 534 ``` 535 1.0 536 ``` 537 538 And the `repositories` file is another JSON file which describes names/tags: 539 540 ``` 541 { 542 "busybox":{ 543 "latest":"5785b62b697b99a5af6cd5d0aabc804d5748abbb6d3d07da5d1d3795f2dcc83e" 544 } 545 } 546 ``` 547 548 Every key in this object is the name of a repository, and maps to a collection 549 of tag suffixes. Each tag maps to the ID of the image represented by that tag. 550 551 ## Loading an Image Filesystem Changeset 552 553 Unpacking a bundle of image layer JSON files and their corresponding filesystem 554 changesets can be done using a series of steps: 555 556 1. Follow the parent IDs of image layers to find the root ancestor (an image 557 with no parent ID specified). 558 2. For every image layer, in order from root ancestor and descending down, 559 extract the contents of that layer's filesystem changeset archive into a 560 directory which will be used as the root of a container filesystem. 561 562 - Extract all contents of each archive. 563 - Walk the directory tree once more, removing any files with the prefix 564 `.wh.` and the corresponding file or directory named without this prefix. 565 566 567 ## Implementations 568 569 This specification is an admittedly imperfect description of an 570 imperfectly-understood problem. The Docker project is, in turn, an attempt to 571 implement this specification. Our goal and our execution toward it will evolve 572 over time, but our primary concern in this specification and in our 573 implementation is compatibility and interoperability.