github.com/demonoid81/moby@v0.0.0-20200517203328-62dd8e17c460/image/spec/v1.md (about)

     1  # Docker Image Specification v1.0.0
     2  
     3  An *Image* is an ordered collection of root filesystem changes and the
     4  corresponding execution parameters for use within a container runtime. This
     5  specification outlines the format of these filesystem changes and corresponding
     6  parameters and describes how to create and use them for use with a container
     7  runtime and execution tool.
     8  
     9  ## Terminology
    10  
    11  This specification uses the following terms:
    12  
    13  <dl>
    14      <dt>
    15          Layer
    16      </dt>
    17      <dd>
    18          Images are composed of <i>layers</i>. <i>Image layer</i> is a general
    19          term which may be used to refer to one or both of the following:
    20          <ol>
    21              <li>The metadata for the layer, described in the JSON format.</li>
    22              <li>The filesystem changes described by a layer.</li>
    23          </ol>
    24          To refer to the former you may use the term <i>Layer JSON</i> or
    25          <i>Layer Metadata</i>. To refer to the latter you may use the term
    26          <i>Image Filesystem Changeset</i> or <i>Image Diff</i>.
    27      </dd>
    28      <dt>
    29          Image JSON
    30      </dt>
    31      <dd>
    32          Each layer has an associated JSON structure which describes some
    33          basic information about the image such as date created, author, and the
    34          ID of its parent image as well as execution/runtime configuration like
    35          its entry point, default arguments, CPU/memory shares, networking, and
    36          volumes.
    37      </dd>
    38      <dt>
    39          Image Filesystem Changeset
    40      </dt>
    41      <dd>
    42          Each layer has an archive of the files which have been added, changed,
    43          or deleted relative to its parent layer. Using a layer-based or union
    44          filesystem such as AUFS, or by computing the diff from filesystem
    45          snapshots, the filesystem changeset can be used to present a series of
    46          image layers as if they were one cohesive filesystem.
    47      </dd>
    48      <dt>
    49          Image ID <a name="id_desc"></a>
    50      </dt>
    51      <dd>
    52          Each layer is given an ID upon its creation. It is 
    53          represented as a hexadecimal encoding of 256 bits, e.g.,
    54          <code>a9561eb1b190625c9adb5a9513e72c4dedafc1cb2d4c5236c9a6957ec7dfd5a9</code>.
    55          Image IDs should be sufficiently random so as to be globally unique.
    56          32 bytes read from <code>/dev/urandom</code> is sufficient for all
    57          practical purposes. Alternatively, an image ID may be derived as a
    58          cryptographic hash of image contents as the result is considered
    59          indistinguishable from random. The choice is left up to implementors.
    60      </dd>
    61      <dt>
    62          Image Parent
    63      </dt>
    64      <dd>
    65          Most layer metadata structs contain a <code>parent</code> field which
    66          refers to the Image from which another directly descends. An image
    67          contains a separate JSON metadata file and set of changes relative to
    68          the filesystem of its parent image. <i>Image Ancestor</i> and
    69          <i>Image Descendant</i> are also common terms.
    70      </dd>
    71      <dt>
    72          Image Checksum
    73      </dt>
    74      <dd>
    75          Layer metadata structs contain a cryptographic hash of the contents of
    76          the layer's filesystem changeset. Though the set of changes exists as a
    77          simple Tar archive, two archives with identical filenames and content
    78          will have different SHA digests if the last-access or last-modified
    79          times of any entries differ. For this reason, image checksums are
    80          generated using the TarSum algorithm which produces a cryptographic
    81          hash of file contents and selected headers only. Details of this
    82          algorithm are described in the separate <a href="https://github.com/demonoid81/moby/blob/master/pkg/tarsum/tarsum_spec.md">TarSum specification</a>.
    83      </dd>
    84      <dt>
    85          Tag
    86      </dt>
    87      <dd>
    88          A tag serves to map a descriptive, user-given name to any single image
    89          ID. An image name suffix (the name component after <code>:</code>) is
    90          often referred to as a tag as well, though it strictly refers to the
    91          full name of an image. Acceptable values for a tag suffix are
    92          implementation specific, but they SHOULD be limited to the set of
    93          alphanumeric characters <code>[a-zA-Z0-9]</code>, punctuation
    94          characters <code>[._-]</code>, and MUST NOT contain a <code>:</code>
    95          character.
    96      </dd>
    97      <dt>
    98          Repository
    99      </dt>
   100      <dd>
   101          A collection of tags grouped under a common prefix (the name component
   102          before <code>:</code>). For example, in an image tagged with the name
   103          <code>my-app:3.1.4</code>, <code>my-app</code> is the <i>Repository</i>
   104          component of the name. Acceptable values for repository name are
   105          implementation specific, but they SHOULD be limited to the set of
   106          alphanumeric characters <code>[a-zA-Z0-9]</code>, and punctuation
   107          characters <code>[._-]</code>, however it MAY contain additional
   108          <code>/</code> and <code>:</code> characters for organizational
   109          purposes, with the last <code>:</code> character being interpreted
   110          dividing the repository component of the name from the tag suffix
   111          component.
   112      </dd>
   113  </dl>
   114  
   115  ## Image JSON Description
   116  
   117  Here is an example image JSON file:
   118  
   119  ```
   120  {  
   121      "id": "a9561eb1b190625c9adb5a9513e72c4dedafc1cb2d4c5236c9a6957ec7dfd5a9",
   122      "parent": "c6e3cedcda2e3982a1a6760e178355e8e65f7b80e4e5248743fa3549d284e024",
   123      "checksum": "tarsum.v1+sha256:e58fcf7418d2390dec8e8fb69d88c06ec07039d651fedc3aa72af9972e7d046b",
   124      "created": "2014-10-13T21:19:18.674353812Z",
   125      "author": "Alyssa P. Hacker &ltalyspdev@example.com&gt",
   126      "architecture": "amd64",
   127      "os": "linux",
   128      "Size": 271828,
   129      "config": {
   130          "User": "alice",
   131          "Memory": 2048,
   132          "MemorySwap": 4096,
   133          "CpuShares": 8,
   134          "ExposedPorts": {  
   135              "8080/tcp": {}
   136          },
   137          "Env": [  
   138              "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
   139              "FOO=docker_is_a_really",
   140              "BAR=great_tool_you_know"
   141          ],
   142          "Entrypoint": [
   143              "/bin/my-app-binary"
   144          ],
   145          "Cmd": [
   146              "--foreground",
   147              "--config",
   148              "/etc/my-app.d/default.cfg"
   149          ],
   150          "Volumes": {
   151              "/var/job-result-data": {},
   152              "/var/log/my-app-logs": {},
   153          },
   154          "WorkingDir": "/home/alice",
   155      }
   156  }
   157  ```
   158  
   159  ### Image JSON Field Descriptions
   160  
   161  <dl>
   162      <dt>
   163          id <code>string</code>
   164      </dt>
   165      <dd>
   166          Randomly generated, 256-bit, hexadecimal encoded. Uniquely identifies
   167          the image.
   168      </dd>
   169      <dt>
   170          parent <code>string</code>
   171      </dt>
   172      <dd>
   173          ID of the parent image. If there is no parent image then this field
   174          should be omitted. A collection of images may share many of the same
   175          ancestor layers. This organizational structure is strictly a tree with
   176          any one layer having either no parent or a single parent and zero or
   177          more descendant layers. Cycles are not allowed and implementations
   178          should be careful to avoid creating them or iterating through a cycle
   179          indefinitely.
   180      </dd>
   181      <dt>
   182          created <code>string</code>
   183      </dt>
   184      <dd>
   185          ISO-8601 formatted combined date and time at which the image was
   186          created.
   187      </dd>
   188      <dt>
   189          author <code>string</code>
   190      </dt>
   191      <dd>
   192          Gives the name and/or email address of the person or entity which
   193          created and is responsible for maintaining the image.
   194      </dd>
   195      <dt>
   196          architecture <code>string</code>
   197      </dt>
   198      <dd>
   199          The CPU architecture which the binaries in this image are built to run
   200          on. Possible values include:
   201          <ul>
   202              <li>386</li>
   203              <li>amd64</li>
   204              <li>arm</li>
   205          </ul>
   206          More values may be supported in the future and any of these may or may
   207          not be supported by a given container runtime implementation.
   208      </dd>
   209      <dt>
   210          os <code>string</code>
   211      </dt>
   212      <dd>
   213          The name of the operating system which the image is built to run on.
   214          Possible values include:
   215          <ul>
   216              <li>darwin</li>
   217              <li>freebsd</li>
   218              <li>linux</li>
   219          </ul>
   220          More values may be supported in the future and any of these may or may
   221          not be supported by a given container runtime implementation.
   222      </dd>
   223      <dt>
   224          checksum <code>string</code>
   225      </dt>
   226      <dd>
   227          Image Checksum of the filesystem changeset associated with the image
   228          layer.
   229      </dd>
   230      <dt>
   231          Size <code>integer</code>
   232      </dt>
   233      <dd>
   234          The size in bytes of the filesystem changeset associated with the image
   235          layer.
   236      </dd>
   237      <dt>
   238          config <code>struct</code>
   239      </dt>
   240      <dd>
   241          The execution parameters which should be used as a base when running a
   242          container using the image. This field can be <code>null</code>, in
   243          which case any execution parameters should be specified at creation of
   244          the container.
   245          <h4>Container RunConfig Field Descriptions</h4>
   246          <dl>
   247              <dt>
   248                  User <code>string</code>
   249              </dt>
   250              <dd>
   251                  <p>The username or UID which the process in the container should
   252                  run as. This acts as a default value to use when the value is
   253                  not specified when creating a container.</p>
   254                  <p>All of the following are valid:</p>
   255                  <ul>
   256                      <li><code>user</code></li>
   257                      <li><code>uid</code></li>
   258                      <li><code>user:group</code></li>
   259                      <li><code>uid:gid</code></li>
   260                      <li><code>uid:group</code></li>
   261                      <li><code>user:gid</code></li>
   262                  </ul>
   263                  <p>If <code>group</code>/<code>gid</code> is not specified, the
   264                  default group and supplementary groups of the given
   265                  <code>user</code>/<code>uid</code> in <code>/etc/passwd</code>
   266                  from the container are applied.</p>
   267              </dd>
   268              <dt>
   269                  Memory <code>integer</code>
   270              </dt>
   271              <dd>
   272                  Memory limit (in bytes). This acts as a default value to use
   273                  when the value is not specified when creating a container.
   274              </dd>
   275              <dt>
   276                  MemorySwap <code>integer</code>
   277              </dt>
   278              <dd>
   279                  Total memory usage (memory + swap); set to <code>-1</code> to
   280                  disable swap. This acts as a default value to use when the
   281                  value is not specified when creating a container.
   282              </dd>
   283              <dt>
   284                  CpuShares <code>integer</code>
   285              </dt>
   286              <dd>
   287                  CPU shares (relative weight vs. other containers). This acts as
   288                  a default value to use when the value is not specified when
   289                  creating a container.
   290              </dd>
   291              <dt>
   292                  ExposedPorts <code>struct</code>
   293              </dt>
   294              <dd>
   295                  A set of ports to expose from a container running this image.
   296                  This JSON structure value is unusual because it is a direct
   297                  JSON serialization of the Go type
   298                  <code>map[string]struct{}</code> and is represented in JSON as
   299                  an object mapping its keys to an empty object. Here is an
   300                  example:
   301  <pre>{
   302      "8080": {},
   303      "53/udp": {},
   304      "2356/tcp": {}
   305  }</pre>
   306                  Its keys can be in the format of:
   307                  <ul>
   308                      <li>
   309                          <code>"port/tcp"</code>
   310                      </li>
   311                      <li>
   312                          <code>"port/udp"</code>
   313                      </li>
   314                      <li>
   315                          <code>"port"</code>
   316                      </li>
   317                  </ul>
   318                  with the default protocol being <code>"tcp"</code> if not
   319                  specified. These values act as defaults and are merged with any specified
   320                  when creating a container.
   321              </dd>
   322              <dt>
   323                  Env <code>array of strings</code>
   324              </dt>
   325              <dd>
   326                  Entries are in the format of <code>VARNAME="var value"</code>.
   327                  These values act as defaults and are merged with any specified
   328                  when creating a container.
   329              </dd>
   330              <dt>
   331                  Entrypoint <code>array of strings</code>
   332              </dt>
   333              <dd>
   334                  A list of arguments to use as the command to execute when the
   335                  container starts. This value acts as a  default and is replaced
   336                  by an entrypoint specified when creating a container.
   337              </dd>
   338              <dt>
   339                  Cmd <code>array of strings</code>
   340              </dt>
   341              <dd>
   342                  Default arguments to the entry point of the container. These
   343                  values act as defaults and are replaced with any specified when
   344                  creating a container. If an <code>Entrypoint</code> value is
   345                  not specified, then the first entry of the <code>Cmd</code>
   346                  array should be interpreted as the executable to run.
   347              </dd>
   348              <dt>
   349                  Volumes <code>struct</code>
   350              </dt>
   351              <dd>
   352                  A set of directories which should be created as data volumes in
   353                  a container running this image. This JSON structure value is
   354                  unusual because it is a direct JSON serialization of the Go
   355                  type <code>map[string]struct{}</code> and is represented in
   356                  JSON as an object mapping its keys to an empty object. Here is
   357                  an example:
   358  <pre>{
   359      "/var/my-app-data/": {},
   360      "/etc/some-config.d/": {},
   361  }</pre>
   362              </dd>
   363              <dt>
   364                  WorkingDir <code>string</code>
   365              </dt>
   366              <dd>
   367                  Sets the current working directory of the entry point process
   368                  in the container. This value acts as a default and is replaced
   369                  by a working directory specified when creating a container.
   370              </dd>
   371          </dl>
   372      </dd>
   373  </dl>
   374  
   375  Any extra fields in the Image JSON struct are considered implementation
   376  specific and should be ignored by any implementations which are unable to
   377  interpret them.
   378  
   379  ## Creating an Image Filesystem Changeset
   380  
   381  An example of creating an Image Filesystem Changeset follows.
   382  
   383  An image root filesystem is first created as an empty directory named with the
   384  ID of the image being created. Here is the initial empty directory structure
   385  for the changeset for an image with ID `c3167915dc9d` ([real IDs are much
   386  longer](#id_desc), but this example use a truncated one here for brevity.
   387  Implementations need not name the rootfs directory in this way but it may be
   388  convenient for keeping record of a large number of image layers.):
   389  
   390  ```
   391  c3167915dc9d/
   392  ```
   393  
   394  Files and directories are then created:
   395  
   396  ```
   397  c3167915dc9d/
   398      etc/
   399          my-app-config
   400      bin/
   401          my-app-binary
   402          my-app-tools
   403  ```
   404  
   405  The `c3167915dc9d` directory is then committed as a plain Tar archive with
   406  entries for the following files:
   407  
   408  ```
   409  etc/my-app-config
   410  bin/my-app-binary
   411  bin/my-app-tools
   412  ```
   413  
   414  The TarSum checksum for the archive file is then computed and placed in the
   415  JSON metadata along with the execution parameters.
   416  
   417  To make changes to the filesystem of this container image, create a new
   418  directory named with a new ID, such as `f60c56784b83`, and initialize it with
   419  a snapshot of the parent image's root filesystem, so that the directory is
   420  identical to that of `c3167915dc9d`. NOTE: a copy-on-write or union filesystem
   421  can make this very efficient:
   422  
   423  ```
   424  f60c56784b83/
   425      etc/
   426          my-app-config
   427      bin/
   428          my-app-binary
   429          my-app-tools
   430  ```
   431  
   432  This example change is going to add a configuration directory at `/etc/my-app.d`
   433  which contains a default config file. There's also a change to the
   434  `my-app-tools` binary to handle the config layout change. The `f60c56784b83`
   435  directory then looks like this:
   436  
   437  ```
   438  f60c56784b83/
   439      etc/
   440          my-app.d/
   441              default.cfg
   442      bin/
   443          my-app-binary
   444          my-app-tools
   445  ```
   446  
   447  This reflects the removal of `/etc/my-app-config` and creation of a file and
   448  directory at `/etc/my-app.d/default.cfg`. `/bin/my-app-tools` has also been
   449  replaced with an updated version. Before committing this directory to a
   450  changeset, because it has a parent image, it is first compared with the
   451  directory tree of the parent snapshot, `f60c56784b83`, looking for files and
   452  directories that have been added, modified, or removed. The following changeset
   453  is found:
   454  
   455  ```
   456  Added:      /etc/my-app.d/default.cfg
   457  Modified:   /bin/my-app-tools
   458  Deleted:    /etc/my-app-config
   459  ```
   460  
   461  A Tar Archive is then created which contains *only* this changeset: The added
   462  and modified files and directories in their entirety, and for each deleted item
   463  an entry for an empty file at the same location but with the basename of the
   464  deleted file or directory prefixed with `.wh.`. The filenames prefixed with
   465  `.wh.` are known as "whiteout" files. NOTE: For this reason, it is not possible
   466  to create an image root filesystem which contains a file or directory with a
   467  name beginning with `.wh.`. The resulting Tar archive for `f60c56784b83` has
   468  the following entries:
   469  
   470  ```
   471  /etc/my-app.d/default.cfg
   472  /bin/my-app-tools
   473  /etc/.wh.my-app-config
   474  ```
   475  
   476  Any given image is likely to be composed of several of these Image Filesystem
   477  Changeset tar archives.
   478  
   479  ## Combined Image JSON + Filesystem Changeset Format
   480  
   481  There is also a format for a single archive which contains complete information
   482  about an image, including:
   483  
   484   - repository names/tags
   485   - all image layer JSON files
   486   - all tar archives of each layer filesystem changesets
   487  
   488  For example, here's what the full archive of `library/busybox` is (displayed in
   489  `tree` format):
   490  
   491  ```
   492  .
   493  ├── 5785b62b697b99a5af6cd5d0aabc804d5748abbb6d3d07da5d1d3795f2dcc83e
   494  │   ├── VERSION
   495  │   ├── json
   496  │   └── layer.tar
   497  ├── a7b8b41220991bfc754d7ad445ad27b7f272ab8b4a2c175b9512b97471d02a8a
   498  │   ├── VERSION
   499  │   ├── json
   500  │   └── layer.tar
   501  ├── a936027c5ca8bf8f517923169a233e391cbb38469a75de8383b5228dc2d26ceb
   502  │   ├── VERSION
   503  │   ├── json
   504  │   └── layer.tar
   505  ├── f60c56784b832dd990022afc120b8136ab3da9528094752ae13fe63a2d28dc8c
   506  │   ├── VERSION
   507  │   ├── json
   508  │   └── layer.tar
   509  └── repositories
   510  ```
   511  
   512  There are one or more directories named with the ID for each layer in a full
   513  image. Each of these directories contains 3 files:
   514  
   515   * `VERSION` - The schema version of the `json` file
   516   * `json` - The JSON metadata for an image layer
   517   * `layer.tar` - The Tar archive of the filesystem changeset for an image
   518     layer.
   519  
   520  The content of the `VERSION` files is simply the semantic version of the JSON
   521  metadata schema:
   522  
   523  ```
   524  1.0
   525  ```
   526  
   527  And the `repositories` file is another JSON file which describes names/tags:
   528  
   529  ```
   530  {  
   531      "busybox":{  
   532          "latest":"5785b62b697b99a5af6cd5d0aabc804d5748abbb6d3d07da5d1d3795f2dcc83e"
   533      }
   534  }
   535  ```
   536  
   537  Every key in this object is the name of a repository, and maps to a collection
   538  of tag suffixes. Each tag maps to the ID of the image represented by that tag.
   539  
   540  ## Loading an Image Filesystem Changeset
   541  
   542  Unpacking a bundle of image layer JSON files and their corresponding filesystem
   543  changesets can be done using a series of steps:
   544  
   545  1. Follow the parent IDs of image layers to find the root ancestor (an image
   546  with no parent ID specified).
   547  2. For every image layer, in order from root ancestor and descending down,
   548  extract the contents of that layer's filesystem changeset archive into a
   549  directory which will be used as the root of a container filesystem.
   550  
   551      - Extract all contents of each archive.
   552      - Walk the directory tree once more, removing any files with the prefix
   553      `.wh.` and the corresponding file or directory named without this prefix.
   554  
   555  
   556  ## Implementations
   557  
   558  This specification is an admittedly imperfect description of an
   559  imperfectly-understood problem. The Docker project is, in turn, an attempt to
   560  implement this specification. Our goal and our execution toward it will evolve
   561  over time, but our primary concern in this specification and in our
   562  implementation is compatibility and interoperability.