github.com/vieux/docker@v0.6.3-0.20161004191708-e097c2a938c7/docs/userguide/eng-image/dockerfile_best-practices.md

github.com/vieux/docker@v0.6.3-0.20161004191708-e097c2a938c7/docs/userguide/eng-image/dockerfile_best-practices.md (about)

     1  <!--[metadata]>
     2  +++
     3  aliases = ["/engine/articles/dockerfile_best-practices/", "/docker-cloud/getting-started/intermediate/optimize-dockerfiles/", "/docker-cloud/tutorials/optimize-dockerfiles/"]
     4  title = "Best practices for writing Dockerfiles"
     5  description = "Hints, tips and guidelines for writing clean, reliable Dockerfiles"
     6  keywords = ["Examples, Usage, base image, docker, documentation, dockerfile, best practices, hub,  official repo"]
     7  [menu.main]
     8  parent = "engine_images"
     9  +++
    10  <![end-metadata]-->
    11  
    12  # Best practices for writing Dockerfiles
    13  
    14  Docker can build images automatically by reading the instructions from a
    15  `Dockerfile`, a text file that contains all the commands, in order, needed to
    16  build a given image. `Dockerfile`s adhere to a specific format and use a
    17  specific set of instructions. You can learn the basics on the
    18  [Dockerfile Reference](../../reference/builder.md) page. If
    19  you’re new to writing `Dockerfile`s, you should start there.
    20  
    21  This document covers the best practices and methods recommended by Docker,
    22  Inc. and the Docker community for creating easy-to-use, effective
    23  `Dockerfile`s. We strongly suggest you follow these recommendations (in fact,
    24  if you’re creating an Official Image, you *must* adhere to these practices).
    25  
    26  You can see many of these practices and recommendations in action in the [buildpack-deps `Dockerfile`](https://github.com/docker-library/buildpack-deps/blob/master/jessie/Dockerfile).
    27  
    28  > Note: for more detailed explanations of any of the Dockerfile commands
    29  >mentioned here, visit the [Dockerfile Reference](../../reference/builder.md) page.
    30  
    31  ## General guidelines and recommendations
    32  
    33  ### Containers should be ephemeral
    34  
    35  The container produced by the image your `Dockerfile` defines should be as
    36  ephemeral as possible. By “ephemeral,” we mean that it can be stopped and
    37  destroyed and a new one built and put in place with an absolute minimum of
    38  set-up and configuration.
    39  
    40  ### Use a .dockerignore file
    41  
    42  In most cases, it's best to put each Dockerfile in an empty directory. Then,
    43  add to that directory only the files needed for building the Dockerfile. To
    44  increase the build's performance, you can exclude files and directories by
    45  adding a `.dockerignore` file to that directory as well. This file supports
    46  exclusion patterns similar to `.gitignore` files. For information on creating one,
    47  see the [.dockerignore file](../../reference/builder.md#dockerignore-file).
    48  
    49  ### Avoid installing unnecessary packages
    50  
    51  In order to reduce complexity, dependencies, file sizes, and build times, you
    52  should avoid installing extra or unnecessary packages just because they
    53  might be “nice to have.” For example, you don’t need to include a text editor
    54  in a database image.
    55  
    56  ### Run only one process per container
    57  
    58  In almost all cases, you should only run a single process in a single
    59  container. Decoupling applications into multiple containers makes it much
    60  easier to scale horizontally and reuse containers. If that service depends on
    61  another service, make use of [container linking](../../userguide/networking/default_network/dockerlinks.md).
    62  
    63  ### Minimize the number of layers
    64  
    65  You need to find the balance between readability (and thus long-term
    66  maintainability) of the `Dockerfile` and minimizing the number of layers it
    67  uses. Be strategic and cautious about the number of layers you use.
    68  
    69  ### Sort multi-line arguments
    70  
    71  Whenever possible, ease later changes by sorting multi-line arguments
    72  alphanumerically. This will help you avoid duplication of packages and make the
    73  list much easier to update. This also makes PRs a lot easier to read and
    74  review. Adding a space before a backslash (`\`) helps as well.
    75  
    76  Here’s an example from the [`buildpack-deps` image](https://github.com/docker-library/buildpack-deps):
    77  
    78      RUN apt-get update && apt-get install -y \
    79        bzr \
    80        cvs \
    81        git \
    82        mercurial \
    83        subversion
    84  
    85  ### Build cache
    86  
    87  During the process of building an image Docker will step through the
    88  instructions in your `Dockerfile` executing each in the order specified.
    89  As each instruction is examined Docker will look for an existing image in its
    90  cache that it can reuse, rather than creating a new (duplicate) image.
    91  If you do not want to use the cache at all you can use the ` --no-cache=true`
    92  option on the `docker build` command.
    93  
    94  However, if you do let Docker use its cache then it is very important to
    95  understand when it will, and will not, find a matching image. The basic rules
    96  that Docker will follow are outlined below:
    97  
    98  * Starting with a base image that is already in the cache, the next
    99  instruction is compared against all child images derived from that base
   100  image to see if one of them was built using the exact same instruction. If
   101  not, the cache is invalidated.
   102  
   103  * In most cases simply comparing the instruction in the `Dockerfile` with one
   104  of the child images is sufficient.  However, certain instructions require
   105  a little more examination and explanation.
   106  
   107  * For the `ADD` and `COPY` instructions, the contents of the file(s)
   108  in the image are examined and a checksum is calculated for each file.
   109  The last-modified and last-accessed times of the file(s) are not considered in
   110  these checksums. During the cache lookup, the checksum is compared against the
   111  checksum in the existing images. If anything has changed in the file(s), such
   112  as the contents and metadata, then the cache is invalidated.
   113  
   114  * Aside from the `ADD` and `COPY` commands, cache checking will not look at the
   115  files in the container to determine a cache match. For example, when processing
   116  a `RUN apt-get -y update` command the files updated in the container
   117  will not be examined to determine if a cache hit exists.  In that case just
   118  the command string itself will be used to find a match.
   119  
   120  Once the cache is invalidated, all subsequent `Dockerfile` commands will
   121  generate new images and the cache will not be used.
   122  
   123  ## The Dockerfile instructions
   124  
   125  Below you'll find recommendations for the best way to write the
   126  various instructions available for use in a `Dockerfile`.
   127  
   128  ### FROM
   129  
   130  [Dockerfile reference for the FROM instruction](../../reference/builder.md#from)
   131  
   132  Whenever possible, use current Official Repositories as the basis for your
   133  image. We recommend the [Debian image](https://hub.docker.com/_/debian/)
   134  since it’s very tightly controlled and kept minimal (currently under 150 mb),
   135  while still being a full distribution.
   136  
   137  ### LABEL
   138  
   139  [Understanding object labels](../labels-custom-metadata.md)
   140  
   141  You can add labels to your image to help organize images by project, record
   142  licensing information, to aid in automation, or for other reasons. For each
   143  label, add a line beginning with `LABEL` and with one or more key-value pairs.
   144  The following examples show the different acceptable formats. Explanatory comments
   145  are included inline.
   146  
   147  >**Note**: If your string contains spaces, it must be quoted **or** the spaces
   148  must be escaped. If your string contains inner quote characters (`"`), escape
   149  them as well.
   150  
   151  ```dockerfile
   152  # Set one or more individual labels
   153  LABEL com.example.version="0.0.1-beta"
   154  LABEL vendor="ACME Incorporated"
   155  LABEL com.example.release-date="2015-02-12"
   156  LABEL com.example.version.is-production=""
   157  
   158  # Set multiple labels on one line
   159  LABEL com.example.version="0.0.1-beta" com.example.release-date="2015-02-12"
   160  
   161  # Set multiple labels at once, using line-continuation characters to break long lines
   162  LABEL vendor=ACME\ Incorporated \
   163        com.example.is-beta= \
   164        com.example.is-production="" \
   165        com.example.version="0.0.1-beta" \
   166        com.example.release-date="2015-02-12"
   167  ```
   168  
   169  See [Understanding object labels](../labels-custom-metadata.md) for
   170  guidelines about acceptable label keys and values. For information about
   171  querying labels, refer to the items related to filtering in
   172  [Managing labels on objects](../labels-custom-metadata.md#managing-labels-on-objects).
   173  
   174  ### RUN
   175  
   176  [Dockerfile reference for the RUN instruction](../../reference/builder.md#run)
   177  
   178  As always, to make your `Dockerfile` more readable, understandable, and
   179  maintainable, split long or complex `RUN` statements on multiple lines separated
   180  with backslashes.
   181  
   182  #### apt-get
   183  
   184  Probably the most common use-case for `RUN` is an application of `apt-get`. The
   185  `RUN apt-get` command, because it installs packages, has several gotchas to look
   186  out for.
   187  
   188  You should avoid `RUN apt-get upgrade` or `dist-upgrade`, as many of the
   189  “essential” packages from the base images won't upgrade inside an unprivileged
   190  container. If a package contained in the base image is out-of-date, you should
   191  contact its maintainers.
   192  If you know there’s a particular package, `foo`, that needs to be updated, use
   193  `apt-get install -y foo` to update automatically.
   194  
   195  Always combine  `RUN apt-get update` with `apt-get install` in the same `RUN`
   196  statement, for example:
   197  
   198          RUN apt-get update && apt-get install -y \
   199              package-bar \
   200              package-baz \
   201              package-foo
   202  
   203  
   204  Using `apt-get update` alone in a `RUN` statement causes caching issues and
   205  subsequent `apt-get install` instructions fail.
   206  For example, say you have a Dockerfile:
   207  
   208          FROM ubuntu:14.04
   209          RUN apt-get update
   210          RUN apt-get install -y curl
   211  
   212  After building the image, all layers are in the Docker cache. Suppose you later
   213  modify `apt-get install` by adding extra package:
   214  
   215          FROM ubuntu:14.04
   216          RUN apt-get update
   217          RUN apt-get install -y curl nginx
   218  
   219  Docker sees the initial and modified instructions as identical and reuses the
   220  cache from previous steps. As a result the `apt-get update` is *NOT* executed
   221  because the build uses the cached version. Because the `apt-get update` is not
   222  run, your build can potentially get an outdated version of the `curl` and `nginx`
   223  packages.
   224  
   225  Using  `RUN apt-get update && apt-get install -y` ensures your Dockerfile
   226  installs the latest package versions with no further coding or manual
   227  intervention. This technique is known as "cache busting". You can also achieve
   228  cache-busting by specifying a package version. This is known as version pinning,
   229  for example:
   230  
   231          RUN apt-get update && apt-get install -y \
   232              package-bar \
   233              package-baz \
   234              package-foo=1.3.*
   235  
   236  Version pinning forces the build to retrieve a particular version regardless of
   237  what’s in the cache. This technique can also reduce failures due to unanticipated changes
   238  in required packages.
   239  
   240  Below is a well-formed `RUN` instruction that demonstrates all the `apt-get`
   241  recommendations.
   242  
   243      RUN apt-get update && apt-get install -y \
   244          aufs-tools \
   245          automake \
   246          build-essential \
   247          curl \
   248          dpkg-sig \
   249          libcap-dev \
   250          libsqlite3-dev \
   251          mercurial \
   252          reprepro \
   253          ruby1.9.1 \
   254          ruby1.9.1-dev \
   255          s3cmd=1.1.* \
   256       && rm -rf /var/lib/apt/lists/*
   257  
   258  The `s3cmd` instructions specifies a version `1.1.0*`. If the image previously
   259  used an older version, specifying the new one causes a cache bust of `apt-get
   260  update` and ensure the installation of the new version. Listing packages on
   261  each line can also prevent mistakes in package duplication.
   262  
   263  In addition, cleaning up the apt cache and removing `/var/lib/apt/lists` helps
   264  keep the image size down. Since the `RUN` statement starts with
   265  `apt-get update`, the package cache will always be refreshed prior to
   266  `apt-get install`.
   267  
   268  > **Note**: The official Debian and Ubuntu images [automatically run `apt-get clean`](https://github.com/docker/docker/blob/03e2923e42446dbb830c654d0eec323a0b4ef02a/contrib/mkimage/debootstrap#L82-L105),
   269  > so explicit invocation is not required.
   270  
   271  ### CMD
   272  
   273  [Dockerfile reference for the CMD instruction](../../reference/builder.md#cmd)
   274  
   275  The `CMD` instruction should be used to run the software contained by your
   276  image, along with any arguments. `CMD` should almost always be used in the
   277  form of `CMD [“executable”, “param1”, “param2”…]`. Thus, if the image is for a
   278  service, such as Apache and Rails, you would run something like
   279  `CMD ["apache2","-DFOREGROUND"]`. Indeed, this form of the instruction is
   280  recommended for any service-based image.
   281  
   282  In most other cases, `CMD` should be given an interactive shell, such as bash, python
   283  and perl. For example, `CMD ["perl", "-de0"]`, `CMD ["python"]`, or
   284  `CMD [“php”, “-a”]`. Using this form means that when you execute something like
   285  `docker run -it python`, you’ll get dropped into a usable shell, ready to go.
   286  `CMD` should rarely be used in the manner of `CMD [“param”, “param”]` in
   287  conjunction with [`ENTRYPOINT`](../../reference/builder.md#entrypoint), unless
   288  you and your expected users are already quite familiar with how `ENTRYPOINT`
   289  works.
   290  
   291  ### EXPOSE
   292  
   293  [Dockerfile reference for the EXPOSE instruction](../../reference/builder.md#expose)
   294  
   295  The `EXPOSE` instruction indicates the ports on which a container will listen
   296  for connections. Consequently, you should use the common, traditional port for
   297  your application. For example, an image containing the Apache web server would
   298  use `EXPOSE 80`, while an image containing MongoDB would use `EXPOSE 27017` and
   299  so on.
   300  
   301  For external access, your users can execute `docker run` with a flag indicating
   302  how to map the specified port to the port of their choice.
   303  For container linking, Docker provides environment variables for the path from
   304  the recipient container back to the source (ie, `MYSQL_PORT_3306_TCP`).
   305  
   306  ### ENV
   307  
   308  [Dockerfile reference for the ENV instruction](../../reference/builder.md#env)
   309  
   310  In order to make new software easier to run, you can use `ENV` to update the
   311  `PATH` environment variable for the software your container installs. For
   312  example, `ENV PATH /usr/local/nginx/bin:$PATH` will ensure that `CMD [“nginx”]`
   313  just works.
   314  
   315  The `ENV` instruction is also useful for providing required environment
   316  variables specific to services you wish to containerize, such as Postgres’s
   317  `PGDATA`.
   318  
   319  Lastly, `ENV` can also be used to set commonly used version numbers so that
   320  version bumps are easier to maintain, as seen in the following example:
   321  
   322      ENV PG_MAJOR 9.3
   323      ENV PG_VERSION 9.3.4
   324      RUN curl -SL http://example.com/postgres-$PG_VERSION.tar.xz | tar -xJC /usr/src/postgress && …
   325      ENV PATH /usr/local/postgres-$PG_MAJOR/bin:$PATH
   326  
   327  Similar to having constant variables in a program (as opposed to hard-coding
   328  values), this approach lets you change a single `ENV` instruction to
   329  auto-magically bump the version of the software in your container.
   330  
   331  ### ADD or COPY
   332  
   333  [Dockerfile reference for the ADD instruction](../../reference/builder.md#add)<br/>
   334  [Dockerfile reference for the COPY instruction](../../reference/builder.md#copy)
   335  
   336  Although `ADD` and `COPY` are functionally similar, generally speaking, `COPY`
   337  is preferred. That’s because it’s more transparent than `ADD`. `COPY` only
   338  supports the basic copying of local files into the container, while `ADD` has
   339  some features (like local-only tar extraction and remote URL support) that are
   340  not immediately obvious. Consequently, the best use for `ADD` is local tar file
   341  auto-extraction into the image, as in `ADD rootfs.tar.xz /`.
   342  
   343  If you have multiple `Dockerfile` steps that use different files from your
   344  context, `COPY` them individually, rather than all at once. This will ensure that
   345  each step's build cache is only invalidated (forcing the step to be re-run) if the
   346  specifically required files change.
   347  
   348  For example:
   349  
   350      COPY requirements.txt /tmp/
   351      RUN pip install --requirement /tmp/requirements.txt
   352      COPY . /tmp/
   353  
   354  Results in fewer cache invalidations for the `RUN` step, than if you put the
   355  `COPY . /tmp/` before it.
   356  
   357  Because image size matters, using `ADD` to fetch packages from remote URLs is
   358  strongly discouraged; you should use `curl` or `wget` instead. That way you can
   359  delete the files you no longer need after they've been extracted and you won't
   360  have to add another layer in your image. For example, you should avoid doing
   361  things like:
   362  
   363      ADD http://example.com/big.tar.xz /usr/src/things/
   364      RUN tar -xJf /usr/src/things/big.tar.xz -C /usr/src/things
   365      RUN make -C /usr/src/things all
   366  
   367  And instead, do something like:
   368  
   369      RUN mkdir -p /usr/src/things \
   370          && curl -SL http://example.com/big.tar.xz \
   371          | tar -xJC /usr/src/things \
   372          && make -C /usr/src/things all
   373  
   374  For other items (files, directories) that do not require `ADD`’s tar
   375  auto-extraction capability, you should always use `COPY`.
   376  
   377  ### ENTRYPOINT
   378  
   379  [Dockerfile reference for the ENTRYPOINT instruction](../../reference/builder.md#entrypoint)
   380  
   381  The best use for `ENTRYPOINT` is to set the image's main command, allowing that
   382  image to be run as though it was that command (and then use `CMD` as the
   383  default flags).
   384  
   385  Let's start with an example of an image for the command line tool `s3cmd`:
   386  
   387      ENTRYPOINT ["s3cmd"]
   388      CMD ["--help"]
   389  
   390  Now the image can be run like this to show the command's help:
   391  
   392      $ docker run s3cmd
   393  
   394  Or using the right parameters to execute a command:
   395  
   396      $ docker run s3cmd ls s3://mybucket
   397  
   398  This is useful because the image name can double as a reference to the binary as
   399  shown in the command above.
   400  
   401  The `ENTRYPOINT` instruction can also be used in combination with a helper
   402  script, allowing it to function in a similar way to the command above, even
   403  when starting the tool may require more than one step.
   404  
   405  For example, the [Postgres Official Image](https://hub.docker.com/_/postgres/)
   406  uses the following script as its `ENTRYPOINT`:
   407  
   408  ```bash
   409  #!/bin/bash
   410  set -e
   411  
   412  if [ "$1" = 'postgres' ]; then
   413      chown -R postgres "$PGDATA"
   414  
   415      if [ -z "$(ls -A "$PGDATA")" ]; then
   416          gosu postgres initdb
   417      fi
   418  
   419      exec gosu postgres "$@"
   420  fi
   421  
   422  exec "$@"
   423  ```
   424  
   425  > **Note**:
   426  > This script uses [the `exec` Bash command](http://wiki.bash-hackers.org/commands/builtin/exec)
   427  > so that the final running application becomes the container's PID 1. This allows
   428  > the application to receive any Unix signals sent to the container.
   429  > See the [`ENTRYPOINT`](../../reference/builder.md#entrypoint)
   430  > help for more details.
   431  
   432  
   433  The helper script is copied into the container and run via `ENTRYPOINT` on
   434  container start:
   435  
   436      COPY ./docker-entrypoint.sh /
   437      ENTRYPOINT ["/docker-entrypoint.sh"]
   438  
   439  This script allows the user to interact with Postgres in several ways.
   440  
   441  It can simply start Postgres:
   442  
   443      $ docker run postgres
   444  
   445  Or, it can be used to run Postgres and pass parameters to the server:
   446  
   447      $ docker run postgres postgres --help
   448  
   449  Lastly, it could also be used to start a totally different tool, such as Bash:
   450  
   451      $ docker run --rm -it postgres bash
   452  
   453  ### VOLUME
   454  
   455  [Dockerfile reference for the VOLUME instruction](../../reference/builder.md#volume)
   456  
   457  The `VOLUME` instruction should be used to expose any database storage area,
   458  configuration storage, or files/folders created by your docker container. You
   459  are strongly encouraged to use `VOLUME` for any mutable and/or user-serviceable
   460  parts of your image.
   461  
   462  ### USER
   463  
   464  [Dockerfile reference for the USER instruction](../../reference/builder.md#user)
   465  
   466  If a service can run without privileges, use `USER` to change to a non-root
   467  user. Start by creating the user and group in the `Dockerfile` with something
   468  like `RUN groupadd -r postgres && useradd -r -g postgres postgres`.
   469  
   470  > **Note:** Users and groups in an image get a non-deterministic
   471  > UID/GID in that the “next” UID/GID gets assigned regardless of image
   472  > rebuilds. So, if it’s critical, you should assign an explicit UID/GID.
   473  
   474  You should avoid installing or using `sudo` since it has unpredictable TTY and
   475  signal-forwarding behavior that can cause more problems than it solves. If
   476  you absolutely need functionality similar to `sudo` (e.g., initializing the
   477  daemon as root but running it as non-root), you may be able to use
   478  [“gosu”](https://github.com/tianon/gosu).
   479  
   480  Lastly, to reduce layers and complexity, avoid switching `USER` back
   481  and forth frequently.
   482  
   483  ### WORKDIR
   484  
   485  [Dockerfile reference for the WORKDIR instruction](../../reference/builder.md#workdir)
   486  
   487  For clarity and reliability, you should always use absolute paths for your
   488  `WORKDIR`. Also, you should use `WORKDIR` instead of  proliferating
   489  instructions like `RUN cd … && do-something`, which are hard to read,
   490  troubleshoot, and maintain.
   491  
   492  ### ONBUILD
   493  
   494  [Dockerfile reference for the ONBUILD instruction](../../reference/builder.md#onbuild)
   495  
   496  An `ONBUILD` command executes after the current `Dockerfile` build completes.
   497  `ONBUILD` executes in any child image derived `FROM` the current image.  Think
   498  of the `ONBUILD` command as an instruction the parent `Dockerfile` gives
   499  to the child `Dockerfile`.
   500  
   501  A Docker build executes `ONBUILD` commands before any command in a child
   502  `Dockerfile`.
   503  
   504  `ONBUILD` is useful for images that are going to be built `FROM` a given
   505  image. For example, you would use `ONBUILD` for a language stack image that
   506  builds arbitrary user software written in that language within the
   507  `Dockerfile`, as you can see in [Ruby’s `ONBUILD` variants](https://github.com/docker-library/ruby/blob/master/2.1/onbuild/Dockerfile).
   508  
   509  Images built from `ONBUILD` should get a separate tag, for example:
   510  `ruby:1.9-onbuild` or `ruby:2.0-onbuild`.
   511  
   512  Be careful when putting `ADD` or `COPY` in `ONBUILD`. The “onbuild” image will
   513  fail catastrophically if the new build's context is missing the resource being
   514  added. Adding a separate tag, as recommended above, will help mitigate this by
   515  allowing the `Dockerfile` author to make a choice.
   516  
   517  ## Examples for Official Repositories
   518  
   519  These Official Repositories have exemplary `Dockerfile`s:
   520  
   521  * [Go](https://hub.docker.com/_/golang/)
   522  * [Perl](https://hub.docker.com/_/perl/)
   523  * [Hy](https://hub.docker.com/_/hylang/)
   524  * [Rails](https://hub.docker.com/_/rails)
   525  
   526  ## Additional resources:
   527  
   528  * [Dockerfile Reference](../../reference/builder.md)
   529  * [More about Base Images](baseimages.md)
   530  * [More about Automated Builds](https://docs.docker.com/docker-hub/builds/)
   531  * [Guidelines for Creating Official
   532  Repositories](https://docs.docker.com/docker-hub/official_repos/)