github.com/vieux/docker@v0.6.3-0.20161004191708-e097c2a938c7/docs/security/security.md (about)

     1  <!--[metadata]>
     2  +++
     3  aliases = ["/engine/articles/security/"]
     4  title = "Docker security"
     5  description = "Review of the Docker Daemon attack surface"
     6  keywords = ["Docker, Docker documentation,  security"]
     7  [menu.main]
     8  parent = "smn_secure_docker"
     9  weight =-99
    10  +++
    11  <![end-metadata]-->
    12  
    13  # Docker security
    14  
    15  There are four major areas to consider when reviewing Docker security:
    16  
    17   - the intrinsic security of the kernel and its support for
    18     namespaces and cgroups;
    19   - the attack surface of the Docker daemon itself;
    20   - loopholes in the container configuration profile, either by default,
    21     or when customized by users.
    22   - the "hardening" security features of the kernel and how they
    23     interact with containers.
    24  
    25  ## Kernel namespaces
    26  
    27  Docker containers are very similar to LXC containers, and they have
    28  similar security features. When you start a container with
    29  `docker run`, behind the scenes Docker creates a set of namespaces and control
    30  groups for the container.
    31  
    32  **Namespaces provide the first and most straightforward form of
    33  isolation**: processes running within a container cannot see, and even
    34  less affect, processes running in another container, or in the host
    35  system.
    36  
    37  **Each container also gets its own network stack**, meaning that a
    38  container doesn't get privileged access to the sockets or interfaces
    39  of another container. Of course, if the host system is setup
    40  accordingly, containers can interact with each other through their
    41  respective network interfaces — just like they can interact with
    42  external hosts. When you specify public ports for your containers or use
    43  [*links*](../userguide/networking/default_network/dockerlinks.md)
    44  then IP traffic is allowed between containers. They can ping each other,
    45  send/receive UDP packets, and establish TCP connections, but that can be
    46  restricted if necessary. From a network architecture point of view, all
    47  containers on a given Docker host are sitting on bridge interfaces. This
    48  means that they are just like physical machines connected through a
    49  common Ethernet switch; no more, no less.
    50  
    51  How mature is the code providing kernel namespaces and private
    52  networking? Kernel namespaces were introduced [between kernel version
    53  2.6.15 and
    54  2.6.26](http://man7.org/linux/man-pages/man7/namespaces.7.html).
    55  This means that since July 2008 (date of the 2.6.26 release
    56  ), namespace code has been exercised and scrutinized on a large
    57  number of production systems. And there is more: the design and
    58  inspiration for the namespaces code are even older. Namespaces are
    59  actually an effort to reimplement the features of [OpenVZ](
    60  http://en.wikipedia.org/wiki/OpenVZ) in such a way that they could be
    61  merged within the mainstream kernel. And OpenVZ was initially released
    62  in 2005, so both the design and the implementation are pretty mature.
    63  
    64  ## Control groups
    65  
    66  Control Groups are another key component of Linux Containers. They
    67  implement resource accounting and limiting. They provide many
    68  useful metrics, but they also help ensure that each container gets
    69  its fair share of memory, CPU, disk I/O; and, more importantly, that a
    70  single container cannot bring the system down by exhausting one of those
    71  resources.
    72  
    73  So while they do not play a role in preventing one container from
    74  accessing or affecting the data and processes of another container, they
    75  are essential to fend off some denial-of-service attacks. They are
    76  particularly important on multi-tenant platforms, like public and
    77  private PaaS, to guarantee a consistent uptime (and performance) even
    78  when some applications start to misbehave.
    79  
    80  Control Groups have been around for a while as well: the code was
    81  started in 2006, and initially merged in kernel 2.6.24.
    82  
    83  ## Docker daemon attack surface
    84  
    85  Running containers (and applications) with Docker implies running the
    86  Docker daemon. This daemon currently requires `root` privileges, and you
    87  should therefore be aware of some important details.
    88  
    89  First of all, **only trusted users should be allowed to control your
    90  Docker daemon**. This is a direct consequence of some powerful Docker
    91  features. Specifically, Docker allows you to share a directory between
    92  the Docker host and a guest container; and it allows you to do so
    93  without limiting the access rights of the container. This means that you
    94  can start a container where the `/host` directory will be the `/` directory
    95  on your host; and the container will be able to alter your host filesystem
    96  without any restriction. This is similar to how virtualization systems
    97  allow filesystem resource sharing. Nothing prevents you from sharing your
    98  root filesystem (or even your root block device) with a virtual machine.
    99  
   100  This has a strong security implication: for example, if you instrument Docker
   101  from a web server to provision containers through an API, you should be
   102  even more careful than usual with parameter checking, to make sure that
   103  a malicious user cannot pass crafted parameters causing Docker to create
   104  arbitrary containers.
   105  
   106  For this reason, the REST API endpoint (used by the Docker CLI to
   107  communicate with the Docker daemon) changed in Docker 0.5.2, and now
   108  uses a UNIX socket instead of a TCP socket bound on 127.0.0.1 (the
   109  latter being prone to cross-site request forgery attacks if you happen to run
   110  Docker directly on your local machine, outside of a VM). You can then
   111  use traditional UNIX permission checks to limit access to the control
   112  socket.
   113  
   114  You can also expose the REST API over HTTP if you explicitly decide to do so.
   115  However, if you do that, being aware of the above mentioned security
   116  implication, you should ensure that it will be reachable only from a
   117  trusted network or VPN; or protected with e.g., `stunnel` and client SSL
   118  certificates. You can also secure them with [HTTPS and
   119  certificates](https.md).
   120  
   121  The daemon is also potentially vulnerable to other inputs, such as image
   122  loading from either disk with 'docker load', or from the network with
   123  'docker pull'. As of Docker 1.3.2, images are now extracted in a chrooted
   124  subprocess on Linux/Unix platforms, being the first-step in a wider effort
   125  toward privilege separation. As of Docker 1.10.0, all images are stored and
   126  accessed by the cryptographic checksums of their contents, limiting the
   127  possibility of an attacker causing a collision with an existing image.
   128  
   129  Eventually, it is expected that the Docker daemon will run restricted
   130  privileges, delegating operations well-audited sub-processes,
   131  each with its own (very limited) scope of Linux capabilities,
   132  virtual network setup, filesystem management, etc. That is, most likely,
   133  pieces of the Docker engine itself will run inside of containers.
   134  
   135  Finally, if you run Docker on a server, it is recommended to run
   136  exclusively Docker in the server, and move all other services within
   137  containers controlled by Docker. Of course, it is fine to keep your
   138  favorite admin tools (probably at least an SSH server), as well as
   139  existing monitoring/supervision processes, such as NRPE and collectd.
   140  
   141  ## Linux kernel capabilities
   142  
   143  By default, Docker starts containers with a restricted set of
   144  capabilities. What does that mean?
   145  
   146  Capabilities turn the binary "root/non-root" dichotomy into a
   147  fine-grained access control system. Processes (like web servers) that
   148  just need to bind on a port below 1024 do not have to run as root: they
   149  can just be granted the `net_bind_service` capability instead. And there
   150  are many other capabilities, for almost all the specific areas where root
   151  privileges are usually needed.
   152  
   153  This means a lot for container security; let's see why!
   154  
   155  Your average server (bare metal or virtual machine) needs to run a bunch
   156  of processes as root. Those typically include SSH, cron, syslogd;
   157  hardware management tools (e.g., load modules), network configuration
   158  tools (e.g., to handle DHCP, WPA, or VPNs), and much more. A container is
   159  very different, because almost all of those tasks are handled by the
   160  infrastructure around the container:
   161  
   162   - SSH access will typically be managed by a single server running on
   163     the Docker host;
   164   - `cron`, when necessary, should run as a user
   165     process, dedicated and tailored for the app that needs its
   166     scheduling service, rather than as a platform-wide facility;
   167   - log management will also typically be handed to Docker, or by
   168     third-party services like Loggly or Splunk;
   169   - hardware management is irrelevant, meaning that you never need to
   170     run `udevd` or equivalent daemons within
   171     containers;
   172   - network management happens outside of the containers, enforcing
   173     separation of concerns as much as possible, meaning that a container
   174     should never need to perform `ifconfig`,
   175     `route`, or ip commands (except when a container
   176     is specifically engineered to behave like a router or firewall, of
   177     course).
   178  
   179  This means that in most cases, containers will not need "real" root
   180  privileges *at all*. And therefore, containers can run with a reduced
   181  capability set; meaning that "root" within a container has much less
   182  privileges than the real "root". For instance, it is possible to:
   183  
   184   - deny all "mount" operations;
   185   - deny access to raw sockets (to prevent packet spoofing);
   186   - deny access to some filesystem operations, like creating new device
   187     nodes, changing the owner of files, or altering attributes (including
   188     the immutable flag);
   189   - deny module loading;
   190   - and many others.
   191  
   192  This means that even if an intruder manages to escalate to root within a
   193  container, it will be much harder to do serious damage, or to escalate
   194  to the host.
   195  
   196  This won't affect regular web apps; but malicious users will find that
   197  the arsenal at their disposal has shrunk considerably! By default Docker
   198  drops all capabilities except [those
   199  needed](https://github.com/docker/docker/blob/master/oci/defaults_linux.go#L64-L79),
   200  a whitelist instead of a blacklist approach. You can see a full list of
   201  available capabilities in [Linux
   202  manpages](http://man7.org/linux/man-pages/man7/capabilities.7.html).
   203  
   204  One primary risk with running Docker containers is that the default set
   205  of capabilities and mounts given to a container may provide incomplete
   206  isolation, either independently, or when used in combination with
   207  kernel vulnerabilities.
   208  
   209  Docker supports the addition and removal of capabilities, allowing use
   210  of a non-default profile. This may make Docker more secure through
   211  capability removal, or less secure through the addition of capabilities.
   212  The best practice for users would be to remove all capabilities except
   213  those explicitly required for their processes.
   214  
   215  ## Other kernel security features
   216  
   217  Capabilities are just one of the many security features provided by
   218  modern Linux kernels. It is also possible to leverage existing,
   219  well-known systems like TOMOYO, AppArmor, SELinux, GRSEC, etc. with
   220  Docker.
   221  
   222  While Docker currently only enables capabilities, it doesn't interfere
   223  with the other systems. This means that there are many different ways to
   224  harden a Docker host. Here are a few examples.
   225  
   226   - You can run a kernel with GRSEC and PAX. This will add many safety
   227     checks, both at compile-time and run-time; it will also defeat many
   228     exploits, thanks to techniques like address randomization. It doesn't
   229     require Docker-specific configuration, since those security features
   230     apply system-wide, independent of containers.
   231   - If your distribution comes with security model templates for
   232     Docker containers, you can use them out of the box. For instance, we
   233     ship a template that works with AppArmor and Red Hat comes with SELinux
   234     policies for Docker. These templates provide an extra safety net (even
   235     though it overlaps greatly with capabilities).
   236   - You can define your own policies using your favorite access control
   237     mechanism.
   238  
   239  Just like there are many third-party tools to augment Docker containers
   240  with e.g., special network topologies or shared filesystems, you can
   241  expect to see tools to harden existing Docker containers without
   242  affecting Docker's core.
   243  
   244  As of Docker 1.10 User Namespaces are supported directly by the docker
   245  daemon. This feature allows for the root user in a container to be mapped
   246  to a non uid-0 user outside the container, which can help to mitigate the
   247  risks of container breakout. This facility is available but not enabled
   248  by default.
   249  
   250  Refer to the [daemon command](../reference/commandline/dockerd.md#daemon-user-namespace-options)
   251  in the command line reference for more information on this feature.
   252  Additional information on the implementation of User Namespaces in Docker
   253  can be found in <a href="https://integratedcode.us/2015/10/13/user-namespaces-have-arrived-in-docker/" target="_blank">this blog post</a>.
   254  
   255  ## Conclusions
   256  
   257  Docker containers are, by default, quite secure; especially if you take
   258  care of running your processes inside the containers as non-privileged
   259  users (i.e., non-`root`).
   260  
   261  You can add an extra layer of safety by enabling AppArmor, SELinux,
   262  GRSEC, or your favorite hardening solution.
   263  
   264  Last but not least, if you see interesting security features in other
   265  containerization systems, these are simply kernels features that may
   266  be implemented in Docker as well. We welcome users to submit issues,
   267  pull requests, and communicate via the mailing list.
   268  
   269  ## Related Information
   270  
   271  * [Use trusted images](../security/trust/index.md)
   272  * [Seccomp security profiles for Docker](../security/seccomp.md)
   273  * [AppArmor security profiles for Docker](../security/apparmor.md)
   274  * [On the Security of Containers (2014)](https://medium.com/@ewindisch/on-the-security-of-containers-2c60ffe25a9e)
   275  * [Docker swarm mode overlay network security model](../userguide/networking/overlay-security-model.md)