github.com/vieux/docker@v0.6.3-0.20161004191708-e097c2a938c7/docs/security/security.md

github.com/vieux/docker@v0.6.3-0.20161004191708-e097c2a938c7/docs/security/security.md (about)

1 
12
13 # Docker security
14
15 There are four major areas to consider when reviewing Docker security:
16
17 - the intrinsic security of the kernel and its support for
18 namespaces and cgroups;
19 - the attack surface of the Docker daemon itself;
20 - loopholes in the container configuration profile, either by default,
21 or when customized by users.
22 - the "hardening" security features of the kernel and how they
23 interact with containers.
24
25 ## Kernel namespaces
26
27 Docker containers are very similar to LXC containers, and they have
28 similar security features. When you start a container with
29 `docker run`, behind the scenes Docker creates a set of namespaces and control
30 groups for the container.
31
32 **Namespaces provide the first and most straightforward form of
33 isolation**: processes running within a container cannot see, and even
34 less affect, processes running in another container, or in the host
35 system.
36
37 **Each container also gets its own network stack**, meaning that a
38 container doesn't get privileged access to the sockets or interfaces
39 of another container. Of course, if the host system is setup
40 accordingly, containers can interact with each other through their
41 respective network interfaces — just like they can interact with
42 external hosts. When you specify public ports for your containers or use
43 [*links*](../userguide/networking/default_network/dockerlinks.md)
44 then IP traffic is allowed between containers. They can ping each other,
45 send/receive UDP packets, and establish TCP connections, but that can be
46 restricted if necessary. From a network architecture point of view, all
47 containers on a given Docker host are sitting on bridge interfaces. This
48 means that they are just like physical machines connected through a
49 common Ethernet switch; no more, no less.
50
51 How mature is the code providing kernel namespaces and private
52 networking? Kernel namespaces were introduced [between kernel version
53 2.6.15 and
54 2.6.26](http://man7.org/linux/man-pages/man7/namespaces.7.html).
55 This means that since July 2008 (date of the 2.6.26 release
56 ), namespace code has been exercised and scrutinized on a large
57 number of production systems. And there is more: the design and
58 inspiration for the namespaces code are even older. Namespaces are
59 actually an effort to reimplement the features of [OpenVZ](
60 http://en.wikipedia.org/wiki/OpenVZ) in such a way that they could be
61 merged within the mainstream kernel. And OpenVZ was initially released
62 in 2005, so both the design and the implementation are pretty mature.
63
64 ## Control groups
65
66 Control Groups are another key component of Linux Containers. They
67 implement resource accounting and limiting. They provide many
68 useful metrics, but they also help ensure that each container gets
69 its fair share of memory, CPU, disk I/O; and, more importantly, that a
70 single container cannot bring the system down by exhausting one of those
71 resources.
72
73 So while they do not play a role in preventing one container from
74 accessing or affecting the data and processes of another container, they
75 are essential to fend off some denial-of-service attacks. They are
76 particularly important on multi-tenant platforms, like public and
77 private PaaS, to guarantee a consistent uptime (and performance) even
78 when some applications start to misbehave.
79
80 Control Groups have been around for a while as well: the code was
81 started in 2006, and initially merged in kernel 2.6.24.
82
83 ## Docker daemon attack surface
84
85 Running containers (and applications) with Docker implies running the
86 Docker daemon. This daemon currently requires `root` privileges, and you
87 should therefore be aware of some important details.
88
89 First of all, **only trusted users should be allowed to control your
90 Docker daemon**. This is a direct consequence of some powerful Docker
91 features. Specifically, Docker allows you to share a directory between
92 the Docker host and a guest container; and it allows you to do so
93 without limiting the access rights of the container. This means that you
94 can start a container where the `/host` directory will be the `/` directory
95 on your host; and the container will be able to alter your host filesystem
96 without any restriction. This is similar to how virtualization systems
97 allow filesystem resource sharing. Nothing prevents you from sharing your
98 root filesystem (or even your root block device) with a virtual machine.
99
100 This has a strong security implication: for example, if you instrument Docker
101 from a web server to provision containers through an API, you should be
102 even more careful than usual with parameter checking, to make sure that
103 a malicious user cannot pass crafted parameters causing Docker to create
104 arbitrary containers.
105
106 For this reason, the REST API endpoint (used by the Docker CLI to
107 communicate with the Docker daemon) changed in Docker 0.5.2, and now
108 uses a UNIX socket instead of a TCP socket bound on 127.0.0.1 (the
109 latter being prone to cross-site request forgery attacks if you happen to run
110 Docker directly on your local machine, outside of a VM). You can then
111 use traditional UNIX permission checks to limit access to the control
112 socket.
113
114 You can also expose the REST API over HTTP if you explicitly decide to do so.
115 However, if you do that, being aware of the above mentioned security
116 implication, you should ensure that it will be reachable only from a
117 trusted network or VPN; or protected with e.g., `stunnel` and client SSL
118 certificates. You can also secure them with [HTTPS and
119 certificates](https.md).
120
121 The daemon is also potentially vulnerable to other inputs, such as image
122 loading from either disk with 'docker load', or from the network with
123 'docker pull'. As of Docker 1.3.2, images are now extracted in a chrooted
124 subprocess on Linux/Unix platforms, being the first-step in a wider effort
125 toward privilege separation. As of Docker 1.10.0, all images are stored and
126 accessed by the cryptographic checksums of their contents, limiting the
127 possibility of an attacker causing a collision with an existing image.
128
129 Eventually, it is expected that the Docker daemon will run restricted
130 privileges, delegating operations well-audited sub-processes,
131 each with its own (very limited) scope of Linux capabilities,
132 virtual network setup, filesystem management, etc. That is, most likely,
133 pieces of the Docker engine itself will run inside of containers.
134
135 Finally, if you run Docker on a server, it is recommended to run
136 exclusively Docker in the server, and move all other services within
137 containers controlled by Docker. Of course, it is fine to keep your
138 favorite admin tools (probably at least an SSH server), as well as
139 existing monitoring/supervision processes, such as NRPE and collectd.
140
141 ## Linux kernel capabilities
142
143 By default, Docker starts containers with a restricted set of
144 capabilities. What does that mean?
145
146 Capabilities turn the binary "root/non-root" dichotomy into a
147 fine-grained access control system. Processes (like web servers) that
148 just need to bind on a port below 1024 do not have to run as root: they
149 can just be granted the `net_bind_service` capability instead. And there
150 are many other capabilities, for almost all the specific areas where root
151 privileges are usually needed.
152
153 This means a lot for container security; let's see why!
154
155 Your average server (bare metal or virtual machine) needs to run a bunch
156 of processes as root. Those typically include SSH, cron, syslogd;
157 hardware management tools (e.g., load modules), network configuration
158 tools (e.g., to handle DHCP, WPA, or VPNs), and much more. A container is
159 very different, because almost all of those tasks are handled by the
160 infrastructure around the container:
161
162 - SSH access will typically be managed by a single server running on
163 the Docker host;
164 - `cron`, when necessary, should run as a user
165 process, dedicated and tailored for the app that needs its
166 scheduling service, rather than as a platform-wide facility;
167 - log management will also typically be handed to Docker, or by
168 third-party services like Loggly or Splunk;
169 - hardware management is irrelevant, meaning that you never need to
170 run `udevd` or equivalent daemons within
171 containers;
172 - network management happens outside of the containers, enforcing
173 separation of concerns as much as possible, meaning that a container
174 should never need to perform `ifconfig`,
175 `route`, or ip commands (except when a container
176 is specifically engineered to behave like a router or firewall, of
177 course).
178
179 This means that in most cases, containers will not need "real" root
180 privileges *at all*. And therefore, containers can run with a reduced
181 capability set; meaning that "root" within a container has much less
182 privileges than the real "root". For instance, it is possible to:
183
184 - deny all "mount" operations;
185 - deny access to raw sockets (to prevent packet spoofing);
186 - deny access to some filesystem operations, like creating new device
187 nodes, changing the owner of files, or altering attributes (including
188 the immutable flag);
189 - deny module loading;
190 - and many others.
191
192 This means that even if an intruder manages to escalate to root within a
193 container, it will be much harder to do serious damage, or to escalate
194 to the host.
195
196 This won't affect regular web apps; but malicious users will find that
197 the arsenal at their disposal has shrunk considerably! By default Docker
198 drops all capabilities except [those
199 needed](https://github.com/docker/docker/blob/master/oci/defaults_linux.go#L64-L79),
200 a whitelist instead of a blacklist approach. You can see a full list of
201 available capabilities in [Linux
202 manpages](http://man7.org/linux/man-pages/man7/capabilities.7.html).
203
204 One primary risk with running Docker containers is that the default set
205 of capabilities and mounts given to a container may provide incomplete
206 isolation, either independently, or when used in combination with
207 kernel vulnerabilities.
208
209 Docker supports the addition and removal of capabilities, allowing use
210 of a non-default profile. This may make Docker more secure through
211 capability removal, or less secure through the addition of capabilities.
212 The best practice for users would be to remove all capabilities except
213 those explicitly required for their processes.
214
215 ## Other kernel security features
216
217 Capabilities are just one of the many security features provided by
218 modern Linux kernels. It is also possible to leverage existing,
219 well-known systems like TOMOYO, AppArmor, SELinux, GRSEC, etc. with
220 Docker.
221
222 While Docker currently only enables capabilities, it doesn't interfere
223 with the other systems. This means that there are many different ways to
224 harden a Docker host. Here are a few examples.
225
226 - You can run a kernel with GRSEC and PAX. This will add many safety
227 checks, both at compile-time and run-time; it will also defeat many
228 exploits, thanks to techniques like address randomization. It doesn't
229 require Docker-specific configuration, since those security features
230 apply system-wide, independent of containers.
231 - If your distribution comes with security model templates for
232 Docker containers, you can use them out of the box. For instance, we
233 ship a template that works with AppArmor and Red Hat comes with SELinux
234 policies for Docker. These templates provide an extra safety net (even
235 though it overlaps greatly with capabilities).
236 - You can define your own policies using your favorite access control
237 mechanism.
238
239 Just like there are many third-party tools to augment Docker containers
240 with e.g., special network topologies or shared filesystems, you can
241 expect to see tools to harden existing Docker containers without
242 affecting Docker's core.
243
244 As of Docker 1.10 User Namespaces are supported directly by the docker
245 daemon. This feature allows for the root user in a container to be mapped
246 to a non uid-0 user outside the container, which can help to mitigate the
247 risks of container breakout. This facility is available but not enabled
248 by default.
249
250 Refer to the [daemon command](../reference/commandline/dockerd.md#daemon-user-namespace-options)
251 in the command line reference for more information on this feature.
252 Additional information on the implementation of User Namespaces in Docker
253 can be found in <a href="https://integratedcode.us/2015/10/13/user-namespaces-have-arrived-in-docker/" target="_blank">this blog post</a>.
254
255 ## Conclusions
256
257 Docker containers are, by default, quite secure; especially if you take
258 care of running your processes inside the containers as non-privileged
259 users (i.e., non-`root`).
260
261 You can add an extra layer of safety by enabling AppArmor, SELinux,
262 GRSEC, or your favorite hardening solution.
263
264 Last but not least, if you see interesting security features in other
265 containerization systems, these are simply kernels features that may
266 be implemented in Docker as well. We welcome users to submit issues,
267 pull requests, and communicate via the mailing list.
268
269 ## Related Information
270
271 * [Use trusted images](../security/trust/index.md)
272 * [Seccomp security profiles for Docker](../security/seccomp.md)
273 * [AppArmor security profiles for Docker](../security/apparmor.md)
274 * [On the Security of Containers (2014)](https://medium.com/@ewindisch/on-the-security-of-containers-2c60ffe25a9e)
275 * [Docker swarm mode overlay network security model](../userguide/networking/overlay-security-model.md)