github.com/sijibomii/docker@v0.0.0-20231230191044-5cf6ca554647/docs/security/security.md

github.com/sijibomii/docker@v0.0.0-20231230191044-5cf6ca554647/docs/security/security.md (about)

1 
12
13 # Docker security
14
15 There are three major areas to consider when reviewing Docker security:
16
17 - the intrinsic security of the kernel and its support for
18 namespaces and cgroups;
19 - the attack surface of the Docker daemon itself;
20 - loopholes in the container configuration profile, either by default,
21 or when customized by users.
22 - the "hardening" security features of the kernel and how they
23 interact with containers.
24
25 ## Kernel namespaces
26
27 Docker containers are very similar to LXC containers, and they have
28 similar security features. When you start a container with
29 `docker run`, behind the scenes Docker creates a set of namespaces and control
30 groups for the container.
31
32 **Namespaces provide the first and most straightforward form of
33 isolation**: processes running within a container cannot see, and even
34 less affect, processes running in another container, or in the host
35 system.
36
37 **Each container also gets its own network stack**, meaning that a
38 container doesn't get privileged access to the sockets or interfaces
39 of another container. Of course, if the host system is setup
40 accordingly, containers can interact with each other through their
41 respective network interfaces — just like they can interact with
42 external hosts. When you specify public ports for your containers or use
43 [*links*](../userguide/networking/default_network/dockerlinks.md)
44 then IP traffic is allowed between containers. They can ping each other,
45 send/receive UDP packets, and establish TCP connections, but that can be
46 restricted if necessary. From a network architecture point of view, all
47 containers on a given Docker host are sitting on bridge interfaces. This
48 means that they are just like physical machines connected through a
49 common Ethernet switch; no more, no less.
50
51 How mature is the code providing kernel namespaces and private
52 networking? Kernel namespaces were introduced [between kernel version
53 2.6.15 and
54 2.6.26](http://lxc.sourceforge.net/index.php/about/kernel-namespaces/).
55 This means that since July 2008 (date of the 2.6.26 release, now 7 years
56 ago), namespace code has been exercised and scrutinized on a large
57 number of production systems. And there is more: the design and
58 inspiration for the namespaces code are even older. Namespaces are
59 actually an effort to reimplement the features of [OpenVZ](
60 http://en.wikipedia.org/wiki/OpenVZ) in such a way that they could be
61 merged within the mainstream kernel. And OpenVZ was initially released
62 in 2005, so both the design and the implementation are pretty mature.
63
64 ## Control groups
65
66 Control Groups are another key component of Linux Containers. They
67 implement resource accounting and limiting. They provide many
68 useful metrics, but they also help ensure that each container gets
69 its fair share of memory, CPU, disk I/O; and, more importantly, that a
70 single container cannot bring the system down by exhausting one of those
71 resources.
72
73 So while they do not play a role in preventing one container from
74 accessing or affecting the data and processes of another container, they
75 are essential to fend off some denial-of-service attacks. They are
76 particularly important on multi-tenant platforms, like public and
77 private PaaS, to guarantee a consistent uptime (and performance) even
78 when some applications start to misbehave.
79
80 Control Groups have been around for a while as well: the code was
81 started in 2006, and initially merged in kernel 2.6.24.
82
83 ## Docker daemon attack surface
84
85 Running containers (and applications) with Docker implies running the
86 Docker daemon. This daemon currently requires `root` privileges, and you
87 should therefore be aware of some important details.
88
89 First of all, **only trusted users should be allowed to control your
90 Docker daemon**. This is a direct consequence of some powerful Docker
91 features. Specifically, Docker allows you to share a directory between
92 the Docker host and a guest container; and it allows you to do so
93 without limiting the access rights of the container. This means that you
94 can start a container where the `/host` directory will be the `/` directory
95 on your host; and the container will be able to alter your host filesystem
96 without any restriction. This is similar to how virtualization systems
97 allow filesystem resource sharing. Nothing prevents you from sharing your
98 root filesystem (or even your root block device) with a virtual machine.
99
100 This has a strong security implication: for example, if you instrument Docker
101 from a web server to provision containers through an API, you should be
102 even more careful than usual with parameter checking, to make sure that
103 a malicious user cannot pass crafted parameters causing Docker to create
104 arbitrary containers.
105
106 For this reason, the REST API endpoint (used by the Docker CLI to
107 communicate with the Docker daemon) changed in Docker 0.5.2, and now
108 uses a UNIX socket instead of a TCP socket bound on 127.0.0.1 (the
109 latter being prone to cross-site-scripting attacks if you happen to run
110 Docker directly on your local machine, outside of a VM). You can then
111 use traditional UNIX permission checks to limit access to the control
112 socket.
113
114 You can also expose the REST API over HTTP if you explicitly decide to do so.
115 However, if you do that, being aware of the above mentioned security
116 implication, you should ensure that it will be reachable only from a
117 trusted network or VPN; or protected with e.g., `stunnel` and client SSL
118 certificates. You can also secure them with [HTTPS and
119 certificates](https.md).
120
121 The daemon is also potentially vulnerable to other inputs, such as image
122 loading from either disk with 'docker load', or from the network with
123 'docker pull'. This has been a focus of improvement in the community,
124 especially for 'pull' security. While these overlap, it should be noted
125 that 'docker load' is a mechanism for backup and restore and is not
126 currently considered a secure mechanism for loading images. As of
127 Docker 1.3.2, images are now extracted in a chrooted subprocess on
128 Linux/Unix platforms, being the first-step in a wider effort toward
129 privilege separation.
130
131 Eventually, it is expected that the Docker daemon will run restricted
132 privileges, delegating operations well-audited sub-processes,
133 each with its own (very limited) scope of Linux capabilities,
134 virtual network setup, filesystem management, etc. That is, most likely,
135 pieces of the Docker engine itself will run inside of containers.
136
137 Finally, if you run Docker on a server, it is recommended to run
138 exclusively Docker in the server, and move all other services within
139 containers controlled by Docker. Of course, it is fine to keep your
140 favorite admin tools (probably at least an SSH server), as well as
141 existing monitoring/supervision processes (e.g., NRPE, collectd, etc).
142
143 ## Linux kernel capabilities
144
145 By default, Docker starts containers with a restricted set of
146 capabilities. What does that mean?
147
148 Capabilities turn the binary "root/non-root" dichotomy into a
149 fine-grained access control system. Processes (like web servers) that
150 just need to bind on a port below 1024 do not have to run as root: they
151 can just be granted the `net_bind_service` capability instead. And there
152 are many other capabilities, for almost all the specific areas where root
153 privileges are usually needed.
154
155 This means a lot for container security; let's see why!
156
157 Your average server (bare metal or virtual machine) needs to run a bunch
158 of processes as root. Those typically include SSH, cron, syslogd;
159 hardware management tools (e.g., load modules), network configuration
160 tools (e.g., to handle DHCP, WPA, or VPNs), and much more. A container is
161 very different, because almost all of those tasks are handled by the
162 infrastructure around the container:
163
164 - SSH access will typically be managed by a single server running on
165 the Docker host;
166 - `cron`, when necessary, should run as a user
167 process, dedicated and tailored for the app that needs its
168 scheduling service, rather than as a platform-wide facility;
169 - log management will also typically be handed to Docker, or by
170 third-party services like Loggly or Splunk;
171 - hardware management is irrelevant, meaning that you never need to
172 run `udevd` or equivalent daemons within
173 containers;
174 - network management happens outside of the containers, enforcing
175 separation of concerns as much as possible, meaning that a container
176 should never need to perform `ifconfig`,
177 `route`, or ip commands (except when a container
178 is specifically engineered to behave like a router or firewall, of
179 course).
180
181 This means that in most cases, containers will not need "real" root
182 privileges *at all*. And therefore, containers can run with a reduced
183 capability set; meaning that "root" within a container has much less
184 privileges than the real "root". For instance, it is possible to:
185
186 - deny all "mount" operations;
187 - deny access to raw sockets (to prevent packet spoofing);
188 - deny access to some filesystem operations, like creating new device
189 nodes, changing the owner of files, or altering attributes (including
190 the immutable flag);
191 - deny module loading;
192 - and many others.
193
194 This means that even if an intruder manages to escalate to root within a
195 container, it will be much harder to do serious damage, or to escalate
196 to the host.
197
198 This won't affect regular web apps; but malicious users will find that
199 the arsenal at their disposal has shrunk considerably! By default Docker
200 drops all capabilities except [those
201 needed](https://github.com/docker/docker/blob/master/oci/defaults_linux.go#L64-L79),
202 a whitelist instead of a blacklist approach. You can see a full list of
203 available capabilities in [Linux
204 manpages](http://man7.org/linux/man-pages/man7/capabilities.7.html).
205
206 One primary risk with running Docker containers is that the default set
207 of capabilities and mounts given to a container may provide incomplete
208 isolation, either independently, or when used in combination with
209 kernel vulnerabilities.
210
211 Docker supports the addition and removal of capabilities, allowing use
212 of a non-default profile. This may make Docker more secure through
213 capability removal, or less secure through the addition of capabilities.
214 The best practice for users would be to remove all capabilities except
215 those explicitly required for their processes.
216
217 ## Other kernel security features
218
219 Capabilities are just one of the many security features provided by
220 modern Linux kernels. It is also possible to leverage existing,
221 well-known systems like TOMOYO, AppArmor, SELinux, GRSEC, etc. with
222 Docker.
223
224 While Docker currently only enables capabilities, it doesn't interfere
225 with the other systems. This means that there are many different ways to
226 harden a Docker host. Here are a few examples.
227
228 - You can run a kernel with GRSEC and PAX. This will add many safety
229 checks, both at compile-time and run-time; it will also defeat many
230 exploits, thanks to techniques like address randomization. It doesn't
231 require Docker-specific configuration, since those security features
232 apply system-wide, independent of containers.
233 - If your distribution comes with security model templates for
234 Docker containers, you can use them out of the box. For instance, we
235 ship a template that works with AppArmor and Red Hat comes with SELinux
236 policies for Docker. These templates provide an extra safety net (even
237 though it overlaps greatly with capabilities).
238 - You can define your own policies using your favorite access control
239 mechanism.
240
241 Just like there are many third-party tools to augment Docker containers
242 with e.g., special network topologies or shared filesystems, you can
243 expect to see tools to harden existing Docker containers without
244 affecting Docker's core.
245
246 As of Docker 1.10 User Namespaces are supported directly by the docker
247 daemon. This feature allows for the root user in a container to be mapped
248 to a non uid-0 user outside the container, which can help to mitigate the
249 risks of container breakout. This facility is available but not enabled
250 by default.
251
252 Refer to the [daemon command](../reference/commandline/daemon.md#daemon-user-namespace-options)
253 in the command line reference for more information on this feature.
254 Additional information on the implementation of User Namespaces in Docker
255 can be found in <a href="https://integratedcode.us/2015/10/13/user-namespaces-have-arrived-in-docker/" target="_blank">this blog post</a>.
256
257 ## Conclusions
258
259 Docker containers are, by default, quite secure; especially if you take
260 care of running your processes inside the containers as non-privileged
261 users (i.e., non-`root`).
262
263 You can add an extra layer of safety by enabling AppArmor, SELinux,
264 GRSEC, or your favorite hardening solution.
265
266 Last but not least, if you see interesting security features in other
267 containerization systems, these are simply kernels features that may
268 be implemented in Docker as well. We welcome users to submit issues,
269 pull requests, and communicate via the mailing list.
270
271 ## Related Information
272
273 * [Use trusted images](../security/trust/index.md)
274 * [Seccomp security profiles for Docker](../security/seccomp.md)
275 * [AppArmor security profiles for Docker](../security/apparmor.md)
276 * [On the Security of Containers (2014)](https://medium.com/@ewindisch/on-the-security-of-containers-2c60ffe25a9e)