github.com/vieux/docker@v0.6.3-0.20161004191708-e097c2a938c7/docs/userguide/storagedriver/imagesandcontainers.md (about) 1 <!--[metadata]> 2 +++ 3 title = "Understand images, containers, and storage drivers" 4 description = "Learn the technologies that support storage drivers." 5 keywords = ["container, storage, driver, AUFS, btfs, devicemapper,zvfs"] 6 [menu.main] 7 parent = "engine_driver" 8 weight = -2 9 +++ 10 <![end-metadata]--> 11 12 13 # Understand images, containers, and storage drivers 14 15 To use storage drivers effectively, you must understand how Docker builds and 16 stores images. Then, you need an understanding of how these images are used by 17 containers. Finally, you'll need a short introduction to the technologies that 18 enable both images and container operations. 19 20 ## Images and layers 21 22 Each Docker image references a list of read-only layers that represent 23 filesystem differences. Layers are stacked on top of each other to form a base 24 for a container's root filesystem. The diagram below shows the Ubuntu 15.04 25 image comprising 4 stacked image layers. 26 27  28 29 The Docker storage driver is responsible for stacking these layers and 30 providing a single unified view. 31 32 When you create a new container, you add a new, thin, writable layer on top of 33 the underlying stack. This layer is often called the "container layer". All 34 changes made to the running container - such as writing new files, modifying 35 existing files, and deleting files - are written to this thin writable 36 container layer. The diagram below shows a container based on the Ubuntu 15.04 37 image. 38 39  40 41 ### Content addressable storage 42 43 Docker 1.10 introduced a new content addressable storage model. This is a 44 completely new way to address image and layer data on disk. Previously, image 45 and layer data was referenced and stored using a randomly generated UUID. In 46 the new model this is replaced by a secure *content hash*. 47 48 The new model improves security, provides a built-in way to avoid ID 49 collisions, and guarantees data integrity after pull, push, load, and save 50 operations. It also enables better sharing of layers by allowing many images to 51 freely share their layers even if they didn’t come from the same build. 52 53 The diagram below shows an updated version of the previous diagram, 54 highlighting the changes implemented by Docker 1.10. 55 56  57 58 As can be seen, all image layer IDs are cryptographic hashes, whereas the 59 container ID is still a randomly generated UUID. 60 61 There are several things to note regarding the new model. These include: 62 63 1. Migration of existing images 64 2. Image and layer filesystem structures 65 66 Existing images, those created and pulled by earlier versions of Docker, need 67 to be migrated before they can be used with the new model. This migration 68 involves calculating new secure checksums and is performed automatically the 69 first time you start an updated Docker daemon. After the migration is complete, 70 all images and tags will have brand new secure IDs. 71 72 Although the migration is automatic and transparent, it is computationally 73 intensive. This means it can take time if you have lots of image data. 74 During this time your Docker daemon will not respond to other requests. 75 76 A migration tool exists that allows you to migrate existing images to the new 77 format before upgrading your Docker daemon. This means that upgraded Docker 78 daemons do not need to perform the migration in-band, and therefore avoids any 79 associated downtime. It also provides a way to manually migrate existing images 80 so that they can be distributed to other Docker daemons in your environment 81 that are already running the latest versions of Docker. 82 83 The migration tool is provided by Docker, Inc., and runs as a container. You 84 can download it from [https://github.com/docker/v1.10-migrator/releases](https://github.com/docker/v1.10-migrator/releases). 85 86 While running the "migrator" image you need to expose your Docker host's data 87 directory to the container. If you are using the default Docker data path, the 88 command to run the container will look like this 89 90 $ sudo docker run --rm -v /var/lib/docker:/var/lib/docker docker/v1.10-migrator 91 92 If you use the `devicemapper` storage driver, you will need to include the 93 `--privileged` option so that the container has access to your storage devices. 94 95 #### Migration example 96 97 The following example shows the migration tool in use on a Docker host running 98 version 1.9.1 of the Docker daemon and the AUFS storage driver. The Docker host 99 is running on a **t2.micro** AWS EC2 instance with 1 vCPU, 1GB RAM, and a 100 single 8GB general purpose SSD EBS volume. The Docker data directory 101 (`/var/lib/docker`) was consuming 2GB of space. 102 103 $ docker images 104 105 REPOSITORY TAG IMAGE ID CREATED SIZE 106 jenkins latest 285c9f0f9d3d 17 hours ago 708.5 MB 107 mysql latest d39c3fa09ced 8 days ago 360.3 MB 108 mongo latest a74137af4532 13 days ago 317.4 MB 109 postgres latest 9aae83d4127f 13 days ago 270.7 MB 110 redis latest 8bccd73928d9 2 weeks ago 151.3 MB 111 centos latest c8a648134623 4 weeks ago 196.6 MB 112 ubuntu 15.04 c8be1ac8145a 7 weeks ago 131.3 MB 113 114 $ sudo du -hs /var/lib/docker 115 116 2.0G /var/lib/docker 117 118 $ time docker run --rm -v /var/lib/docker:/var/lib/docker docker/v1.10-migrator 119 120 Unable to find image 'docker/v1.10-migrator:latest' locally 121 latest: Pulling from docker/v1.10-migrator 122 ed1f33c5883d: Pull complete 123 b3ca410aa2c1: Pull complete 124 2b9c6ed9099e: Pull complete 125 dce7e318b173: Pull complete 126 Digest: sha256:bd2b245d5d22dd94ec4a8417a9b81bb5e90b171031c6e216484db3fe300c2097 127 Status: Downloaded newer image for docker/v1.10-migrator:latest 128 time="2016-01-27T12:31:06Z" level=debug msg="Assembling tar data for 01e70da302a553ba13485ad020a0d77dbb47575a31c4f48221137bb08f45878d from /var/lib/docker/aufs/diff/01e70da302a553ba13485ad020a0d77dbb47575a31c4f48221137bb08f45878d" 129 time="2016-01-27T12:31:06Z" level=debug msg="Assembling tar data for 07ac220aeeef9febf1ac16a9d1a4eff7ef3c8cbf5ed0be6b6f4c35952ed7920d from /var/lib/docker/aufs/diff/07ac220aeeef9febf1ac16a9d1a4eff7ef3c8cbf5ed0be6b6f4c35952ed7920d" 130 <snip> 131 time="2016-01-27T12:32:00Z" level=debug msg="layer dbacfa057b30b1feaf15937c28bd8ca0d6c634fc311ccc35bd8d56d017595d5b took 10.80 seconds" 132 133 real 0m59.583s 134 user 0m0.046s 135 sys 0m0.008s 136 137 The Unix `time` command prepends the `docker run` command to produce timings 138 for the operation. As can be seen, the overall time taken to migrate 7 images 139 comprising 2GB of disk space took approximately 1 minute. However, this 140 included the time taken to pull the `docker/v1.10-migrator` image 141 (approximately 3.5 seconds). The same operation on an m4.10xlarge EC2 instance 142 with 40 vCPUs, 160GB RAM and an 8GB provisioned IOPS EBS volume resulted in the 143 following improved timings: 144 145 real 0m9.871s 146 user 0m0.094s 147 sys 0m0.021s 148 149 This shows that the migration operation is affected by the hardware spec of the 150 machine performing the migration. 151 152 ## Container and layers 153 154 The major difference between a container and an image is the top writable 155 layer. All writes to the container that add new or modify existing data are 156 stored in this writable layer. When the container is deleted the writable layer 157 is also deleted. The underlying image remains unchanged. 158 159 Because each container has its own thin writable container layer, and all 160 changes are stored in this container layer, this means that multiple containers 161 can share access to the same underlying image and yet have their own data 162 state. The diagram below shows multiple containers sharing the same Ubuntu 163 15.04 image. 164 165  166 167 The Docker storage driver is responsible for enabling and managing both the 168 image layers and the writable container layer. How a storage driver 169 accomplishes these can vary between drivers. Two key technologies behind Docker 170 image and container management are stackable image layers and copy-on-write 171 (CoW). 172 173 174 ## The copy-on-write strategy 175 176 Sharing is a good way to optimize resources. People do this instinctively in 177 daily life. For example, twins Jane and Joseph taking an Algebra class at 178 different times from different teachers can share the same exercise book by 179 passing it between each other. Now, suppose Jane gets an assignment to complete 180 the homework on page 11 in the book. At that point, Jane copies page 11, 181 completes the homework, and hands in her copy. The original exercise book is 182 unchanged and only Jane has a copy of the changed page 11. 183 184 Copy-on-write is a similar strategy of sharing and copying. In this strategy, 185 system processes that need the same data share the same instance of that data 186 rather than having their own copy. At some point, if one process needs to 187 modify or write to the data, only then does the operating system make a copy of 188 the data for that process to use. Only the process that needs to write has 189 access to the data copy. All the other processes continue to use the original 190 data. 191 192 Docker uses a copy-on-write technology with both images and containers. This 193 CoW strategy optimizes both image disk space usage and the performance of 194 container start times. The next sections look at how copy-on-write is leveraged 195 with images and containers through sharing and copying. 196 197 ### Sharing promotes smaller images 198 199 This section looks at image layers and copy-on-write technology. All image and 200 container layers exist inside the Docker host's *local storage area* and are 201 managed by the storage driver. On Linux-based Docker hosts this is usually 202 located under `/var/lib/docker/`. 203 204 The Docker client reports on image layers when instructed to pull and push 205 images with `docker pull` and `docker push`. The command below pulls the 206 `ubuntu:15.04` Docker image from Docker Hub. 207 208 $ docker pull ubuntu:15.04 209 210 15.04: Pulling from library/ubuntu 211 1ba8ac955b97: Pull complete 212 f157c4e5ede7: Pull complete 213 0b7e98f84c4c: Pull complete 214 a3ed95caeb02: Pull complete 215 Digest: sha256:5e279a9df07990286cce22e1b0f5b0490629ca6d187698746ae5e28e604a640e 216 Status: Downloaded newer image for ubuntu:15.04 217 218 From the output, you'll see that the command actually pulls 4 image layers. 219 Each of the above lines lists an image layer and its UUID or cryptographic 220 hash. The combination of these four layers makes up the `ubuntu:15.04` Docker 221 image. 222 223 Each of these layers is stored in its own directory inside the Docker host's 224 local storage are. 225 226 Versions of Docker prior to 1.10 stored each layer in a directory with the same 227 name as the image layer ID. However, this is not the case for images pulled 228 with Docker version 1.10 and later. For example, the command below shows an 229 image being pulled from Docker Hub, followed by a directory listing on a host 230 running version 1.9.1 of the Docker Engine. 231 232 $ docker pull ubuntu:15.04 233 234 15.04: Pulling from library/ubuntu 235 47984b517ca9: Pull complete 236 df6e891a3ea9: Pull complete 237 e65155041eed: Pull complete 238 c8be1ac8145a: Pull complete 239 Digest: sha256:5e279a9df07990286cce22e1b0f5b0490629ca6d187698746ae5e28e604a640e 240 Status: Downloaded newer image for ubuntu:15.04 241 242 $ ls /var/lib/docker/aufs/layers 243 244 47984b517ca9ca0312aced5c9698753ffa964c2015f2a5f18e5efa9848cf30e2 245 c8be1ac8145a6e59a55667f573883749ad66eaeef92b4df17e5ea1260e2d7356 246 df6e891a3ea9cdce2a388a2cf1b1711629557454fd120abd5be6d32329a0e0ac 247 e65155041eed7ec58dea78d90286048055ca75d41ea893c7246e794389ecf203 248 249 Notice how the four directories match up with the layer IDs of the downloaded 250 image. Now compare this with the same operations performed on a host running 251 version 1.10 of the Docker Engine. 252 253 $ docker pull ubuntu:15.04 254 15.04: Pulling from library/ubuntu 255 1ba8ac955b97: Pull complete 256 f157c4e5ede7: Pull complete 257 0b7e98f84c4c: Pull complete 258 a3ed95caeb02: Pull complete 259 Digest: sha256:5e279a9df07990286cce22e1b0f5b0490629ca6d187698746ae5e28e604a640e 260 Status: Downloaded newer image for ubuntu:15.04 261 262 $ ls /var/lib/docker/aufs/layers/ 263 1d6674ff835b10f76e354806e16b950f91a191d3b471236609ab13a930275e24 264 5dbb0cbe0148cf447b9464a358c1587be586058d9a4c9ce079320265e2bb94e7 265 bef7199f2ed8e86fa4ada1309cfad3089e0542fec8894690529e4c04a7ca2d73 266 ebf814eccfe98f2704660ca1d844e4348db3b5ccc637eb905d4818fbfb00a06a 267 268 See how the four directories do not match up with the image layer IDs pulled in 269 the previous step. 270 271 Despite the differences between image management before and after version 1.10, 272 all versions of Docker still allow images to share layers. For example, If you 273 `pull` an image that shares some of the same image layers as an image that has 274 already been pulled, the Docker daemon recognizes this, and only pulls the 275 layers it doesn't already have stored locally. After the second pull, the two 276 images will share any common image layers. 277 278 You can illustrate this now for yourself. Starting with the `ubuntu:15.04` 279 image that you just pulled, make a change to it, and build a new image based on 280 the change. One way to do this is using a `Dockerfile` and the `docker build` 281 command. 282 283 1. In an empty directory, create a simple `Dockerfile` that starts with the 284 ubuntu:15.04 image. 285 286 FROM ubuntu:15.04 287 288 2. Add a new file called "newfile" in the image's `/tmp` directory with the 289 text "Hello world" in it. 290 291 When you are done, the `Dockerfile` contains two lines: 292 293 FROM ubuntu:15.04 294 295 RUN echo "Hello world" > /tmp/newfile 296 297 3. Save and close the file. 298 299 4. From a terminal in the same folder as your `Dockerfile`, run the following 300 command: 301 302 $ docker build -t changed-ubuntu . 303 304 Sending build context to Docker daemon 2.048 kB 305 Step 1 : FROM ubuntu:15.04 306 ---> 3f7bcee56709 307 Step 2 : RUN echo "Hello world" > /tmp/newfile 308 ---> Running in d14acd6fad4e 309 ---> 94e6b7d2c720 310 Removing intermediate container d14acd6fad4e 311 Successfully built 94e6b7d2c720 312 313 > **Note:** The period (.) at the end of the above command is important. It 314 > tells the `docker build` command to use the current working directory as 315 > its build context. 316 317 The output above shows a new image with image ID `94e6b7d2c720`. 318 319 5. Run the `docker images` command to verify the new `changed-ubuntu` image is 320 in the Docker host's local storage area. 321 322 REPOSITORY TAG IMAGE ID CREATED SIZE 323 changed-ubuntu latest 03b964f68d06 33 seconds ago 131.4 MB 324 ubuntu 15.04 013f3d01d247 6 weeks ago 131.3 MB 325 326 6. Run the `docker history` command to see which image layers were used to 327 create the new `changed-ubuntu` image. 328 329 $ docker history changed-ubuntu 330 IMAGE CREATED CREATED BY SIZE COMMENT 331 94e6b7d2c720 2 minutes ago /bin/sh -c echo "Hello world" > /tmp/newfile 12 B 332 3f7bcee56709 6 weeks ago /bin/sh -c #(nop) CMD ["/bin/bash"] 0 B 333 <missing> 6 weeks ago /bin/sh -c sed -i 's/^#\s*\(deb.*universe\)$/ 1.879 kB 334 <missing> 6 weeks ago /bin/sh -c echo '#!/bin/sh' > /usr/sbin/polic 701 B 335 <missing> 6 weeks ago /bin/sh -c #(nop) ADD file:8e4943cd86e9b2ca13 131.3 MB 336 337 The `docker history` output shows the new `94e6b7d2c720` image layer at the 338 top. You know that this is the new image layer added because it was created 339 by the `echo "Hello world" > /tmp/newfile` command in your `Dockerfile`. 340 The 4 image layers below it are the exact same image layers 341 that make up the `ubuntu:15.04` image. 342 343 > **Note:** Under the content addressable storage model introduced with Docker 344 > 1.10, image history data is no longer stored in a config file with each image 345 > layer. It is now stored as a string of text in a single config file that 346 > relates to the overall image. This can result in some image layers showing as 347 > "missing" in the output of the `docker history` command. This is normal 348 > behaviour and can be ignored. 349 > 350 > You may hear images like these referred to as *flat images*. 351 352 Notice the new `changed-ubuntu` image does not have its own copies of every 353 layer. As can be seen in the diagram below, the new image is sharing its four 354 underlying layers with the `ubuntu:15.04` image. 355 356  357 358 The `docker history` command also shows the size of each image layer. As you 359 can see, the `94e6b7d2c720` layer is only consuming 12 Bytes of disk space. 360 This means that the `changed-ubuntu` image we just created is only consuming an 361 additional 12 Bytes of disk space on the Docker host - all layers below the 362 `94e6b7d2c720` layer already exist on the Docker host and are shared by other 363 images. 364 365 This sharing of image layers is what makes Docker images and containers so 366 space efficient. 367 368 ### Copying makes containers efficient 369 370 You learned earlier that a container is a Docker image with a thin writable, 371 container layer added. The diagram below shows the layers of a container based 372 on the `ubuntu:15.04` image: 373 374  375 376 All writes made to a container are stored in the thin writable container layer. 377 The other layers are read-only (RO) image layers and can't be changed. This 378 means that multiple containers can safely share a single underlying image. The 379 diagram below shows multiple containers sharing a single copy of the 380 `ubuntu:15.04` image. Each container has its own thin RW layer, but they all 381 share a single instance of the ubuntu:15.04 image: 382 383  384 385 When an existing file in a container is modified, Docker uses the storage 386 driver to perform a copy-on-write operation. The specifics of operation depends 387 on the storage driver. For the AUFS and OverlayFS storage drivers, the 388 copy-on-write operation is pretty much as follows: 389 390 * Search through the image layers for the file to update. The process starts 391 at the top, newest layer and works down to the base layer one layer at a 392 time. 393 * Perform a "copy-up" operation on the first copy of the file that is found. A 394 "copy up" copies the file up to the container's own thin writable layer. 395 * Modify the *copy of the file* in container's thin writable layer. 396 397 Btrfs, ZFS, and other drivers handle the copy-on-write differently. You can 398 read more about the methods of these drivers later in their detailed 399 descriptions. 400 401 Containers that write a lot of data will consume more space than containers 402 that do not. This is because most write operations consume new space in the 403 container's thin writable top layer. If your container needs to write a lot of 404 data, you should consider using a data volume. 405 406 A copy-up operation can incur a noticeable performance overhead. This overhead 407 is different depending on which storage driver is in use. However, large files, 408 lots of layers, and deep directory trees can make the impact more noticeable. 409 Fortunately, the operation only occurs the first time any particular file is 410 modified. Subsequent modifications to the same file do not cause a copy-up 411 operation and can operate directly on the file's existing copy already present 412 in the container layer. 413 414 Let's see what happens if we spin up 5 containers based on our `changed-ubuntu` 415 image we built earlier: 416 417 1. From a terminal on your Docker host, run the following `docker run` command 418 5 times. 419 420 $ docker run -dit changed-ubuntu bash 421 422 75bab0d54f3cf193cfdc3a86483466363f442fba30859f7dcd1b816b6ede82d4 423 424 $ docker run -dit changed-ubuntu bash 425 426 9280e777d109e2eb4b13ab211553516124a3d4d4280a0edfc7abf75c59024d47 427 428 $ docker run -dit changed-ubuntu bash 429 430 a651680bd6c2ef64902e154eeb8a064b85c9abf08ac46f922ad8dfc11bb5cd8a 431 432 $ docker run -dit changed-ubuntu bash 433 434 8eb24b3b2d246f225b24f2fca39625aaad71689c392a7b552b78baf264647373 435 436 $ docker run -dit changed-ubuntu bash 437 438 0ad25d06bdf6fca0dedc38301b2aff7478b3e1ce3d1acd676573bba57cb1cfef 439 440 This launches 5 containers based on the `changed-ubuntu` image. As each 441 container is created, Docker adds a writable layer and assigns it a random 442 UUID. This is the value returned from the `docker run` command. 443 444 2. Run the `docker ps` command to verify the 5 containers are running. 445 446 $ docker ps 447 CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 448 0ad25d06bdf6 changed-ubuntu "bash" About a minute ago Up About a minute stoic_ptolemy 449 8eb24b3b2d24 changed-ubuntu "bash" About a minute ago Up About a minute pensive_bartik 450 a651680bd6c2 changed-ubuntu "bash" 2 minutes ago Up 2 minutes hopeful_turing 451 9280e777d109 changed-ubuntu "bash" 2 minutes ago Up 2 minutes backstabbing_mahavira 452 75bab0d54f3c changed-ubuntu "bash" 2 minutes ago Up 2 minutes boring_pasteur 453 454 The output above shows 5 running containers, all sharing the 455 `changed-ubuntu` image. Each `CONTAINER ID` is derived from the UUID when 456 creating each container. 457 458 3. List the contents of the local storage area. 459 460 $ sudo ls /var/lib/docker/containers 461 462 0ad25d06bdf6fca0dedc38301b2aff7478b3e1ce3d1acd676573bba57cb1cfef 463 9280e777d109e2eb4b13ab211553516124a3d4d4280a0edfc7abf75c59024d47 464 75bab0d54f3cf193cfdc3a86483466363f442fba30859f7dcd1b816b6ede82d4 465 a651680bd6c2ef64902e154eeb8a064b85c9abf08ac46f922ad8dfc11bb5cd8a 466 8eb24b3b2d246f225b24f2fca39625aaad71689c392a7b552b78baf264647373 467 468 Docker's copy-on-write strategy not only reduces the amount of space consumed 469 by containers, it also reduces the time required to start a container. At start 470 time, Docker only has to create the thin writable layer for each container. 471 The diagram below shows these 5 containers sharing a single read-only (RO) 472 copy of the `changed-ubuntu` image. 473 474  475 476 If Docker had to make an entire copy of the underlying image stack each time it 477 started a new container, container start times and disk space used would be 478 significantly increased. 479 480 ## Data volumes and the storage driver 481 482 When a container is deleted, any data written to the container that is not 483 stored in a *data volume* is deleted along with the container. 484 485 A data volume is a directory or file in the Docker host's filesystem that is 486 mounted directly into a container. Data volumes are not controlled by the 487 storage driver. Reads and writes to data volumes bypass the storage driver and 488 operate at native host speeds. You can mount any number of data volumes into a 489 container. Multiple containers can also share one or more data volumes. 490 491 The diagram below shows a single Docker host running two containers. Each 492 container exists inside of its own address space within the Docker host's local 493 storage area (`/var/lib/docker/...`). There is also a single shared data 494 volume located at `/data` on the Docker host. This is mounted directly into 495 both containers. 496 497  498 499 Data volumes reside outside of the local storage area on the Docker host, 500 further reinforcing their independence from the storage driver's control. When 501 a container is deleted, any data stored in data volumes persists on the Docker 502 host. 503 504 For detailed information about data volumes 505 [Managing data in containers](https://docs.docker.com/userguide/dockervolumes/). 506 507 ## Related information 508 509 * [Select a storage driver](selectadriver.md) 510 * [AUFS storage driver in practice](aufs-driver.md) 511 * [Btrfs storage driver in practice](btrfs-driver.md) 512 * [Device Mapper storage driver in practice](device-mapper-driver.md)