github.com/dpiddy/docker@v1.12.2-rc1/docs/userguide/storagedriver/overlayfs-driver.md (about) 1 <!--[metadata]> 2 +++ 3 title = "OverlayFS storage in practice" 4 description = "Learn how to optimize your use of OverlayFS driver." 5 keywords = ["container, storage, driver, OverlayFS "] 6 [menu.main] 7 parent = "engine_driver" 8 +++ 9 <![end-metadata]--> 10 11 # Docker and OverlayFS in practice 12 13 OverlayFS is a modern *union filesystem* that is similar to AUFS. In comparison 14 to AUFS, OverlayFS: 15 16 * has a simpler design 17 * has been in the mainline Linux kernel since version 3.18 18 * is potentially faster 19 20 As a result, OverlayFS is rapidly gaining popularity in the Docker community 21 and is seen by many as a natural successor to AUFS. As promising as OverlayFS 22 is, it is still relatively young. Therefore caution should be taken before 23 using it in production Docker environments. 24 25 Docker's `overlay` storage driver leverages several OverlayFS features to build 26 and manage the on-disk structures of images and containers. 27 28 Since version 1.12, Docker also provides `overlay2` storage driver which is much 29 more efficient than `overlay` in terms of inode utilization. The `overlay2` 30 driver is only compatible with Linux kernel 4.0 and later. 31 32 For comparison between `overlay` vs `overlay2`, please also refer to [Select a 33 storage driver](selectadriver.md#overlay-vs-overlay2). 34 35 >**Note**: Since it was merged into the mainline kernel, the OverlayFS *kernel 36 >module* was renamed from "overlayfs" to "overlay". As a result you may see the 37 > two terms used interchangeably in some documentation. However, this document 38 > uses "OverlayFS" to refer to the overall filesystem, and `overlay`/`overlay2` 39 > to refer to Docker's storage-drivers. 40 41 ## Image layering and sharing with OverlayFS (`overlay`) 42 43 OverlayFS takes two directories on a single Linux host, layers one on top of 44 the other, and provides a single unified view. These directories are often 45 referred to as *layers* and the technology used to layer them is known as a 46 *union mount*. The OverlayFS terminology is "lowerdir" for the bottom layer and 47 "upperdir" for the top layer. The unified view is exposed through its own 48 directory called "merged". 49 50 The diagram below shows how a Docker image and a Docker container are layered. 51 The image layer is the "lowerdir" and the container layer is the "upperdir". 52 The unified view is exposed through a directory called "merged" which is 53 effectively the containers mount point. The diagram shows how Docker constructs 54 map to OverlayFS constructs. 55 56  57 58 Notice how the image layer and container layer can contain the same files. When 59 this happens, the files in the container layer ("upperdir") are dominant and 60 obscure the existence of the same files in the image layer ("lowerdir"). The 61 container mount ("merged") presents the unified view. 62 63 The `overlay` driver only works with two layers. This means that multi-layered 64 images cannot be implemented as multiple OverlayFS layers. Instead, each image 65 layer is implemented as its own directory under `/var/lib/docker/overlay`. Hard 66 links are then used as a space-efficient way to reference data shared with lower 67 layers. As of Docker 1.10, image layer IDs no longer correspond to directory 68 names in `/var/lib/docker/` 69 70 To create a container, the `overlay` driver combines the directory representing 71 the image's top layer plus a new directory for the container. The image's top 72 layer is the "lowerdir" in the overlay and read-only. The new directory for the 73 container is the "upperdir" and is writable. 74 75 ### Example: Image and container on-disk constructs (`overlay`) 76 77 The following `docker pull` command shows a Docker host with downloading a 78 Docker image comprising five layers. 79 80 $ sudo docker pull ubuntu 81 82 Using default tag: latest 83 latest: Pulling from library/ubuntu 84 85 5ba4f30e5bea: Pull complete 86 9d7d19c9dc56: Pull complete 87 ac6ad7efd0f9: Pull complete 88 e7491a747824: Pull complete 89 a3ed95caeb02: Pull complete 90 Digest: sha256:46fb5d001b88ad904c5c732b086b596b92cfb4a4840a3abd0e35dbb6870585e4 91 Status: Downloaded newer image for ubuntu:latest 92 93 Each image layer has its own directory under `/var/lib/docker/overlay/`. This 94 is where the contents of each image layer are stored. 95 96 The output of the command below shows the five directories that store the 97 contents of each image layer just pulled. However, as can be seen, the image 98 layer IDs do not match the directory names in `/var/lib/docker/overlay`. This 99 is normal behavior in Docker 1.10 and later. 100 101 $ ls -l /var/lib/docker/overlay/ 102 103 total 20 104 drwx------ 3 root root 4096 Jun 20 16:11 38f3ed2eac129654acef11c32670b534670c3a06e483fce313d72e3e0a15baa8 105 drwx------ 3 root root 4096 Jun 20 16:11 55f1e14c361b90570df46371b20ce6d480c434981cbda5fd68c6ff61aa0a5358 106 drwx------ 3 root root 4096 Jun 20 16:11 824c8a961a4f5e8fe4f4243dab57c5be798e7fd195f6d88ab06aea92ba931654 107 drwx------ 3 root root 4096 Jun 20 16:11 ad0fe55125ebf599da124da175174a4b8c1878afe6907bf7c78570341f308461 108 drwx------ 3 root root 4096 Jun 20 16:11 edab9b5e5bf73f2997524eebeac1de4cf9c8b904fa8ad3ec43b3504196aa3801 109 110 The image layer directories contain the files unique to that layer as well as 111 hard links to the data that is shared with lower layers. This allows for 112 efficient use of disk space. 113 114 $ ls -i /var/lib/docker/overlay/38f3ed2eac129654acef11c32670b534670c3a06e483fce313d72e3e0a15baa8/root/bin/ls 115 116 19793696 /var/lib/docker/overlay/38f3ed2eac129654acef11c32670b534670c3a06e483fce313d72e3e0a15baa8/root/bin/ls 117 118 $ ls -i /var/lib/docker/overlay/55f1e14c361b90570df46371b20ce6d480c434981cbda5fd68c6ff61aa0a5358/root/bin/ls 119 120 19793696 /var/lib/docker/overlay/55f1e14c361b90570df46371b20ce6d480c434981cbda5fd68c6ff61aa0a5358/root/bin/ls 121 122 Containers also exist on-disk in the Docker host's filesystem under 123 `/var/lib/docker/overlay/`. If you inspect the directory relating to a running 124 container using the `ls -l` command, you find the following file and 125 directories. 126 127 $ ls -l /var/lib/docker/overlay/<directory-of-running-container> 128 129 total 16 130 -rw-r--r-- 1 root root 64 Jun 20 16:39 lower-id 131 drwxr-xr-x 1 root root 4096 Jun 20 16:39 merged 132 drwxr-xr-x 4 root root 4096 Jun 20 16:39 upper 133 drwx------ 3 root root 4096 Jun 20 16:39 work 134 135 These four filesystem objects are all artifacts of OverlayFS. The "lower-id" 136 file contains the ID of the top layer of the image the container is based on. 137 This is used by OverlayFS as the "lowerdir". 138 139 $ cat /var/lib/docker/overlay/ec444863a55a9f1ca2df72223d459c5d940a721b2288ff86a3f27be28b53be6c/lower-id 140 141 55f1e14c361b90570df46371b20ce6d480c434981cbda5fd68c6ff61aa0a5358 142 143 The "upper" directory is the containers read-write layer. Any changes made to 144 the container are written to this directory. 145 146 The "merged" directory is effectively the containers mount point. This is where 147 the unified view of the image ("lowerdir") and container ("upperdir") is 148 exposed. Any changes written to the container are immediately reflected in this 149 directory. 150 151 The "work" directory is required for OverlayFS to function. It is used for 152 things such as *copy_up* operations. 153 154 You can verify all of these constructs from the output of the `mount` command. 155 (Ellipses and line breaks are used in the output below to enhance readability.) 156 157 $ mount | grep overlay 158 159 overlay on /var/lib/docker/overlay/ec444863a55a.../merged 160 type overlay (rw,relatime,lowerdir=/var/lib/docker/overlay/55f1e14c361b.../root, 161 upperdir=/var/lib/docker/overlay/ec444863a55a.../upper, 162 workdir=/var/lib/docker/overlay/ec444863a55a.../work) 163 164 The output reflects that the overlay is mounted as read-write ("rw"). 165 166 167 ## Image layering and sharing with OverlayFS (`overlay2`) 168 169 While the `overlay` driver only works with a single lower OverlayFS layer and 170 hence requires hard links for implementation of multi-layered images, the 171 `overlay2` driver natively supports multiple lower OverlayFS layers (up to 128). 172 173 Hence the `overlay2` driver offers better performance for layer-related docker commands (e.g. `docker build` and `docker commit`), and consumes fewer inodes than the `overlay` driver. 174 175 ### Example: Image and container on-disk constructs (`overlay2`) 176 177 After downloading a five-layer image using `docker pull ubuntu`, you can see 178 six directories under `/var/lib/docker/overlay2`. 179 180 $ ls -l /var/lib/docker/overlay2 181 182 total 24 183 drwx------ 5 root root 4096 Jun 20 07:36 223c2864175491657d238e2664251df13b63adb8d050924fd1bfcdb278b866f7 184 drwx------ 3 root root 4096 Jun 20 07:36 3a36935c9df35472229c57f4a27105a136f5e4dbef0f87905b2e506e494e348b 185 drwx------ 5 root root 4096 Jun 20 07:36 4e9fa83caff3e8f4cc83693fa407a4a9fac9573deaf481506c102d484dd1e6a1 186 drwx------ 5 root root 4096 Jun 20 07:36 e8876a226237217ec61c4baf238a32992291d059fdac95ed6303bdff3f59cff5 187 drwx------ 5 root root 4096 Jun 20 07:36 eca1e4e1694283e001f200a667bb3cb40853cf2d1b12c29feda7422fed78afed 188 drwx------ 2 root root 4096 Jun 20 07:36 l 189 190 The "l" directory contains shortened layer identifiers as symbolic links. These 191 shortened identifiers are used for avoid hitting the page size limitation on 192 mount arguments. 193 194 $ ls -l /var/lib/docker/overlay2/l 195 196 total 20 197 lrwxrwxrwx 1 root root 72 Jun 20 07:36 6Y5IM2XC7TSNIJZZFLJCS6I4I4 -> ../3a36935c9df35472229c57f4a27105a136f5e4dbef0f87905b2e506e494e348b/diff 198 lrwxrwxrwx 1 root root 72 Jun 20 07:36 B3WWEFKBG3PLLV737KZFIASSW7 -> ../4e9fa83caff3e8f4cc83693fa407a4a9fac9573deaf481506c102d484dd1e6a1/diff 199 lrwxrwxrwx 1 root root 72 Jun 20 07:36 JEYMODZYFCZFYSDABYXD5MF6YO -> ../eca1e4e1694283e001f200a667bb3cb40853cf2d1b12c29feda7422fed78afed/diff 200 lrwxrwxrwx 1 root root 72 Jun 20 07:36 NFYKDW6APBCCUCTOUSYDH4DXAT -> ../223c2864175491657d238e2664251df13b63adb8d050924fd1bfcdb278b866f7/diff 201 lrwxrwxrwx 1 root root 72 Jun 20 07:36 UL2MW33MSE3Q5VYIKBRN4ZAGQP -> ../e8876a226237217ec61c4baf238a32992291d059fdac95ed6303bdff3f59cff5/diff 202 203 The lowerest layer contains the "link" file which contains the name of the shortened 204 identifier, and the "diff" directory which contains the contents. 205 206 $ ls /var/lib/docker/overlay2/3a36935c9df35472229c57f4a27105a136f5e4dbef0f87905b2e506e494e348b/ 207 208 diff link 209 210 $ cat /var/lib/docker/overlay2/3a36935c9df35472229c57f4a27105a136f5e4dbef0f87905b2e506e494e348b/link 211 212 6Y5IM2XC7TSNIJZZFLJCS6I4I4 213 214 $ ls /var/lib/docker/overlay2/3a36935c9df35472229c57f4a27105a136f5e4dbef0f87905b2e506e494e348b/diff 215 216 bin boot dev etc home lib lib64 media mnt opt proc root run sbin srv sys tmp usr var 217 218 The second layer contains the "lower" file for denoting the layer composition, 219 and the "diff" directory for the layer contents. It also contains the "merged" and 220 the "work" directories. 221 222 $ ls /var/lib/docker/overlay2/223c2864175491657d238e2664251df13b63adb8d050924fd1bfcdb278b866f7 223 224 diff link lower merged work 225 226 $ cat /var/lib/docker/overlay2/223c2864175491657d238e2664251df13b63adb8d050924fd1bfcdb278b866f7/lower 227 228 l/6Y5IM2XC7TSNIJZZFLJCS6I4I4 229 230 $ ls /var/lib/docker/overlay2/223c2864175491657d238e2664251df13b63adb8d050924fd1bfcdb278b866f7/diff/ 231 232 etc sbin usr var 233 234 A directory for running container have similar files and directories as well. 235 Note that the lower list is separated by ':', and ordered from highest layer to lower. 236 237 $ ls -l /var/lib/docker/overlay/<directory-of-running-container> 238 239 $ cat /var/lib/docker/overlay/<directory-of-running-container>/lower 240 241 l/DJA75GUWHWG7EWICFYX54FIOVT:l/B3WWEFKBG3PLLV737KZFIASSW7:l/JEYMODZYFCZFYSDABYXD5MF6YO:l/UL2MW33MSE3Q5VYIKBRN4ZAGQP:l/NFYKDW6APBCCUCTOUSYDH4DXAT:l/6Y5IM2XC7TSNIJZZFLJCS6I4I4 242 243 The result of `mount` is as follows: 244 245 $ mount | grep overlay 246 247 overlay on /var/lib/docker/overlay2/9186877cdf386d0a3b016149cf30c208f326dca307529e646afce5b3f83f5304/merged 248 type overlay (rw,relatime, 249 lowerdir=l/DJA75GUWHWG7EWICFYX54FIOVT:l/B3WWEFKBG3PLLV737KZFIASSW7:l/JEYMODZYFCZFYSDABYXD5MF6YO:l/UL2MW33MSE3Q5VYIKBRN4ZAGQP:l/NFYKDW6APBCCUCTOUSYDH4DXAT:l/6Y5IM2XC7TSNIJZZFLJCS6I4I4, 250 upperdir=9186877cdf386d0a3b016149cf30c208f326dca307529e646afce5b3f83f5304/diff, 251 workdir=9186877cdf386d0a3b016149cf30c208f326dca307529e646afce5b3f83f5304/work) 252 253 ## Container reads and writes with overlay 254 255 Consider three scenarios where a container opens a file for read access with 256 overlay. 257 258 - **The file does not exist in the container layer**. If a container opens a 259 file for read access and the file does not already exist in the container 260 ("upperdir") it is read from the image ("lowerdir"). This should incur very 261 little performance overhead. 262 263 - **The file only exists in the container layer**. If a container opens a file 264 for read access and the file exists in the container ("upperdir") and not in 265 the image ("lowerdir"), it is read directly from the container. 266 267 - **The file exists in the container layer and the image layer**. If a 268 container opens a file for read access and the file exists in the image layer 269 and the container layer, the file's version in the container layer is read. 270 This is because files in the container layer ("upperdir") obscure files with 271 the same name in the image layer ("lowerdir"). 272 273 Consider some scenarios where files in a container are modified. 274 275 - **Writing to a file for the first time**. The first time a container writes 276 to an existing file, that file does not exist in the container ("upperdir"). 277 The `overlay`/`overlay2` driver performs a *copy_up* operation to copy the file 278 from the image ("lowerdir") to the container ("upperdir"). The container then 279 writes the changes to the new copy of the file in the container layer. 280 281 However, OverlayFS works at the file level not the block level. This means 282 that all OverlayFS copy-up operations copy entire files, even if the file is 283 very large and only a small part of it is being modified. This can have a 284 noticeable impact on container write performance. However, two things are 285 worth noting: 286 287 * The copy_up operation only occurs the first time any given file is 288 written to. Subsequent writes to the same file will operate against the copy of 289 the file already copied up to the container. 290 291 * OverlayFS only works with two layers. This means that performance should 292 be better than AUFS which can suffer noticeable latencies when searching for 293 files in images with many layers. 294 295 - **Deleting files and directories**. When files are deleted within a container 296 a *whiteout* file is created in the containers "upperdir". The version of the 297 file in the image layer ("lowerdir") is not deleted. However, the whiteout file 298 in the container obscures it. 299 300 Deleting a directory in a container results in *opaque directory* being 301 created in the "upperdir". This has the same effect as a whiteout file and 302 effectively masks the existence of the directory in the image's "lowerdir". 303 304 - **Renaming directories**. Calling `rename(2)` for a directory is allowed only 305 when both of the source and the destination path are on the top layer. 306 Otherwise, it returns `EXDEV` ("cross-device link not permitted"). 307 308 So your application has to be designed so that it can handle `EXDEV` and fall 309 back to a "copy and unlink" strategy. 310 311 ## Configure Docker with the `overlay`/`overlay2` storage driver 312 313 To configure Docker to use the `overlay` storage driver your Docker host must be 314 running version 3.18 of the Linux kernel (preferably newer) with the overlay 315 kernel module loaded. For the `overlay2` driver, the version of your kernel must 316 be 4.0 or newer. OverlayFS can operate on top of most supported Linux filesystems. 317 However, ext4 is currently recommended for use in production environments. 318 319 The following procedure shows you how to configure your Docker host to use 320 OverlayFS. The procedure assumes that the Docker daemon is in a stopped state. 321 322 > **Caution:** If you have already run the Docker daemon on your Docker host 323 > and have images you want to keep, `push` them Docker Hub or your private 324 > Docker Trusted Registry before attempting this procedure. 325 326 1. If it is running, stop the Docker `daemon`. 327 328 2. Verify your kernel version and that the overlay kernel module is loaded. 329 330 $ uname -r 331 332 3.19.0-21-generic 333 334 $ lsmod | grep overlay 335 336 overlay 337 338 3. Start the Docker daemon with the `overlay`/`overlay2` storage driver. 339 340 $ dockerd --storage-driver=overlay & 341 342 [1] 29403 343 root@ip-10-0-0-174:/home/ubuntu# INFO[0000] Listening for HTTP on unix (/var/run/docker.sock) 344 INFO[0000] Option DefaultDriver: bridge 345 INFO[0000] Option DefaultNetwork: bridge 346 <output truncated> 347 348 Alternatively, you can force the Docker daemon to automatically start with 349 the `overlay`/`overlay2` driver by editing the Docker config file and adding 350 the `--storage-driver=overlay` flag to the `DOCKER_OPTS` line. Once this option 351 is set you can start the daemon using normal startup scripts without having 352 to manually pass in the `--storage-driver` flag. 353 354 4. Verify that the daemon is using the `overlay`/`overlay2` storage driver 355 356 $ docker info 357 358 Containers: 0 359 Images: 0 360 Storage Driver: overlay 361 Backing Filesystem: extfs 362 <output truncated> 363 364 Notice that the *Backing filesystem* in the output above is showing as 365 `extfs`. Multiple backing filesystems are supported but `extfs` (ext4) is 366 recommended for production use cases. 367 368 Your Docker host is now using the `overlay`/`overlay2` storage driver. If you 369 run the `mount` command, you'll find Docker has automatically created the 370 `overlay` mount with the required "lowerdir", "upperdir", "merged" and "workdir" 371 constructs. 372 373 ## OverlayFS and Docker Performance 374 375 As a general rule, the `overlay`/`overlay2` drivers should be fast. Almost 376 certainly faster than `aufs` and `devicemapper`. In certain circumstances it may 377 also be faster than `btrfs`. That said, there are a few things to be aware of 378 relative to the performance of Docker using the `overlay`/`overlay2` storage 379 drivers. 380 381 - **Page Caching**. OverlayFS supports page cache sharing. This means multiple 382 containers accessing the same file can share a single page cache entry (or 383 entries). This makes the `overlay`/`overlay2` drivers efficient with memory and 384 a good option for PaaS and other high density use cases. 385 386 - **copy_up**. As with AUFS, OverlayFS has to perform copy-up operations any 387 time a container writes to a file for the first time. This can insert latency 388 into the write operation — especially if the file being copied up is 389 large. However, once the file has been copied up, all subsequent writes to that 390 file occur without the need for further copy-up operations. 391 392 The OverlayFS copy_up operation should be faster than the same operation 393 with AUFS. This is because AUFS supports more layers than OverlayFS and it is 394 possible to incur far larger latencies if searching through many AUFS layers. 395 396 - **Inode limits**. Use of the `overlay` storage driver can cause excessive 397 inode consumption. This is especially so as the number of images and containers 398 on the Docker host grows. A Docker host with a large number of images and lots 399 of started and stopped containers can quickly run out of inodes. The `overlay2` 400 does not have such an issue. 401 402 Unfortunately you can only specify the number of inodes in a filesystem at the 403 time of creation. For this reason, you may wish to consider putting 404 `/var/lib/docker` on a separate device with its own filesystem, or manually 405 specifying the number of inodes when creating the filesystem. 406 407 The following generic performance best practices also apply to OverlayFS. 408 409 - **Solid State Devices (SSD)**. For best performance it is always a good idea 410 to use fast storage media such as solid state devices (SSD). 411 412 - **Use Data Volumes**. Data volumes provide the best and most predictable 413 performance. This is because they bypass the storage driver and do not incur 414 any of the potential overheads introduced by thin provisioning and 415 copy-on-write. For this reason, you should place heavy write workloads on data 416 volumes. 417 418 ## OverlayFS compatibility 419 To summarize the OverlayFS's aspect which is incompatible with other 420 filesystems: 421 422 - **open(2)**. OverlayFS only implements a subset of the POSIX standards. 423 This can result in certain OverlayFS operations breaking POSIX standards. One 424 such operation is the *copy-up* operation. Suppose that your application calls 425 `fd1=open("foo", O_RDONLY)` and then `fd2=open("foo", O_RDWR)`. In this case, 426 your application expects `fd1` and `fd2` to refer to the same file. However, due 427 to a copy-up operation that occurs after the first calling to `open(2)`, the 428 descriptors refer to different files. 429 430 `yum` is known to be affected unless the `yum-plugin-ovl` package is installed. 431 If the `yum-plugin-ovl` package is not available in your distribution (e.g. 432 RHEL/CentOS prior to 6.8 or 7.2), you may need to run `touch /var/lib/rpm/*` 433 before running `yum install`. 434 435 - **rename(2)**. OverlayFS does not fully support the `rename(2)` system call. 436 Your application needs to detect its failure and fall back to a "copy and 437 unlink" strategy.