github.com/sijibomii/docker@v0.0.0-20231230191044-5cf6ca554647/docs/userguide/storagedriver/device-mapper-driver.md (about)

     1  <!--[metadata]>
     2  +++
     3  title="Device mapper storage in practice"
     4  description="Learn how to optimize your use of device mapper driver."
     5  keywords=["container, storage, driver, device mapper"]
     6  [menu.main]
     7  parent="engine_driver"
     8  +++
     9  <![end-metadata]-->
    10  
    11  # Docker and the Device Mapper storage driver
    12  
    13  Device Mapper is a kernel-based framework that underpins many advanced
    14  volume management technologies on Linux. Docker's `devicemapper` storage driver
    15  leverages the thin provisioning and snapshotting capabilities of this framework
    16  for image and container management. This article refers to the Device Mapper
    17  storage driver as `devicemapper`, and the kernel framework as `Device Mapper`.
    18  
    19  
    20  >**Note**: The [Commercially Supported Docker Engine (CS-Engine) running on RHEL
    21  and CentOS Linux](https://www.docker.com/compatibility-maintenance) requires
    22  that you use the `devicemapper` storage driver.
    23  
    24  
    25  ## An alternative to AUFS
    26  
    27  Docker originally ran on Ubuntu and Debian Linux and used AUFS for its storage
    28  backend. As Docker became popular, many of the companies that wanted to use it
    29  were using Red Hat Enterprise Linux (RHEL). Unfortunately, because the upstream
    30  mainline Linux kernel did not include AUFS, RHEL did not use AUFS either.
    31  
    32  To correct this Red Hat developers investigated getting AUFS into the mainline
    33  kernel. Ultimately, though, they decided a better idea was to develop a new
    34  storage backend. Moreover, they would base this new storage backend on existing
    35  `Device Mapper` technology.
    36  
    37  Red Hat collaborated with Docker Inc. to contribute this new driver. As a result
    38  of this collaboration, Docker's Engine was re-engineered to make the storage
    39  backend pluggable. So it was that the `devicemapper` became the second storage
    40  driver Docker supported.
    41  
    42  Device Mapper has been included in the mainline Linux kernel since version
    43  2.6.9. It is a core part of RHEL family of Linux distributions. This means that
    44  the `devicemapper` storage driver is based on stable code that has a lot of
    45  real-world production deployments and strong community support.
    46  
    47  
    48  ## Image layering and sharing
    49  
    50  The `devicemapper` driver stores every image and container on its own virtual
    51  device. These devices are thin-provisioned copy-on-write snapshot devices.
    52  Device Mapper technology works at the block level rather than the file level.
    53  This means that `devicemapper` storage driver's thin provisioning and
    54  copy-on-write operations work with blocks rather than entire files.
    55  
    56  >**Note**: Snapshots are also referred to as *thin devices* or *virtual
    57  >devices*. They all mean the same thing in the context of the `devicemapper`
    58  >storage driver.
    59  
    60  With `devicemapper` the high level process for creating images is as follows:
    61  
    62  1. The `devicemapper` storage driver creates a thin pool.
    63  
    64      The pool is created from block devices or loop mounted sparse files (more
    65  on this later).
    66  
    67  2. Next it creates a *base device*.
    68  
    69      A base device is a thin device with a filesystem. You can see which
    70  filesystem is in use by running the `docker info` command and checking the
    71  `Backing filesystem` value.
    72  
    73  3. Each new image (and image layer) is a snapshot of this base device.
    74  
    75      These are thin provisioned copy-on-write snapshots. This means that they
    76  are initially empty and only consume space from the pool when data is written
    77  to them.
    78  
    79  With `devicemapper`, container layers are snapshots of the image they are
    80  created from. Just as with images, container snapshots are thin provisioned
    81  copy-on-write snapshots. The container snapshot stores all updates to the
    82  container. The `devicemapper` allocates space to them on-demand from the pool
    83  as and when data is written to the container.
    84  
    85  The high level diagram below shows a thin pool with a base device and two
    86  images.
    87  
    88  ![](images/base_device.jpg)
    89  
    90  If you look closely at the diagram you'll see that it's snapshots all the way
    91  down. Each image layer is a snapshot of the layer below it. The lowest layer of
    92   each image is a snapshot of the base device that exists in the pool. This
    93  base device is a `Device Mapper` artifact and not a Docker image layer.
    94  
    95  A container is a snapshot of the image it is created from. The diagram below
    96  shows two containers - one based on the Ubuntu image and the other based on the
    97   Busybox image.
    98  
    99  ![](images/two_dm_container.jpg)
   100  
   101  
   102  ## Reads with the devicemapper
   103  
   104  Let's look at how reads and writes occur using the `devicemapper` storage
   105  driver. The diagram below shows the high level process for reading a single
   106  block (`0x44f`) in an example container.
   107  
   108  ![](images/dm_container.jpg)
   109  
   110  1. An application makes a read request for block `0x44f` in the container.
   111  
   112      Because the container is a thin snapshot of an image it does not have the
   113  data. Instead, it has a pointer (PTR) to where the data is stored in the image
   114  snapshot lower down in the image stack.
   115  
   116  2. The storage driver follows the pointer to block `0xf33` in the snapshot
   117  relating to image layer `a005...`.
   118  
   119  3. The `devicemapper` copies the contents of block `0xf33` from the image
   120  snapshot to memory in the container.
   121  
   122  4. The storage driver returns the data to the requesting application.
   123  
   124  ### Write examples
   125  
   126  With the `devicemapper` driver, writing new data to a container is accomplished
   127   by an *allocate-on-demand* operation. Updating existing data uses a
   128  copy-on-write operation. Because Device Mapper is a block-based technology
   129  these operations occur at the block level.
   130  
   131  For example, when making a small change to a large file in a container, the
   132  `devicemapper` storage driver does not copy the entire file. It only copies the
   133   blocks to be modified. Each block is 64KB.
   134  
   135  #### Writing new data
   136  
   137  To write 56KB of new data to a container:
   138  
   139  1. An application makes a request to write 56KB of new data to the container.
   140  
   141  2. The allocate-on-demand operation allocates a single new 64KB block to the
   142  container's snapshot.
   143  
   144      If the write operation is larger than 64KB, multiple new blocks are
   145  allocated to the container's snapshot.
   146  
   147  3. The data is written to the newly allocated block.
   148  
   149  #### Overwriting existing data
   150  
   151  To modify existing data for the first time:
   152  
   153  1. An application makes a request to modify some data in the container.
   154  
   155  2. A copy-on-write operation locates the blocks that need updating.
   156  
   157  3. The operation allocates new empty blocks to the container snapshot and
   158  copies the data into those blocks.
   159  
   160  4. The modified data is written into the newly allocated blocks.
   161  
   162  The application in the container is unaware of any of these
   163  allocate-on-demand and copy-on-write operations. However, they may add latency
   164  to the application's read and write operations.
   165  
   166  ## Configuring Docker with Device Mapper
   167  
   168  The `devicemapper` is the default Docker storage driver on some Linux
   169  distributions. This includes RHEL and most of its forks. Currently, the
   170  following distributions support the driver:
   171  
   172  * RHEL/CentOS/Fedora
   173  * Ubuntu 12.04
   174  * Ubuntu 14.04
   175  * Debian
   176  
   177  Docker hosts running the `devicemapper` storage driver default to a
   178  configuration mode known as `loop-lvm`. This mode uses sparse files to build
   179  the thin pool used by image and container snapshots. The mode is designed to
   180  work out-of-the-box with no additional configuration. However, production
   181  deployments should not run under `loop-lvm` mode.
   182  
   183  You can detect the mode by viewing the `docker info` command:
   184  
   185      $ sudo docker info
   186      Containers: 0
   187      Images: 0
   188      Storage Driver: devicemapper
   189       Pool Name: docker-202:2-25220302-pool
   190       Pool Blocksize: 65.54 kB
   191       Backing Filesystem: xfs
   192       ...
   193       Data loop file: /var/lib/docker/devicemapper/devicemapper/data
   194       Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata
   195       Library Version: 1.02.93-RHEL7 (2015-01-28)
   196       ...
   197  
   198  The output above shows a Docker host running with the `devicemapper` storage
   199  driver operating in `loop-lvm` mode. This is indicated by the fact that the
   200  `Data loop file` and a `Metadata loop file` are on files under
   201  `/var/lib/docker/devicemapper/devicemapper`. These are loopback mounted sparse
   202  files.
   203  
   204  ### Configure direct-lvm mode for production
   205  
   206  The preferred configuration for production deployments is `direct lvm`. This
   207  mode uses block devices to create the thin pool. The following procedure shows
   208  you how to configure a Docker host to use the `devicemapper` storage driver in
   209  a `direct-lvm` configuration.
   210  
   211  > **Caution:** If you have already run the Engine daemon on your Docker host
   212  > and have images you want to keep, `push` them Docker Hub or your private
   213  > Docker Trusted Registry before attempting this procedure.
   214  
   215  The procedure below will create a 90GB data volume and 4GB metadata volume to
   216  use as backing for the storage pool. It assumes that you have a spare block
   217  device at `/dev/sdd` with enough free space to complete the task. The device
   218  identifier and volume sizes may be be different in your environment and you
   219  should substitute your own values throughout the procedure.
   220  
   221  The procedure also assumes that the Engine daemon is in the `stopped` state.
   222  Any existing images or data are lost by this process.
   223  
   224  1. Log in to the Docker host you want to configure.
   225  2. If it is running, stop the Engine daemon.
   226  3. Install the logical volume management version 2.
   227  
   228      ```bash
   229      $ yum install lvm2
   230      ```
   231  4. Create a physical volume replacing `/dev/sdd` with your block device.
   232  
   233      ```bash
   234      $ pvcreate /dev/sdd
   235    ```
   236  
   237  5. Create a 'docker' volume group.
   238  
   239      ```bash
   240      $ vgcreate docker /dev/sdd
   241      ```
   242  
   243  6. Create a thin pool named `thinpool`.
   244  
   245      In this example, the data logical is 95% of the 'docker' volume group size.
   246      Leaving this free space allows for auto expanding of either the data or
   247      metadata if space runs low as a temporary stopgap.
   248  
   249      ```bash
   250     $ lvcreate --wipesignatures y -n thinpool docker -l 95%VG
   251     $ lvcreate --wipesignatures y -n thinpoolmeta docker -l 1%VG
   252     ```
   253  
   254  7. Convert the pool to a thin pool.
   255  
   256      ```bash
   257      $ lvconvert -y --zero n -c 512K --thinpool docker/thinpool --poolmetadata docker/thinpoolmeta
   258      ```
   259  
   260  8. Configure autoextension of thin pools via an `lvm` profile.
   261  
   262      ```bash
   263      $ vi /etc/lvm/profile/docker-thinpool.profile
   264      ```
   265  
   266  9. Specify 'thin_pool_autoextend_threshold' value.
   267  
   268      The value should be the percentage of space used before `lvm` attempts
   269      to autoextend the available space (100 = disabled).
   270  
   271      ```
   272      thin_pool_autoextend_threshold = 80
   273      ```
   274  
   275  10. Modify the `thin_pool_autoextend_percent` for when thin pool autoextension occurs.
   276  
   277      The value's setting is the perentage of space to increase the thin pool (100 =
   278      disabled)
   279  
   280      ```
   281      thin_pool_autoextend_percent = 20
   282      ```
   283  
   284  11. Check your work, your `docker-thinpool.profile` file should appear similar to the following:
   285  
   286      An example `/etc/lvm/profile/docker-thinpool.profile` file:
   287  
   288      ```
   289       activation {
   290           thin_pool_autoextend_threshold=80
   291           thin_pool_autoextend_percent=20
   292       }
   293       ```
   294  
   295  12. Apply your new lvm profile
   296  
   297      ```bash
   298      $ lvchange --metadataprofile docker-thinpool docker/thinpool
   299    ```
   300  
   301  13. Verify the `lv` is monitored.
   302  
   303      ```bash
   304      $ lvs -o+seg_monitor
   305      ```
   306  
   307  14. If Engine was previously started, clear your graph driver directory.
   308  
   309      Clearing your graph driver removes any images and containers in your Docker
   310      installation.
   311  
   312      ```bash
   313      $ rm -rf /var/lib/docker/*
   314      ```
   315  
   316  14. Configure the Engine daemon with specific devicemapper options.
   317  
   318      There are two ways to do this. You can set options on the commmand line if you start the daemon there:
   319  
   320      ```bash
   321      --storage-driver=devicemapper --storage-opt=dm.thinpooldev=/dev/mapper/docker-thinpool --storage-opt dm.use_deferred_removal=true
   322      ```
   323  
   324      You can also set them for startup in the `daemon.json` configuration, for example:
   325  
   326      ```json
   327       {
   328               "storage-driver": "devicemapper",
   329               "storage-opts": [
   330                       "dm.thinpooldev=/dev/mapper/docker-thinpool",
   331                       "dm.use_deferred_removal=true"
   332               ]
   333       }
   334      ```
   335  15. Start the Engine daemon.
   336  
   337      ```bash
   338      $ systemctl start docker
   339      ```
   340  
   341  After you start the Engine daemon, ensure you monitor your thin pool and volume
   342  group free space. While the volume group will auto-extend, it can still fill
   343  up. To monitor logical volumes, use `lvs` without options or `lvs -a` to see tha
   344  data and metadata sizes. To monitor volume group free space, use the `vgs` command.
   345  
   346  Logs can show the auto-extension of the thin pool when it hits the threshold, to
   347  view the logs use:
   348  
   349  ```bash
   350  journalctl -fu dm-event.service
   351  ```
   352  
   353  If you run into repeated problems with thin pool, you can use the
   354  `dm.min_free_space` option to tune the Engine behavior. This value ensures that
   355  operations fail with a warning when the free space is at or near the minimum.
   356  For information, see <a
   357  href="https://docs.docker.com/engine/reference/commandline/daemon/#storage-driver-options"
   358  target="_blank">the storage driver options in the Engine daemon reference</a>.
   359  
   360  
   361  ### Examine devicemapper structures on the host
   362  
   363  You can use the `lsblk` command to see the device files created above and the
   364  `pool` that the `devicemapper` storage driver creates on top of them.
   365  
   366      $ sudo lsblk
   367      NAME                       MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
   368      xvda                       202:0    0    8G  0 disk
   369      └─xvda1                    202:1    0    8G  0 part /
   370      xvdf                       202:80   0   10G  0 disk
   371      ├─vg--docker-data          253:0    0   90G  0 lvm
   372      │ └─docker-202:1-1032-pool 253:2    0   10G  0 dm
   373      └─vg--docker-metadata      253:1    0    4G  0 lvm
   374        └─docker-202:1-1032-pool 253:2    0   10G  0 dm
   375  
   376  The diagram below shows the image from prior examples updated with the detail
   377  from the `lsblk` command above.
   378  
   379  ![](http://farm1.staticflickr.com/703/22116692899_0471e5e160_b.jpg)
   380  
   381  In the diagram, the pool is named `Docker-202:1-1032-pool` and spans the `data`
   382   and `metadata` devices created earlier. The `devicemapper` constructs the pool
   383   name as follows:
   384  
   385  ```
   386  Docker-MAJ:MIN-INO-pool
   387  ```
   388  
   389  `MAJ`, `MIN` and `INO` refer to the major and minor device numbers and inode.
   390  
   391  Because Device Mapper operates at the block level it is more difficult to see
   392  diffs between image layers and containers. Docker 1.10 and later no longer
   393  matches image layer IDs with directory names in `/var/lib/docker`. However,
   394  there are two key directories. The `/var/lib/docker/devicemapper/mnt` directory
   395   contains the mount points for image and container layers. The
   396  `/var/lib/docker/devicemapper/metadata`directory contains one file for every
   397  image layer and container snapshot. The files contain metadata about each
   398  snapshot in JSON format.
   399  
   400  ## Device Mapper and Docker performance
   401  
   402  It is important to understand the impact that allocate-on-demand and
   403  copy-on-write operations can have on overall container performance.
   404  
   405  ### Allocate-on-demand performance impact
   406  
   407  The `devicemapper` storage driver allocates new blocks to a container via an
   408  allocate-on-demand operation. This means that each time an app writes to
   409  somewhere new inside a container, one or more empty blocks has to be located
   410  from the pool and mapped into the container.
   411  
   412  All blocks are 64KB. A write that uses less than 64KB still results in a single
   413   64KB block being allocated. Writing more than 64KB of data uses multiple 64KB
   414  blocks. This can impact container performance, especially in containers that
   415  perform lots of small writes. However, once a block is allocated to a container
   416   subsequent reads and writes can operate directly on that block.
   417  
   418  ### Copy-on-write performance impact
   419  
   420  Each time a container updates existing data for the first time, the
   421  `devicemapper` storage driver has to perform a copy-on-write operation. This
   422  copies the data from the image snapshot to the container's snapshot. This
   423  process can have a noticeable impact on container performance.
   424  
   425  All copy-on-write operations have a 64KB granularity. As a results, updating
   426  32KB of a 1GB file causes the driver to copy a single 64KB block into the
   427  container's snapshot. This has obvious performance advantages over file-level
   428  copy-on-write operations which would require copying the entire 1GB file into
   429  the container layer.
   430  
   431  In practice, however, containers that perform lots of small block writes
   432  (<64KB) can perform worse with `devicemapper` than with AUFS.
   433  
   434  ### Other device mapper performance considerations
   435  
   436  There are several other things that impact the performance of the
   437  `devicemapper` storage driver.
   438  
   439  - **The mode.** The default mode for Docker running the `devicemapper` storage
   440  driver is `loop-lvm`. This mode uses sparse files and suffers from poor
   441  performance. It is **not recommended for production**. The recommended mode for
   442   production environments is `direct-lvm` where the storage driver writes
   443  directly to raw block devices.
   444  
   445  - **High speed storage.** For best performance you should place the `Data file`
   446   and `Metadata file` on high speed storage such as SSD. This can be direct
   447  attached storage or from a SAN or NAS array.
   448  
   449  - **Memory usage.** `devicemapper` is not the most memory efficient Docker
   450  storage driver. Launching *n* copies of the same container loads *n* copies of
   451  its files into memory. This can have a memory impact on your Docker host. As a
   452  result, the `devicemapper` storage driver may not be the best choice for PaaS
   453  and other high density use cases.
   454  
   455  One final point, data volumes provide the best and most predictable
   456  performance. This is because they bypass the storage driver and do not incur
   457  any of the potential overheads introduced by thin provisioning and
   458  copy-on-write. For this reason, you should to place heavy write workloads on
   459  data volumes.
   460  
   461  ## Related Information
   462  
   463  * [Understand images, containers, and storage drivers](imagesandcontainers.md)
   464  * [Select a storage driver](selectadriver.md)
   465  * [AUFS storage driver in practice](aufs-driver.md)
   466  * [Btrfs storage driver in practice](btrfs-driver.md)
   467  * [daemon reference](../../reference/commandline/daemon#storage-driver-options)