github.com/sijibomii/docker@v0.0.0-20231230191044-5cf6ca554647/docs/userguide/storagedriver/btrfs-driver.md (about)

     1  <!--[metadata]>
     2  +++
     3  title = "Btrfs storage in practice"
     4  description = "Learn how to optimize your use of Btrfs driver."
     5  keywords = ["container, storage, driver, Btrfs "]
     6  [menu.main]
     7  parent = "engine_driver"
     8  +++
     9  <![end-metadata]-->
    10  
    11  # Docker and Btrfs in practice
    12  
    13  Btrfs is a next generation copy-on-write filesystem that supports many advanced
    14  storage technologies that make it a good fit for Docker. Btrfs is included in
    15  the mainline Linux kernel and its on-disk-format is now considered stable.
    16  However, many of its features are still under heavy development and users 
    17  should consider it a fast-moving target.
    18  
    19  Docker's `btrfs` storage driver leverages many Btrfs features for image and
    20  container management. Among these features are thin provisioning, 
    21  copy-on-write, and snapshotting.
    22  
    23  This article refers to Docker's Btrfs storage driver as `btrfs` and the overall
    24   Btrfs Filesystem as Btrfs.
    25  
    26  >**Note**: The [Commercially Supported Docker Engine (CS-Engine)](https://www.docker.com/compatibility-maintenance) does not currently support the `btrfs` storage driver.
    27  
    28  ## The future of Btrfs
    29  
    30  Btrfs has been long hailed as the future of Linux filesystems. With full 
    31  support in the mainline Linux kernel, a stable on-disk-format, and active 
    32  development with a focus on stability, this is now becoming more of a reality.
    33  
    34  As far as Docker on the Linux platform goes, many people see the `btrfs` 
    35  storage driver as a potential long-term replacement for the `devicemapper` 
    36  storage driver. However, at the time of writing, the `devicemapper` storage 
    37  driver should be considered safer, more stable, and more *production ready*. 
    38  You should only consider the `btrfs` driver for production deployments if you 
    39  understand it well and have existing experience with Btrfs.
    40  
    41  ## Image layering and sharing with Btrfs
    42  
    43  Docker leverages Btrfs *subvolumes* and *snapshots* for managing the on-disk 
    44  components of image and container layers.  Btrfs subvolumes look and feel like 
    45  a normal Unix filesystem. As such, they can have their own internal directory 
    46  structure that hooks into the wider Unix filesystem.
    47  
    48  Subvolumes are natively copy-on-write and have space allocated to them 
    49  on-demand from an underlying storage pool. They can also be nested and snapped.
    50   The diagram blow shows 4 subvolumes. 'Subvolume 2' and 'Subvolume 3' are 
    51  nested, whereas 'Subvolume 4' shows its own internal directory tree.
    52  
    53  ![](images/btfs_subvolume.jpg)
    54  
    55  Snapshots are a point-in-time read-write copy of an entire subvolume. They 
    56  exist directly below the subvolume they were created from. You can create 
    57  snapshots of snapshots as shown in the diagram below.
    58  
    59  ![](images/btfs_snapshots.jpg)
    60  
    61  Btfs allocates space to subvolumes and snapshots on demand from an underlying 
    62  pool of storage. The unit of allocation is referred to as a *chunk*, and 
    63  *chunks* are normally ~1GB in size.
    64  
    65  Snapshots are first-class citizens in a Btrfs filesystem. This means that they 
    66  look, feel, and operate just like regular subvolumes. The technology required 
    67  to create them is built directly into the Btrfs filesystem thanks to its 
    68  native copy-on-write design. This means that Btrfs snapshots are space 
    69  efficient with little or no performance overhead. The diagram below shows a 
    70  subvolume and its snapshot sharing the same data.
    71  
    72  ![](images/btfs_pool.jpg)
    73  
    74  Docker's `btrfs` storage driver stores every image layer and container in its 
    75  own Btrfs subvolume or snapshot. The base layer of an image is stored as a 
    76  subvolume whereas child image layers and containers are stored as snapshots. 
    77  This is shown in the diagram below.
    78  
    79  ![](images/btfs_container_layer.jpg)
    80  
    81  The high level process for creating images and containers on Docker hosts 
    82  running the `btrfs` driver is as follows:
    83  
    84  1. The image's base layer is stored in a Btrfs *subvolume* under
    85  `/var/lib/docker/btrfs/subvolumes`.
    86  
    87  2. Subsequent image layers are stored as a Btrfs *snapshot* of the parent 
    88  layer's subvolume or snapshot.
    89  
    90      The diagram below shows a three-layer image. The base layer is a subvolume.
    91   Layer 1 is a snapshot of the base layer's subvolume. Layer 2 is a snapshot of 
    92  Layer 1's snapshot.
    93  
    94      ![](images/btfs_constructs.jpg)
    95  
    96  As of Docker 1.10, image layer IDs no longer correspond to directory names 
    97  under `/var/lib/docker/`.
    98  
    99  ## Image and container on-disk constructs
   100  
   101  Image layers and containers are visible in the Docker host's filesystem at
   102  `/var/lib/docker/btrfs/subvolumes/`. However, as previously stated, directory 
   103  names no longer correspond to image layer IDs. That said, directories for
   104  containers are present even for containers with a stopped status. This is
   105  because the `btrfs` storage driver mounts a default, top-level subvolume at
   106  `/var/lib/docker/subvolumes`. All other subvolumes and snapshots exist below
   107  that as Btrfs filesystem objects and not as individual mounts.
   108  
   109  Because Btrfs works at the filesystem level and not the block level, each image
   110  and container layer can be browsed in the filesystem using normal Unix 
   111  commands. The example below shows a truncated output of an `ls -l` command an 
   112  image layer:
   113  
   114      $ ls -l /var/lib/docker/btrfs/subvolumes/0a17decee4139b0de68478f149cc16346f5e711c5ae3bb969895f22dd6723751/
   115      total 0
   116      drwxr-xr-x 1 root root 1372 Oct  9 08:39 bin
   117      drwxr-xr-x 1 root root    0 Apr 10  2014 boot
   118      drwxr-xr-x 1 root root  882 Oct  9 08:38 dev
   119      drwxr-xr-x 1 root root 2040 Oct 12 17:27 etc
   120      drwxr-xr-x 1 root root    0 Apr 10  2014 home
   121      ...output truncated...
   122  
   123  ## Container reads and writes with Btrfs
   124  
   125  A container is a space-efficient snapshot of an image. Metadata in the snapshot
   126  points to the actual data blocks in the storage pool. This is the same as with 
   127  a subvolume. Therefore, reads performed against a snapshot are essentially the
   128  same as reads performed against a subvolume. As a result, no performance
   129  overhead is incurred from the Btrfs driver.
   130  
   131  Writing a new file to a container invokes an allocate-on-demand operation to
   132  allocate new data block to the container's snapshot. The file is then written to
   133  this new space. The allocate-on-demand operation is native to all writes with
   134  Btrfs and is the same as writing new data to a subvolume. As a result, writing
   135  new files to a container's snapshot operate at native Btrfs speeds.
   136  
   137  Updating an existing file in a container causes a copy-on-write operation
   138  (technically *redirect-on-write*). The driver leaves the original data and
   139  allocates new space to the snapshot. The updated data is written to this new
   140  space. Then, the driver updates the filesystem metadata in the snapshot to 
   141  point to this new data. The original data is preserved in-place for subvolumes 
   142  and snapshots further up the tree. This behavior is native to copy-on-write
   143  filesystems like Btrfs and incurs very little overhead.
   144  
   145  With Btfs, writing and updating lots of small files can result in slow 
   146  performance. More on this later.
   147  
   148  ## Configuring Docker with Btrfs
   149  
   150  The `btrfs` storage driver only operates on a Docker host where 
   151  `/var/lib/docker` is mounted as a Btrfs filesystem. The following procedure 
   152  shows  how to configure Btrfs on Ubuntu 14.04 LTS.
   153  
   154  ### Prerequisites
   155  
   156  If you have already used the Docker daemon on your Docker host and have images 
   157  you want to keep, `push` them to Docker Hub or your private Docker Trusted 
   158  Registry before attempting this procedure.
   159  
   160  Stop the Docker daemon. Then, ensure that you have a spare block device at 
   161  `/dev/xvdb`. The device identifier may be different in your environment and you
   162   should substitute your own values throughout the procedure.
   163  
   164  The procedure also assumes your kernel has the appropriate Btrfs modules 
   165  loaded. To verify this, use the following command:
   166  
   167      $ cat /proc/filesystems | grep btrfs
   168  
   169  ### Configure Btrfs on Ubuntu 14.04 LTS
   170  
   171  Assuming your system meets the prerequisites, do the following:
   172  
   173  1. Install the "btrfs-tools" package.
   174  
   175          $ sudo apt-get install btrfs-tools
   176          Reading package lists... Done
   177          Building dependency tree
   178          <output truncated>
   179  
   180  2. Create the Btrfs storage pool.
   181  
   182      Btrfs storage pools are created with the `mkfs.btrfs` command. Passing 
   183  multiple devices to the `mkfs.btrfs` command creates a pool across all of those
   184   devices. Here you create a pool with a single device at `/dev/xvdb`.
   185  
   186          $ sudo mkfs.btrfs -f /dev/xvdb
   187          WARNING! - Btrfs v3.12 IS EXPERIMENTAL
   188          WARNING! - see http://btrfs.wiki.kernel.org before using
   189  
   190          Turning ON incompat feature 'extref': increased hardlink limit per file to 65536
   191          fs created label (null) on /dev/xvdb
   192              nodesize 16384 leafsize 16384 sectorsize 4096 size 4.00GiB
   193          Btrfs v3.12
   194  
   195      Be sure to substitute `/dev/xvdb` with the appropriate device(s) on your
   196      system.
   197  
   198      > **Warning**: Take note of the warning about Btrfs being experimental. As
   199      noted earlier, Btrfs is not currently recommended for production deployments
   200      unless you already have extensive experience.
   201  
   202  3. If it does not already exist, create a directory for the Docker host's local
   203   storage area at `/var/lib/docker`.
   204  
   205          $ sudo mkdir /var/lib/docker
   206  
   207  4. Configure the system to automatically mount the Btrfs filesystem each time the system boots.
   208  
   209      a. Obtain the Btrfs filesystem's UUID.
   210  
   211          $ sudo blkid /dev/xvdb
   212          /dev/xvdb: UUID="a0ed851e-158b-4120-8416-c9b072c8cf47" UUID_SUB="c3927a64-4454-4eef-95c2-a7d44ac0cf27" TYPE="btrfs"
   213  
   214      b. Create an `/etc/fstab` entry to automatically mount `/var/lib/docker` 
   215  each time the system boots. Either of the following lines will work, just 
   216  remember to substitute the UUID value with the value obtained from the previous
   217   command.
   218  
   219          /dev/xvdb /var/lib/docker btrfs defaults 0 0
   220          UUID="a0ed851e-158b-4120-8416-c9b072c8cf47" /var/lib/docker btrfs defaults 0 0
   221  
   222  5. Mount the new filesystem and verify the operation.
   223  
   224          $ sudo mount -a
   225          $ mount
   226          /dev/xvda1 on / type ext4 (rw,discard)
   227          <output truncated>
   228          /dev/xvdb on /var/lib/docker type btrfs (rw)
   229  
   230      The last line in the output above shows the `/dev/xvdb` mounted at 
   231  `/var/lib/docker` as Btrfs.
   232  
   233  Now that you have a Btrfs filesystem mounted at `/var/lib/docker`, the daemon 
   234  should automatically load with the `btrfs` storage driver.
   235  
   236  1. Start the Docker daemon.
   237  
   238          $ sudo service docker start
   239          docker start/running, process 2315
   240  
   241      The procedure for starting the Docker daemon may differ depending on the
   242      Linux distribution you are using.
   243  
   244      You can force the Docker daemon to start with the `btrfs` storage 
   245  driver by either passing the `--storage-driver=btrfs` flag to the `docker 
   246  daemon` at startup, or adding it to the `DOCKER_OPTS` line to the Docker config
   247   file.
   248  
   249  2. Verify the storage driver with the `docker info` command.
   250  
   251          $ sudo docker info
   252          Containers: 0
   253          Images: 0
   254          Storage Driver: btrfs
   255          [...]
   256  
   257  Your Docker host is now configured to use the `btrfs` storage driver.
   258  
   259  ## Btrfs and Docker performance
   260  
   261  There are several factors that influence Docker's performance under the `btrfs`
   262   storage driver.
   263  
   264  - **Page caching**. Btrfs does not support page cache sharing. This means that 
   265  *n* containers accessing the same file require *n* copies to be cached. As a 
   266  result, the `btrfs` driver may not be the best choice for PaaS and other high 
   267  density container use cases.
   268  
   269  - **Small writes**. Containers performing lots of small writes (including 
   270  Docker hosts that start and stop many containers) can lead to poor use of Btrfs
   271   chunks. This can ultimately lead to out-of-space conditions on your Docker 
   272  host and stop it working. This is currently a major drawback to using current 
   273  versions of Btrfs.
   274  
   275      If you use the `btrfs` storage driver, closely monitor the free space on 
   276  your Btrfs filesystem using the `btrfs filesys show` command. Do not trust the 
   277  output of normal Unix commands such as `df`; always use the Btrfs native 
   278  commands.
   279  
   280  - **Sequential writes**. Btrfs writes data to disk via journaling technique. 
   281  This can impact sequential writes, where performance can be up to half.
   282  
   283  - **Fragmentation**. Fragmentation is a natural byproduct of copy-on-write 
   284  filesystems like Btrfs. Many small random writes can compound this issue. It 
   285  can manifest as CPU spikes on Docker hosts using SSD media and head thrashing 
   286  on Docker hosts using spinning media. Both of these result in poor performance.
   287  
   288      Recent versions of Btrfs allow you to specify `autodefrag` as a mount 
   289  option. This mode attempts to detect random writes and defragment them. You 
   290  should perform your own tests before enabling this option on your Docker hosts.
   291   Some tests have shown this option has a negative performance impact on Docker 
   292  hosts performing lots of small writes (including systems that start and stop 
   293  many containers).
   294  
   295  - **Solid State Devices (SSD)**. Btrfs has native optimizations for SSD media. 
   296  To enable these, mount with the `-o ssd` mount option. These optimizations 
   297  include enhanced SSD write performance by avoiding things like *seek 
   298  optimizations* that have no use on SSD media.
   299  
   300      Btfs also supports the TRIM/Discard primitives. However, mounting with the 
   301  `-o discard` mount option can cause performance issues. Therefore, it is 
   302  recommended you perform your own tests before using this option.
   303  
   304  - **Use Data Volumes**. Data volumes provide the best and most predictable 
   305  performance. This is because they bypass the storage driver and do not incur 
   306  any of the potential overheads introduced by thin provisioning and 
   307  copy-on-write. For this reason, you should place heavy write workloads on data 
   308  volumes.
   309  
   310  ## Related Information
   311  
   312  * [Understand images, containers, and storage drivers](imagesandcontainers.md)
   313  * [Select a storage driver](selectadriver.md)
   314  * [AUFS storage driver in practice](aufs-driver.md)
   315  * [Device Mapper storage driver in practice](device-mapper-driver.md)