github.com/rkt/rkt@v1.30.1-0.20200224141603-171c416fac02/Documentation/block-devices.md (about)

     1  ## Using block devices in rkt
     2  
     3  Using block devices in containers is usually restricted via the [device cgroup controller](https://www.kernel.org/doc/Documentation/cgroup-v1/devices.txt).
     4  This is done to prevent users from doing dangerous things like creating a physical disk device via `mknod` and writing data to it.
     5  
     6  ```
     7  # ls -l /dev/nvme0n1p1
     8  brw-rw---- 1 root disk 259, 1 Nov  3 15:13 /dev/nvme0n1p1
     9  # rkt run --interactive kinvolk.io/aci/busybox
    10  / # mknod /dev/nvme0n1p1 b 259 1
    11  mknod: /dev/nvme0n1p1: Operation not permitted
    12  ```
    13  
    14  When accessing devices from inside the container is actually desired, users can set up rkt volumes and mounts.
    15  In that case, rkt will automatically configure the device cgroup controller for the container without the aforementioned restriction.
    16  
    17  ```
    18  # rkt run --volume disk,kind=host,source=/dev/nvme0n1p1,readOnly=true \
    19            --interactive \
    20            kinvolk.io/aci/busybox \
    21            --mount volume=disk,target=/dev/nvme0n1p1
    22  / # ls -l /dev/nvme0n1p1
    23  brw-rw----    1 root     disk      259,   1 Nov  3 14:13 /dev/nvme0n1p1
    24  / # head /dev/nvme0n1p1 -c11
    25  �X�mkfs.fat/ # 
    26  / # echo 1 > /dev/nvme0n1p1
    27  /bin/sh: can't create /dev/nvme0n1p1: Operation not permitted
    28  ```
    29  
    30  Note that the volume is read-only, so we can't write to it because rkt sets a read-only policy in the device cgroup.
    31  
    32  For completeness, users can also use `--insecure-options=paths`, which disables any block device protection.
    33  Then, users can just create devices with `mknod`:
    34  
    35  ```
    36  # rkt run --insecure-options=paths \
    37            --interactive \
    38            kinvolk.io/aci/busybox
    39  / # mknod /dev/nvme0n1p1 b 259 1
    40  / # ls -l /dev/nvme0n1p1
    41  brw-r--r--    1 root     root      259,   1 Nov  3 15:43 /dev/nvme0n1p1
    42  ```
    43  
    44  ## Examples
    45  
    46  Here are some real-world examples that use block devices.
    47  
    48  ### FUSE + SSHFS
    49  
    50  SSHFS allows mounting remote directories over ssh.
    51  In this example we'll mount a remote directory on `/mnt` inside the container.
    52  For this to work, we need to be able to mount and umount filesystems inside the container so we pass the appropriate seccomp and capability options:
    53  
    54  ```
    55  # rkt run --insecure-options=image \
    56            --dns=8.8.8.8 \
    57            --interactive \
    58            --volume fuse,kind=host,source=/dev/fuse \
    59            docker://ubuntu \
    60            --mount volume=fuse,target=/dev/fuse \
    61            --seccomp mode=retain,@rkt/default-whitelist,mount,umount2 \
    62            --caps-retain=CAP_SETUID,CAP_SETGID,CAP_DAC_OVERRIDE,CAP_CHOWN,CAP_FOWNER,CAP_SYS_ADMIN
    63  root@rkt-f2098164-b207-41d0-b62b-745659725aee:/# apt-get update && apt-get install sshfs
    64  [...]
    65  root@rkt-f2098164-b207-41d0-b62b-745659725aee:/# sshfs user@host.com: /mnt
    66  The authenticity of host 'host.com (12.34.56.78)' can't be established.
    67  ECDSA key fingerprint is SHA256:L1/2LPI1J6/YlDzbvH+/SF5gamNusPDSqnCSmaNlolc.
    68  Are you sure you want to continue connecting (yes/no)? yes
    69  user@host.com's password: 
    70  root@rkt-f2098164-b207-41d0-b62b-745659725aee:/# cat /mnt/remote-file.txt
    71  HELLO FROM REMOTE
    72  root@rkt-f2098164-b207-41d0-b62b-745659725aee:/# fusermount -u /mnt/
    73  ```
    74  
    75  ### NVIDIA CUDA
    76  
    77  CUDA allows using GPUs for general purpose processing and it needs access to the gpu devices.
    78  In this example we also mount the CUDA SDK binaries and the host libraries, and we do some substitution magic to have appc-compliant volume names:
    79  
    80  ```
    81  # rkt run --insecure-options=image \
    82            $(for f in /dev/nvidia* /opt/bin/nvidia* /usr/lib/; \
    83                  do echo "--volume $(basename $f | sed 's/\./-/g'),source=$f,kind=host \
    84                           --mount volume=$(basename $f | sed 's/\./-/g'),target=$f"; \
    85                  done) \
    86            docker://nvidia/cuda:latest \
    87            --exec=/opt/bin/nvidia-smi
    88  Wed Sep  7 21:25:22 2016
    89  +-----------------------------------------------------------------------------+
    90  | NVIDIA-SMI 367.35                 Driver Version: 367.35                    |
    91  |-------------------------------+----------------------+----------------------+
    92  | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
    93  | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
    94  |===============================+======================+======================|
    95  |   0  GeForce GTX 780     Off  | 0000:01:00.0     N/A |                  N/A |
    96  | 33%   61C    P2    N/A /  N/A |    474MiB /  3018MiB |     N/A      Default |
    97  +-------------------------------+----------------------+----------------------+
    98  
    99  +-----------------------------------------------------------------------------+
   100  | Processes:                                                       GPU Memory |
   101  |  GPU       PID  Type  Process name                               Usage      |
   102  |=============================================================================|
   103  |    0                  Not Supported                                         |
   104  +-----------------------------------------------------------------------------+
   105  ```
   106  
   107  ### Format /dev/sdX
   108  
   109  You can mount a disk block device (for example, an external USB stick) and format it inside a container.
   110  Like before, if you want to mount it inside the container, you need to pass the appropriate seccomp and capability options:
   111  
   112  ```
   113  # rkt run --insecure-options=image \
   114            --volume disk,kind=host,source=/dev/sda,readOnly=false \
   115            --interactive \
   116            docker://ubuntu \
   117            --mount volume=disk,target=/dev/sda
   118  root@rkt-72bd9a93-2e89-4515-8b46-44e0e11c4c79:/# mkfs.ext4 /dev/sda
   119  mke2fs 1.42.13 (17-May-2015)
   120  /dev/sda contains a ext4 file system
   121  	last mounted on Fri Nov  3 17:15:56 2017
   122  Proceed anyway? (y,n) y
   123  Creating filesystem with 491520 4k blocks and 122880 inodes
   124  Filesystem UUID: 9ede01b1-e35b-46a0-b224-24e879973582
   125  Superblock backups stored on blocks:
   126  	32768, 98304, 163840, 229376, 294912
   127  
   128  Allocating group tables: done
   129  Writing inode tables: done
   130  Creating journal (8192 blocks): done
   131  Writing superblocks and filesystem accounting information: done
   132  
   133  root@rkt-72bd9a93-2e89-4515-8b46-44e0e11c4c79:/# mount /dev/sda /mnt/
   134  root@rkt-72bd9a93-2e89-4515-8b46-44e0e11c4c79:/# echo HELLO > /mnt/hi.txt
   135  root@rkt-72bd9a93-2e89-4515-8b46-44e0e11c4c79:/# cat /mnt/hi.txt
   136  HELLO
   137  root@rkt-72bd9a93-2e89-4515-8b46-44e0e11c4c79:/# umount /mnt/
   138  ```