github.com/tompao/docker@v1.9.1/experimental/userns.md (about) 1 # Experimental: User namespace support 2 3 Linux kernel [user namespace support](http://man7.org/linux/man-pages/man7/user_namespaces.7.html) provides additional security by enabling 4 a process--and therefore a container--to have a unique range of user and 5 group IDs which are outside the traditional user and group range utilized by 6 the host system. Potentially the most important security improvement is that, 7 by default, container processes running as the `root` user will have expected 8 administrative privilege (with some restrictions) inside the container but will 9 effectively be mapped to an unprivileged `uid` on the host. 10 11 In this experimental phase, the Docker daemon creates a single daemon-wide mapping 12 for all containers running on the same engine instance. The mappings will 13 utilize the existing subordinate user and group ID feature available on all modern 14 Linux distributions. 15 The [`/etc/subuid`](http://man7.org/linux/man-pages/man5/subuid.5.html) and 16 [`/etc/subgid`](http://man7.org/linux/man-pages/man5/subgid.5.html) files will be 17 read for the user, and optional group, specified to the `--userns-remap` 18 parameter. If you do not wish to specify your own user and/or group, you can 19 provide `default` as the value to this flag, and a user will be created on your behalf 20 and provided subordinate uid and gid ranges. This default user will be named 21 `dockremap`, and entries will be created for it in `/etc/passwd` and 22 `/etc/group` using your distro's standard user and group creation tools. 23 24 > **Note**: The single mapping per-daemon restriction exists for this experimental 25 > phase because Docker shares image layers from its local cache across all 26 > containers running on the engine instance. Since file ownership must be 27 > the same for all containers sharing the same layer content, the decision 28 > was made to map the file ownership on `docker pull` to the daemon's user and 29 > group mappings so that there is no delay for running containers once the 30 > content is downloaded--exactly the same performance characteristics as with 31 > user namespaces disabled. 32 33 ## Starting the daemon with user namespaces enabled 34 To enable this experimental user namespace support for a Docker daemon instance, 35 start the daemon with the aforementioned `--userns-remap` flag, which accepts 36 values in the following formats: 37 38 - uid 39 - uid:gid 40 - username 41 - username:groupname 42 43 If numeric IDs are provided, translation back to valid user or group names 44 will occur so that the subordinate uid and gid information can be read, given 45 these resources are name-based, not id-based. If the numeric ID information 46 provided does not exist as entries in `/etc/passwd` or `/etc/group`, dameon 47 startup will fail with an error message. 48 49 *An example: starting with default Docker user management:* 50 51 ``` 52 $ docker daemon --userns-remap=default 53 ``` 54 In this case, Docker will create--or find the existing--user and group 55 named `dockremap`. If the user is created, and the Linux distribution has 56 appropriate support, the `/etc/subuid` and `/etc/subgid` files will be populated 57 with a contiguous 65536 length range of subordinate user and group IDs, starting 58 at an offset based on prior entries in those files. For example, Ubuntu will 59 create the following range, based on an existing user already having the first 60 65536 range: 61 62 ``` 63 $ cat /etc/subuid 64 user1:100000:65536 65 dockremap:165536:65536 66 ``` 67 68 > **Note:** On a fresh Fedora install, we found that we had to `touch` the 69 > `/etc/subuid` and `/etc/subgid` files to have ranges assigned when users 70 > were created. Once these files existed, range assigment on user creation 71 > worked properly. 72 73 If you have a preferred/self-managed user with subordinate ID mappings already 74 configured, you can provide that username or uid to the `--userns-remap` flag. 75 If you have a group that doesn't match the username, you may provide the `gid` 76 or group name as well; otherwise the username will be used as the group name 77 when querying the system for the subordinate group ID range. 78 79 ## Detailed information on `subuid`/`subgid` ranges 80 81 Given there may be advanced use of the subordinate ID ranges by power users, we will 82 describe how the Docker daemon uses the range entries within these files under the 83 current experimental user namespace support. 84 85 The simplest case exists where only one contiguous range is defined for the 86 provided user or group. In this case, Docker will use that entire contiguous 87 range for the mapping of host uids and gids to the container process. This 88 means that the first ID in the range will be the remapped root user, and the 89 IDs above that initial ID will map host ID 1 through the end of the range. 90 91 From the example `/etc/subid` content shown above, that means the remapped root 92 user would be uid 165536. 93 94 If the system administrator has set up multiple ranges for a single user or 95 group, the Docker daemon will read all the available ranges and use the 96 following algorithm to create the mapping ranges: 97 98 1. The ranges will be sorted by *start ID* ascending 99 2. Maps will be created from each range with where the host ID will increment starting at 0 for the first range, 0+*range1* length for the second, and so on. This means that the lowest range start ID will be the remapped root, and all further ranges will map IDs from 1 through the uid or gid that equals the sum of all range lengths. 100 3. Ranges segments above five will be ignored as the kernel ignores any ID maps after five (in `/proc/self/{u,g}id_map`) 101 102 ## User namespace known restrictions 103 104 The following standard Docker features are currently incompatible when 105 running a Docker daemon with experimental user namespaces enabled: 106 107 - sharing namespaces with the host (--pid=host, --net=host, etc.) 108 - sharing namespaces with other containers (--net=container:*other*) 109 - A `--readonly` container filesystem (a Linux kernel restriction on remount with new flags of a currently mounted filesystem when inside a user namespace) 110 - external (volume/graph) drivers which are unaware/incapable of using daemon user mappings 111 - Using `--privileged` mode containers 112 - Using the lxc execdriver (only the `native` execdriver is enabled to use user namespaces) 113 - volume use without pre-arranging proper file ownership in mounted volumes 114 115 Additionally, while the `root` user inside a user namespaced container 116 process has many of the privileges of the administrative root user, the 117 following operations will fail: 118 119 - Use of `mknod` - permission is denied for device creation by the container root 120 - others will be listed here when fully tested