github.com/opencontainers/runc@v1.2.0-rc.1.0.20240520010911-492dc558cdd6/docs/terminals.md (about) 1 # Terminals and Standard IO # 2 3 *Note that the default configuration of `runc` (foreground, new terminal) is 4 generally the best option for most users. This document exists to help explain 5 what the purpose of the different modes is, and to try to steer users away from 6 common mistakes and misunderstandings.* 7 8 In general, most processes on Unix (and Unix-like) operating systems have 3 9 standard file descriptors provided at the start, collectively referred to as 10 "standard IO" (`stdio`): 11 12 * `0`: standard-in (`stdin`), the input stream into the process 13 * `1`: standard-out (`stdout`), the output stream from the process 14 * `2`: standard-error (`stderr`), the error stream from the process 15 16 When creating and running a container via `runc`, it is important to take care 17 to structure the `stdio` the new container's process receives. In some ways 18 containers are just regular processes, while in other ways they're an isolated 19 sub-partition of your machine (in a similar sense to a VM). This means that the 20 structure of IO is not as simple as with ordinary programs (which generally 21 just use the file descriptors you give them). 22 23 ## Other File Descriptors ## 24 25 Before we continue, it is important to note that processes can have more file 26 descriptors than just `stdio`. By default in `runc` no other file descriptors 27 will be passed to the spawned container process. If you wish to explicitly pass 28 file descriptors to the container you have to use the `--preserve-fds` option. 29 These ancillary file descriptors don't have any of the strange semantics 30 discussed further in this document (those only apply to `stdio`) -- they are 31 passed untouched by `runc`. 32 33 It should be noted that `--preserve-fds` does not take individual file 34 descriptors to preserve. Instead, it takes how many file descriptors (not 35 including `stdio` or `LISTEN_FDS`) should be passed to the container. In the 36 following example: 37 38 ``` 39 % runc run --preserve-fds 5 <container> 40 ``` 41 42 `runc` will pass the first `5` file descriptors (`3`, `4`, `5`, `6`, and `7` -- 43 assuming that `LISTEN_FDS` has not been configured) to the container. 44 45 In addition to `--preserve-fds`, `LISTEN_FDS` file descriptors are passed 46 automatically to allow for `systemd`-style socket activation. To extend the 47 above example: 48 49 ``` 50 % LISTEN_PID=$pid_of_runc LISTEN_FDS=3 runc run --preserve-fds 5 <container> 51 ``` 52 53 `runc` will now pass the first `8` file descriptors (and it will also pass 54 `LISTEN_FDS=3` and `LISTEN_PID=1` to the container). The first `3` (`3`, `4`, 55 and `5`) were passed due to `LISTEN_FDS` and the other `5` (`6`, `7`, `8`, `9`, 56 and `10`) were passed due to `--preserve-fds`. You should keep this in mind if 57 you use `runc` directly in something like a `systemd` unit file. To disable 58 this `LISTEN_FDS`-style passing just unset `LISTEN_FDS`. 59 60 **Be very careful when passing file descriptors to a container process.** Due 61 to some Linux kernel (mis)features, a container with access to certain types of 62 file descriptors (such as `O_PATH` descriptors) outside of the container's root 63 file system can use these to break out of the container's pivoted mount 64 namespace. [This has resulted in CVEs in the past.][CVE-2016-9962] 65 66 [CVE-2016-9962]: https://nvd.nist.gov/vuln/detail/CVE-2016-9962 67 68 ## <a name="terminal-modes" /> Terminal Modes ## 69 70 `runc` supports two distinct methods for passing `stdio` to the container's 71 primary process: 72 73 * [new terminal](#new-terminal) (`terminal: true`) 74 * [pass-through](#pass-through) (`terminal: false`) 75 76 When first using `runc` these two modes will look incredibly similar, but this 77 can be quite deceptive as these different modes have quite different 78 characteristics. 79 80 By default, `runc spec` will create a configuration that will create a new 81 terminal (`terminal: true`). However, if the `terminal: ...` line is not 82 present in `config.json` then pass-through is the default. 83 84 *In general we recommend using new terminal, because it means that tools like 85 `sudo` will work inside your container. But pass-through can be useful if you 86 know what you're doing, or if you're using `runc` as part of a non-interactive 87 pipeline.* 88 89 ### <a name="new-terminal"> New Terminal ### 90 91 In new terminal mode, `runc` will create a brand-new "console" (or more 92 precisely, a new pseudo-terminal using the container's namespaced 93 `/dev/pts/ptmx`) for your contained process to use as its `stdio`. 94 95 When you start a process in new terminal mode, `runc` will do the following: 96 97 1. Create a new pseudo-terminal. 98 2. Pass the slave end to the container's primary process as its `stdio`. 99 3. Send the master end to a process to interact with the `stdio` for the 100 container's primary process ([details below](#runc-modes)). 101 102 It should be noted that since a new pseudo-terminal is being used for 103 communication with the container, some strange properties of pseudo-terminals 104 might surprise you. For instance, by default, all new pseudo-terminals 105 translate the byte `'\n'` to the sequence `'\r\n'` on both `stdout` and 106 `stderr`. In addition there are [a whole range of `ioctls(2)` that can only 107 interact with pseudo-terminal `stdio`][tty_ioctl(4)]. 108 109 > **NOTE**: In new terminal mode, all three `stdio` file descriptors are the 110 > same underlying file. The reason for this is to match how a shell's `stdio` 111 > looks to a process (as well as remove race condition issues with having to 112 > deal with multiple master pseudo-terminal file descriptors). However this 113 > means that it is not really possible to uniquely distinguish between `stdout` 114 > and `stderr` from the caller's perspective. 115 116 #### Issues 117 118 If you see an error like 119 120 ``` 121 open /dev/tty: no such device or address 122 ``` 123 124 from runc, it means it can't open a terminal (because there isn't one). This 125 can happen when stdin (and possibly also stdout and stderr) are redirected, 126 or in some environments that lack a tty (such as GitHub Actions runners). 127 128 The solution to this is to *not* use a terminal for the container, i.e. have 129 `terminal: false` in `config.json`. If the container really needs a terminal 130 (some programs require one), you can provide one, using one of the following 131 methods. 132 133 One way is to use `ssh` with the `-tt` flag. The second `t` forces a terminal 134 allocation even if there's no local one -- and so it is required when stdin is 135 not a terminal (some `ssh` implementations only look for a terminal on stdin). 136 137 Another way is to run runc under the `script` utility, like this 138 139 ```console 140 $ script -e -c 'runc run <container>' 141 ``` 142 143 [tty_ioctl(4)]: https://linux.die.net/man/4/tty_ioctl 144 145 ### <a name="pass-through"> Pass-Through ### 146 147 If you have already set up some file handles that you wish your contained 148 process to use as its `stdio`, then you can ask `runc` to pass them through to 149 the contained process (this is not necessarily the same as `--preserve-fds`'s 150 passing of file descriptors -- [details below](#runc-modes)). As an example 151 (assuming that `terminal: false` is set in `config.json`): 152 153 ``` 154 % echo input | runc run some_container > /tmp/log.out 2> /tmp/log.err 155 ``` 156 157 Here the container's various `stdio` file descriptors will be substituted with 158 the following: 159 160 * `stdin` will be sourced from the `echo input` pipeline. 161 * `stdout` will be output into `/tmp/log.out` on the host. 162 * `stderr` will be output into `/tmp/log.err` on the host. 163 164 It should be noted that the actual file handles seen inside the container may 165 be different [based on the mode `runc` is being used in](#runc-modes) (for 166 instance, the file referenced by `1` could be `/tmp/log.out` directly or a pipe 167 which `runc` is using to buffer output, based on the mode). However the net 168 result will be the same in either case. In principle you could use the [new 169 terminal mode](#new-terminal) in a pipeline, but the difference will become 170 more clear when you are introduced to [`runc`'s detached mode](#runc-modes). 171 172 ## <a name="runc-modes" /> `runc` Modes ## 173 174 `runc` itself runs in two modes: 175 176 * [foreground](#foreground) 177 * [detached](#detached) 178 179 You can use either [terminal mode](#terminal-modes) with either `runc` mode. 180 However, there are considerations that may indicate preference for one mode 181 over another. It should be noted that while two types of modes (terminal and 182 `runc`) are conceptually independent from each other, you should be aware of 183 the intricacies of which combination you are using. 184 185 *In general we recommend using foreground because it's the most 186 straight-forward to use, with the only downside being that you will have a 187 long-running `runc` process. Detached mode is difficult to get right and 188 generally requires having your own `stdio` management.* 189 190 ### Foreground ### 191 192 The default (and most straight-forward) mode of `runc`. In this mode, your 193 `runc` command remains in the foreground with the container process as a child. 194 All `stdio` is buffered through the foreground `runc` process (irrespective of 195 which terminal mode you are using). This is conceptually quite similar to 196 running a normal process interactively in a shell (and if you are using `runc` 197 in a shell interactively, this is what you should use). 198 199 Because the `stdio` will be buffered in this mode, some very important 200 peculiarities of this mode should be kept in mind: 201 202 * With [new terminal mode](#new-terminal), the container will see a 203 pseudo-terminal as its `stdio` (as you might expect). However, the `stdio` of 204 the foreground `runc` process will remain the `stdio` that the process was 205 started with -- and `runc` will copy all `stdio` between its `stdio` and the 206 container's `stdio`. This means that while a new pseudo-terminal has been 207 created, the foreground `runc` process manages it over the lifetime of the 208 container. 209 210 * With [pass-through mode](#pass-through), the foreground `runc`'s `stdio` is 211 **not** passed to the container. Instead, the container's `stdio` is a set of 212 pipes which are used to copy data between `runc`'s `stdio` and the 213 container's `stdio`. This means that the container never has direct access to 214 host file descriptors (aside from the pipes created by the container runtime, 215 but that shouldn't be an issue). 216 217 The main drawback of the foreground mode of operation is that it requires a 218 long-running foreground `runc` process. If you kill the foreground `runc` 219 process then you will no longer have access to the `stdio` of the container 220 (and in most cases this will result in the container dying abnormally due to 221 `SIGPIPE` or some other error). By extension this means that any bug in the 222 long-running foreground `runc` process (such as a memory leak) or a stray 223 OOM-kill sweep could result in your container being killed **through no fault 224 of the user**. In addition, there is no way in foreground mode of passing a 225 file descriptor directly to the container process as its `stdio` (like 226 `--preserve-fds` does). 227 228 These shortcomings are obviously sub-optimal and are the reason that `runc` has 229 an additional mode called "detached mode". 230 231 ### Detached ### 232 233 In contrast to foreground mode, in detached mode there is no long-running 234 foreground `runc` process once the container has started. In fact, there is no 235 long-running `runc` process at all. However, this means that it is up to the 236 caller to handle the `stdio` after `runc` has set it up for you. In a shell 237 this means that the `runc` command will exit and control will return to the 238 shell, after the container has been set up. 239 240 You can run `runc` in detached mode in one of the following ways: 241 242 * `runc run -d ...` which operates similar to `runc run` but is detached. 243 * `runc create` followed by `runc start` which is the standard container 244 lifecycle defined by the OCI runtime specification (`runc create` sets up the 245 container completely, waiting for `runc start` to begin execution of user 246 code). 247 248 The main use-case of detached mode is for higher-level tools that want to be 249 wrappers around `runc`. By running `runc` in detached mode, those tools have 250 far more control over the container's `stdio` without `runc` getting in the 251 way (most wrappers around `runc` like `cri-o` or `containerd` use detached mode 252 for this reason). 253 254 Unfortunately using detached mode is a bit more complicated and requires more 255 care than the foreground mode -- mainly because it is now up to the caller to 256 handle the `stdio` of the container. 257 258 Another complication is that the parent process is responsible for acting as 259 the subreaper for the container. In short, you need to call 260 `prctl(PR_SET_CHILD_SUBREAPER, 1, ...)` in the parent process and correctly 261 handle the implications of being a subreaper. Failing to do so may result in 262 zombie processes being accumulated on your host. 263 264 These tasks are usually performed by a dedicated (and minimal) monitor process 265 per-container. For the sake of comparison, other runtimes such as LXC do not 266 have an equivalent detached mode and instead integrate this monitor process 267 into the container runtime itself -- this has several tradeoffs, and runc has 268 opted to support delegating the monitoring responsibility to the parent process 269 through this detached mode. 270 271 #### Detached Pass-Through #### 272 273 In detached mode, pass-through actually does what it says on the tin -- the 274 `stdio` file descriptors of the `runc` process are passed through (untouched) 275 to the container's `stdio`. The purpose of this option is to allow a user to 276 set up `stdio` for a container themselves and then force `runc` to just use 277 their pre-prepared `stdio` (without any pseudo-terminal funny business). *If 278 you don't see why this would be useful, don't use this option.* 279 280 **You must be incredibly careful when using detached pass-through (especially 281 in a shell).** The reason for this is that by using detached pass-through you 282 are passing host file descriptors to the container. In the case of a shell, 283 usually your `stdio` is going to be a pseudo-terminal (on your host). A 284 malicious container could take advantage of TTY-specific `ioctls` like 285 `TIOCSTI` to fake input into the **host** shell (remember that in detached 286 mode, control is returned to your shell and so the terminal you've given the 287 container is being read by a shell prompt). 288 289 There are also several other issues with running non-malicious containers in a 290 shell with detached pass-through (where you pass your shell's `stdio` to the 291 container): 292 293 * Output from the container will be interleaved with output from your shell (in 294 a non-deterministic way), without any real way of distinguishing from where a 295 particular piece of output came from. 296 297 * Any input to `stdin` will be non-deterministically split and given to either 298 the container or the shell (because both are blocked on a `read(2)` of the 299 same FIFO-style file descriptor). 300 301 They are all related to the fact that there is going to be a race when either 302 your host or the container tries to read from (or write to) `stdio`. This 303 problem is especially obvious when in a shell, where usually the terminal has 304 been put into raw mode (where each individual key-press should cause `read(2)` 305 to return). 306 307 > **NOTE**: There is also currently a [known problem][issue-1721] where using 308 > detached pass-through will result in the container hanging if the `stdout` or 309 > `stderr` is a pipe (though this should be a temporary issue). 310 311 [issue-1721]: https://github.com/opencontainers/runc/issues/1721 312 313 #### Detached New Terminal #### 314 315 When creating a new pseudo-terminal in detached mode, and fairly obvious 316 problem appears -- how do we use the new terminal that `runc` created? Unlike 317 in pass-through, `runc` has created a new set of file descriptors that need to 318 be used by *something* in order for container communication to work. 319 320 The way this problem is resolved is through the use of Unix domain sockets. 321 There is a feature of Unix sockets called `SCM_RIGHTS` which allows a file 322 descriptor to be sent through a Unix socket to a completely separate process 323 (which can then use that file descriptor as though they opened it). When using 324 `runc` in detached new terminal mode, this is how a user gets access to the 325 pseudo-terminal's master file descriptor. 326 327 To this end, there is a new option (which is required if you want to use `runc` 328 in detached new terminal mode): `--console-socket`. This option takes the path 329 to a Unix domain socket which `runc` will connect to and send the 330 pseudo-terminal master file descriptor down. The general process for getting 331 the pseudo-terminal master is as follows: 332 333 1. Create a Unix domain socket at some path, `$socket_path`. 334 2. Call `runc run` or `runc create` with the argument `--console-socket 335 $socket_path`. 336 3. Using `recvmsg(2)` retrieve the file descriptor sent using `SCM_RIGHTS` by 337 `runc`. 338 4. Now the manager can interact with the `stdio` of the container, using the 339 retrieved pseudo-terminal master. 340 341 After `runc` exits, the only process with a copy of the pseudo-terminal master 342 file descriptor is whoever read the file descriptor from the socket. 343 344 > **NOTE**: Currently `runc` doesn't support abstract socket addresses (due to 345 > it not being possible to pass an `argv` with a null-byte as the first 346 > character). In the future this may change, but currently you must use a valid 347 > path name. 348 349 In order to help users make use of detached new terminal mode, we have provided 350 a [Go implementation in the `go-runc` bindings][containerd/go-runc.Socket], as 351 well as [a simple client][recvtty]. 352 353 [containerd/go-runc.Socket]: https://godoc.org/github.com/containerd/go-runc#Socket 354 [recvtty]: /contrib/cmd/recvtty