github.com/opencontainers/runc@v1.2.0-rc.1.0.20240520010911-492dc558cdd6/CHANGELOG.md (about) 1 # Changelog 2 This file documents all notable changes made to this project since runc 1.0. 3 4 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), 5 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). 6 7 ## [Unreleased] 8 9 ### Changed 10 11 * libcontainer/cgroups users who want to manage cgroup devices need to explicitly 12 import libcontainer/cgroups/devices. (#3452, #4248) 13 14 ## [1.2.0-rc.1] - 2024-04-03 15 16 > There's a frood who really knows where his towel is. 17 18 `runc` now requires a minimum of Go 1.20 to compile. 19 20 > **NOTE**: runc currently will not work properly when compiled with Go 1.22 or 21 > newer. This is due to some unfortunate glibc behaviour that Go 1.22 22 > exacerbates in a way that results in containers not being able to start on 23 > some systems. [See this issue for more information.][runc-4233] 24 25 [runc-4233]: https://github.com/opencontainers/runc/issues/4233 26 27 ### Breaking 28 29 * Several aspects of how mount options work has been adjusted in a way that 30 could theoretically break users that have very strange mount option strings. 31 This was necessary to fix glaring issues in how mount options were being 32 treated. The key changes are: 33 34 - Mount options on bind-mounts that clear a mount flag are now always 35 applied. Previously, if a user requested a bind-mount with only clearing 36 options (such as `rw,exec,dev`) the options would be ignored and the 37 original bind-mount options would be set. Unfortunately this also means 38 that container configurations which specified only clearing mount options 39 will now actually get what they asked for, which could break existing 40 containers (though it seems unlikely that a user who requested a specific 41 mount option would consider it "broken" to get the mount options they 42 asked foruser who requested a specific mount option would consider it 43 "broken" to get the mount options they asked for). This also allows us to 44 silently add locked mount flags the user *did not explicitly request to be 45 cleared* in rootless mode, allowing for easier use of bind-mounts for 46 rootless containers. (#3967) 47 48 - Container configurations using bind-mounts with superblock mount flags 49 (i.e. filesystem-specific mount flags, referred to as "data" in 50 `mount(2)`, as opposed to VFS generic mount flags like `MS_NODEV`) will 51 now return an error. This is because superblock mount flags will also 52 affect the host mount (as the superblock is shared when bind-mounting), 53 which is obviously not acceptable. Previously, these flags were silently 54 ignored so this change simply tells users that runc cannot fulfil their 55 request rather than just ignoring it. (#3990) 56 57 If any of these changes cause problems in real-world workloads, please [open 58 an issue](https://github.com/opencontainers/runc/issues/new/choose) so we 59 can adjust the behaviour to avoid compatibility issues. 60 61 ### Added 62 63 * runc has been updated to OCI runtime-spec 1.2.0, and supports all Linux 64 features with a few minor exceptions. See 65 [`docs/spec-conformance.md`](https://github.com/opencontainers/runc/blob/v1.2.0-rc.1/docs/spec-conformance.md) 66 for more details. 67 * runc now supports id-mapped mounts for bind-mounts (with no restrictions on 68 the mapping used for each mount). Other mount types are not currently 69 supported. This feature requires `MOUNT_ATTR_IDMAP` kernel support (Linux 70 5.12 or newer) as well as kernel support for the underlying filesystem used 71 for the bind-mount. See [`mount_setattr(2)`][mount_setattr.2] for a list of 72 supported filesystems and other restrictions. (#3717, #3985, #3993) 73 * Two new mechanisms for reducing the memory usage of our protections against 74 [CVE-2019-5736][cve-2019-5736] have been introduced: 75 - `runc-dmz` is a minimal binary (~8K) which acts as an additional execve 76 stage, allowing us to only need to protect the smaller binary. It should 77 be noted that there have been several compatibility issues reported with 78 the usage of `runc-dmz` (namely related to capabilities and SELinux). As 79 such, this mechanism is **opt-in** and can be enabled by running `runc` 80 with the environment variable `RUNC_DMZ=true` (setting this environment 81 variable in `config.json` will have no effect). This feature can be 82 disabled at build time using the `runc_nodmz` build tag. (#3983, #3987) 83 - `contrib/memfd-bind` is a helper daemon which will bind-mount a memfd copy 84 of `/usr/bin/runc` on top of `/usr/bin/runc`. This entirely eliminates 85 per-container copies of the binary, but requires care to ensure that 86 upgrades to runc are handled properly, and requires a long-running daemon 87 (unfortunately memfds cannot be bind-mounted directly and thus require a 88 daemon to keep them alive). (#3987) 89 * runc will now use `cgroup.kill` if available to kill all processes in a 90 container (such as when doing `runc kill`). (#3135, #3825) 91 * Add support for setting the umask for `runc exec`. (#3661) 92 * libct/cg: support `SCHED_IDLE` for runc cgroupfs. (#3377) 93 * checkpoint/restore: implement `--manage-cgroups-mode=ignore`. (#3546) 94 * seccomp: refactor flags support; add flags to features, set `SPEC_ALLOW` by 95 default. (#3588) 96 * libct/cg/sd: use systemd v240+ new `MAJOR:*` syntax. (#3843) 97 * Support CFS bandwidth burst for CPU. (#3749, #3145) 98 * Support time namespaces. (#3876) 99 * Reduce the `runc` binary size by ~11% by updating 100 `github.com/checkpoint-restore/go-criu`. (#3652) 101 * Add `--pidfd-socket` to `runc run` and `runc exec` to allow for management 102 processes to receive a pidfd for the new process, allowing them to avoid pid 103 reuse attacks. (#4045) 104 105 [mount_setattr.2]: https://man7.org/linux/man-pages/man2/mount_setattr.2.html 106 [cve-2019-5736]: https://github.com/advisories/GHSA-gxmr-w5mj-v8hh 107 108 ### Deprecated 109 110 * `runc` option `--criu` is now ignored (with a warning), and the option will 111 be removed entirely in a future release. Users who need a non-standard 112 `criu` binary should rely on the standard way of looking up binaries in 113 `$PATH`. (#3316) 114 * `runc kill` option `-a` is now deprecated. Previously, it had to be specified 115 to kill a container (with SIGKILL) which does not have its own private PID 116 namespace (so that runc would send SIGKILL to all processes). Now, this is 117 done automatically. (#3864, #3825) 118 * `github.com/opencontainers/runc/libcontainer/user` is now deprecated, please 119 use `github.com/moby/sys/user` instead. It will be removed in a future 120 release. (#4017) 121 122 ### Changed 123 124 * When Intel RDT feature is not available, its initialization is skipped, 125 resulting in slightly faster `runc exec` and `runc run`. (#3306) 126 * `runc features` is no longer experimental. (#3861) 127 * libcontainer users that create and kill containers from a daemon process 128 (so that the container init is a child of that process) must now implement 129 a proper child reaper in case a container does not have its own private PID 130 namespace, as documented in `container.Signal`. (#3825) 131 * Sum `anon` and `file` from `memory.stat` for cgroupv2 root usage, 132 as the root does not have `memory.current` for cgroupv2. 133 This aligns cgroupv2 root usage more closely with cgroupv1 reporting. 134 Additionally, report root swap usage as sum of swap and memory usage, 135 aligned with v1 and existing non-root v2 reporting. (#3933) 136 * Add `swapOnlyUsage` in `MemoryStats`. This field reports swap-only usage. 137 For cgroupv1, `Usage` and `Failcnt` are set by subtracting memory usage 138 from memory+swap usage. For cgroupv2, `Usage`, `Limit`, and `MaxUsage` 139 are set. (#4010) 140 * libcontainer users that create and kill containers from a daemon process 141 (so that the container init is a child of that process) must now implement 142 a proper child reaper in case a container does not have its own private PID 143 namespace, as documented in `container.Signal`. (#3825) 144 * libcontainer: `container.Signal` no longer takes an `all` argument. Whether 145 or not it is necessary to kill all processes in the container individually 146 is now determined automatically. (#3825, #3885) 147 * seccomp: enable seccomp binary tree optimization. (#3405) 148 * `runc run`/`runc exec`: ignore SIGURG. (#3368) 149 * Remove tun/tap from the default device allowlist. (#3468) 150 * `runc --root non-existent-dir list` now reports an error for non-existent 151 root directory. (#3374) 152 153 ### Fixed 154 155 * In case the runc binary resides on tmpfs, `runc init` no longer re-execs 156 itself twice. (#3342) 157 * Our seccomp `-ENOSYS` stub now correctly handles multiplexed syscalls on 158 s390 and s390x. This solves the issue where syscalls the host kernel did not 159 support would return `-EPERM` despite the existence of the `-ENOSYS` stub 160 code (this was due to how s390x does syscall multiplexing). (#3474) 161 * Remove tun/tap from the default device rules. (#3468) 162 * specconv: avoid mapping "acl" to `MS_POSIXACL`. (#3739) 163 * libcontainer: fix private PID namespace detection when killing the 164 container. (#3866, #3825) 165 * systemd socket notification: fix race where runc exited before systemd 166 properly handled the `READY` notification. (#3291, #3293) 167 * The `-ENOSYS` seccomp stub is now always generated for the native 168 architecture that `runc` is running on. This is needed to work around some 169 arguably specification-incompliant behaviour from Docker on architectures 170 such as ppc64le, where the allowed architecture list is set to `null`. This 171 ensures that we always generate at least one `-ENOSYS` stub for the native 172 architecture even with these weird configs. (#4219) 173 174 ### Removed 175 176 * In order to fix performance issues in the "lightweight" bindfd protection 177 against [CVE-2019-5736][cve-2019-5736], the temporary `ro` bind-mount of 178 `/proc/self/exe` has been removed. runc now creates a binary copy in all 179 cases. See the above notes about `memfd-bind` and `runc-dmz` as well as 180 `contrib/cmd/memfd-bind/README.md` for more information about how this 181 (minor) change in memory usage can be further reduced. (#3987, #3599, #2532, 182 #3931) 183 * libct/cg: Remove `EnterPid` (a function with no users). (#3797) 184 * libcontainer: Remove `{Pre,Post}MountCmds` which were never used and are 185 obsoleted by more generic container hooks. (#3350) 186 187 [cve-2019-5736]: https://github.com/advisories/GHSA-gxmr-w5mj-v8hh 188 189 ## [1.1.12] - 2024-01-31 190 191 > Now you're thinking with Portals™! 192 193 ### Security 194 195 * Fix [CVE-2024-21626][cve-2024-21626], a container breakout attack that took 196 advantage of a file descriptor that was leaked internally within runc (but 197 never leaked to the container process). In addition to fixing the leak, 198 several strict hardening measures were added to ensure that future internal 199 leaks could not be used to break out in this manner again. Based on our 200 research, while no other container runtime had a similar leak, none had any 201 of the hardening steps we've introduced (and some runtimes would not check 202 for any file descriptors that a calling process may have leaked to them, 203 allowing for container breakouts due to basic user error). 204 205 [cve-2024-21626]: https://github.com/opencontainers/runc/security/advisories/GHSA-xr7r-f8xq-vfvv 206 207 ## [1.1.11] - 2024-01-01 208 209 > Happy New Year! 210 211 ### Fixed 212 213 * Fix several issues with userns path handling. (#4122, #4124, #4134, #4144) 214 215 ### Changed 216 217 * Support memory.peak and memory.swap.peak in cgroups v2. 218 Add `swapOnlyUsage` in `MemoryStats`. This field reports swap-only usage. 219 For cgroupv1, `Usage` and `Failcnt` are set by subtracting memory usage 220 from memory+swap usage. For cgroupv2, `Usage`, `Limit`, and `MaxUsage` 221 are set. (#4000, #4010, #4131) 222 * build(deps): bump github.com/cyphar/filepath-securejoin. (#4140) 223 224 ## [1.1.10] - 2023-10-31 225 226 > Śruba, przykręcona we śnie, nie zmieni sytuacji, jaka panuje na jawie. 227 228 ### Added 229 230 * Support for `hugetlb.<pagesize>.rsvd` limiting and accounting. Fixes the 231 issue of postres failing when hugepage limits are set. (#3859, #4077) 232 233 ### Fixed 234 235 * Fixed permissions of a newly created directories to not depend on the value 236 of umask in tmpcopyup feature implementation. (#3991, #4060) 237 * libcontainer: cgroup v1 GetStats now ignores missing `kmem.limit_in_bytes` 238 (fixes the compatibility with Linux kernel 6.1+). (#4028) 239 * Fix a semi-arbitrary cgroup write bug when given a malicious hugetlb 240 configuration. This issue is not a security issue because it requires a 241 malicious `config.json`, which is outside of our threat model. (#4103) 242 * Various CI fixes. (#4081, #4055) 243 244 ## [1.1.9] - 2023-08-10 245 246 > There is a crack in everything. That's how the light gets in. 247 248 ### Added 249 250 * Added go 1.21 to the CI matrix; other CI updates. (#3976, #3958) 251 252 ### Fixed 253 254 * Fixed losing sticky bit on tmpfs (a regression in 1.1.8). (#3952, #3961) 255 * intelrdt: fixed ignoring ClosID on some systems. (#3550, #3978) 256 257 ### Changed 258 259 * Sum `anon` and `file` from `memory.stat` for cgroupv2 root usage, 260 as the root does not have `memory.current` for cgroupv2. 261 This aligns cgroupv2 root usage more closely with cgroupv1 reporting. 262 Additionally, report root swap usage as sum of swap and memory usage, 263 aligned with v1 and existing non-root v2 reporting. (#3933) 264 265 ## [1.1.8] - 2023-07-20 266 267 > 海纳百川 有容乃大 268 269 ### Added 270 271 * Support riscv64. (#3905) 272 273 ### Fixed 274 275 * init: do not print environment variable value. (#3879) 276 * libct: fix a race with systemd removal. (#3877) 277 * tests/int: increase num retries for oom tests. (#3891) 278 * man/runc: fixes. (#3892) 279 * Fix tmpfs mode opts when dir already exists. (#3916) 280 * docs/systemd: fix a broken link. (#3917) 281 * ci/cirrus: enable some rootless tests on cs9. (#3918) 282 * runc delete: call systemd's reset-failed. (#3932) 283 * libct/cg/sd/v1: do not update non-frozen cgroup after frozen failed. (#3921) 284 285 ### Changed 286 287 * CI: bump Fedora, Vagrant, bats. (#3878) 288 * `.codespellrc`: update for 2.2.5. (#3909) 289 290 ## [1.1.7] - 2023-04-26 291 292 > Ночевала тучка золотая на груди утеса-великана. 293 294 ### Fixed 295 296 * When used with systemd v240+, systemd cgroup drivers no longer skip 297 `DeviceAllow` rules if the device does not exist (a regression introduced 298 in runc 1.1.3). This fix also reverts the workaround added in runc 1.1.5, 299 removing an extra warning emitted by runc run/start. (#3845, #3708, #3671) 300 301 ### Added 302 303 * The source code now has a new file, `runc.keyring`, which contains the keys 304 used to sign runc releases. (#3838) 305 306 ## [1.1.6] - 2023-04-11 307 308 > In this world nothing is certain but death and taxes. 309 310 ### Compatibility 311 312 * This release can no longer be built from sources using Go 1.16. Using a 313 latest maintained Go 1.20.x or Go 1.19.x release is recommended. 314 Go 1.17 can still be used. 315 316 ### Fixed 317 318 * systemd cgroup v1 and v2 drivers were deliberately ignoring `UnitExist` error 319 from systemd while trying to create a systemd unit, which in some scenarios 320 may result in a container not being added to the proper systemd unit and 321 cgroup. (#3780, #3806) 322 * systemd cgroup v2 driver was incorrectly translating cpuset range from spec's 323 `resources.cpu.cpus` to systemd unit property (`AllowedCPUs`) in case of more 324 than 8 CPUs, resulting in the wrong AllowedCPUs setting. (#3808) 325 * systemd cgroup v1 driver was prefixing container's cgroup path with the path 326 of PID 1 cgroup, resulting in inability to place PID 1 in a non-root cgroup. 327 (#3811) 328 * runc run/start may return "permission denied" error when starting a rootless 329 container when the file to be executed does not have executable bit set for 330 the user, not taking the `CAP_DAC_OVERRIDE` capability into account. This is 331 a regression in runc 1.1.4, as well as in Go 1.20 and 1.20.1 (#3715, #3817) 332 * cgroup v1 drivers are now aware of `misc` controller. (#3823) 333 * Various CI fixes and improvements, mostly to ensure Go 1.19.x and Go 1.20.x 334 compatibility. 335 336 ## [1.1.5] - 2023-03-29 337 338 > 囚われた屈辱は 339 > 反撃の嚆矢だ 340 341 ### Security 342 343 The following CVEs were fixed in this release: 344 345 * [CVE-2023-25809][] is a vulnerability involving rootless containers where 346 (under specific configurations), the container would have write access to the 347 `/sys/fs/cgroup/user.slice/...` cgroup hierarchy. No other hierarchies on the 348 host were affected. This vulnerability was discovered by Akihiro Suda. 349 350 * [CVE-2023-27561][] was a regression in our protections against tricky `/proc` 351 and `/sys` configurations (where the container mountpoint is a symlink) 352 causing us to be tricked into incorrectly configuring the container, which 353 effectively re-introduced [CVE-2019-19921][]. This regression was present 354 from v1.0.0-rc95 to v1.1.4 and was discovered by @Beuc. (#3785) 355 356 * [CVE-2023-28642][] is a different attack vector using the same regression 357 as in [CVE-2023-27561][]. This was reported by Lei Wang. 358 359 [CVE-2019-19921]: https://github.com/advisories/GHSA-fh74-hm69-rqjw 360 [CVE-2023-25809]: https://github.com/opencontainers/runc/security/advisories/GHSA-m8cg-xc2p-r3fc 361 [CVE-2023-27561]: https://github.com/advisories/GHSA-vpvm-3wq2-2wvm 362 [CVE-2023-28642]: https://github.com/opencontainers/runc/security/advisories/GHSA-g2j6-57v7-gm8c 363 364 ### Fixed 365 366 * Fix the inability to use `/dev/null` when inside a container. (#3620) 367 * Fix changing the ownership of host's `/dev/null` caused by fd redirection 368 (a regression in 1.1.1). (#3674, #3731) 369 * Fix rare runc exec/enter unshare error on older kernels, including 370 CentOS < 7.7. (#3776) 371 * nsexec: Check for errors in `write_log()`. (#3721) 372 * Various CI fixes and updates. (#3618, #3630, #3640, #3729) 373 374 ## [1.1.4] - 2022-08-24 375 376 > If you look for perfection, you'll never be content. 377 378 ### Fixed 379 380 * Fix mounting via wrong proc fd. 381 When the user and mount namespaces are used, and the bind mount is followed by 382 the cgroup mount in the spec, the cgroup was mounted using the bind mount's 383 mount fd. (#3511) 384 * Switch `kill()` in `libcontainer/nsenter` to `sane_kill()`. (#3536) 385 * Fix "permission denied" error from `runc run` on `noexec` fs. (#3541) 386 * Fix failed exec after `systemctl daemon-reload`. 387 Due to a regression in v1.1.3, the `DeviceAllow=char-pts rwm` rule was no 388 longer added and was causing an error `open /dev/pts/0: operation not permitted: unknown` 389 when systemd was reloaded. (#3554) 390 * Various CI fixes. (#3538, #3558, #3562) 391 392 ## [1.1.3] - 2022-06-09 393 394 > In the beginning there was nothing, which exploded. 395 396 ### Fixed 397 * Our seccomp `-ENOSYS` stub now correctly handles multiplexed syscalls on 398 s390 and s390x. This solves the issue where syscalls the host kernel did not 399 support would return `-EPERM` despite the existence of the `-ENOSYS` stub 400 code (this was due to how s390x does syscall multiplexing). (#3478) 401 * Retry on dbus disconnect logic in libcontainer/cgroups/systemd now works as 402 intended; this fix does not affect runc binary itself but is important for 403 libcontainer users such as Kubernetes. (#3476) 404 * Inability to compile with recent clang due to an issue with duplicate 405 constants in libseccomp-golang. (#3477) 406 * When using systemd cgroup driver, skip adding device paths that don't exist, 407 to stop systemd from emitting warnings about those paths. (#3504) 408 * Socket activation was failing when more than 3 sockets were used. (#3494) 409 * Various CI fixes. (#3472, #3479) 410 411 ### Added 412 * Allow to bind mount /proc/sys/kernel/ns_last_pid to inside container. (#3493) 413 414 ### Changed 415 * runc static binaries are now linked against libseccomp v2.5.4. (#3481) 416 417 418 ## [1.1.2] - 2022-05-11 419 420 > I should think I'm going to be a perpetual student. 421 422 ### Security 423 * A bug was found in runc where runc exec --cap executed processes with 424 non-empty inheritable Linux process capabilities, creating an atypical Linux 425 environment. For more information, see [GHSA-f3fp-gc8g-vw66][] and 426 CVE-2022-29162. 427 428 ### Changed 429 * `runc spec` no longer sets any inheritable capabilities in the created 430 example OCI spec (`config.json`) file. 431 432 [GHSA-f3fp-gc8g-vw66]: https://github.com/opencontainers/runc/security/advisories/GHSA-f3fp-gc8g-vw66 433 434 435 ## [1.1.1] - 2022-03-28 436 437 > Violence is the last refuge of the incompetent. 438 439 ### Added 440 * CI is now also run on centos-stream-9. (#3436) 441 442 ### Fixed 443 * `runc run/start` can now run a container with read-only `/dev` in OCI spec, 444 rather than error out. (#3355) 445 * `runc exec` now ensures that `--cgroup` argument is a sub-cgroup. (#3403) 446 * libcontainer systemd v2 manager no longer errors out if one of the files 447 listed in `/sys/kernel/cgroup/delegate` do not exist in container's cgroup. 448 (#3387, #3404) 449 * Loose OCI spec validation to avoid bogus "Intel RDT is not supported" error. 450 (#3406) 451 * libcontainer/cgroups no longer panics in cgroup v1 managers if `stat` 452 of `/sys/fs/cgroup/unified` returns an error other than ENOENT. (#3435) 453 454 455 ## [1.1.0] - 2022-01-14 456 457 > A plan depends as much upon execution as it does upon concept. 458 459 ### Changed 460 * libcontainer will now refuse to build without the nsenter package being 461 correctly compiled (specifically this requires CGO to be enabled). This 462 should avoid folks accidentally creating broken runc binaries (and 463 incorrectly importing our internal libraries into their projects). (#3331) 464 465 466 ## [1.1.0-rc.1] - 2021-12-14 467 468 > He who controls the spice controls the universe. 469 470 ### Deprecated 471 * runc run/start now warns if a new container cgroup is non-empty or frozen; 472 this warning will become an error in runc 1.2. (#3132, #3223) 473 * runc can only be built with Go 1.16 or later from this release onwards. 474 (#3100, #3245, #3325) 475 476 ### Removed 477 * `cgroup.GetHugePageSizes` has been removed entirely, and been replaced with 478 `cgroup.HugePageSizes` which is more efficient. (#3234) 479 * `intelrdt.GetIntelRdtPath` has been removed. Users who were using this 480 function to get the intelrdt root should use the new `intelrdt.Root` 481 instead. (#2920, #3239) 482 483 ### Added 484 * Add support for RDMA cgroup added in Linux 4.11. (#2883) 485 * runc exec now produces exit code of 255 when the exec failed. 486 This may help in distinguishing between runc exec failures 487 (such as invalid options, non-running container or non-existent 488 binary etc.) and failures of the command being executed. (#3073) 489 * runc run: new `--keep` option to skip removal exited containers artefacts. 490 This might be useful to check the state (e.g. of cgroup controllers) after 491 the container has exited. (#2817, #2825) 492 * seccomp: add support for `SCMP_ACT_KILL_PROCESS` and `SCMP_ACT_KILL_THREAD` 493 (the latter is just an alias for `SCMP_ACT_KILL`). (#3204) 494 * seccomp: add support for `SCMP_ACT_NOTIFY` (seccomp actions). This allows 495 users to create sophisticated seccomp filters where syscalls can be 496 efficiently emulated by privileged processes on the host. (#2682) 497 * checkpoint/restore: add an option (`--lsm-mount-context`) to set 498 a different LSM mount context on restore. (#3068) 499 * runc releases are now cross-compiled for several architectures. Static 500 builds for said architectures will be available for all future releases. 501 (#3197) 502 * intelrdt: support ClosID parameter. (#2920) 503 * runc exec --cgroup: an option to specify a (non-top) in-container cgroup 504 to use for the process being executed. (#3040, #3059) 505 * cgroup v1 controllers now support hybrid hierarchy (i.e. when on a cgroup v1 506 machine a cgroup2 filesystem is mounted to /sys/fs/cgroup/unified, runc 507 run/exec now adds the container to the appropriate cgroup under it). (#2087, 508 #3059) 509 * sysctl: allow slashes in sysctl names, to better match `sysctl(8)`'s 510 behaviour. (#3254, #3257) 511 * mounts: add support for bind-mounts which are inaccessible after switching 512 the user namespace. Note that this does not permit the container any 513 additional access to the host filesystem, it simply allows containers to 514 have bind-mounts configured for paths the user can access but have 515 restrictive access control settings for other users. (#2576) 516 * Add support for recursive mount attributes using `mount_setattr(2)`. These 517 have the same names as the proposed `mount(8)` options -- just prepend `r` 518 to the option name (such as `rro`). (#3272) 519 * Add `runc features` subcommand to allow runc users to detect what features 520 runc has been built with. This includes critical information such as 521 supported mount flags, hook names, and so on. Note that the output of this 522 command is subject to change and will not be considered stable until runc 523 1.2 at the earliest. The runtime-spec specification for this feature is 524 being developed in [opencontainers/runtime-spec#1130]. (#3296) 525 526 [opencontainers/runtime-spec#1130]: https://github.com/opencontainers/runtime-spec/pull/1130 527 528 ### Changed 529 * system: improve performance of `/proc/$pid/stat` parsing. (#2696) 530 * cgroup2: when `/sys/fs/cgroup` is configured as a read-write mount, change 531 the ownership of certain cgroup control files (as per 532 `/sys/kernel/cgroup/delegate`) to allow for proper deferral to the container 533 process. (#3057) 534 * docs: series of improvements to man pages to make them easier to read and 535 use. (#3032) 536 537 #### libcontainer API 538 * internal api: remove internal error types and handling system, switch to Go 539 wrapped errors. (#3033) 540 * New configs.Cgroup structure fields (#3177): 541 * Systemd (whether to use systemd cgroup manager); and 542 * Rootless (whether to use rootless cgroups). 543 * New cgroups/manager package aiming to simplify cgroup manager instantiation. 544 (#3177) 545 * All cgroup managers' instantiation methods now initialize cgroup paths and 546 can return errors. This allows to use any cgroup manager method (e.g. 547 Exists, Destroy, Set, GetStats) right after instantiation, which was not 548 possible before (as paths were initialized in Apply only). (#3178) 549 550 ### Fixed 551 * nsenter: do not try to close already-closed fds during container setup and 552 bail on close(2) failures. (#3058) 553 * runc checkpoint/restore: fixed for containers with an external bind mount 554 which destination is a symlink. (#3047). 555 * cgroup: improve openat2 handling for cgroup directory handle hardening. 556 (#3030) 557 * `runc delete -f` now succeeds (rather than timing out) on a paused 558 container. (#3134) 559 * runc run/start/exec now refuses a frozen cgroup (paused container in case of 560 exec). Users can disable this using `--ignore-paused`. (#3132, #3223) 561 * config: do not permit null bytes in mount fields. (#3287) 562 563 564 ## [1.0.3] - 2021-12-06 565 566 > If you were waiting for the opportune moment, that was it. 567 568 ### Security 569 * A potential vulnerability was discovered in runc (related to an internal 570 usage of netlink), however upon further investigation we discovered that 571 while this bug was exploitable on the master branch of runc, no released 572 version of runc could be exploited using this bug. The exploit required being 573 able to create a netlink attribute with a length that would overflow a uint16 574 but this was not possible in any released version of runc. For more 575 information, see [GHSA-v95c-p5hm-xq8f][] and CVE-2021-43784. 576 577 ### Fixed 578 * Fixed inability to start a container with read-write bind mount of a 579 read-only fuse host mount. (#3283, #3292) 580 * Fixed inability to start when read-only /dev in set in spec. (#3276, #3277) 581 * Fixed not removing sub-cgroups upon container delete, when rootless cgroup v2 582 is used with older systemd. (#3226, #3297) 583 * Fixed returning error from GetStats when hugetlb is unsupported (which causes 584 excessive logging for Kubernetes). (#3233, #3295) 585 * Improved an error message when dbus-user-session is not installed and 586 rootless + cgroup2 + systemd are used. (#3212) 587 588 [GHSA-v95c-p5hm-xq8f]: https://github.com/opencontainers/runc/security/advisories/GHSA-v95c-p5hm-xq8f 589 590 591 ## [1.0.2] - 2021-07-16 592 593 > Given the right lever, you can move a planet. 594 595 ### Changed 596 * Made release builds reproducible from now on. (#3099, #3142) 597 598 ### Fixed 599 * Fixed a failure to set CPU quota period in some cases on cgroup v1. (#3090 600 #3115) 601 * Fixed the inability to start a container with the "adding seccomp filter 602 rule for syscall ..." error, caused by redundant seccomp rules (i.e. those 603 that has action equal to the default one). Such redundant rules are now 604 skipped. (#3109, #3129) 605 * Fixed a rare debug log race in runc init, which can result in occasional 606 harmful "failed to decode ..." errors from runc run or exec. (#3120, #3130) 607 * Fixed the check in cgroup v1 systemd manager if a container needs to be 608 frozen before Set, and add a setting to skip such freeze unconditionally. 609 The previous fix for that issue, done in runc 1.0.1, was not working. 610 (#3166, #3167) 611 612 613 ## [1.0.1] - 2021-07-16 614 615 > If in doubt, Meriadoc, always follow your nose. 616 617 ### Fixed 618 * Fixed occasional runc exec/run failure ("interrupted system call") on an 619 Azure volume. (#3045, #3074) 620 * Fixed "unable to find groups ... token too long" error with /etc/group 621 containing lines longer than 64K characters. (#3062, #3079) 622 * cgroup/systemd/v1: fix leaving cgroup frozen after Set if a parent cgroup is 623 frozen. This is a regression in 1.0.0, not affecting runc itself but some 624 of libcontainer users (e.g Kubernetes). (#3081, #3085) 625 * cgroupv2: bpf: Ignore inaccessible existing programs in case of 626 permission error when handling replacement of existing bpf cgroup 627 programs. This fixes a regression in 1.0.0, where some SELinux 628 policies would block runc from being able to run entirely. (#3055, #3087) 629 * cgroup/systemd/v2: don't freeze cgroup on Set. (#3067, #3092) 630 * cgroup/systemd/v1: avoid unnecessary freeze on Set. (#3082, #3093) 631 632 633 ## [1.0.0] - 2021-06-22 634 635 > A wizard is never late, nor is he early, he arrives precisely when he means 636 > to. 637 638 As runc follows Semantic Versioning, we will endeavour to not make any 639 breaking changes without bumping the major version number of runc. 640 However, it should be noted that Go API usage of runc's internal 641 implementation (libcontainer) is *not* covered by this policy. 642 643 ### Removed 644 * Removed libcontainer/configs.Device* identifiers (deprecated since rc94, 645 use libcontainer/devices). (#2999) 646 * Removed libcontainer/system.RunningInUserNS function (deprecated since 647 rc94, use libcontainer/userns). (#2999) 648 649 ### Deprecated 650 * The usage of relative paths for mountpoints will now produce a warning 651 (such configurations are outside of the spec, and in future runc will 652 produce an error when given such configurations). (#2917, #3004) 653 654 ### Fixed 655 * cgroupv2: devices: rework the filter generation to produce consistent 656 results with cgroupv1, and always clobber any existing eBPF 657 program(s) to fix `runc update` and avoid leaking eBPF programs 658 (resulting in errors when managing containers). (#2951) 659 * cgroupv2: correctly convert "number of IOs" statistics in a 660 cgroupv1-compatible way. (#2965, #2967, #2968, #2964) 661 * cgroupv2: support larger than 32-bit IO statistics on 32-bit architectures. 662 * cgroupv2: wait for freeze to finish before returning from the freezing 663 code, optimize the method for checking whether a cgroup is frozen. (#2955) 664 * cgroups/systemd: fixed "retry on dbus disconnect" logic introduced in rc94 665 * cgroups/systemd: fixed returning "unit already exists" error from a systemd 666 cgroup manager (regression in rc94). (#2997, #2996) 667 668 ### Added 669 * cgroupv2: support SkipDevices with systemd driver. (#2958, #3019) 670 * cgroup1: blkio: support BFQ weights. (#3010) 671 * cgroupv2: set per-device io weights if BFQ IO scheduler is available. 672 (#3022) 673 674 ### Changed 675 * cgroup/systemd: return, not ignore, stop unit error from Destroy. (#2946) 676 * Fix all golangci-lint failures. (#2781, #2962) 677 * Make `runc --version` output sane even when built with `go get` or 678 otherwise outside of our build scripts. (#2962) 679 * cgroups: set SkipDevices during runc update (so we don't modify 680 cgroups at all during `runc update`). (#2994) 681 682 <!-- minor releases --> 683 [Unreleased]: https://github.com/opencontainers/runc/compare/v1.2.0-rc.1...HEAD 684 [1.1.0]: https://github.com/opencontainers/runc/compare/v1.1.0-rc.1...v1.1.0 685 [1.0.0]: https://github.com/opencontainers/runc/releases/tag/v1.0.0 686 687 <!-- 1.0.z patch releases --> 688 [Unreleased 1.0.z]: https://github.com/opencontainers/runc/compare/v1.0.3...release-1.0 689 [1.0.3]: https://github.com/opencontainers/runc/compare/v1.0.2...v1.0.3 690 [1.0.2]: https://github.com/opencontainers/runc/compare/v1.0.1...v1.0.2 691 [1.0.1]: https://github.com/opencontainers/runc/compare/v1.0.0...v1.0.1 692 693 <!-- 1.1.z patch releases --> 694 [Unreleased 1.1.z]: https://github.com/opencontainers/runc/compare/v1.1.12...release-1.1 695 [1.1.12]: https://github.com/opencontainers/runc/compare/v1.1.11...v1.1.12 696 [1.1.11]: https://github.com/opencontainers/runc/compare/v1.1.10...v1.1.11 697 [1.1.10]: https://github.com/opencontainers/runc/compare/v1.1.9...v1.1.10 698 [1.1.9]: https://github.com/opencontainers/runc/compare/v1.1.8...v1.1.9 699 [1.1.8]: https://github.com/opencontainers/runc/compare/v1.1.7...v1.1.8 700 [1.1.7]: https://github.com/opencontainers/runc/compare/v1.1.6...v1.1.7 701 [1.1.6]: https://github.com/opencontainers/runc/compare/v1.1.5...v1.1.6 702 [1.1.5]: https://github.com/opencontainers/runc/compare/v1.1.4...v1.1.5 703 [1.1.4]: https://github.com/opencontainers/runc/compare/v1.1.3...v1.1.4 704 [1.1.3]: https://github.com/opencontainers/runc/compare/v1.1.2...v1.1.3 705 [1.1.2]: https://github.com/opencontainers/runc/compare/v1.1.1...v1.1.2 706 [1.1.1]: https://github.com/opencontainers/runc/compare/v1.1.0...v1.1.1 707 [1.1.0-rc.1]: https://github.com/opencontainers/runc/compare/v1.0.0...v1.1.0-rc.1 708 709 <!-- 1.2.z patch releases --> 710 [1.2.0-rc.1]: https://github.com/opencontainers/runc/compare/v1.1.0...v1.2.0-rc.1