github.com/anth0d/nomad@v0.0.0-20221214183521-ae3a0a2cad06/website/content/docs/concepts/filesystem.mdx (about) 1 --- 2 layout: docs 3 page_title: Filesystem 4 description: |- 5 Nomad creates an allocation working directory for every allocation. Learn what 6 goes into the working directory and how it interacts with Nomad task drivers. 7 --- 8 9 # Filesystem 10 11 Nomad creates a working directory for each allocation on a client. This 12 directory can be found in the Nomad [`data_dir`] at 13 `./alloc/«alloc_id»`. The allocation working directory is where Nomad 14 creates task directories and directories shared between tasks, write logs for 15 tasks, and downloads artifacts or templates. 16 17 An allocation with two tasks (named `task1` and `task2`) will have an 18 allocation directory like the one below. 19 20 ```shell-session 21 . 22 ├── alloc 23 │ ├── data 24 │ ├── logs 25 │ │ ├── task1.stderr.0 26 │ │ ├── task1.stdout.0 27 │ │ ├── task2.stderr.0 28 │ │ └── task2.stdout.0 29 │ └── tmp 30 ├── task1 31 │ ├── local 32 │ ├── secrets 33 │ └── tmp 34 └── task2 35 ├── local 36 ├── secrets 37 └── tmp 38 ``` 39 40 - **alloc/**: This directory is shared across all tasks in an allocation and 41 can be used to store data that needs to be used by multiple tasks, such as a 42 log shipper. This is the directory that's provided to the task as the 43 `NOMAD_ALLOC_DIR`. Note that this `alloc/` directory is not the same as the 44 "allocation working directory", which is the top-level directory. All tasks 45 in a task group can read and write to the `alloc/` directory. But the full host 46 path may differ depending on the task driver's [filesystem isolation mode], so 47 tasks should always used the `NOMAD_ALLOC_DIR` environment variable 48 to find this path rather than relying on the specific implementation of the 49 [`none`](#none-isolation), [`chroot`](#chroot-isolation), or [`image`](#image-isolation) 50 modes. Within the `alloc/` directory are three standard directories: 51 52 - **alloc/data/**: This directory is the location used by the 53 [`ephemeral_disk`] stanza for shared data. 54 55 - **alloc/logs/**: This directory is the location of the log files for every 56 task within an allocation. The `nomad alloc logs` command streams these 57 files to your terminal. 58 59 - **alloc/tmp/**: A temporary directory used as scratch space by task drivers. 60 61 - **«taskname»**: Each task has a **task working directory** with the same name as 62 the task. Tasks in a task group can't read each other's task working 63 directory. Depending on the task driver's [filesystem isolation mode], a 64 task may not be able to access the task working directory. Within the 65 `task/` directory are three standard directories: 66 67 - **«taskname»/local/**: This directory is the location provided to the task as the 68 `NOMAD_TASK_DIR`. Note this is not the same as the "task working 69 directory". This directory is private to the task. 70 71 - **«taskname»/secrets/**: This directory is the location provided to the task as 72 `NOMAD_SECRETS_DIR`. The contents of files in this directory cannot be read 73 by the `nomad alloc fs` command. It can be used to store secret data that 74 should not be visible outside the task. 75 76 - **«taskname»/tmp/**: A temporary directory used as scratch space by task drivers. 77 78 The allocation working directory is the directory you see when using the 79 `nomad alloc fs` command. If you were to run `nomad alloc fs` against the 80 allocation that made the working directory shown above, you'd see the 81 following: 82 83 ```shell-session 84 $ nomad alloc fs c0b2245f 85 Mode Size Modified Time Name 86 drwxrwxrwx 4.0 KiB 2020-10-27T18:00:39Z alloc/ 87 drwxrwxrwx 4.0 KiB 2020-10-27T18:00:32Z task1/ 88 drwxrwxrwx 4.0 KiB 2020-10-27T18:00:39Z task2/ 89 90 $ nomad alloc fs c0b2245f alloc/ 91 Mode Size Modified Time Name 92 drwxrwxrwx 4.0 KiB 2020-10-27T18:00:32Z data/ 93 drwxrwxrwx 4.0 KiB 2020-10-27T18:00:39Z logs/ 94 drwxrwxrwx 4.0 KiB 2020-10-27T18:00:32Z tmp/ 95 96 $ nomad alloc fs c0b2245f task1/ 97 Mode Size Modified Time Name 98 drwxrwxrwx 4.0 KiB 2020-10-27T18:00:33Z local/ 99 drwxrwxrwx 60 B 2020-10-27T18:00:32Z secrets/ 100 dtrwxrwxrwx 4.0 KiB 2020-10-27T18:00:32Z tmp/ 101 ``` 102 103 ## Task Drivers and Filesystem Isolation Modes 104 105 Depending on the task driver, the task's working directory may also be the 106 root directory for the running task. This is determined by the task driver's 107 [filesystem isolation capability]. 108 109 ### `image` isolation 110 111 Task drivers like `docker` or `qemu` use `image` isolation, where the task 112 driver isolates task filesystems as machine images. These filesystems are 113 owned by the task driver's external process and not by Nomad itself. These 114 filesystems will not typically be found anywhere in the allocation working 115 directory. For example, Docker containers will have their overlay filesystem 116 unpacked to `/var/run/docker/containerd/«container_id»` by default. 117 118 Nomad will provide the `NOMAD_ALLOC_DIR`, `NOMAD_TASK_DIR`, and 119 `NOMAD_SECRETS_DIR` to tasks with `image` isolation, typically by 120 bind-mounting them to the task driver's filesystem. 121 122 You can see an example of `image` isolation by running the following minimal 123 job: 124 125 ```hcl 126 job "example" { 127 datacenters = ["dc1"] 128 129 task "task1" { 130 driver = "docker" 131 132 config { 133 image = "redis:6.0" 134 } 135 } 136 } 137 ``` 138 139 If you look at the allocation working directory from the host, you'll see a 140 minimal filesystem tree: 141 142 ```shell-session 143 . 144 ├── alloc 145 │ ├── data 146 │ ├── logs 147 │ │ ├── task1.stderr.0 148 │ │ └── task1.stdout.0 149 │ └── tmp 150 └── task1 151 ├── local 152 ├── secrets 153 └── tmp 154 ``` 155 156 The `nomad alloc fs` command shows the same bare directory tree: 157 158 ```shell-session 159 $ nomad alloc fs b0686b27 160 Mode Size Modified Time Name 161 drwxrwxrwx 4.0 KiB 2020-10-27T18:51:54Z alloc/ 162 drwxrwxrwx 4.0 KiB 2020-10-27T18:51:54Z task1/ 163 164 $ nomad alloc fs b0686b27 task1 165 Mode Size Modified Time Name 166 drwxrwxrwx 4.0 KiB 2020-10-27T18:51:54Z local/ 167 drwxrwxrwx 60 B 2020-10-27T18:51:54Z secrets/ 168 dtrwxrwxrwx 4.0 KiB 2020-10-27T18:51:54Z tmp/ 169 170 $ nomad alloc fs b0686b27 task1/local 171 Mode Size Modified Time Name 172 ``` 173 174 If you inspect the Docker container that's created, you'll see three 175 directories bind-mounted into the container: 176 177 ```shell-session 178 $ docker inspect 32e | jq '.[0].HostConfig.Binds' 179 [ 180 "/var/nomad/alloc/b0686b27-8af3-8252-028f-af485c81a8b3/alloc:/alloc", 181 "/var/nomad/alloc/b0686b27-8af3-8252-028f-af485c81a8b3/task1/local:/local", 182 "/var/nomad/alloc/b0686b27-8af3-8252-028f-af485c81a8b3/task1/secrets:/secrets" 183 ] 184 ``` 185 186 The root filesystem inside the container can see these three mounts, along 187 with the rest of the container filesystem: 188 189 ```shell-session 190 $ docker exec -it 32e /bin/sh 191 # ls / 192 alloc boot dev home lib64 media opt root sbin srv tmp var 193 bin data etc lib local mnt proc run secrets sys usr 194 ``` 195 196 Note that because the three directories are bind-mounted into the container 197 filesystem, nothing written outside those three directories elsewhere in the 198 allocation working directory will be accessible inside the container. This 199 means templates, artifacts, and dispatch payloads for tasks with `image` 200 isolation must be written into the `NOMAD_ALLOC_DIR`, `NOMAD_TASK_DIR`, or 201 `NOMAD_SECRETS_DIR`. 202 203 To work around this limitation, you can use the task driver's mounting 204 capabilities to mount one of the three directories to another location in the 205 task. For example, with the Docker driver you can use the driver's `mounts` 206 block to bind a secret written by a `template` block to the 207 `NOMAD_SECRETS_DIR` into a configuration directory elsewhere in the task: 208 209 ```hcl 210 job "example" { 211 datacenters = ["dc1"] 212 213 task "task1" { 214 driver = "docker" 215 216 config { 217 image = "redis:6.0" 218 mounts = [{ 219 type = "bind" 220 source = "secrets" 221 target = "/etc/redis.d" 222 readonly = true 223 }] 224 225 template { 226 destination = "${NOMAD_SECRETS_DIR}/redis.conf" 227 data = <<EOT 228 {{ with secret "secrets/data/redispass" }} 229 requirepass {{- .Data.data.passwd -}}{{end}} 230 EOT 231 232 } 233 } 234 } 235 } 236 ``` 237 238 Note that relative mount source path are relative to the task working 239 directory, so to bind the `NOMAD_ALLOC_DIR` as a mount source, you will need 240 to use a relative path that traverses up into the allocation working directory 241 (ex. `source = "../alloc"`). 242 243 ### `chroot` isolation 244 245 Task drivers like `exec` or `java` (on Linux) use `chroot` isolation, where 246 the task driver isolates task filesystems with `chroot` or `pivot_root`. These 247 isolated filesystems will be built inside the task working directory. 248 249 You can see an example of `chroot` isolation by running the following minimal 250 job on Linux: 251 252 ```hcl 253 job "example" { 254 datacenters = ["dc1"] 255 256 task "task2" { 257 driver = "exec" 258 259 config { 260 command = "/bin/sh" 261 args = ["-c", "sleep 600"] 262 } 263 } 264 } 265 ``` 266 267 If you look at the allocation working directory from the host, you'll see a 268 filesystem tree that has been populated with the task driver's [chroot 269 contents], in addition to the `NOMAD_ALLOC_DIR`, `NOMAD_TASK_DIR`, and 270 `NOMAD_SECRETS_DIR`: 271 272 ```shell-session 273 . 274 ├── alloc 275 │ ├── container 276 │ ├── data 277 │ ├── logs 278 │ └── tmp 279 └── task2 280 ├── alloc 281 ├── bin 282 ├── dev 283 ├── etc 284 ├── executor.out 285 ├── lib 286 ├── lib32 287 ├── lib64 288 ├── local 289 ├── proc 290 ├── run 291 ├── sbin 292 ├── secrets 293 ├── sys 294 ├── tmp 295 └── usr 296 ``` 297 298 Likewise, the root directory of the task is now available in the `nomad alloc fs` command output: 299 300 ```shell-session 301 $ nomad alloc fs eebd13a7 302 Mode Size Modified Time Name 303 drwxrwxrwx 4.0 KiB 2020-10-27T19:05:24Z alloc/ 304 drwxrwxrwx 4.0 KiB 2020-10-27T19:05:24Z task2/ 305 306 $ nomad alloc fs eebd13a7 task2 307 Mode Size Modified Time Name 308 drwxrwxrwx 4.0 KiB 2020-10-27T19:05:24Z alloc/ 309 drwxr-xr-x 4.0 KiB 2020-10-27T19:05:22Z bin/ 310 drwxr-xr-x 4.0 KiB 2020-10-27T19:05:24Z dev/ 311 drwxr-xr-x 4.0 KiB 2020-10-27T19:05:22Z etc/ 312 -rw-r--r-- 297 B 2020-10-27T19:05:24Z executor.out 313 drwxr-xr-x 4.0 KiB 2020-10-27T19:05:22Z lib/ 314 drwxr-xr-x 4.0 KiB 2020-10-27T19:05:22Z lib32/ 315 drwxr-xr-x 4.0 KiB 2020-10-27T19:05:22Z lib64/ 316 drwxrwxrwx 4.0 KiB 2020-10-27T19:05:22Z local/ 317 drwxr-xr-x 4.0 KiB 2020-10-27T19:05:24Z proc/ 318 drwxr-xr-x 4.0 KiB 2020-10-27T19:05:22Z run/ 319 drwxr-xr-x 12 KiB 2020-10-27T19:05:22Z sbin/ 320 drwxrwxrwx 60 B 2020-10-27T19:05:22Z secrets/ 321 drwxr-xr-x 4.0 KiB 2020-10-27T19:05:24Z sys/ 322 dtrwxrwxrwx 4.0 KiB 2020-10-27T19:05:22Z tmp/ 323 drwxr-xr-x 4.0 KiB 2020-10-27T19:05:22Z usr/ 324 ``` 325 326 Nomad will provide the `NOMAD_ALLOC_DIR`, `NOMAD_TASK_DIR`, and 327 `NOMAD_SECRETS_DIR` to tasks with `chroot` isolation. But unlike with `image` 328 isolation, Nomad does not need to bind-mount the `NOMAD_TASK_DIR` directory 329 because it can be directly created inside the chroot. 330 331 ```shell-session 332 $ nomad alloc exec eebd13a7 /bin/sh 333 $ mount 334 ... 335 /dev/mapper/root on /alloc type ext4 (rw,relatime,errors=remount-ro,data=ordered) 336 tmpfs on /secrets type tmpfs (rw,noexec,relatime,size=1024k) 337 ... 338 ``` 339 340 ### `none` isolation 341 342 The `raw_exec` task driver (or the `java` task driver on Windows) uses the 343 `none` filesystem isolation mode. This means the task driver does not isolate 344 the filesystem for the task, and the task can read and write anywhere the 345 user that's running Nomad can. 346 347 You can see an example of `none` isolation by running the following minimal 348 `raw_exec` job on Linux or Unix. 349 350 ```hcl 351 job "example" { 352 datacenters = ["dc1"] 353 354 task "task3" { 355 driver = "raw_exec" 356 357 config { 358 command = "/bin/sh" 359 args = ["-c", "sleep 600"] 360 } 361 } 362 } 363 ``` 364 365 If you look at the allocation working directory from the host, you'll see a 366 minimal filesystem tree: 367 368 ```shell-session 369 . 370 ├── alloc 371 │ ├── data 372 │ ├── logs 373 │ │ ├── task3.stderr.0 374 │ │ └── task3.stdout.0 375 │ └── tmp 376 └── task3 377 ├── executor.out 378 ├── local 379 ├── secrets 380 └── tmp 381 ``` 382 383 The `nomad alloc fs` command shows the same bare directory tree: 384 385 ```shell-session 386 $ nomad alloc fs 87ec7d12 task3 387 Mode Size Modified Time Name 388 -rw-r--r-- 140 B 2020-10-27T19:15:33Z executor.out 389 drwxrwxrwx 4.0 KiB 2020-10-27T19:15:33Z local/ 390 drwxrwxrwx 60 B 2020-10-27T19:15:33Z secrets/ 391 dtrwxrwxrwx 4.0 KiB 2020-10-27T19:15:33Z tmp/ 392 ``` 393 394 But if you use `nomad alloc exec` to view the filesystem from inside the 395 container, you'll see that the task has access to the entire root 396 filesystem. The `NOMAD_ALLOC_DIR`, `NOMAD_TASK_DIR`, and `NOMAD_SECRETS_DIR` 397 point to the filepath on the host, not a path anchored in the task working 398 directory. And the task is running as `root`, because the Nomad client agent 399 is running as `root`. This is why the `raw_exec` driver is disabled by 400 default. 401 402 ```shell-session 403 $ nomad alloc exec 87ec7d12 /bin/sh 404 # ls / 405 bin dev home lib lib64 lost+found mnt proc run snap sys usr vmlinuz 406 boot etc initrd.img lib32 libx32 media opt root sbin srv tmp var 407 408 # echo $NOMAD_SECRETS_DIR 409 /var/nomad/alloc/87ec7d12-5e35-8fba-96cc-09e5376be15a/task3/secrets 410 411 # whoami 412 root 413 ``` 414 415 ## Templates, Artifacts, and Dispatch Payloads 416 417 The other contents of the allocation working directory depend on what features 418 the job specification uses. The allocation working directory is populated by 419 other features in a specific order: 420 421 - The allocation working directory is created. 422 - The ephemeral disk data is [migrated] from any previous allocation. 423 - [CSI volumes] are staged. 424 - Then, for each task: 425 - Task working directories are created. 426 - [Dispatch payloads] are written. 427 - [Artifacts] are downloaded. 428 - [Templates] are rendered. 429 - The task is started by the task driver, which includes all bind mounts and 430 [volume mounts]. 431 432 Dispatch payloads, artifacts, and templates are written to the task working 433 directory before a task can start because the resulting files may be binary or 434 image run by the task. For example, an `artifact` can be used to download a 435 Docker image or .jar file, or a `template` can be used to render a shell 436 script that's run by `exec`. 437 438 The `artifact` and `template` blocks write their data to a destination 439 relative to the task working directory, not the `NOMAD_TASK_DIR`. For task 440 drivers with `image` filesystem isolation, this means the `destination` field 441 path should be prefixed with either `NOMAD_TASK_DIR` or 442 `NOMAD_SECRETS_DIR`. Otherwise, the file will not be visible from inside the 443 resulting container. (The `dispatch_payload` block always writes its data to 444 the `NOMAD_TASK_DIR`.) 445 446 For [CSI volumes], the client will stage the volume before setting up the task 447 working directory. Staging typically involves mounting the volume into the CSI 448 plugin's task directory, sending commands to the plugin to format the volume 449 as required, and making a volume claim to the Nomad server. 450 451 The behavior of the `volume_mount` block is controlled by the task driver. The 452 client builds a mount configuration describing the host volume or CSI volume 453 and passes it to the task driver to execute. Because the task driver mounts 454 the volume, it is not possible to have `artifact`, `template`, or 455 `dispatch_payload` blocks write to a volume. 456 457 [artifacts]: /docs/job-specification/artifact 458 [csi volumes]: /docs/concepts/plugins/csi 459 [dispatch payloads]: /docs/job-specification/dispatch_payload 460 [templates]: /docs/job-specification/template 461 [`data_dir`]: /docs/configuration#data_dir 462 [`ephemeral_disk`]: /docs/job-specification/ephemeral_disk 463 [artifact]: /docs/job-specification/artifact 464 [chroot contents]: /docs/drivers/exec#chroot 465 [filesystem isolation capability]: /docs/concepts/plugins/task-drivers#capabilities-capabilities-error 466 [filesystem isolation mode]: #task-drivers-and-filesystem-isolation-modes 467 [migrated]: /docs/job-specification/ephemeral_disk#migrate 468 [template]: /docs/job-specification/template 469 [volume mounts]: /docs/job-specification/volume_mount