github.com/Ilhicas/nomad@v1.0.4-0.20210304152020-e86851182bc3/website/content/docs/upgrade/upgrade-specific.mdx (about) 1 --- 2 layout: docs 3 page_title: Upgrade Guides 4 sidebar_title: Specific Version Details 5 description: |- 6 Specific versions of Nomad may have additional information about the upgrade 7 process beyond the standard flow. 8 --- 9 10 # Upgrade Guides 11 12 The [upgrading page](/docs/upgrade) covers the details of doing a standard 13 upgrade. However, specific versions of Nomad may have more details provided for 14 their upgrades as a result of new features or changed behavior. This page is 15 used to document those details separately from the standard upgrade flow. 16 17 ## Nomad 1.0.3, 0.12.10 18 19 Nomad versions 1.0.3 and 0.12.10 change the behavior of the `exec` and `java` drivers so that 20 tasks are isolated in their own PID and IPC namespaces. As a result, the 21 process launched by these drivers will be PID 1 in the namespace. This has 22 [significant impact](https://man7.org/linux/man-pages/man7/pid_namespaces.7.html) 23 on the treatment of a process by the Linux kernel. Furthermore, tasks in the 24 same allocation will no longer be able to coordinate using signals, SystemV IPC 25 objects, or POSIX message queues. Operators should weigh potential impact of an 26 upgrade on their applications against the security consequences inherent in using 27 the host namespaces. 28 29 This is the sole change for Nomad 1.0.3, intended to provide better process 30 isolation by default. An upcoming version of Nomad will include options for 31 configuring this behavior. 32 33 This change is limited to the `exec` and `java` driver plugins. It does not affect 34 the Nomad server. This only affect Nomad clients running on Linux, using the 35 `exec` or `java` drivers or third-party driver plugins which relied on the shared 36 Nomad executor library. 37 38 Upgrading a Nomad client to 1.0.3 or 0.12.10 will not restart existing tasks. 39 As such, processes from existing `exec`/`java` tasks will need to be manually restarted 40 (using `alloc stop` or another mechanism) in order to be fully isolated. 41 42 ## Nomad 1.0.2 43 44 #### Dynamic secrets trigger template changes on client restart 45 46 Nomad 1.0.2 changed the behavior of template `change_mode` triggers when a 47 client node restarts. In Nomad 1.0.1 and earlier, the first rendering of a 48 template after a client restart would not trigger the `change_mode`. For 49 dynamic secrets such as the Vault PKI secrets engine, this resulted in the 50 secret being updated but not restarting or signalling the task. When the 51 secret's lease expired at some later time, the task workload might fail 52 because of the stale secret. For example, a web server's SSL certificate would 53 be expired and browsers would be unable to connect. 54 55 In Nomad 1.0.2, when a client node is restarted any task with Vault secrets 56 that are generated or have expired will have its `change_mode` triggered. If 57 `change_mode = "restart"` this will result in the task being restarted, to 58 avoid the task failing unexpectedly at some point in the future. This change 59 only impacts tasks using dynamic Vault secrets engines such as [PKI][pki], or 60 when secrets are rotated. Secrets that don't change in Vault will not trigger 61 a `change_mode` on client restart. 62 63 ## Nomad 1.0.1 64 65 #### Envoy worker threads 66 67 Nomad v1.0.0 changed the default behavior around the number of worker threads 68 created by the Envoy when being used as a sidecar for Consul Connect. In Nomad 69 v1.0.1, the same default setting of [`--concurrency=1`][envoy_concurrency] is set for Envoy when used 70 as a Connect gateway. As before, the [`meta.connect.proxy_concurrency`][proxy_concurrency] 71 property can be set in client configuration to override the default value. 72 73 ## Nomad 1.0.0 74 75 ### HCL2 for Job specification 76 77 Nomad v1.0.0 adopts HCL2 for parsing the job spec. HCL2 extends HCL with more 78 expression and reuse support, but adds some stricter schema for HCL blocks 79 (a.k.a. stanzas). Check [HCL](/docs/job-specification/hcl2) for more details. 80 81 ### Signal used when stopping Docker tasks 82 83 When stopping tasks running with the Docker task driver, Nomad documents that a 84 `SIGTERM` will be issued (unless configured with `kill_signal`). However, recent 85 versions of Nomad would issue `SIGINT` instead. Starting again with Nomad v1.0.0 86 `SIGTERM` will be sent by default when stopping Docker tasks. 87 88 ### Deprecated metrics have been removed 89 90 Nomad v0.7.0 added supported for tagged metrics and deprecated untagged metrics. 91 There was support for configuring backwards-compatible metrics. This support has 92 been removed with v1.0.0, and all metrics will be emitted with tags. 93 94 ### Null characters in region, datacenter, job name/ID, task group name, and task names 95 96 Starting with Nomad v1.0.0, jobs will fail validation if any of the following 97 contain null character: the job ID or name, the task group name, or the task 98 name. Any jobs meeting this requirement should be modified before an update to 99 v1.0.0. Similarly, client and server config validation will prohibit either the 100 region or the datacenter from containing null characters. 101 102 ### EC2 CPU characteristics may be different 103 104 Starting with Nomad v1.0.0, the AWS fingerprinter uses data derived from the 105 official AWS EC2 API to determine default CPU performance characteristics, 106 including core count and core speed. This data should be accurate for each 107 instance type per region. Previously, Nomad used a hand-made lookup table that 108 was not region aware and may have contained inaccurate or incomplete data. As 109 part of this change, the AWS fingerprinter no longer sets the `cpu.modelname` 110 attribute. 111 112 As before, `cpu_total_compute` can be used to override the discovered CPU 113 resources available to the Nomad client. 114 115 ### Inclusive language 116 117 Starting with Nomad v1.0.0, the terms `blacklist` and `whitelist` have been 118 deprecated from client configuration and driver configuration. The existing 119 configuration values are permitted but will be removed in a future version of 120 Nomad. The specific configuration values replaced are: 121 122 - Client `driver.blacklist` is replaced with `driver.denylist`. 123 124 - Client `driver.whitelist` is replaced with `driver.allowlist`. 125 126 - Client `env.blacklist` is replaced with `env.denylist`. 127 128 - Client `fingerprint.blacklist` is replaced with `fingerprint.denylist`. 129 130 - Client `fingerprint.whitelist` is replaced with `fingerprint.allowlist`. 131 132 - Client `user.blacklist` is replaced with `user.denylist`. 133 134 - Client `template.function_blacklist` is replaced with 135 `template.function_denylist`. 136 137 - Docker driver `docker.caps.whitelist` is replaced with 138 `docker.caps.allowlist`. 139 140 ### Consul Connect 141 142 Nomad 1.0's Consul Connect integration works best with Consul 1.9 or later. The 143 ideal upgrade path is: 144 145 1. Create a new Nomad client image with Nomad 1.0 and Consul 1.9 or later. 146 2. Add new hosts based on the image. 147 3. [Drain][drain-cli] and shutdown old Nomad client nodes. 148 149 While inplace upgrades and older versions of Consul are supported by Nomad 1.0, 150 Envoy proxies will drop and stop accepting connections while the Nomad agent is 151 restarting. Nomad 1.0 with Consul 1.9 do not have this limitation. 152 153 #### Envoy proxy versions 154 155 Nomad v1.0.0 changes the behavior around the selection of Envoy version used for 156 Connect sidecar proxies. Previously, Nomad always defaulted to Envoy v1.11.2 if 157 neither the `meta.connect.sidecar_image` parameter or `sidecar_task` stanza were 158 explicitly configured. Likewise the same version of Envoy would be used for 159 Connect ingress gateways if `meta.connect.gateway_image` was unset. Starting 160 with Nomad v1.0.0, each Nomad Client will query Consul for a list of supported 161 Envoy versions. Nomad will make use of the latest version of Envoy supported by 162 the Consul agent when launching Envoy as a Connect sidecar proxy. If the version 163 of the Consul agent is older than v1.7.8, v1.8.4, or v1.9.0, Nomad will fallback 164 to the v1.11.2 version of Envoy. As before, if the `meta.connect.sidecar_image`, 165 `meta.connect.gateway_image`, or `sidecar_task` stanza are set, those settings 166 take precedence. 167 168 When upgrading Nomad Clients from a previous version to v1.0.0 and above, it is 169 recommended to also upgrade the Consul agents to v1.7.8, 1.8.4, or v1.9.0 or 170 newer. Upgrading Nomad and Consul to versions that support the new behavior 171 while also doing a full [node drain][] at the time of the upgrade for each node 172 will ensure Connect workloads are properly rescheduled onto nodes in such a way 173 that the Nomad Clients, Consul agents, and Envoy sidecar tasks maintain 174 compatibility with one another. 175 176 #### Envoy worker threads 177 178 Nomad v1.0.0 changes the default behavior around the number of worker threads 179 created by the Envoy sidecar proxy when using Consul Connect. Previously, the 180 Envoy [`--concurrency`][envoy_concurrency] argument was left unset, which caused 181 Envoy to spawn as many worker threads as logical cores available on the CPU. The 182 `--concurrency` value now defaults to `1` and can be configured by setting the 183 [`meta.connect.proxy_concurrency`][proxy_concurrency] property in client 184 configuration. 185 186 ## Nomad 0.12.8 187 188 ### Docker volume mounts 189 190 Nomad 0.12.8 includes security fixes for the handling of Docker volume mounts: 191 192 - The `docker.volumes.enabled` flag now defaults to `false` as documented. 193 194 - Docker driver mounts of type "volume" (but not "bind") were not sandboxed and 195 could mount arbitrary locations from the client host. The 196 `docker.volumes.enabled` configuration will now disable Docker mounts with 197 type "volume" when set to `false` (the default). 198 199 This change Docker impacts jobs that use a `mounts` with type "volume", as shown 200 below. This job will fail when placed unless `docker.volumes.enabled = true`. 201 202 ```hcl 203 mounts = [ 204 { 205 type = "volume" 206 target = "/path/in/container" 207 source = "docker_volume" 208 volume_options = { 209 driver_config = { 210 name = "local" 211 options = [ 212 { 213 device = "/" 214 o = "ro,bind" 215 type = "ext4" 216 } 217 ] 218 } 219 } 220 } 221 ] 222 ``` 223 224 ## Nomad 0.12.6 225 226 ### Artifact and Template Paths 227 228 Nomad 0.12.6 includes security fixes for privilege escalation vulnerabilities 229 in handling of job `template` and `artifact` stanzas: 230 231 - The `template.source` and `template.destination` fields are now protected by 232 the file sandbox introduced in 0.9.6. These paths are now restricted to fall 233 inside the task directory by default. An operator can opt-out of this 234 protection with the [`template.disable_file_sandbox`][] field in the client 235 configuration. 236 237 - The paths for `template.source`, `template.destination`, and 238 `artifact.destination` are validated on job submission to ensure the paths do 239 not escape the file sandbox. It was possible to use interpolation to bypass 240 this validation. The client now interpolates the paths before checking if they 241 are in the file sandbox. 242 243 ~> **Warning:** Due to a [bug][gh-9148] in Nomad v0.12.6, the 244 `template.destination` and `artifact.destination` paths do not support 245 absolute paths, including the interpolated `NOMAD_SECRETS_DIR`, 246 `NOMAD_TASK_DIR`, and `NOMAD_ALLOC_DIR` variables. This bug is fixed in 247 v0.12.9. To work around the bug, use a relative path. 248 249 ## Nomad 0.12.0 250 251 ### `mbits` and Task Network Resource deprecation 252 253 Starting in Nomad 0.12.0 the `mbits` field of the network resource block has 254 been deprecated and is no longer considered when making scheduling decisions. 255 This is in part because we felt that `mbits` didn't accurately account network 256 bandwidth as a resource. 257 258 Additionally the use of the `network` block inside of a task's `resource` block 259 is also deprecated. Users are advised to move their `network` block to the 260 `group` block. Recent networking features have only been added to group based 261 network configuration. If any usecase or feature which was available with task 262 network resource is not fulfilled with group network configuration, please open 263 an issue detailing the missing capability. 264 265 ### Enterprise Licensing 266 267 Enterprise binaries for Nomad are now publicly available via 268 [releases.hashicorp.com](https://releases.hashicorp.com/nomad/). By default all 269 enterprise features are enabled for 6 hours. During that time enterprise users 270 should apply their license with the [`nomad license put ...`](/docs/commands/license/put) command. 271 272 Once the 6 hour demonstration period expires, Nomad will shutdown. If restarted 273 Nomad will shutdown in a very short amount of time unless a valid license is 274 applied. 275 276 ~> **Warning:** Due to a [bug][gh-8457] in Nomad v0.12.0, existing clusters 277 that are upgraded will **not** have 6 hours to apply a license. The minimal 278 grace period should be sufficient to apply a valid license, but enterprise 279 users are encouraged to delay upgrading until Nomad v0.12.1 is released and 280 fixes the issue. 281 282 ### Docker access host filesystem 283 284 Nomad 0.12.0 disables Docker tasks access to the host filesystem, by default. 285 Prior to Nomad 0.12, Docker tasks may mount and then manipulate any host file 286 and may pose a security risk. 287 288 Operators now must explicitly allow tasks to access host filesystem. [Host 289 Volumes](/docs/configuration/client#host_volume-stanza) provide a fine tune 290 access to individual paths. 291 292 To restore pre-0.12.0 behavior, you can enable [Docker 293 `volume`](/docs/drivers/docker#enabled-1) to allow binding host paths, by adding 294 the following to the nomad client config file: 295 296 ```hcl 297 plugin "docker" { 298 config { 299 volumes { 300 enabled = true 301 } 302 } 303 } 304 ``` 305 306 ### QEMU images 307 308 Nomad 0.12.0 restricts the paths the QEMU tasks can load an image from. A QEMU 309 task may download an image to the allocation directory to load. But images 310 outside the allocation directories must be explicitly allowed by operators in 311 the client agent configuration file. 312 313 For example, you may allow loading QEMU images from `/mnt/qemu-images` by 314 adding the following to the agent configuration file: 315 316 ```hcl 317 plugin "qemu" { 318 config { 319 image_paths = ["/mnt/qemu-images"] 320 } 321 } 322 ``` 323 324 ## Nomad 0.11.7 325 326 ### Docker volume mounts 327 328 Nomad 0.11.7 includes a security fix for the handling of Docker volume 329 mounts. Docker driver mounts of type "volume" (but not "bind") were not 330 sandboxed and could mount arbitrary locations from the client host. The 331 `docker.volumes.enabled` configuration will now disable Docker mounts with 332 type "volume" when set to `false`. 333 334 This change Docker impacts jobs that use a `mounts` with type "volume", as 335 shown below. This job will fail when placed unless `docker.volumes.enabled = true`. 336 337 ```hcl 338 mounts = [ 339 { 340 type = "volume" 341 target = "/path/in/container" 342 source = "docker_volume" 343 volume_options = { 344 driver_config = { 345 name = "local" 346 options = [ 347 { 348 device = "/" 349 o = "ro,bind" 350 type = "ext4" 351 } 352 ] 353 } 354 } 355 } 356 ] 357 ``` 358 359 ## Nomad 0.11.5 360 361 ### Artifact and Template Paths 362 363 Nomad 0.11.5 includes backported security fixes for privilege escalation 364 vulnerabilities in handling of job `template` and `artifact` stanzas: 365 366 - The `template.source` and `template.destination` fields are now protected by 367 the file sandbox introduced in 0.9.6. These paths are now restricted to fall 368 inside the task directory by default. An operator can opt-out of this 369 protection with the 370 [`template.disable_file_sandbox`](/docs/configuration/client#template-parameters) 371 field in the client configuration. 372 - The paths for `template.source`, `template.destination`, and 373 `artifact.destination` are validated on job submission to ensure the paths 374 do not escape the file sandbox. It was possible to use interpolation to 375 bypass this validation. The client now interpolates the paths before 376 checking if they are in the file sandbox. 377 378 ~> **Warning:** Due to a [bug][gh-9148] in Nomad v0.11.5, the 379 `template.destination` and `artifact.destination` paths do not support 380 absolute paths, including the interpolated `NOMAD_SECRETS_DIR`, 381 `NOMAD_TASK_DIR`, and `NOMAD_ALLOC_DIR` variables. This bug is fixed in 382 v0.11.6. To work around the bug, use a relative path. 383 384 ## Nomad 0.11.3 385 386 Nomad 0.11.3 fixes a critical bug causing the nomad agent to become 387 unresponsive. The issue is due to a [Go 1.14.1 runtime 388 bug](https://github.com/golang/go/issues/38023) and affects Nomad 0.11.1 and 389 0.11.2. 390 391 ## Nomad 0.11.2 392 393 ### Scheduler Scoring Changes 394 395 Prior to Nomad 0.11.2 the scheduler algorithm used a [node's reserved 396 resources][reserved] 397 incorrectly during scoring. The result of this bug was that scoring biased in 398 favor of nodes with reserved resources vs nodes without reserved resources. 399 400 Placements will be more correct but slightly different in v0.11.2 vs earlier 401 versions of Nomad. Operators do _not_ need to take any actions as the impact of 402 the bug fix will only minimally affect scoring. 403 404 Feasibility (whether a node is capable of running a job at all) is _not_ 405 affected. 406 407 ### Periodic Jobs and Daylight Saving Time 408 409 Nomad 0.11.2 fixed a long outstanding bug affecting periodic jobs that are 410 scheduled to run during Daylight Saving Time transitions. 411 412 Nomad 0.11.2 provides a more defined behavior: Nomad evaluates the cron 413 expression with respect to specified time zone during transition. A 2:30am 414 nightly job with `America/New_York` time zone will not run on the day daylight 415 saving time starts; similarly, a 1:30am nightly job will run twice on the day 416 daylight saving time ends. See the [Daylight Saving Time][dst] documentation 417 for details. 418 419 ## Nomad 0.11.0 420 421 ### client.template: `vault_grace` deprecation 422 423 Nomad 0.11.0 updates 424 [consul-template](https://github.com/hashicorp/consul-template) to v0.24.1. This 425 library deprecates the [`vault_grace`][vault_grace] option for templating 426 included in Nomad. The feature has been ignored since Vault 0.5 and as long as 427 you are running a more recent version of Vault, you can safely remove 428 `vault_grace` from your Nomad jobs. 429 430 ### Rkt Task Driver Removed 431 432 The `rkt` task driver has been deprecated and removed from Nomad. While the code 433 is available in an external repository, 434 <https://github.com/hashicorp/nomad-driver-rkt>, it will not be maintained as 435 `rkt` is [no longer being developed upstream](https://github.com/rkt/rkt). We 436 encourage all `rkt` users to find a new task driver as soon as possible. 437 438 ## Nomad 0.10.8 439 440 ### Docker volume mounts 441 442 Nomad 0.10.8 includes a security fix for the handling of Docker volume mounts. 443 Docker driver mounts of type "volume" (but not "bind") were not sandboxed and 444 could mount arbitrary locations from the client host. The 445 `docker.volumes.enabled` configuration will now disable Docker mounts with type 446 "volume" when set to `false`. 447 448 This change Docker impacts jobs that use a `mounts` with type "volume", as shown 449 below. This job will fail when placed unless `docker.volumes.enabled = true`. 450 451 ```hcl 452 mounts = [ 453 { 454 type = "volume" 455 target = "/path/in/container" 456 source = "docker_volume" 457 volume_options = { 458 driver_config = { 459 name = "local" 460 options = [ 461 { 462 device = "/" 463 o = "ro,bind" 464 type = "ext4" 465 } 466 ] 467 } 468 } 469 } 470 ] 471 ``` 472 473 ## Nomad 0.10.6 474 475 ### Artifact and Template Paths 476 477 Nomad 0.10.6 includes backported security fixes for privilege escalation 478 vulnerabilities in handling of job `template` and `artifact` stanzas: 479 480 - The `template.source` and `template.destination` fields are now protected by 481 the file sandbox introduced in 0.9.6. These paths are now restricted to fall 482 inside the task directory by default. An operator can opt-out of this 483 protection with the 484 [`template.disable_file_sandbox`](/docs/configuration/client#template-parameters) 485 field in the client configuration. 486 487 - The paths for `template.source`, `template.destination`, and 488 `artifact.destination` are validated on job submission to ensure the paths 489 do not escape the file sandbox. It was possible to use interpolation to 490 bypass this validation. The client now interpolates the paths before 491 checking if they are in the file sandbox. 492 493 ~> **Warning:** Due to a [bug][gh-9148] in Nomad v0.10.6, the 494 `template.destination` and `artifact.destination` paths do not support 495 absolute paths, including the interpolated `NOMAD_SECRETS_DIR`, 496 `NOMAD_TASK_DIR`, and `NOMAD_ALLOC_DIR` variables. This bug is fixed in 497 v0.10.7. To work around the bug, use a relative path. 498 499 ## Nomad 0.10.4 500 501 ### Same-Node Scheduling Penalty Removed 502 503 Nomad 0.10.4 includes a fix to the scheduler that removes the same-node penalty 504 for allocations that have not previously failed. In earlier versions of Nomad, 505 the node where an allocation was running was penalized from receiving updated 506 versions of that allocation, resulting in a higher chance of the allocation 507 being placed on a new node. This was changed so that the penalty only applies to 508 nodes where the previous allocation has failed or been rescheduled, to reduce 509 the risk of correlated failures on a host. Scheduling weighs a number of 510 factors, but this change should reduce movement of allocations that are being 511 updated from a healthy state. You can view the placement metrics for an 512 allocation with `nomad alloc status -verbose`. 513 514 ### Additional Environment Variable Filtering 515 516 Nomad will by default prevent certain environment variables set in the client 517 process from being passed along into launched tasks. The `CONSUL_HTTP_TOKEN` 518 environment variable has been added to the default list. More information can 519 be found in the `env.blacklist` [configuration](/docs/configuration/client#env-blacklist) . 520 521 ## Nomad 0.10.3 522 523 ### mTLS Certificate Validation 524 525 Nomad 0.10.3 includes a fix for a privilege escalation vulnerability in 526 validating TLS certificates for RPC with mTLS. Nomad RPC endpoints validated 527 that TLS client certificates had not expired and were signed by the same CA as 528 the Nomad node, but did not correctly check the certificate's name for the role 529 and region as described in the [Securing Nomad with TLS][tls-guide] guide. This 530 allows trusted operators with a client certificate signed by the CA to send RPC 531 calls as a Nomad client or server node, bypassing access control and accessing 532 any secrets available to a client. 533 534 Nomad clusters configured for mTLS following the [Securing Nomad with 535 TLS][tls-guide] guide or the [Vault PKI Secrets Engine 536 Integration][tls-vault-guide] guide should already have certificates that will 537 pass validation. Before upgrading to Nomad 0.10.3, operators using mTLS with 538 `verify_server_hostname = true` should confirm that the common name or SAN of 539 all Nomad client node certs is `client.<region>.nomad`, and that the common name 540 or SAN of all Nomad server node certs is `server.<region>.nomad`. 541 542 ### Connection Limits Added 543 544 Nomad 0.10.3 introduces the [limits][] agent configuration parameters for 545 mitigating denial of service attacks from users who are not authenticated via 546 mTLS. The default limits stanza is: 547 548 ```hcl 549 limits { 550 https_handshake_timeout = "5s" 551 http_max_conns_per_client = 100 552 rpc_handshake_timeout = "5s" 553 rpc_max_conns_per_client = 100 554 } 555 ``` 556 557 If your Nomad agent's endpoints are protected from unauthenticated users via 558 other mechanisms these limits may be safely disabled by setting them to `0`. 559 560 However the defaults were chosen to be safe for a wide variety of Nomad 561 deployments and may protect against accidental abuses of the Nomad API that 562 could cause unintended resource usage. 563 564 ## Nomad 0.10.2 565 566 ### Preemption Panic Fixed 567 568 Nomad 0.9.7 and 0.10.2 fix a [server crashing bug][gh-6787] present in scheduler 569 preemption since 0.9.0. Users unable to immediately upgrade Nomad can [disable 570 preemption][preemption-api] to avoid the panic. 571 572 ### Dangling Docker Container Cleanup 573 574 Nomad 0.10.2 addresses an issue occurring in heavily loaded clients, where 575 containers are started without being properly managed by Nomad. Nomad 0.10.2 576 introduced a reaper that detects and kills such containers. 577 578 Operators may opt to run reaper in a dry-mode or disabling it through a client 579 config. 580 581 For more information, see [Docker Dangling containers][dangling-containers]. 582 583 ## Nomad 0.10.0 584 585 ### Deployments 586 587 Nomad 0.10 enables rolling deployments for service jobs by default and adds a 588 default update stanza when a service job is created or updated. This does not 589 affect jobs with an update stanza. 590 591 In pre-0.10 releases, when updating a service job without an update stanza, all 592 existing allocations are stopped while new allocations start up, and this may 593 cause a service degradation or an outage. You can regain this behavior and 594 disable deployments by setting `max_parallel` to 0. 595 596 For more information, see [`update` stanza][update]. 597 598 ## Nomad 0.9.5 599 600 ### Template Rendering 601 602 Nomad 0.9.5 includes security fixes for privilege escalation vulnerabilities in 603 handling of job `template` stanzas: 604 605 - The client host's environment variables are now cleaned before rendering the 606 template. If a template includes the `env` function, the job should include an 607 [`env`](/docs/job-specification/env) stanza to allow access to the variable in 608 the template. 609 610 - The `plugin` function is no longer permitted by default and will raise an 611 error if used in a template. Operator can opt-in to permitting this function 612 with the new 613 [`template.function_blacklist`](/docs/configuration/client#template-parameters) 614 field in the client configuration. 615 616 - The `file` function has been changed to restrict paths to fall inside the task 617 directory by default. Paths that used the `NOMAD_TASK_DIR` environment 618 variable to prefix file paths should work unchanged. Relative paths or 619 symlinks that point outside the task directory will raise an error. An 620 operator can opt-out of this protection with the new 621 [`template.disable_file_sandbox`](/docs/configuration/client#template-parameters) 622 field in the client configuration. 623 624 ## Nomad 0.9.0 625 626 ### Preemption 627 628 Nomad 0.9 adds preemption support for system jobs. If a system job is submitted 629 that has a higher priority than other running jobs on the node, and the node 630 does not have capacity remaining, Nomad may preempt those lower priority 631 allocations to place the system job. See [preemption][preemption] for more 632 details. 633 634 ### Task Driver Plugins 635 636 All task drivers have become [plugins][plugins] in Nomad 0.9.0. There are two 637 user visible differences between 0.8 and 0.9 drivers: 638 639 - [LXC][lxc] is now community supported and distributed independently. 640 641 - Task driver [`config`][task-config] stanzas are no longer validated by 642 the [`nomad job validate`][validate] command. This is a regression that will 643 be fixed in a future release. 644 645 There is a new method for client driver configuration options, but existing 646 `client.options` settings are supported in 0.9. See [plugin 647 configuration][plugin-stanza] for details. 648 649 #### LXC 650 651 LXC is now an external plugin and must be installed separately. See [the LXC 652 driver's documentation][lxc] for details. 653 654 ### Structured Logging 655 656 Nomad 0.9.0 switches to structured logging. Any log processing on the pre-0.9 657 log output will need to be updated to match the structured output. 658 659 Structured log lines have the format: 660 661 ``` 662 # <Timestamp> [<Level>] <Component>: <Message>: <KeyN>=<ValueN> ... 663 664 2019-01-29T05:52:09.221Z [INFO ] client.plugin: starting plugin manager: plugin-type=device 665 ``` 666 667 Values containing whitespace will be quoted: 668 669 ``` 670 ... starting plugin: task=redis args="[/opt/gopath/bin/nomad logmon]" 671 ``` 672 673 ### HCL2 Transition 674 675 Nomad 0.9.0 begins a transition to [HCL2][hcl2], the next version of the 676 HashiCorp configuration language. While Nomad has begun integrating HCL2, users 677 will need to continue to use HCL1 in Nomad 0.9.0 as the transition is 678 incomplete. 679 680 If you interpolate variables in your [`task.config`][task-config] containing 681 consecutive dots in their name, you will need to change your job specification 682 to use the `env` map. See the following example: 683 684 ```hcl 685 env { 686 # Note the multiple consecutive dots 687 image...version = "3.2" 688 689 # Valid in both v0.8 and v0.9 690 image.version = "3.2" 691 } 692 693 # v0.8 task config stanza: 694 task { 695 driver = "docker" 696 config { 697 image = "redis:${image...version}" 698 } 699 } 700 701 # v0.9 task config stanza: 702 task { 703 driver = "docker" 704 config { 705 image = "redis:${env["image...version"]}" 706 } 707 } 708 ``` 709 710 This only affects users who interpolate unusual variables with multiple 711 consecutive dots in their task `config` stanza. All other interpolation is 712 unchanged. 713 714 Since HCL2 uses dotted object notation for interpolation users should transition 715 away from variable names with multiple consecutive dots. 716 717 ### Downgrading clients 718 719 Due to the large refactor of the Nomad client in 0.9, downgrading to a previous 720 version of the client after upgrading it to Nomad 0.9 is not supported. To 721 downgrade safely, users should erase the Nomad client's data directory. 722 723 ### `port_map` Environment Variable Changes 724 725 Before Nomad 0.9.0 ports mapped via a task driver's `port_map` stanza could be 726 interpolated via the `NOMAD_PORT_<label>` environment variables. 727 728 However, in Nomad 0.9.0 no parameters in a driver's `config` stanza, including 729 its `port_map`, are available for interpolation. This means `{{ env NOMAD_PORT_<label> }}` in a `template` stanza or `HTTP_PORT = "${NOMAD_PORT_http}"` in an `env` stanza will now interpolate the _host_ ports, 730 not the container's. 731 732 Nomad 0.10 introduced Task Group Networking which natively supports port mapping 733 without relying on task driver specific `port_map` fields. The 734 [`to`](/docs/job-specification/network#to) field on group network port stanzas 735 will be interpolated properly. Please see the 736 [`network`](/docs/job-specification/network/) stanza documentation for details. 737 738 ## Nomad 0.8.0 739 740 ### Raft Protocol Version Compatibility 741 742 When upgrading to Nomad 0.8.0 from a version lower than 0.7.0, users will need 743 to set the [`raft_protocol`](/docs/configuration/server#raft_protocol) option in 744 their `server` stanza to 1 in order to maintain backwards compatibility with the 745 old servers during the upgrade. After the servers have been migrated to version 746 0.8.0, `raft_protocol` can be moved up to 2 and the servers restarted to match 747 the default. 748 749 The Raft protocol must be stepped up in this way; only adjacent version numbers 750 are compatible (for example, version 1 cannot talk to version 3). Here is a 751 table of the Raft Protocol versions supported by each Nomad version: 752 753 <table> 754 <thead> 755 <tr> 756 <th>Version</th> 757 <th>Supported Raft Protocols</th> 758 </tr> 759 </thead> 760 <tbody> 761 <tr> 762 <td>0.6 and earlier</td> 763 <td>0</td> 764 </tr> 765 <tr> 766 <td>0.7</td> 767 <td>1</td> 768 </tr> 769 <tr> 770 <td>0.8 and later</td> 771 <td>1, 2, 3</td> 772 </tr> 773 </tbody> 774 </table> 775 776 In order to enable all 777 [Autopilot](https://learn.hashicorp.com/tutorials/nomad/autopilot) features, all 778 servers in a Nomad cluster must be running with Raft protocol version 3 or 779 later. 780 781 #### Upgrading to Raft Protocol 3 782 783 This section provides details on upgrading to Raft Protocol 3 in Nomad 0.8 and 784 higher. Raft protocol version 3 requires Nomad running 0.8.0 or newer on all 785 servers in order to work. See [Raft Protocol Version 786 Compatibility](/docs/upgrade/upgrade-specific#raft-protocol-version-compatibility) 787 for more details. Also the format of `peers.json` used for outage recovery is 788 different when running with the latest Raft protocol. See [Manual Recovery Using 789 peers.json](https://learn.hashicorp.com/tutorials/nomad/outage-recovery#manual-recovery-using-peersjson) 790 for a description of the required format. 791 792 Please note that the Raft protocol is different from Nomad's internal protocol 793 as shown in commands like `nomad server members`. To see the version of the Raft 794 protocol in use on each server, use the `nomad operator raft list-peers` 795 command. 796 797 The easiest way to upgrade servers is to have each server leave the cluster, 798 upgrade its `raft_protocol` version in the `server` stanza, and then add it 799 back. Make sure the new server joins successfully and that the cluster is stable 800 before rolling the upgrade forward to the next server. It's also possible to 801 stand up a new set of servers, and then slowly stand down each of the older 802 servers in a similar fashion. 803 804 When using Raft protocol version 3, servers are identified by their `node-id` 805 instead of their IP address when Nomad makes changes to its internal Raft quorum 806 configuration. This means that once a cluster has been upgraded with servers all 807 running Raft protocol version 3, it will no longer allow servers running any 808 older Raft protocol versions to be added. If running a single Nomad server, 809 restarting it in-place will result in that server not being able to elect itself 810 as a leader. To avoid this, either set the Raft protocol back to 2, or use 811 [Manual Recovery Using 812 peers.json](https://learn.hashicorp.com/tutorials/nomad/outage-recovery#manual-recovery-using-peersjson) 813 to map the server to its node ID in the Raft quorum configuration. 814 815 ### Node Draining Improvements 816 817 Node draining via the [`node drain`][drain-cli] command or the [drain 818 API][drain-api] has been substantially changed in Nomad 0.8. In Nomad 0.7.1 and 819 earlier draining a node would immediately stop all allocations on the node 820 being drained. Nomad 0.8 now supports a [`migrate`][migrate] stanza in job 821 specifications to control how many allocations may be migrated at once and the 822 default will be used for existing jobs. 823 824 The `drain` command now blocks until the drain completes. To get the Nomad 0.7.1 825 and earlier drain behavior use the command: `nomad node drain -enable -force -detach <node-id>` 826 827 See the [`migrate` stanza documentation][migrate] and [Decommissioning Nodes 828 guide](https://learn.hashicorp.com/tutorials/nomad/node-drain) for details. 829 830 ### Periods in Environment Variable Names No Longer Escaped 831 832 _Applications which expect periods in environment variable names to be replaced 833 with underscores must be updated._ 834 835 In Nomad 0.7 periods (`.`) in environment variables names were replaced with an 836 underscore in both the [`env`](/docs/job-specification/env) and 837 [`template`](/docs/job-specification/template) stanzas. 838 839 In Nomad 0.8 periods are _not_ replaced and will be included in environment 840 variables verbatim. 841 842 For example the following stanza: 843 844 ```text 845 env { 846 registry.consul.addr = "${NOMAD_IP_http}:8500" 847 } 848 ``` 849 850 In Nomad 0.7 would be exposed to the task as 851 `registry_consul_addr=127.0.0.1:8500`. In Nomad 0.8 it will now appear exactly 852 as specified: `registry.consul.addr=127.0.0.1:8500`. 853 854 ### Client APIs Unavailable on Older Nodes 855 856 Because Nomad 0.8 uses a new RPC mechanism to route node-specific APIs like 857 [`nomad alloc fs`](/docs/commands/alloc/fs) through servers to the node, 858 0.8 CLIs are incompatible using these commands on clients older than 0.8. 859 860 To access these commands on older clients either continue to use a pre-0.8 861 version of the CLI, or upgrade all clients to 0.8. 862 863 ### CLI Command Changes 864 865 Nomad 0.8 has changed the organization of CLI commands to be based on 866 subcommands. An example of this change is the change from `nomad alloc-status` 867 to `nomad alloc status`. All commands have been made to be backwards compatible, 868 but operators should update any usage of the old style commands to the new style 869 as the old style will be deprecated in future versions of Nomad. 870 871 ### RPC Advertise Address 872 873 The behavior of the [advertised RPC address](/docs/configuration#rpc-1) has 874 changed to be only used to advertise the RPC address of servers to client nodes. 875 Server to server communication is done using the advertised Serf address. 876 Existing cluster's should not be effected but the advertised RPC address may 877 need to be updated to allow connecting client's over a NAT. 878 879 ## Nomad 0.6.0 880 881 ### Default `advertise` address changes 882 883 When no `advertise` address was specified and Nomad's `bind_addr` was loopback 884 or `0.0.0.0`, Nomad attempted to resolve the local hostname to use as an 885 advertise address. 886 887 Many hosts cannot properly resolve their hostname, so Nomad 0.6 defaults 888 `advertise` to the first private IP on the host (e.g. `10.1.2.3`). 889 890 If you manually configure `advertise` addresses no changes are necessary. 891 892 ## Nomad Clients 893 894 The change to the default, advertised IP also effect clients that do not specify 895 which network_interface to use. If you have several routable IPs, it is advised 896 to configure the client's [network 897 interface](/docs/configuration/client#network_interface) such that tasks bind to 898 the correct address. 899 900 ## Nomad 0.5.5 901 902 ### Docker `load` changes 903 904 Nomad 0.5.5 has a backward incompatible change in the `docker` driver's 905 configuration. Prior to 0.5.5 the `load` configuration option accepted a list 906 images to load, in 0.5.5 it has been changed to a single string. No 907 functionality was changed. Even if more than one item was specified prior to 908 0.5.5 only the first item was used. 909 910 To do a zero-downtime deploy with jobs that use the `load` option: 911 912 - Upgrade servers to version 0.5.5 or later. 913 914 - Deploy new client nodes on the same version as the servers. 915 916 - Resubmit jobs with the `load` option fixed and a constraint to only run on 917 version 0.5.5 or later: 918 919 ```hcl 920 constraint { 921 attribute = "${attr.nomad.version}" 922 operator = "version" 923 value = ">= 0.5.5" 924 } 925 ``` 926 927 - Drain and shutdown old client nodes. 928 929 ### Validation changes 930 931 Due to internal job serialization and validation changes you may run into 932 issues using 0.5.5 command line tools such as `nomad run` and `nomad validate` 933 with 0.5.4 or earlier agents. 934 935 It is recommended you upgrade agents before or alongside your command line 936 tools. 937 938 ## Nomad 0.4.0 939 940 Nomad 0.4.0 has backward incompatible changes in the logic for Consul 941 deregistration. When a Task which was started by Nomad v0.3.x is uncleanly shut 942 down, the Nomad 0.4 Client will no longer clean up any stale services. If an 943 in-place upgrade of the Nomad client to 0.4 prevents the Task from gracefully 944 shutting down and deregistering its Consul-registered services, the Nomad Client 945 will not clean up the remaining Consul services registered with the 0.3 946 Executor. 947 948 We recommend draining a node before upgrading to 0.4.0 and then re-enabling the 949 node once the upgrade is complete. 950 951 ## Nomad 0.3.1 952 953 Nomad 0.3.1 removes artifact downloading from driver configurations and places them as 954 a first class element of the task. As such, jobs will have to be rewritten in 955 the proper format and resubmitted to Nomad. Nomad clients will properly 956 re-attach to existing tasks but job definitions must be updated before they can 957 be dispatched to clients running 0.3.1. 958 959 ## Nomad 0.3.0 960 961 Nomad 0.3.0 has made several substantial changes to job files included a new 962 `log` block and variable interpretation syntax (`${var}`), a modified `restart` 963 policy syntax, and minimum resources for tasks as well as validation. These 964 changes require a slight change to the default upgrade flow. 965 966 After upgrading the version of the servers, all previously submitted jobs must 967 be resubmitted with the updated job syntax using a Nomad 0.3.0 binary. 968 969 - All instances of `$var` must be converted to the new syntax of `${var}` 970 971 - All tasks must provide their required resources for CPU, memory and disk as 972 well as required network usage if ports are required by the task. 973 974 - Restart policies must be updated to indicate whether it is desired for the 975 task to restart on failure or to fail using `mode = "delay"` or `mode = "fail"` respectively. 976 977 - Service names that include periods will fail validation. To fix, remove any 978 periods from the service name before running the job. 979 980 After updating the Servers and job files, Nomad Clients can be upgraded by first 981 draining the node so no tasks are running on it. This can be verified by running 982 `nomad node status <node-id>` and verify there are no tasks in the `running` 983 state. Once that is done the client can be killed, the `data_dir` should be 984 deleted and then Nomad 0.3.0 can be launched. 985 986 [dangling-containers]: /docs/drivers/docker#dangling-containers 987 [drain-api]: /api-docs/nodes#drain-node 988 [drain-cli]: /docs/commands/node/drain 989 [dst]: /docs/job-specification/periodic#daylight-saving-time 990 [envoy_concurrency]: https://www.envoyproxy.io/docs/envoy/latest/operations/cli#cmdoption-concurrency 991 [gh-6787]: https://github.com/hashicorp/nomad/issues/6787 992 [gh-8457]: https://github.com/hashicorp/nomad/issues/8457 993 [gh-9148]: https://github.com/hashicorp/nomad/issues/9148 994 [hcl2]: https://github.com/hashicorp/hcl2 995 [limits]: /docs/configuration#limits 996 [lxc]: /docs/drivers/external/lxc 997 [migrate]: /docs/job-specification/migrate 998 [plugin-stanza]: /docs/configuration/plugin 999 [plugins]: /docs/drivers/external 1000 [preemption-api]: /api-docs/operator#update-scheduler-configuration 1001 [preemption]: /docs/internals/scheduling/preemption 1002 [proxy_concurrency]: /docs/job-specification/sidecar_task#proxy_concurrency 1003 [reserved]: /docs/configuration/client#reserved-parameters 1004 [task-config]: /docs/job-specification/task#config 1005 [tls-guide]: https://learn.hashicorp.com/tutorials/nomad/security-enable-tls 1006 [tls-vault-guide]: https://learn.hashicorp.com/tutorials/nomad/vault-pki-nomad 1007 [update]: /docs/job-specification/update 1008 [validate]: /docs/commands/job/validate 1009 [vault_grace]: /docs/job-specification/template 1010 [node drain]: https://www.nomadproject.io/docs/upgrade#5-upgrade-clients 1011 [`template.disable_file_sandbox`]: /docs/configuration/client#template-parameters 1012 [pki]: https://www.vaultproject.io/docs/secrets/pki