github.com/iqoqo/nomad@v0.11.3-0.20200911112621-d7021c74d101/website/pages/docs/upgrade/upgrade-specific.mdx (about)

     1  ---
     2  layout: docs
     3  page_title: Upgrade Guides
     4  sidebar_title: Specific Version Details
     5  description: |-
     6    Specific versions of Nomad may have additional information about the upgrade
     7    process beyond the standard flow.
     8  ---
     9  
    10  # Upgrade Guides
    11  
    12  The [upgrading page](/docs/upgrade) covers the details of doing
    13  a standard upgrade. However, specific versions of Nomad may have more
    14  details provided for their upgrades as a result of new features or changed
    15  behavior. This page is used to document those details separately from the
    16  standard upgrade flow.
    17  
    18  ## Nomad 0.11.2
    19  
    20  ### Scheduler Scoring Changes
    21  
    22  Prior to Nomad 0.11.2 the scheduler algorithm used a [node's reserved
    23  resources][reserved]
    24  incorrectly during scoring. The result of this bug was that scoring biased in
    25  favor of nodes with reserved resources vs nodes without reserved resources.
    26  
    27  Placements will be more correct but slightly different in v0.11.2 vs earlier
    28  versions of Nomad. Operators do *not* need to take any actions as the impact of
    29  the bug fix will only minimally affect scoring.
    30  
    31  Feasability (whether a node is capable of running a job at all) is *not*
    32  affected.
    33  
    34  ### Periodic Jobs and Daylight Saving Time
    35  
    36  Nomad 0.11.2 fixed a long outstanding bug affecting periodic jobs that are
    37  scheduled to run during Daylight Saving Time transitions.
    38  
    39  Nomad 0.11.2 provides a more defined behavior: Nomad evaluates the cron
    40  expression with respect to specified time zone during transition. A 2:30am
    41  nightly job with `America/New_York` time zone will not run on the day daylight
    42  saving time start; similarly, a 1:30am nightly job will run twice on the day
    43  daylight saving time ends. See the [Daylight Saving Time][dst] documentation
    44  for details.
    45  
    46  ## Nomad 0.11.0
    47  
    48  ### client.template: `vault_grace` deprecation
    49  
    50  Nomad 0.11.0 updates
    51  [consul-template](https://github.com/hashicorp/consul-template) to v0.24.1.
    52  This library deprecates the [`vault_grace`][vault_grace] option for templating
    53  included in Nomad. The feature has been ignored since Vault 0.5 and as long as
    54  you are running a more recent version of Vault, you can safely remove
    55  `vault_grace` from your Nomad jobs.
    56  
    57  ### Rkt Task Driver Removed
    58  
    59  The `rkt` task driver has been deprecated and removed from Nomad. While the
    60  code is available in an external repository,
    61  [https://github.com/hashicorp/nomad-driver-rkt](https://github.com/hashicorp/nomad-driver-rkt),
    62  it will not be maintained as `rkt` is [no longer being developed
    63  upstream](https://github.com/rkt/rkt). We encourage all `rkt` users to find a
    64  new task driver as soon as possible.
    65  
    66  ## Nomad 0.10.4
    67  
    68  ### Same-Node Scheduling Penalty Removed
    69  
    70  Nomad 0.10.4 includes a fix to the scheduler that removes the
    71  same-node penalty for allocations that have not previously failed. In
    72  earlier versions of Nomad, the node where an allocation was running
    73  was penalized from receiving updated versions of that allocation,
    74  resulting in a higher chance of the allocation being placed on a new
    75  node. This was changed so that the penalty only applies to nodes where
    76  the previous allocation has failed or been rescheduled, to reduce the
    77  risk of correlated failures on a host. Scheduling weighs a number of
    78  factors, but this change should reduce movement of allocations that
    79  are being updated from a healthy state. You can view the placement
    80  metrics for an allocation with `nomad alloc status -verbose`.
    81  
    82  ### Additional Environment Variable Filtering
    83  
    84  Nomad will by default prevent certain environment variables set in the client
    85  process from being passed along into launched tasks. The `CONSUL_HTTP_TOKEN`
    86  environment variable has been added to the default list. More information can
    87  be found in the `env.blacklist` [configuration](/docs/configuration/client#env-blacklist) .
    88  
    89  ## Nomad 0.10.3
    90  
    91  ### mTLS Certificate Validation
    92  
    93  Nomad 0.10.3 includes a fix for a privilege escalation vulnerability in
    94  validating TLS certificates for RPC with mTLS. Nomad RPC endpoints validated
    95  that TLS client certificates had not expired and were signed by the same CA as
    96  the Nomad node, but did not correctly check the certificate's name for the role
    97  and region as described in the [Securing Nomad with TLS][tls-guide] guide. This
    98  allows trusted operators with a client certificate signed by the CA to send RPC
    99  calls as a Nomad client or server node, bypassing access control and accessing
   100  any secrets available to a client.
   101  
   102  Nomad clusters configured for mTLS following the [Securing Nomad with TLS][tls-guide]
   103  guide or the [Vault PKI Secrets Engine Integration][tls-vault-guide] guide
   104  should already have certificates that will pass validation. Before upgrading to
   105  Nomad 0.10.3, operators using mTLS with `verify_server_hostname = true` should
   106  confirm that the common name or SAN of all Nomad client node certs is
   107  `client.<region>.nomad`, and that the common name or SAN of all Nomad server
   108  node certs is `server.<region>.nomad`.
   109  
   110  ### Connection Limits Added
   111  
   112  Nomad 0.10.3 introduces the [limits][limits] agent configuration parameters for
   113  mitigating denial of service attacks from users who are not authenticated via
   114  mTLS. The default limits stanza is:
   115  
   116  ```hcl
   117  limits {
   118    https_handshake_timeout   = "5s"
   119    http_max_conns_per_client = 100
   120    rpc_handshake_timeout     = "5s"
   121    rpc_max_conns_per_client  = 100
   122  }
   123  ```
   124  
   125  If your Nomad agent's endpoints are protected from unauthenticated users via
   126  other mechanisms these limits may be safely disabled by setting them to `0`.
   127  
   128  However the defaults were chosen to be safe for a wide variety of Nomad
   129  deployments and may protect against accidental abuses of the Nomad API that
   130  could cause unintended resource usage.
   131  
   132  ## Nomad 0.10.2
   133  
   134  ### Preemption Panic Fixed
   135  
   136  Nomad 0.9.7 and 0.10.2 fix a [server crashing bug][gh-6787] present in
   137  scheduler preemption since 0.9.0. Users unable to immediately upgrade Nomad can
   138  [disable preemption][preemption-api] to avoid the panic.
   139  
   140  ### Dangling Docker Container Cleanup
   141  
   142  Nomad 0.10.2 addresses an issue occurring in heavily loaded clients, where
   143  containers are started without being properly managed by Nomad. Nomad 0.10.2
   144  introduced a reaper that detects and kills such containers.
   145  
   146  Operators may opt to run reaper in a dry-mode or disabling it through a client config.
   147  
   148  For more information, see [Docker Dangling containers][dangling-containers].
   149  
   150  ## Nomad 0.10.0
   151  
   152  ### Deployments
   153  
   154  Nomad 0.10 enables rolling deployments for service jobs by default
   155  and adds a default update stanza when a service job is created or updated.
   156  This does not affect jobs with an update stanza.
   157  
   158  In pre-0.10 releases, when updating a service job without an update stanza,
   159  all existing allocations are stopped while new allocations start up,
   160  and this may cause a service degradation or an outage.
   161  You can regain this behavior and disable deployments by setting `max_parallel` to 0.
   162  
   163  For more information, see [`update` stanza][update].
   164  
   165  ## Nomad 0.9.5
   166  
   167  ### Template Rendering
   168  
   169  Nomad 0.9.5 includes security fixes for privilege escalation vulnerabilities in handling of job `template` stanzas:
   170  
   171  - The client host's environment variables are now cleaned before rendering the template. If a template includes the `env` function, the job should include an [`env`](/docs/job-specification/env) stanza to allow access to the variable in the template.
   172  - The `plugin` function is no longer permitted by default and will raise an error if used in a template. Operator can opt-in to permitting this function with the new [`template.function_blacklist`](/docs/configuration/client#template-parameters) field in the client configuration.
   173  - The `file` function has been changed to restrict paths to fall inside the task directory by default. Paths that used the `NOMAD_TASK_DIR` environment variable to prefix file paths should work unchanged. Relative paths or symlinks that point outside the task directory will raise an error. An operator can opt-out of this protection with the new [`template.disable_file_sandbox`](/docs/configuration/client#template-parameters) field in the client configuration.
   174  
   175  ## Nomad 0.9.0
   176  
   177  ### Preemption
   178  
   179  Nomad 0.9 adds preemption support for system jobs. If a system job is submitted
   180  that has a higher priority than other running jobs on the node, and the node
   181  does not have capacity remaining, Nomad may preempt those lower priority
   182  allocations to place the system job. See [preemption][preemption] for more
   183  details.
   184  
   185  ### Task Driver Plugins
   186  
   187  All task drivers have become [plugins][plugins] in Nomad 0.9.0. There are two
   188  user visible differences between 0.8 and 0.9 drivers:
   189  
   190  - [LXC][lxc] is now community supported and distributed independently.
   191  - Task driver [`config`][task-config] stanzas are no longer validated by
   192    the [`nomad job validate`][validate] command. This is a regression that will
   193    be fixed in a future release.
   194  
   195  There is a new method for client driver configuration options, but existing
   196  `client.options` settings are supported in 0.9. See [plugin
   197  configuration][plugin-stanza] for details.
   198  
   199  #### LXC
   200  
   201  LXC is now an external plugin and must be installed separately. See [the LXC
   202  driver's documentation][lxc] for details.
   203  
   204  ### Structured Logging
   205  
   206  Nomad 0.9.0 switches to structured logging. Any log processing on the pre-0.9
   207  log output will need to be updated to match the structured output.
   208  
   209  Structured log lines have the format:
   210  
   211  ```
   212  # <Timestamp> [<Level>] <Component>: <Message>: <KeyN>=<ValueN> ...
   213  
   214  2019-01-29T05:52:09.221Z [INFO ] client.plugin: starting plugin manager: plugin-type=device
   215  ```
   216  
   217  Values containing whitespace will be quoted:
   218  
   219  ```
   220  ... starting plugin: task=redis args="[/opt/gopath/bin/nomad logmon]"
   221  ```
   222  
   223  ### HCL2 Transition
   224  
   225  Nomad 0.9.0 begins a transition to [HCL2][hcl2], the next version of the
   226  HashiCorp configuration language. While Nomad has begun integrating HCL2,
   227  users will need to continue to use HCL1 in Nomad 0.9.0 as the transition is
   228  incomplete.
   229  
   230  If you interpolate variables in your [`task.config`][task-config] containing
   231  consecutive dots in their name, you will need to change your job specification
   232  to use the `env` map. See the following example:
   233  
   234  ```hcl
   235  env {
   236    # Note the multiple consecutive dots
   237    image...version = "3.2"
   238  
   239    # Valid in both v0.8 and v0.9
   240    image.version = "3.2"
   241  }
   242  
   243  # v0.8 task config stanza:
   244  task {
   245    driver = "docker"
   246    config {
   247      image = "redis:${image...version}"
   248    }
   249  }
   250  
   251  # v0.9 task config stanza:
   252  task {
   253    driver = "docker"
   254    config {
   255      image = "redis:${env["image...version"]}"
   256    }
   257  }
   258  ```
   259  
   260  This only affects users who interpolate unusual variables with multiple
   261  consecutive dots in their task `config` stanza. All other interpolation is
   262  unchanged.
   263  
   264  Since HCL2 uses dotted object notation for interpolation users should
   265  transition away from variable names with multiple consecutive dots.
   266  
   267  ### Downgrading clients
   268  
   269  Due to the large refactor of the Nomad client in 0.9, downgrading to a
   270  previous version of the client after upgrading it to Nomad 0.9 is not supported.
   271  To downgrade safely, users should erase the Nomad client's data directory.
   272  
   273  ## Nomad 0.8.0
   274  
   275  ### Raft Protocol Version Compatibility
   276  
   277  When upgrading to Nomad 0.8.0 from a version lower than 0.7.0, users will need
   278  to set the
   279  [`raft_protocol`](/docs/configuration/server#raft_protocol) option
   280  in their `server` stanza to 1 in order to maintain backwards compatibility with
   281  the old servers during the upgrade. After the servers have been migrated to
   282  version 0.8.0, `raft_protocol` can be moved up to 2 and the servers restarted
   283  to match the default.
   284  
   285  The Raft protocol must be stepped up in this way; only adjacent version numbers are
   286  compatible (for example, version 1 cannot talk to version 3). Here is a table of the
   287  Raft Protocol versions supported by each Nomad version:
   288  
   289  <table>
   290    <thead>
   291      <tr>
   292        <th>Version</th>
   293        <th>Supported Raft Protocols</th>
   294      </tr>
   295    </thead>
   296    <tbody>
   297      <tr>
   298        <td>0.6 and earlier</td>
   299        <td>0</td>
   300      </tr>
   301      <tr>
   302        <td>0.7</td>
   303        <td>1</td>
   304      </tr>
   305      <tr>
   306        <td>0.8 and later</td>
   307        <td>1, 2, 3</td>
   308      </tr>
   309    </tbody>
   310  </table>
   311  
   312  In order to enable all [Autopilot](https://learn.hashicorp.com/nomad/operating-nomad/autopilot) features, all servers
   313  in a Nomad cluster must be running with Raft protocol version 3 or later.
   314  
   315  #### Upgrading to Raft Protocol 3
   316  
   317  This section provides details on upgrading to Raft Protocol 3 in Nomad 0.8 and higher. Raft protocol version 3 requires Nomad running 0.8.0 or newer on all servers in order to work. See [Raft Protocol Version Compatibility](/docs/upgrade/upgrade-specific#raft-protocol-version-compatibility) for more details. Also the format of `peers.json` used for outage recovery is different when running with the latest Raft protocol. See [Manual Recovery Using peers.json](https://learn.hashicorp.com/nomad/operating-nomad/outage#manual-recovery-using-peersjson) for a description of the required format.
   318  
   319  Please note that the Raft protocol is different from Nomad's internal protocol as shown in commands like `nomad server members`. To see the version of the Raft protocol in use on each server, use the `nomad operator raft list-peers` command.
   320  
   321  The easiest way to upgrade servers is to have each server leave the cluster, upgrade its `raft_protocol` version in the `server` stanza, and then add it back. Make sure the new server joins successfully and that the cluster is stable before rolling the upgrade forward to the next server. It's also possible to stand up a new set of servers, and then slowly stand down each of the older servers in a similar fashion.
   322  
   323  When using Raft protocol version 3, servers are identified by their `node-id` instead of their IP address when Nomad makes changes to its internal Raft quorum configuration. This means that once a cluster has been upgraded with servers all running Raft protocol version 3, it will no longer allow servers running any older Raft protocol versions to be added. If running a single Nomad server, restarting it in-place will result in that server not being able to elect itself as a leader. To avoid this, either set the Raft protocol back to 2, or use [Manual Recovery Using peers.json](https://learn.hashicorp.com/nomad/operating-nomad/outage#manual-recovery-using-peersjson) to map the server to its node ID in the Raft quorum configuration.
   324  
   325  ### Node Draining Improvements
   326  
   327  Node draining via the [`node drain`][drain-cli] command or the [drain
   328  API][drain-api] has been substantially changed in Nomad 0.8. In Nomad 0.7.1 and
   329  earlier draining a node would immediately stop all allocations on the node
   330  being drained. Nomad 0.8 now supports a [`migrate`][migrate] stanza in job
   331  specifications to control how many allocations may be migrated at once and the
   332  default will be used for existing jobs.
   333  
   334  The `drain` command now blocks until the drain completes. To get the Nomad
   335  0.7.1 and earlier drain behavior use the command: `nomad node drain -enable -force -detach <node-id>`
   336  
   337  See the [`migrate` stanza documentation][migrate] and [Decommissioning Nodes
   338  guide](https://learn.hashicorp.com/nomad/operating-nomad/node-draining) for details.
   339  
   340  ### Periods in Environment Variable Names No Longer Escaped
   341  
   342  _Applications which expect periods in environment variable names to be replaced
   343  with underscores must be updated._
   344  
   345  In Nomad 0.7 periods (`.`) in environment variables names were replaced with an
   346  underscore in both the [`env`](/docs/job-specification/env) and
   347  [`template`](/docs/job-specification/template) stanzas.
   348  
   349  In Nomad 0.8 periods are _not_ replaced and will be included in environment
   350  variables verbatim.
   351  
   352  For example the following stanza:
   353  
   354  ```text
   355  env {
   356    registry.consul.addr = "${NOMAD_IP_http}:8500"
   357  }
   358  ```
   359  
   360  In Nomad 0.7 would be exposed to the task as
   361  `registry_consul_addr=127.0.0.1:8500`. In Nomad 0.8 it will now appear exactly
   362  as specified: `registry.consul.addr=127.0.0.1:8500`.
   363  
   364  ### Client APIs Unavailable on Older Nodes
   365  
   366  Because Nomad 0.8 uses a new RPC mechanism to route node-specific APIs like
   367  [`nomad alloc fs`](/docs/commands/alloc/fs) through servers to the node,
   368  0.8 CLIs are incompatible using these commands on clients older than 0.8.
   369  
   370  To access these commands on older clients either continue to use a pre-0.8
   371  version of the CLI, or upgrade all clients to 0.8.
   372  
   373  ### CLI Command Changes
   374  
   375  Nomad 0.8 has changed the organization of CLI commands to be based on
   376  subcommands. An example of this change is the change from `nomad alloc-status`
   377  to `nomad alloc status`. All commands have been made to be backwards compatible,
   378  but operators should update any usage of the old style commands to the new style
   379  as the old style will be deprecated in future versions of Nomad.
   380  
   381  ### RPC Advertise Address
   382  
   383  The behavior of the [advertised RPC
   384  address](/docs/configuration#rpc-1) has changed to be only used
   385  to advertise the RPC address of servers to client nodes. Server to server
   386  communication is done using the advertised Serf address. Existing cluster's
   387  should not be effected but the advertised RPC address may need to be updated to
   388  allow connecting client's over a NAT.
   389  
   390  ## Nomad 0.6.0
   391  
   392  ### Default `advertise` address changes
   393  
   394  When no `advertise` address was specified and Nomad's `bind_addr` was loopback
   395  or `0.0.0.0`, Nomad attempted to resolve the local hostname to use as an
   396  advertise address.
   397  
   398  Many hosts cannot properly resolve their hostname, so Nomad 0.6 defaults
   399  `advertise` to the first private IP on the host (e.g. `10.1.2.3`).
   400  
   401  If you manually configure `advertise` addresses no changes are necessary.
   402  
   403  ## Nomad Clients
   404  
   405  The change to the default, advertised IP also effect clients that do not specify
   406  which network_interface to use. If you have several routable IPs, it is advised
   407  to configure the client's [network
   408  interface](/docs/configuration/client#network_interface)
   409  such that tasks bind to the correct address.
   410  
   411  ## Nomad 0.5.5
   412  
   413  ### Docker `load` changes
   414  
   415  Nomad 0.5.5 has a backward incompatible change in the `docker` driver's
   416  configuration. Prior to 0.5.5 the `load` configuration option accepted a list
   417  images to load, in 0.5.5 it has been changed to a single string. No
   418  functionality was changed. Even if more than one item was specified prior to
   419  0.5.5 only the first item was used.
   420  
   421  To do a zero-downtime deploy with jobs that use the `load` option:
   422  
   423  - Upgrade servers to version 0.5.5 or later.
   424  
   425  - Deploy new client nodes on the same version as the servers.
   426  
   427  - Resubmit jobs with the `load` option fixed and a constraint to only run on
   428    version 0.5.5 or later:
   429  
   430  ```hcl
   431      constraint {
   432        attribute = "${attr.nomad.version}"
   433        operator  = "version"
   434        value     = ">= 0.5.5"
   435      }
   436  ```
   437  
   438  - Drain and shutdown old client nodes.
   439  
   440  ### Validation changes
   441  
   442  Due to internal job serialization and validation changes you may run into
   443  issues using 0.5.5 command line tools such as `nomad run` and `nomad validate`
   444  with 0.5.4 or earlier agents.
   445  
   446  It is recommended you upgrade agents before or alongside your command line
   447  tools.
   448  
   449  ## Nomad 0.4.0
   450  
   451  Nomad 0.4.0 has backward incompatible changes in the logic for Consul
   452  deregistration. When a Task which was started by Nomad v0.3.x is uncleanly shut
   453  down, the Nomad 0.4 Client will no longer clean up any stale services. If an
   454  in-place upgrade of the Nomad client to 0.4 prevents the Task from gracefully
   455  shutting down and deregistering its Consul-registered services, the Nomad Client
   456  will not clean up the remaining Consul services registered with the 0.3
   457  Executor.
   458  
   459  We recommend draining a node before upgrading to 0.4.0 and then re-enabling the
   460  node once the upgrade is complete.
   461  
   462  ## Nomad 0.3.1
   463  
   464  Nomad 0.3.1 removes artifact downloading from driver configurations and places them as
   465  a first class element of the task. As such, jobs will have to be rewritten in
   466  the proper format and resubmitted to Nomad. Nomad clients will properly
   467  re-attach to existing tasks but job definitions must be updated before they can
   468  be dispatched to clients running 0.3.1.
   469  
   470  ## Nomad 0.3.0
   471  
   472  Nomad 0.3.0 has made several substantial changes to job files included a new
   473  `log` block and variable interpretation syntax (`${var}`), a modified `restart`
   474  policy syntax, and minimum resources for tasks as well as validation. These
   475  changes require a slight change to the default upgrade flow.
   476  
   477  After upgrading the version of the servers, all previously submitted jobs must
   478  be resubmitted with the updated job syntax using a Nomad 0.3.0 binary.
   479  
   480  - All instances of `$var` must be converted to the new syntax of `${var}`
   481  
   482  - All tasks must provide their required resources for CPU, memory and disk as
   483    well as required network usage if ports are required by the task.
   484  
   485  - Restart policies must be updated to indicate whether it is desired for the
   486    task to restart on failure or to fail using `mode = "delay"` or `mode = "fail"` respectively.
   487  
   488  - Service names that include periods will fail validation. To fix, remove any
   489    periods from the service name before running the job.
   490  
   491  After updating the Servers and job files, Nomad Clients can be upgraded by first
   492  draining the node so no tasks are running on it. This can be verified by running
   493  `nomad node status <node-id>` and verify there are no tasks in the `running`
   494  state. Once that is done the client can be killed, the `data_dir` should be
   495  deleted and then Nomad 0.3.0 can be launched.
   496  
   497  [dangling-containers]: /docs/drivers/docker#dangling-containers
   498  [drain-api]: /api-docs/nodes#drain-node
   499  [drain-cli]: /docs/commands/node/drain
   500  [dst]: /docs/job-specification/periodic#daylight-saving-time
   501  [gh-6787]: https://github.com/hashicorp/nomad/issues/6787
   502  [hcl2]: https://github.com/hashicorp/hcl2
   503  [limits]: /docs/configuration#limits
   504  [lxc]: /docs/drivers/external/lxc
   505  [migrate]: /docs/job-specification/migrate
   506  [plugin-stanza]: /docs/configuration/plugin
   507  [plugins]: /docs/drivers/external
   508  [preemption-api]: /api-docs/operator#update-scheduler-configuration
   509  [preemption]: /docs/internals/scheduling/preemption
   510  [reserved]: /docs/configuration/client#reserved-parameters
   511  [task-config]: /docs/job-specification/task#config
   512  [tls-guide]: https://learn.hashicorp.com/nomad/transport-security/enable-tls
   513  [tls-vault-guide]: https://learn.hashicorp.com/nomad/vault-integration/vault-pki-nomad
   514  [update]: /docs/job-specification/update
   515  [validate]: /docs/commands/job/validate
   516  [vault_grace]: /docs/job-specification/template