github.com/ferranbt/nomad@v0.9.3-0.20190607002617-85c449b7667c/website/source/guides/upgrade/upgrade-specific.html.md (about)

     1  ---
     2  layout: "guides"
     3  page_title: "Upgrade Guides"
     4  sidebar_current: "guides-upgrade-specific"
     5  description: |-
     6    Specific versions of Nomad may have additional information about the upgrade
     7    process beyond the standard flow.
     8  ---
     9  
    10  # Upgrade Guides
    11  
    12  The [upgrading page](/guides/upgrade/index.html) covers the details of doing
    13  a standard upgrade. However, specific versions of Nomad may have more
    14  details provided for their upgrades as a result of new features or changed
    15  behavior. This page is used to document those details separately from the
    16  standard upgrade flow.
    17  
    18  ## Nomad 0.9.0
    19  
    20  ### Preemption
    21  
    22  Nomad 0.9 adds preemption support for system jobs. If a system job is submitted
    23  that has a higher priority than other running jobs on the node, and the node
    24  does not have capacity remaining, Nomad may preempt those lower priority
    25  allocations to place the system job. See [preemption][preemption] for more
    26  details.
    27  
    28  ### Task Driver Plugins
    29  
    30  All task drivers have become [plugins][plugins] in Nomad 0.9.0. There are two
    31  user visible differences between 0.8 and 0.9 drivers:
    32  
    33   * [LXC][lxc] is now community supported and distributed independently.
    34   * Task driver [`config`][task-config] stanzas are no longer validated by
    35     the [`nomad job validate`][validate] command. This is a regression that will
    36     be fixed in a future release.
    37  
    38  There is a new method for client driver configuration options, but existing
    39  `client.options` settings are supported in 0.9. See [plugin
    40  configuration][plugin-stanza] for details.
    41  
    42  #### LXC
    43  
    44  LXC is now an external plugin and must be installed separately. See [the LXC
    45  driver's documentation][lxc] for details.
    46  
    47  ### Structured Logging
    48  
    49  Nomad 0.9.0 switches to structured logging. Any log processing on the pre-0.9
    50  log output will need to be updated to match the structured output.
    51  
    52  Structured log lines have the format:
    53  
    54  ```
    55  # <Timestamp> [<Level>] <Component>: <Message>: <KeyN>=<ValueN> ...
    56  
    57  2019-01-29T05:52:09.221Z [INFO ] client.plugin: starting plugin manager: plugin-type=device
    58  ```
    59  
    60  Values containing whitespace will be quoted:
    61  
    62  ```
    63  ... starting plugin: task=redis args="[/opt/gopath/bin/nomad logmon]"
    64  ```
    65  
    66  ### HCL2 Transition
    67  
    68  Nomad 0.9.0 begins a transition to [HCL2][hcl2], the next version of the
    69  HashiCorp configuration language. While Nomad has begun integrating HCL2,
    70  users will need to continue to use HCL1 in Nomad 0.9.0 as the transition is
    71  incomplete.
    72  
    73  If you interpolate variables in your [`task.config`][task-config] containing
    74  consecutive dots in their name, you will need to change your job specification
    75  to use the `env` map. See the following example:
    76  
    77  ```hcl
    78  env {
    79    # Note the multiple consecutive dots
    80    image...version = "3.2"
    81  
    82    # Valid in both v0.8 and v0.9
    83    image.version = "3.2"
    84  }
    85  
    86  # v0.8 task config stanza:
    87  task {
    88    driver = "docker"
    89    config {
    90      image = "redis:${image...version}"
    91    }
    92  }
    93  
    94  # v0.9 task config stanza:
    95  task {
    96    driver = "docker"
    97    config {
    98      image = "redis:${env["image...version"]}"
    99    }
   100  }
   101  ```
   102  
   103  This only affects users who interpolate unusual variables with multiple
   104  consecutive dots in their task `config` stanza. All other interpolation is
   105  unchanged.
   106  
   107  Since HCL2 uses dotted object notation for interpolation users should
   108  transition away from variable names with multiple consecutive dots.
   109  
   110  ## Nomad 0.8.0
   111  
   112  ### Raft Protocol Version Compatibility
   113  
   114  When upgrading to Nomad 0.8.0 from a version lower than 0.7.0, users will need
   115  to set the
   116  [`raft_protocol`](/docs/configuration/server.html#raft_protocol) option
   117  in their `server` stanza to 1 in order to maintain backwards compatibility with
   118  the old servers during the upgrade.  After the servers have been migrated to
   119  version 0.8.0, `raft_protocol` can be moved up to 2 and the servers restarted
   120  to match the default.
   121  
   122  The Raft protocol must be stepped up in this way; only adjacent version numbers are
   123  compatible (for example, version 1 cannot talk to version 3). Here is a table of the
   124  Raft Protocol versions supported by each Nomad version:
   125  
   126  <table class="table table-bordered table-striped">
   127    <tr>
   128      <th>Version</th>
   129      <th>Supported Raft Protocols</th>
   130    </tr>
   131    <tr>
   132      <td>0.6 and earlier</td>
   133      <td>0</td>
   134    </tr>
   135    <tr>
   136      <td>0.7</td>
   137      <td>1</td>
   138    </tr>
   139    <tr>
   140      <td>0.8</td>
   141      <td>1, 2, 3</td>
   142    </tr>
   143  </table>
   144  
   145  In order to enable all [Autopilot](/guides/operations/autopilot.html) features, all servers
   146  in a Nomad cluster must be running with Raft protocol version 3 or later.
   147  
   148  #### Upgrading to Raft Protocol 3
   149  
   150  This section provides details on upgrading to Raft Protocol 3 in Nomad 0.8 and higher. Raft protocol version 3 requires Nomad running 0.8.0 or newer on all servers in order to work. See [Raft Protocol Version Compatibility](/guides/upgrade/upgrade-specific.html#raft-protocol-version-compatibility) for more details. Also the format of `peers.json` used for outage recovery is different when running with the latest Raft protocol. See [Manual Recovery Using peers.json](/guides/operations/outage.html#manual-recovery-using-peers-json) for a description of the required format.
   151  
   152  Please note that the Raft protocol is different from Nomad's internal protocol as shown in commands like `nomad server members`. To see the version of the Raft protocol in use on each server, use the `nomad operator raft list-peers` command.
   153  
   154  The easiest way to upgrade servers is to have each server leave the cluster, upgrade its `raft_protocol` version in the `server` stanza, and then add it back. Make sure the new server joins successfully and that the cluster is stable before rolling the upgrade forward to the next server. It's also possible to stand up a new set of servers, and then slowly stand down each of the older servers in a similar fashion.
   155  
   156  When using Raft protocol version 3, servers are identified by their `node-id` instead of their IP address when Nomad makes changes to its internal Raft quorum configuration. This means that once a cluster has been upgraded with servers all running Raft protocol version 3, it will no longer allow servers running any older Raft protocol versions to be added. If running a single Nomad server, restarting it in-place will result in that server not being able to elect itself as a leader. To avoid this, either set the Raft protocol back to 2, or use [Manual Recovery Using peers.json](/guides/operations/outage.html#manual-recovery-using-peers-json) to map the server to its node ID in the Raft quorum configuration.
   157  
   158  
   159  ### Node Draining Improvements
   160  
   161  Node draining via the [`node drain`][drain-cli] command or the [drain
   162  API][drain-api] has been substantially changed in Nomad 0.8. In Nomad 0.7.1 and
   163  earlier draining a node would immediately stop all allocations on the node
   164  being drained. Nomad 0.8 now supports a [`migrate`][migrate] stanza in job
   165  specifications to control how many allocations may be migrated at once and the
   166  default will be used for existing jobs.
   167  
   168  The `drain` command now blocks until the drain completes. To get the Nomad
   169  0.7.1 and earlier drain behavior use the command: `nomad node drain -enable
   170  -force -detach <node-id>`
   171  
   172  See the [`migrate` stanza documentation][migrate] and [Decommissioning Nodes
   173  guide](/guides/operations/node-draining.html) for details.
   174  
   175  ### Periods in Environment Variable Names No Longer Escaped
   176  
   177  *Applications which expect periods in environment variable names to be replaced
   178  with underscores must be updated.*
   179  
   180  In Nomad 0.7 periods (`.`) in environment variables names were replaced with an
   181  underscore in both the [`env`](/docs/job-specification/env.html) and
   182  [`template`](/docs/job-specification/template.html) stanzas.
   183  
   184  In Nomad 0.8 periods are *not* replaced and will be included in environment
   185  variables verbatim.
   186  
   187  For example the following stanza:
   188  
   189  ```text
   190  env {
   191    registry.consul.addr = "${NOMAD_IP_http}:8500"
   192  }
   193  ```
   194  
   195  In Nomad 0.7 would be exposed to the task as
   196  `registry_consul_addr=127.0.0.1:8500`. In Nomad 0.8 it will now appear exactly
   197  as specified: `registry.consul.addr=127.0.0.1:8500`.
   198  
   199  ### Client APIs Unavailable on Older Nodes
   200  
   201  Because Nomad 0.8 uses a new RPC mechanism to route node-specific APIs like
   202  [`nomad alloc fs`](/docs/commands/alloc/fs.html) through servers to the node,
   203  0.8 CLIs are incompatible using these commands on clients older than 0.8.
   204  
   205  To access these commands on older clients either continue to use a pre-0.8
   206  version of the CLI, or upgrade all clients to 0.8.
   207  
   208  ### CLI Command Changes
   209  
   210  Nomad 0.8 has changed the organization of CLI commands to be based on
   211  subcommands. An example of this change is the change from `nomad alloc-status`
   212  to `nomad alloc status`. All commands have been made to be backwards compatible,
   213  but operators should update any usage of the old style commands to the new style
   214  as the old style will be deprecated in future versions of Nomad.
   215  
   216  ### RPC Advertise Address
   217  
   218  The behavior of the [advertised RPC
   219  address](/docs/configuration/index.html#rpc-1) has changed to be only used
   220  to advertise the RPC address of servers to client nodes. Server to server
   221  communication is done using the advertised Serf address. Existing cluster's
   222  should not be effected but the advertised RPC address may need to be updated to
   223  allow connecting client's over a NAT.
   224  
   225  
   226  ## Nomad 0.6.0
   227  
   228  ### Default `advertise` address changes
   229  
   230  When no `advertise` address was specified and Nomad's `bind_addr` was loopback
   231  or `0.0.0.0`, Nomad attempted to resolve the local hostname to use as an
   232  advertise address.
   233  
   234  Many hosts cannot properly resolve their hostname, so Nomad 0.6 defaults
   235  `advertise` to the first private IP on the host (e.g. `10.1.2.3`).
   236  
   237  If you manually configure `advertise` addresses no changes are necessary.
   238  
   239  ## Nomad Clients
   240  
   241  The change to the default, advertised IP also effect clients that do not specify
   242  which network_interface to use. If you have several routable IPs, it is advised
   243  to configure the client's [network
   244  interface](/docs/configuration/client.html#network_interface)
   245  such that tasks bind to the correct address.
   246  
   247  ## Nomad 0.5.5
   248  
   249  ### Docker `load` changes
   250  
   251  Nomad 0.5.5 has a backward incompatible change in the `docker` driver's
   252  configuration. Prior to 0.5.5 the `load` configuration option accepted a list
   253  images to load, in 0.5.5 it has been changed to a single string. No
   254  functionality was changed. Even if more than one item was specified prior to
   255  0.5.5 only the first item was used.
   256  
   257  To do a zero-downtime deploy with jobs that use the `load` option:
   258  
   259  * Upgrade servers to version 0.5.5 or later.
   260  
   261  * Deploy new client nodes on the same version as the servers.
   262  
   263  * Resubmit jobs with the `load` option fixed and a constraint to only run on
   264    version 0.5.5 or later:
   265  
   266  ```hcl
   267      constraint {
   268        attribute = "${attr.nomad.version}"
   269        operator  = "version"
   270        value     = ">= 0.5.5"
   271      }
   272  ```
   273  
   274  * Drain and shutdown old client nodes.
   275  
   276  ### Validation changes
   277  
   278  Due to internal job serialization and validation changes you may run into
   279  issues using 0.5.5 command line tools such as `nomad run` and `nomad validate`
   280  with 0.5.4 or earlier agents.
   281  
   282  It is recommended you upgrade agents before or alongside your command line
   283  tools.
   284  
   285  ## Nomad 0.4.0
   286  
   287  Nomad 0.4.0 has backward incompatible changes in the logic for Consul
   288  deregistration.  When a Task which was started by Nomad v0.3.x is uncleanly shut
   289  down, the Nomad 0.4 Client will no longer clean up any stale services.  If an
   290  in-place upgrade of the Nomad client to 0.4 prevents the Task from gracefully
   291  shutting down and deregistering its Consul-registered services, the Nomad Client
   292  will not clean up the remaining Consul services registered with the 0.3
   293  Executor.
   294  
   295  We recommend draining a node before upgrading to 0.4.0 and then re-enabling the
   296  node once the upgrade is complete.
   297  
   298  
   299  ## Nomad 0.3.1
   300  
   301  Nomad 0.3.1 removes artifact downloading from driver configurations and places them as
   302  a first class element of the task. As such, jobs will have to be rewritten in
   303  the proper format and resubmitted to Nomad. Nomad clients will properly
   304  re-attach to existing tasks but job definitions must be updated before they can
   305  be dispatched to clients running 0.3.1.
   306  
   307  ## Nomad 0.3.0
   308  
   309  Nomad 0.3.0 has made several substantial changes to job files included a new
   310  `log` block and variable interpretation syntax (`${var}`), a modified `restart`
   311  policy syntax, and minimum resources for tasks as well as validation. These
   312  changes require a slight change to the default upgrade flow.
   313  
   314  After upgrading the version of the servers, all previously submitted jobs must
   315  be resubmitted with the updated job syntax using a Nomad 0.3.0 binary.
   316  
   317  * All instances of `$var` must be converted to the new syntax of `${var}`
   318  
   319  * All tasks must provide their required resources for CPU, memory and disk as
   320    well as required network usage if ports are required by the task.
   321  
   322  * Restart policies must be updated to indicate whether it is desired for the
   323    task to restart on failure or to fail using `mode = "delay"` or `mode =
   324    "fail"` respectively.
   325  
   326  * Service names that include periods will fail validation. To fix, remove any
   327    periods from the service name before running the job.
   328  
   329  After updating the Servers and job files, Nomad Clients can be upgraded by first
   330  draining the node so no tasks are running on it. This can be verified by running
   331  `nomad node status <node-id>` and verify there are no tasks in the `running`
   332  state. Once that is done the client can be killed, the `data_dir` should be
   333  deleted and then Nomad 0.3.0 can be launched.
   334  
   335  [drain-api]: /api/nodes.html#drain-node
   336  [drain-cli]: /docs/commands/node/drain.html
   337  [hcl2]: https://github.com/hashicorp/hcl2
   338  [lxc]: /docs/drivers/external/lxc.html
   339  [migrate]: /docs/job-specification/migrate.html
   340  [plugins]: /docs/drivers/external/index.html
   341  [plugin-stanza]: /docs/configuration/plugin.html
   342  [preemption]: /docs/internals/scheduling/preemption.html
   343  [task-config]: /docs/job-specification/task.html#config
   344  [validate]: /docs/commands/job/validate.html