github.com/ferranbt/nomad@v0.9.3-0.20190607002617-85c449b7667c/website/source/guides/upgrade/upgrade-specific.html.md (about) 1 --- 2 layout: "guides" 3 page_title: "Upgrade Guides" 4 sidebar_current: "guides-upgrade-specific" 5 description: |- 6 Specific versions of Nomad may have additional information about the upgrade 7 process beyond the standard flow. 8 --- 9 10 # Upgrade Guides 11 12 The [upgrading page](/guides/upgrade/index.html) covers the details of doing 13 a standard upgrade. However, specific versions of Nomad may have more 14 details provided for their upgrades as a result of new features or changed 15 behavior. This page is used to document those details separately from the 16 standard upgrade flow. 17 18 ## Nomad 0.9.0 19 20 ### Preemption 21 22 Nomad 0.9 adds preemption support for system jobs. If a system job is submitted 23 that has a higher priority than other running jobs on the node, and the node 24 does not have capacity remaining, Nomad may preempt those lower priority 25 allocations to place the system job. See [preemption][preemption] for more 26 details. 27 28 ### Task Driver Plugins 29 30 All task drivers have become [plugins][plugins] in Nomad 0.9.0. There are two 31 user visible differences between 0.8 and 0.9 drivers: 32 33 * [LXC][lxc] is now community supported and distributed independently. 34 * Task driver [`config`][task-config] stanzas are no longer validated by 35 the [`nomad job validate`][validate] command. This is a regression that will 36 be fixed in a future release. 37 38 There is a new method for client driver configuration options, but existing 39 `client.options` settings are supported in 0.9. See [plugin 40 configuration][plugin-stanza] for details. 41 42 #### LXC 43 44 LXC is now an external plugin and must be installed separately. See [the LXC 45 driver's documentation][lxc] for details. 46 47 ### Structured Logging 48 49 Nomad 0.9.0 switches to structured logging. Any log processing on the pre-0.9 50 log output will need to be updated to match the structured output. 51 52 Structured log lines have the format: 53 54 ``` 55 # <Timestamp> [<Level>] <Component>: <Message>: <KeyN>=<ValueN> ... 56 57 2019-01-29T05:52:09.221Z [INFO ] client.plugin: starting plugin manager: plugin-type=device 58 ``` 59 60 Values containing whitespace will be quoted: 61 62 ``` 63 ... starting plugin: task=redis args="[/opt/gopath/bin/nomad logmon]" 64 ``` 65 66 ### HCL2 Transition 67 68 Nomad 0.9.0 begins a transition to [HCL2][hcl2], the next version of the 69 HashiCorp configuration language. While Nomad has begun integrating HCL2, 70 users will need to continue to use HCL1 in Nomad 0.9.0 as the transition is 71 incomplete. 72 73 If you interpolate variables in your [`task.config`][task-config] containing 74 consecutive dots in their name, you will need to change your job specification 75 to use the `env` map. See the following example: 76 77 ```hcl 78 env { 79 # Note the multiple consecutive dots 80 image...version = "3.2" 81 82 # Valid in both v0.8 and v0.9 83 image.version = "3.2" 84 } 85 86 # v0.8 task config stanza: 87 task { 88 driver = "docker" 89 config { 90 image = "redis:${image...version}" 91 } 92 } 93 94 # v0.9 task config stanza: 95 task { 96 driver = "docker" 97 config { 98 image = "redis:${env["image...version"]}" 99 } 100 } 101 ``` 102 103 This only affects users who interpolate unusual variables with multiple 104 consecutive dots in their task `config` stanza. All other interpolation is 105 unchanged. 106 107 Since HCL2 uses dotted object notation for interpolation users should 108 transition away from variable names with multiple consecutive dots. 109 110 ## Nomad 0.8.0 111 112 ### Raft Protocol Version Compatibility 113 114 When upgrading to Nomad 0.8.0 from a version lower than 0.7.0, users will need 115 to set the 116 [`raft_protocol`](/docs/configuration/server.html#raft_protocol) option 117 in their `server` stanza to 1 in order to maintain backwards compatibility with 118 the old servers during the upgrade. After the servers have been migrated to 119 version 0.8.0, `raft_protocol` can be moved up to 2 and the servers restarted 120 to match the default. 121 122 The Raft protocol must be stepped up in this way; only adjacent version numbers are 123 compatible (for example, version 1 cannot talk to version 3). Here is a table of the 124 Raft Protocol versions supported by each Nomad version: 125 126 <table class="table table-bordered table-striped"> 127 <tr> 128 <th>Version</th> 129 <th>Supported Raft Protocols</th> 130 </tr> 131 <tr> 132 <td>0.6 and earlier</td> 133 <td>0</td> 134 </tr> 135 <tr> 136 <td>0.7</td> 137 <td>1</td> 138 </tr> 139 <tr> 140 <td>0.8</td> 141 <td>1, 2, 3</td> 142 </tr> 143 </table> 144 145 In order to enable all [Autopilot](/guides/operations/autopilot.html) features, all servers 146 in a Nomad cluster must be running with Raft protocol version 3 or later. 147 148 #### Upgrading to Raft Protocol 3 149 150 This section provides details on upgrading to Raft Protocol 3 in Nomad 0.8 and higher. Raft protocol version 3 requires Nomad running 0.8.0 or newer on all servers in order to work. See [Raft Protocol Version Compatibility](/guides/upgrade/upgrade-specific.html#raft-protocol-version-compatibility) for more details. Also the format of `peers.json` used for outage recovery is different when running with the latest Raft protocol. See [Manual Recovery Using peers.json](/guides/operations/outage.html#manual-recovery-using-peers-json) for a description of the required format. 151 152 Please note that the Raft protocol is different from Nomad's internal protocol as shown in commands like `nomad server members`. To see the version of the Raft protocol in use on each server, use the `nomad operator raft list-peers` command. 153 154 The easiest way to upgrade servers is to have each server leave the cluster, upgrade its `raft_protocol` version in the `server` stanza, and then add it back. Make sure the new server joins successfully and that the cluster is stable before rolling the upgrade forward to the next server. It's also possible to stand up a new set of servers, and then slowly stand down each of the older servers in a similar fashion. 155 156 When using Raft protocol version 3, servers are identified by their `node-id` instead of their IP address when Nomad makes changes to its internal Raft quorum configuration. This means that once a cluster has been upgraded with servers all running Raft protocol version 3, it will no longer allow servers running any older Raft protocol versions to be added. If running a single Nomad server, restarting it in-place will result in that server not being able to elect itself as a leader. To avoid this, either set the Raft protocol back to 2, or use [Manual Recovery Using peers.json](/guides/operations/outage.html#manual-recovery-using-peers-json) to map the server to its node ID in the Raft quorum configuration. 157 158 159 ### Node Draining Improvements 160 161 Node draining via the [`node drain`][drain-cli] command or the [drain 162 API][drain-api] has been substantially changed in Nomad 0.8. In Nomad 0.7.1 and 163 earlier draining a node would immediately stop all allocations on the node 164 being drained. Nomad 0.8 now supports a [`migrate`][migrate] stanza in job 165 specifications to control how many allocations may be migrated at once and the 166 default will be used for existing jobs. 167 168 The `drain` command now blocks until the drain completes. To get the Nomad 169 0.7.1 and earlier drain behavior use the command: `nomad node drain -enable 170 -force -detach <node-id>` 171 172 See the [`migrate` stanza documentation][migrate] and [Decommissioning Nodes 173 guide](/guides/operations/node-draining.html) for details. 174 175 ### Periods in Environment Variable Names No Longer Escaped 176 177 *Applications which expect periods in environment variable names to be replaced 178 with underscores must be updated.* 179 180 In Nomad 0.7 periods (`.`) in environment variables names were replaced with an 181 underscore in both the [`env`](/docs/job-specification/env.html) and 182 [`template`](/docs/job-specification/template.html) stanzas. 183 184 In Nomad 0.8 periods are *not* replaced and will be included in environment 185 variables verbatim. 186 187 For example the following stanza: 188 189 ```text 190 env { 191 registry.consul.addr = "${NOMAD_IP_http}:8500" 192 } 193 ``` 194 195 In Nomad 0.7 would be exposed to the task as 196 `registry_consul_addr=127.0.0.1:8500`. In Nomad 0.8 it will now appear exactly 197 as specified: `registry.consul.addr=127.0.0.1:8500`. 198 199 ### Client APIs Unavailable on Older Nodes 200 201 Because Nomad 0.8 uses a new RPC mechanism to route node-specific APIs like 202 [`nomad alloc fs`](/docs/commands/alloc/fs.html) through servers to the node, 203 0.8 CLIs are incompatible using these commands on clients older than 0.8. 204 205 To access these commands on older clients either continue to use a pre-0.8 206 version of the CLI, or upgrade all clients to 0.8. 207 208 ### CLI Command Changes 209 210 Nomad 0.8 has changed the organization of CLI commands to be based on 211 subcommands. An example of this change is the change from `nomad alloc-status` 212 to `nomad alloc status`. All commands have been made to be backwards compatible, 213 but operators should update any usage of the old style commands to the new style 214 as the old style will be deprecated in future versions of Nomad. 215 216 ### RPC Advertise Address 217 218 The behavior of the [advertised RPC 219 address](/docs/configuration/index.html#rpc-1) has changed to be only used 220 to advertise the RPC address of servers to client nodes. Server to server 221 communication is done using the advertised Serf address. Existing cluster's 222 should not be effected but the advertised RPC address may need to be updated to 223 allow connecting client's over a NAT. 224 225 226 ## Nomad 0.6.0 227 228 ### Default `advertise` address changes 229 230 When no `advertise` address was specified and Nomad's `bind_addr` was loopback 231 or `0.0.0.0`, Nomad attempted to resolve the local hostname to use as an 232 advertise address. 233 234 Many hosts cannot properly resolve their hostname, so Nomad 0.6 defaults 235 `advertise` to the first private IP on the host (e.g. `10.1.2.3`). 236 237 If you manually configure `advertise` addresses no changes are necessary. 238 239 ## Nomad Clients 240 241 The change to the default, advertised IP also effect clients that do not specify 242 which network_interface to use. If you have several routable IPs, it is advised 243 to configure the client's [network 244 interface](/docs/configuration/client.html#network_interface) 245 such that tasks bind to the correct address. 246 247 ## Nomad 0.5.5 248 249 ### Docker `load` changes 250 251 Nomad 0.5.5 has a backward incompatible change in the `docker` driver's 252 configuration. Prior to 0.5.5 the `load` configuration option accepted a list 253 images to load, in 0.5.5 it has been changed to a single string. No 254 functionality was changed. Even if more than one item was specified prior to 255 0.5.5 only the first item was used. 256 257 To do a zero-downtime deploy with jobs that use the `load` option: 258 259 * Upgrade servers to version 0.5.5 or later. 260 261 * Deploy new client nodes on the same version as the servers. 262 263 * Resubmit jobs with the `load` option fixed and a constraint to only run on 264 version 0.5.5 or later: 265 266 ```hcl 267 constraint { 268 attribute = "${attr.nomad.version}" 269 operator = "version" 270 value = ">= 0.5.5" 271 } 272 ``` 273 274 * Drain and shutdown old client nodes. 275 276 ### Validation changes 277 278 Due to internal job serialization and validation changes you may run into 279 issues using 0.5.5 command line tools such as `nomad run` and `nomad validate` 280 with 0.5.4 or earlier agents. 281 282 It is recommended you upgrade agents before or alongside your command line 283 tools. 284 285 ## Nomad 0.4.0 286 287 Nomad 0.4.0 has backward incompatible changes in the logic for Consul 288 deregistration. When a Task which was started by Nomad v0.3.x is uncleanly shut 289 down, the Nomad 0.4 Client will no longer clean up any stale services. If an 290 in-place upgrade of the Nomad client to 0.4 prevents the Task from gracefully 291 shutting down and deregistering its Consul-registered services, the Nomad Client 292 will not clean up the remaining Consul services registered with the 0.3 293 Executor. 294 295 We recommend draining a node before upgrading to 0.4.0 and then re-enabling the 296 node once the upgrade is complete. 297 298 299 ## Nomad 0.3.1 300 301 Nomad 0.3.1 removes artifact downloading from driver configurations and places them as 302 a first class element of the task. As such, jobs will have to be rewritten in 303 the proper format and resubmitted to Nomad. Nomad clients will properly 304 re-attach to existing tasks but job definitions must be updated before they can 305 be dispatched to clients running 0.3.1. 306 307 ## Nomad 0.3.0 308 309 Nomad 0.3.0 has made several substantial changes to job files included a new 310 `log` block and variable interpretation syntax (`${var}`), a modified `restart` 311 policy syntax, and minimum resources for tasks as well as validation. These 312 changes require a slight change to the default upgrade flow. 313 314 After upgrading the version of the servers, all previously submitted jobs must 315 be resubmitted with the updated job syntax using a Nomad 0.3.0 binary. 316 317 * All instances of `$var` must be converted to the new syntax of `${var}` 318 319 * All tasks must provide their required resources for CPU, memory and disk as 320 well as required network usage if ports are required by the task. 321 322 * Restart policies must be updated to indicate whether it is desired for the 323 task to restart on failure or to fail using `mode = "delay"` or `mode = 324 "fail"` respectively. 325 326 * Service names that include periods will fail validation. To fix, remove any 327 periods from the service name before running the job. 328 329 After updating the Servers and job files, Nomad Clients can be upgraded by first 330 draining the node so no tasks are running on it. This can be verified by running 331 `nomad node status <node-id>` and verify there are no tasks in the `running` 332 state. Once that is done the client can be killed, the `data_dir` should be 333 deleted and then Nomad 0.3.0 can be launched. 334 335 [drain-api]: /api/nodes.html#drain-node 336 [drain-cli]: /docs/commands/node/drain.html 337 [hcl2]: https://github.com/hashicorp/hcl2 338 [lxc]: /docs/drivers/external/lxc.html 339 [migrate]: /docs/job-specification/migrate.html 340 [plugins]: /docs/drivers/external/index.html 341 [plugin-stanza]: /docs/configuration/plugin.html 342 [preemption]: /docs/internals/scheduling/preemption.html 343 [task-config]: /docs/job-specification/task.html#config 344 [validate]: /docs/commands/job/validate.html