github.com/outbrain/consul@v1.4.5/website/source/docs/upgrade-specific.html.md (about)

     1  ---
     2  layout: "docs"
     3  page_title: "Upgrading Specific Versions"
     4  sidebar_current: "docs-upgrading-specific"
     5  description: |-
     6    Specific versions of Consul may have additional information about the upgrade process beyond the standard flow.
     7  ---
     8  
     9  # Upgrading Specific Versions
    10  
    11  The [upgrading page](/docs/upgrading.html) covers the details of doing
    12  a standard upgrade. However, specific versions of Consul may have more
    13  details provided for their upgrades as a result of new features or changed
    14  behavior. This page is used to document those details separately from the
    15  standard upgrade flow.
    16  
    17  ## Consul 1.4.0
    18  
    19  There are two major features in Consul 1.4.0 that may impact upgrades: a [new ACL system](#acl-upgrade) and [multi-datacenter support for Connect](#connect-multi-datacenter) in the Enterprise version.
    20  
    21  ### ACL Upgrade
    22  
    23  Consul 1.4.0 includes a [new ACL system](/docs/guides/acl.html) that is
    24  designed to have a smooth upgrade path but requires care to upgrade components
    25  in the right order.
    26  
    27  **Note:** As with most major version upgrades, you cannot downgrade once the
    28  upgrade to 1.4.0 is complete as it adds new state to the raft store. As always
    29  it is _strongly_ recommended that you test the upgrade first outside of
    30  production and ensure you take backup snapshots of all datacenters before
    31  upgrading.
    32  
    33  #### Primary Datacenter
    34  
    35  The "ACL datacenter" in 1.3.x and earlier is now referred to as the "Primary
    36  datacenter". All configuration is backwards compatible and shouldn't need to
    37  change prior to upgrade although it's strongly recommended to migrate ACL
    38  configuration to the new syntax soon after upgrade. This includes moving to
    39  `primary_datacenter` rather than `acl_datacenter` and `acl_*` to the new [ACL
    40  block](/docs/agent/options.html#acl).
    41  
    42  Datacenters can be upgraded in any order although secondaries will remain in
    43  [Legacy ACL mode](#legacy-acl-mode) until the primary datacenter is fully
    44  ugraded.
    45  
    46  Each datacenter should follow the [standard rolling upgrade
    47  procedure](/docs/upgrading.html#standard-upgrades).
    48  
    49  #### Legacy ACL Mode
    50  
    51  When a 1.4.0 server first starts, it runs in "Legacy ACL mode". In this mode,
    52  bootstrap requests and new ACL APIs will not be functional yet and will return
    53  an error. The server advertises it's ability to support 1.4.0 ACLs via gossip
    54  and waits.
    55  
    56  In the primary datacenter, the servers all wait in legacy ACL mode until they
    57  see every server in the primary datacenter advertise 1.4.0 ACL support. Once
    58  this happens, the leader will complete the transition out of "legacy ACL mode"
    59  and write this into the state so future restarts don't need to go through the
    60  same transition.
    61  
    62  In a secondary datacenter, the same process happens except that servers
    63  _additionally_ wait for all servers in the primary datacenter making it safe to
    64  upgrade datacenters in any order.
    65  
    66  It should be noted that even if you are not upgrading, starting a brand new
    67  1.4.0 cluster will transition through legacy ACL mode so you may be unable to
    68  bootstrap ACLs until all the expected servers are up and healthy.
    69  
    70  #### Legacy Token Accessor Migration
    71  
    72  As soon as all servers in the primary datacenter have been upgraded to 1.4.0,
    73  the leader will begin the process of creating new accessor IDs for all existing
    74  ACL tokens.
    75  
    76  This process completes in the background and is rate limited to ensure it
    77  doesn't overload the leader. It completes upgrades in batches of 128 tokens and
    78  will not upgrade more than one batch per second so on a cluster with 10,000
    79  tokens, this may take several minutes.
    80  
    81  While this is happening both old and new ACLs will work correctly with the
    82  caveat that new ACL [Token APIs](/api/acl/tokens.html) may not return an
    83  accessor ID for legacy tokens that are not yet migrated.
    84  
    85  #### Migrating Existing ACLs
    86  
    87  New ACL policies have slightly different syntax designed to fix some
    88  shortcomings in old ACL syntax. During and after the upgrade process, any old
    89  ACL tokens will continue to work and grant exactly the same level of access.
    90  
    91  After upgrade, it is still possible to create "legacy" tokens using the existing
    92  API so existing integrations that create tokens (e.g. Vault) will continue to
    93  work. The "legacy" tokens generated though will not be able to take advantage of
    94  new policy features. It's recommended that you complete migration of all tokens
    95  as soon as possible after upgrade, as well as updating any integrations to work
    96  with the the new ACL [Token](/api/acl/tokens.html) and
    97  [Policy](/api/acl/policies.html) APIs.
    98  
    99  More complete details on how to upgrade "legacy" tokens is available [here](/docs/guides/acl-migrate-tokens.html).
   100  
   101  ### Connect Multi-datacenter
   102  
   103  This only applies to users upgrading from an older version of Consul Enterprise to Consul Enterprise 1.4.0 (all license types).
   104  
   105  In addition, this upgrade will only affect clusters where [Connect is enabled](/docs/connect/configuration.html) on your servers before the migration.
   106  
   107  Connect multi-datacenter uses the same primary/secondary approach as ACLs and will use the same [primary_datacenter](#primary-datacenter). When a secondary datacenter server restarts with 1.4.0 it will detect it is not the primary and begin an automatic bootstrap of multi-datacenter CA federation.
   108  
   109  Datacenters can be upgraded in either order; secondary datacenters will not switch into multi-datacenter mode until all servers in both the secondary and primary datacenter are detected to be running at least Consul 1.4.0. Secondary datacenters monitor this periodically (every few minutes) and will automatically upgrade Connect to use a federated Certificate Authority when they do.
   110  
   111  In general, migrating a Consul cluster from OSS to Enterprise will update the CA to be federated automatically and without impact on Connect traffic. When upgrading Consul Enterprise 1.3.x to Consul Enterprise 1.4.0 upgrades the CA upgrade is seamless, however depending on the size of the cluster, _new_ connection attempts in the secondary datacenter might fail for a short window (typically seconds) while the update is propagated due to the 1.3.x Beta authorization endpoint validating originating cluster in a way that was not fully forwards compatible with migrating between cluster trust domains. That issue is fixed in 1.4.0 as part of General Availability.
   112  
   113  Once migrated (typically a few seconds). Connect will use the primary datacenter's Certificate Authority as the root of trust for all other datacenters. CA migration or root key changes in the primary will now rotate automatically and without loss of connectivity throughout all datacenters and workloads.
   114  
   115  For more information see [Connect Multi-datacenter](/docs/enterprise/connect-multi-datacenter/index.html).
   116  
   117  ## Consul 1.3.0
   118  
   119  This version added support for multiple tag filters in service discovery queries, however it introduced a subtle bug where API calls to `/catalog/service/:name?tag=<tag>` would ignore the tag filter _only during the upgrade_. It only occurs when clients are still running 1.2.3 or earlier but servers have been upgraded. The `/health/service/:name?tag=<tag>` endpoint and DNS interface were _not_ affected.
   120  
   121  For this reason, we recommend you upgrade directly to 1.3.1 which includes only a fix for this issue.
   122  
   123  ## Consul 1.1.0
   124  
   125  #### Removal of Deprecated Features
   126  
   127  The following previously deprecated fields and config options have been removed:
   128  
   129   - `CheckID` has been removed from config file check definitions (use `id` instead).
   130   - `script` has been removed from config file check definitions (use `args` instead).
   131   - `enableTagOverride` is no longer valid in service definitions (use `enable_tag_override` instead).
   132   - The [deprecated set of metric names](/docs/upgrade-specific.html#metric-names-updated) (beginning with `consul.consul.`) has been removed
   133   along with the `enable_deprecated_names` option from the metrics configuration.
   134  
   135  #### New defaults for Raft Snapshot Creation
   136  Consul 1.0.1 (and earlier versions of Consul) checked for raft snapshots every
   137  5 seconds, and created new snapshots for every 8192 writes. These defaults cause
   138  constant disk IO in large busy clusters. Consul 1.1.0 increases these to larger values,
   139  and makes them tunable via the [raft_snapshot_interval](/docs/agent/options.html#_raft_snapshot_interval) and
   140  [raft_snapshot_threshold](/docs/agent/options.html#_raft_snapshot_threshold) parameters. We recommend
   141  keeping the new defaults. However, operators can go back to the old defaults by changing their
   142  config if they prefer more frequent snapshots. See the documentation for [raft_snapshot_interval](/docs/agent/options.html#_raft_snapshot_interval)
   143  and [raft_snapshot_threshold](/docs/agent/options.html#_raft_snapshot_threshold) to understand the trade-offs
   144  when tuning these.
   145  
   146  ## Consul 1.0.7
   147  
   148  When requesting a specific service (`/v1/health/:service` or
   149  `/v1/catalog/:service` endpoints), the `X-Consul-Index` returned is now the
   150  index at which that _specific service_ was last modified. In version 1.0.6 and
   151  earlier the `X-Consul-Index` returned was the index at which _any_ service was
   152  last modified. See [GH-3890](https://github.com/hashicorp/consul/issues/3890)
   153  for more details.
   154  
   155  During upgrades from 1.0.6 or lower to 1.0.7 or higher, watchers are likely to
   156  see `X-Consul-Index` for these endpoints decrease between blocking calls.
   157  
   158  Consul’s watch feature and `consul-template` should gracefully handle this case.
   159  Other tools relying on blocking service or health queries are also likely to
   160  work; some may require a restart. It is possible external tools could break and
   161  either stop working or continually re-request data without blocking if they
   162  have assumed indexes can never decrease or be reset and/or persist index
   163  values. Please test any blocking query integrations in a controlled environment
   164  before proceeding.
   165  
   166  ## Consul 1.0.1
   167  
   168  #### Carefully Check and Remove Stale Servers During Rolling Upgrades
   169  
   170  Consul 1.0 (and earlier versions of Consul when running with [Raft protocol 3](/docs/agent/options.html#_raft_protocol) had an issue where performing rolling updates of Consul servers could result in an outage from old servers remaining in the cluster. [Autopilot](/docs/guides/autopilot.html) would normally remove old servers when new ones come online, but it was also waiting to promote servers to voters in pairs to maintain an odd quorum size. The pairwise promotion feature was removed so that servers become voters as soon as they are stable, allowing Autopilot to remove old servers in a safer way.
   171  
   172  When upgrading from Consul 1.0, you may need to manually [force-leave](/docs/commands/force-leave.html) old servers as part of a rolling update to Consul 1.0.1.
   173  
   174  ## Consul 1.0
   175  
   176  Consul 1.0 has several important breaking changes that are documented here. Please be sure to read over all the details here before upgrading.
   177  
   178  #### Raft Protocol Now Defaults to 3
   179  
   180  The [`-raft-protocol`](/docs/agent/options.html#_raft_protocol) default has been changed from 2 to 3, enabling all [Autopilot](/docs/guides/autopilot.html) features by default.
   181  
   182  Raft protocol version 3 requires Consul running 0.8.0 or newer on all servers in order to work, so if you are upgrading with older servers in a cluster then you will need to set this back to 2 in order to upgrade. See [Raft Protocol Version Compatibility](/docs/upgrade-specific.html#raft-protocol-version-compatibility) for more details. Also the format of `peers.json` used for outage recovery is different when running with the latest Raft protocol. See [Manual Recovery Using peers.json](/docs/guides/outage.html#manual-recovery-using-peers-json) for a description of the required format.
   183  
   184  Please note that the Raft protocol is different from Consul's internal protocol as described on the [Protocol Compatibility Promise](/docs/compatibility.html) page, and as is shown in commands like `consul members` and `consul version`. To see the version of the Raft protocol in use on each server, use the `consul operator raft list-peers` command.
   185  
   186  The easiest way to upgrade servers is to have each server leave the cluster, upgrade its Consul version, and then add it back. Make sure the new server joins successfully and that the cluster is stable before rolling the upgrade forward to the next server. It's also possible to stand up a new set of servers, and then slowly stand down each of the older servers in a similar fashion.
   187  
   188  When using Raft protocol version 3, servers are identified by their [`-node-id`](/docs/agent/options.html#_node_id) instead of their IP address when Consul makes changes to its internal Raft quorum configuration. This means that once a cluster has been upgraded with servers all running Raft protocol version 3, it will no longer allow servers running any older Raft protocol versions to be added. If running a single Consul server, restarting it in-place will result in that server not being able to elect itself as a leader. To avoid this, either set the Raft protocol back to 2, or use [Manual Recovery Using peers.json](/docs/guides/outage.html#manual-recovery-using-peers-json) to map the server to its node ID in the Raft quorum configuration.
   189  
   190  #### Config Files Require an Extension
   191  
   192  As part of supporting the [HCL](https://github.com/hashicorp/hcl#syntax) format for Consul's config files, an `.hcl` or `.json` extension is required for all config files loaded by Consul, even when using the [`-config-file`](/docs/agent/options.html#_config_file) argument to specify a file directly.
   193  
   194  #### Deprecated Options Have Been Removed
   195  
   196  All of Consul's previously deprecated command line flags and config options have been removed, so these will need to be mapped to their equivalents before upgrading. Here's the complete list of removed options and their equivalents:
   197  
   198  | Removed Option | Equivalent |
   199  | -------------- | ---------- |
   200  | `-dc` | [`-datacenter`](/docs/agent/options.html#_datacenter) |
   201  | `-retry-join-azure-tag-name` | [`-retry-join`](/docs/agent/options.html#microsoft-azure) |
   202  | `-retry-join-azure-tag-value` | [`-retry-join`](/docs/agent/options.html#microsoft-azure) |
   203  | `-retry-join-ec2-region` | [`-retry-join`](/docs/agent/options.html#amazon-ec2) |
   204  | `-retry-join-ec2-tag-key` | [`-retry-join`](/docs/agent/options.html#amazon-ec2) |
   205  | `-retry-join-ec2-tag-value` | [`-retry-join`](/docs/agent/options.html#amazon-ec2) |
   206  | `-retry-join-gce-credentials-file` | [`-retry-join`](/docs/agent/options.html#google-compute-engine) |
   207  | `-retry-join-gce-project-name` | [`-retry-join`](/docs/agent/options.html#google-compute-engine) |
   208  | `-retry-join-gce-tag-name` | [`-retry-join`](/docs/agent/options.html#google-compute-engine) |
   209  | `-retry-join-gce-zone-pattern` | [`-retry-join`](/docs/agent/options.html#google-compute-engine) |
   210  | `addresses.rpc` | None, the RPC server for CLI commands is no longer supported. |
   211  | `advertise_addrs` | [`ports`](/docs/agent/options.html#ports) with [`advertise_addr`](https://www.consul.io/docs/agent/options.html#advertise_addr) and/or [`advertise_addr_wan`](/docs/agent/options.html#advertise_addr_wan) |
   212  | `dogstatsd_addr` | [`telemetry.dogstatsd_addr`](/docs/agent/options.html#telemetry-dogstatsd_addr) |
   213  | `dogstatsd_tags` | [`telemetry.dogstatsd_tags`](/docs/agent/options.html#telemetry-dogstatsd_tags) |
   214  | `http_api_response_headers` | [`http_config.response_headers`](/docs/agent/options.html#response_headers) |
   215  | `ports.rpc` | None, the RPC server for CLI commands is no longer supported. |
   216  | `recursor` | [`recursors`](https://github.com/hashicorp/consul/blob/master/website/source/docs/agent/options.html.md#recursors) |
   217  | `retry_join_azure` | [`-retry-join`](/docs/agent/options.html#microsoft-azure) |
   218  | `retry_join_ec2` | [`-retry-join`](/docs/agent/options.html#amazon-ec2) |
   219  | `retry_join_gce` | [`-retry-join`](/docs/agent/options.html#google-compute-engine) |
   220  | `statsd_addr` | [`telemetry.statsd_address`](https://github.com/hashicorp/consul/blob/master/website/source/docs/agent/options.html.md#telemetry-statsd_address) |
   221  | `statsite_addr` | [`telemetry.statsite_address`](https://github.com/hashicorp/consul/blob/master/website/source/docs/agent/options.html.md#telemetry-statsite_address) |
   222  | `statsite_prefix` | [`telemetry.metrics_prefix`](/docs/agent/options.html#telemetry-metrics_prefix) |
   223  | `telemetry.statsite_prefix` | [`telemetry.metrics_prefix`](/docs/agent/options.html#telemetry-metrics_prefix) |
   224  | (service definitions) `serviceid` | [`service_id`](/docs/agent/services.html) |
   225  | (service definitions) `dockercontainerid` | [`docker_container_id`](/docs/agent/services.html) |
   226  | (service definitions) `tlsskipverify` | [`tls_skip_verify`](/docs/agent/services.html) |
   227  | (service definitions) `deregistercriticalserviceafter` | [`deregister_critical_service_after`](/docs/agent/services.html) |
   228  
   229  #### `statsite_prefix` Renamed to `metrics_prefix`
   230  
   231  Since the `statsite_prefix` configuration option applied to all telemetry providers, `statsite_prefix` was renamed to [`metrics_prefix`](/docs/agent/options.html#telemetry-metrics_prefix). Configuration files will need to be updated when upgrading to this version of Consul.
   232  
   233  #### `advertise_addrs` Removed
   234  
   235  This configuration option was removed since it was redundant with `advertise_addr` and `advertise_addr_wan` in combination with `ports` and also wrongly stated that you could configure both host and port.
   236  
   237  #### Escaping Behavior Changed for go-discover Configs
   238  
   239  The format for [`-retry-join`](/docs/agent/options.html#retry-join) and [`-retry-join-wan`](/docs/agent/options.html#retry-join-wan) values that use [go-discover](https://github.com/hashicorp/go-discover) cloud auto joining has changed. Values in `key=val` sequences must no longer be URL encoded and can be provided as literals as long as they do not contain spaces, backslashes `\` or double quotes `"`. If values contain these characters then use double quotes as in `"some key"="some value"`. Special characters within a double quoted string can be escaped with a backslash `\`.
   240  
   241  #### HTTP Verbs are Enforced in Many HTTP APIs
   242  
   243  Many endpoints in the HTTP API that previously took any HTTP verb now check for specific HTTP verbs and enforce them. This may break clients relying on the old behavior. Here's the complete list of updated endpoints and required HTTP verbs:
   244  
   245  | Endpoint | Required HTTP Verb |
   246  | -------- | ------------------ |
   247  | /v1/acl/info | GET |
   248  | /v1/acl/list | GET |
   249  | /v1/acl/replication | GET |
   250  | /v1/agent/check/deregister | PUT |
   251  | /v1/agent/check/fail | PUT |
   252  | /v1/agent/check/pass | PUT |
   253  | /v1/agent/check/register | PUT |
   254  | /v1/agent/check/warn | PUT |
   255  | /v1/agent/checks | GET |
   256  | /v1/agent/force-leave | PUT |
   257  | /v1/agent/join | PUT |
   258  | /v1/agent/members | GET |
   259  | /v1/agent/metrics | GET |
   260  | /v1/agent/self | GET |
   261  | /v1/agent/service/register | PUT |
   262  | /v1/agent/service/deregister | PUT |
   263  | /v1/agent/services | GET |
   264  | /v1/catalog/datacenters | GET |
   265  | /v1/catalog/deregister | PUT |
   266  | /v1/catalog/node | GET |
   267  | /v1/catalog/nodes | GET |
   268  | /v1/catalog/register | PUT |
   269  | /v1/catalog/service | GET |
   270  | /v1/catalog/services | GET |
   271  | /v1/coordinate/datacenters | GET |
   272  | /v1/coordinate/nodes | GET |
   273  | /v1/health/checks | GET |
   274  | /v1/health/node | GET |
   275  | /v1/health/service | GET |
   276  | /v1/health/state | GET |
   277  | /v1/internal/ui/node | GET |
   278  | /v1/internal/ui/nodes | GET |
   279  | /v1/internal/ui/services | GET |
   280  | /v1/session/info | GET |
   281  | /v1/session/list | GET |
   282  | /v1/session/node | GET |
   283  | /v1/status/leader | GET |
   284  | /v1/status/peers | GET |
   285  | /v1/operator/area/:uuid/members | GET |
   286  | /v1/operator/area/:uuid/join | PUT |
   287  
   288  #### Unauthorized KV Requests Return 403
   289  
   290  When ACLs are enabled, reading a key with an unauthorized token returns a 403. This previously returned a 404 response.
   291  
   292  #### Config Section of Agent Self Endpoint has Changed
   293  
   294  The /v1/agent/self endpoint's `Config` section has often been in flux as it was directly returning one of Consul's internal data structures. This configuration structure has been moved under `DebugConfig`, and is documents as for debugging use and subject to change, and a small set of elements of `Config` have been maintained and documented. See [Read Configuration](/api/agent.html#read-configuration) endpoint documentation for details.
   295  
   296  #### Deprecated `configtest` Command Removed
   297  
   298  The `configtest` command was deprecated and has been superseded by the `validate` command.
   299  
   300  #### Undocumented Flags in `validate` Command Removed
   301  
   302  The `validate` command supported the `-config-file` and `-config-dir` command line flags but did not document them. This support has been removed since the flags are not required.
   303  
   304  #### Metric Names Updated
   305  
   306  Metric names no longer start with `consul.consul`. To help with transitioning dashboards and other metric consumers, the field `enable_deprecated_names` has been added to the telemetry section of the config, which will enable metrics with the old naming scheme to be sent alongside the new ones. The following prefixes were affected:
   307  
   308  | Prefix |
   309  | ------ |
   310  | consul.consul.acl |
   311  | consul.consul.autopilot |
   312  | consul.consul.catalog |
   313  | consul.consul.fsm |
   314  | consul.consul.health |
   315  | consul.consul.http |
   316  | consul.consul.kvs |
   317  | consul.consul.leader |
   318  | consul.consul.prepared-query |
   319  | consul.consul.rpc |
   320  | consul.consul.session |
   321  | consul.consul.session_ttl |
   322  | consul.consul.txn |
   323  
   324  #### Checks Validated On Agent Startup
   325  
   326  Consul agents now validate health check definitions in their configuration and will fail at startup if any checks are invalid. In previous versions of Consul, invalid health checks would get skipped.
   327  
   328  ## Consul 0.9.0
   329  
   330  #### Script Checks Are Now Opt-In
   331  
   332  A new [`enable_script_checks`](/docs/agent/options.html#_enable_script_checks) configuration option was added, and defaults to `false`, meaning that in order to allow an agent to run health checks that execute scripts, this will need to be configured and set to `true`. This provides a safer out-of-the-box configuration for Consul where operators must opt-in to allow script-based health checks.
   333  
   334  If your cluster uses script health checks please be sure to set this to `true` as part of upgrading agents. If this is set to `true`, you should also enable [ACLs](/docs/guides/acl.html) to provide control over which users are allowed to register health checks that could potentially execute scripts on the agent machines.
   335  
   336  #### Web UI Is No Longer Released Separately
   337  
   338  Consul releases will no longer include a `web_ui.zip` file with the compiled web assets. These have been built in to the Consul binary since the 0.7.x series and can be enabled with the [`-ui`](/docs/agent/options.html#_ui) configuration option. These built-in web assets have always been identical to the contents of the `web_ui.zip` file for each release. The [`-ui-dir`](/docs/agent/options.html#_ui_dir) option is still available for hosting customized versions of the web assets, but the vast majority of Consul users can just use the built in web assets.
   339  
   340  ## Consul 0.8.0
   341  
   342  #### Upgrade Current Cluster Leader Last
   343  
   344  We identified a potential issue with Consul 0.8 that requires the current cluster
   345  leader to be upgraded last when updating multiple servers. Please see
   346  [this issue](https://github.com/hashicorp/consul/issues/2889) for more details.
   347  
   348  #### Command-Line Interface RPC Deprecation
   349  
   350  The RPC client interface has been removed. All CLI commands that used RPC and the
   351  `-rpc-addr` flag to communicate with Consul have been converted to use the HTTP API
   352  and the appropriate flags for it, and the `rpc` field has been removed from the port
   353  and address binding configs. You will need to remove these fields from your config files
   354  and update any scripts that passed a custom `-rpc-addr` to the following commands:
   355  
   356  * `force-leave`
   357  * `info`
   358  * `join`
   359  * `keyring`
   360  * `leave`
   361  * `members`
   362  * `monitor`
   363  * `reload`
   364  
   365  #### Version 8 ACLs Are Now Opt-Out
   366  
   367  The [`acl_enforce_version_8`](/docs/agent/options.html#acl_enforce_version_8) configuration now defaults to `true` to enable [full version 8 ACL support](/docs/guides/acl.html#version_8_acls) by default. If you are upgrading an existing cluster with ACLs enabled, you will need to set this to `false` during the upgrade on **both Consul agents and Consul servers**. Version 8 ACLs were also changed so that [`acl_datacenter`](/docs/agent/options.html#acl_datacenter) must be set on agents in order to enable the agent-side enforcement of ACLs. This makes for a smoother experience in clusters where ACLs aren't enabled at all, but where the agents would have to wait to contact a Consul server before learning that.
   368  
   369  #### Remote Exec Is Now Opt-In
   370  
   371  The default for [`disable_remote_exec`](/docs/agent/options.html#disable_remote_exec) was
   372  changed to "true", so now operators need to opt-in to having agents support running
   373  commands remotely via [`consul exec`](/docs/commands/exec.html).
   374  
   375  #### Raft Protocol Version Compatibility
   376  
   377  When upgrading to Consul 0.8.0 from a version lower than 0.7.0, users will need to
   378  set the [`-raft-protocol`](/docs/agent/options.html#_raft_protocol) option to 1 in
   379  order to maintain backwards compatibility with the old servers during the upgrade.
   380  After the servers have been migrated to version 0.8.0, `-raft-protocol` can be moved
   381  up to 2 and the servers restarted to match the default.
   382  
   383  The Raft protocol must be stepped up in this way; only adjacent version numbers are
   384  compatible (for example, version 1 cannot talk to version 3). Here is a table of the
   385  Raft Protocol versions supported by each Consul version:
   386  
   387  <table class="table table-bordered table-striped">
   388    <tr>
   389      <th>Version</th>
   390      <th>Supported Raft Protocols</th>
   391    </tr>
   392    <tr>
   393      <td>0.6 and earlier</td>
   394      <td>0</td>
   395    </tr>
   396    <tr>
   397      <td>0.7</td>
   398      <td>1</td>
   399    </tr>
   400    <tr>
   401      <td>0.8</td>
   402      <td>1, 2, 3</td>
   403    </tr>
   404  </table>
   405  
   406  In order to enable all [Autopilot](/docs/guides/autopilot.html) features, all servers
   407  in a Consul cluster must be running with Raft protocol version 3 or later.
   408  
   409  ## Consul 0.7.1
   410  
   411  #### Child Process Reaping
   412  
   413  Child process reaping support has been removed, along with the `reap` configuration option. Reaping is also done via [dumb-init](https://github.com/Yelp/dumb-init) in the [Consul Docker image](https://github.com/hashicorp/docker-consul), so removing it from Consul itself simplifies the code and eases future maintenance for Consul. If you are running Consul as PID 1 in a container you will need to arrange for a wrapper process to reap child processes.
   414  
   415  #### DNS Resiliency Defaults
   416  
   417  The default for [`max_stale`](/docs/agent/options.html#max_stale) has been increased from 5 seconds to a near-indefinite threshold (10 years) to allow DNS queries to continue to be served in the event of a long outage with no leader. A new telemetry counter was added at `consul.dns.stale_queries` to track when agents serve DNS queries that are stale by more than 5 seconds.
   418  
   419  ## Consul 0.7
   420  
   421  Consul version 0.7 is a very large release with many important changes. Changes
   422  to be aware of during an upgrade are categorized below.
   423  
   424  #### Performance Timing Defaults and Tuning
   425  
   426  Consul 0.7 now defaults the DNS configuration to allow for stale queries by defaulting
   427  [`allow_stale`](/docs/agent/options.html#allow_stale) to true for better utilization
   428  of available servers. If you want to retain the previous behavior, set the following
   429  configuration:
   430  
   431  ```javascript
   432  {
   433    "dns_config": {
   434      "allow_stale": false
   435    }
   436  }
   437  ```
   438  
   439  Consul also 0.7 introduced support for tuning Raft performance using a new
   440  [performance configuration block](/docs/agent/options.html#performance). Also,
   441  the default Raft timing is set to a lower-performance mode suitable for
   442  [minimal Consul servers](/docs/guides/performance.html#minimum).
   443  
   444  To continue to use the high-performance settings that were the default prior to
   445  Consul 0.7 (recommended for production servers), add the following configuration
   446  to all Consul servers when upgrading:
   447  
   448  ```javascript
   449  {
   450    "performance": {
   451      "raft_multiplier": 1
   452    }
   453  }
   454  ```
   455  
   456  See the [Server Performance](/docs/guides/performance.html) guide for more details.
   457  
   458  #### Leave-Related Configuration Defaults
   459  
   460  The default behavior of [`leave_on_terminate`](/docs/agent/options.html#leave_on_terminate)
   461  and [`skip_leave_on_interrupt`](/docs/agent/options.html#skip_leave_on_interrupt)
   462  are now dependent on whether or not the agent is acting as a server or client:
   463  
   464  * For servers, `leave_on_terminate` defaults to "false" and `skip_leave_on_interrupt`
   465  defaults to "true".
   466  
   467  * For clients, `leave_on_terminate` defaults to "true" and `skip_leave_on_interrupt`
   468  defaults to "false".
   469  
   470  These defaults are designed to be safer for servers so that you must explicitly
   471  configure them to leave the cluster. This also results in a better experience for
   472  clients, especially in cloud environments where they may be created and destroyed
   473  often and users prefer not to wait for the 72 hour reap time for cleanup.
   474  
   475  #### Dropped Support for Protocol Version 1
   476  
   477  Consul version 0.7 dropped support for protocol version 1, which means it
   478  is no longer compatible with versions of Consul prior to 0.3. You will need
   479  to upgrade all agents to a newer version of Consul before upgrading to Consul
   480  0.7.
   481  
   482  #### Prepared Query Changes
   483  
   484  Consul version 0.7 adds a feature which allows prepared queries to store a
   485  [`Near` parameter](/api/query.html#near) in the query definition
   486  itself. This feature enables using the distance sorting features of prepared
   487  queries without explicitly providing the node to sort near in requests, but
   488  requires the agent servicing a request to send additional information about
   489  itself to the Consul servers when executing the prepared query. Agents prior
   490  to 0.7 do not send this information, which means they are unable to properly
   491  execute prepared queries configured with a `Near` parameter. Similarly, any
   492  server nodes prior to version 0.7 are unable to store the `Near` parameter,
   493  making them unable to properly serve requests for prepared queries using the
   494  feature. It is recommended that all agents be running version 0.7 prior to
   495  using this feature.
   496  
   497  #### WAN Address Translation in HTTP Endpoints
   498  
   499  Consul version 0.7 added support for translating WAN addresses in certain
   500  [HTTP endpoints](/docs/agent/options.html#translate_wan_addrs). The servers
   501  and the agents need to be running version 0.7 or later in order to use this
   502  feature.
   503  
   504  These translated addresses could break HTTP endpoint consumers that are
   505  expecting local addresses, so a new [`X-Consul-Translate-Addresses`](/api/index.html#translate_header)
   506  header was added to allow clients to detect if translation is enabled for HTTP
   507  responses. A "lan" tag was added to `TaggedAddresses` for clients that need
   508  the local address regardless of translation.
   509  
   510  #### Outage Recovery and `peers.json` Changes
   511  
   512  The `peers.json` file is no longer present by default and is only used when
   513  performing recovery. This file will be deleted after Consul starts and ingests
   514  the file. Consul 0.7 also uses a new, automatically-created raft/peers.info file
   515  to avoid ingesting the `peers.json` file on the first start after upgrading (the
   516  `peers.json` file is simply deleted on the first start after upgrading).
   517  
   518  Please be sure to review the [Outage Recovery Guide](/docs/guides/outage.html)
   519  before upgrading for more details.
   520  
   521  ## Consul 0.6.4
   522  
   523  Consul 0.6.4 made some substantial changes to how ACLs work with prepared
   524  queries. Existing queries will execute with no changes, but there are important
   525  differences to understand about how prepared queries are managed before you
   526  upgrade. In particular, prepared queries with no `Name` defined will no longer
   527  require any ACL to manage them, and prepared queries with a `Name` defined are
   528  now governed by a new `query` ACL policy that will need to be configured
   529  after the upgrade.
   530  
   531  See the [ACL Guide](/docs/guides/acl.html#prepared_query_acls) for more details
   532  about the new behavior and how it compares to previous versions of Consul.
   533  
   534  ## Consul 0.6
   535  
   536  Consul version 0.6 is a very large release with many enhancements and
   537  optimizations. Changes to be aware of during an upgrade are categorized below.
   538  
   539  #### Data Store Changes
   540  
   541  Consul changed the format used to store data on the server nodes in version 0.5
   542  (see 0.5.1 notes below for details). Previously, Consul would automatically
   543  detect data directories using the old LMDB format, and convert them to the newer
   544  BoltDB format. This automatic upgrade has been removed for Consul 0.6, and
   545  instead a safeguard has been put in place which will prevent Consul from booting
   546  if the old directory format is detected.
   547  
   548  It is still possible to migrate from a 0.5.x version of Consul to 0.6+ using the
   549  [consul-migrate](https://github.com/hashicorp/consul-migrate) CLI utility. This
   550  is the same tool that was previously embedded into Consul. See the
   551  [releases](https://github.com/hashicorp/consul-migrate/releases) page for
   552  downloadable versions of the tool.
   553  
   554  Also, in this release Consul switched from LMDB to a fully in-memory database for
   555  the state store. Because LMDB is a disk-based backing store, it was able to store
   556  more data than could fit in RAM in some cases (though this is not a recommended
   557  configuration for Consul). If you have an extremely large data set that won't fit
   558  into RAM, you may encounter issues upgrading to Consul 0.6.0 and later. Consul
   559  should be provisioned with physical memory approximately 2X the data set size to
   560  allow for bursty allocations and subsequent garbage collection.
   561  
   562  #### ACL Enhancements
   563  
   564  Consul 0.6 introduces enhancements to the ACL system which may require special
   565  handling:
   566  
   567  * Service ACLs are enforced during service discovery (REST + DNS)
   568  
   569  Previously, service discovery was wide open, and any client could query
   570  information about any service without providing a token. Consul now requires
   571  read-level access at a minimum when ACLs are enabled to return service
   572  information over the REST or DNS interfaces. If clients depend on an open
   573  service discovery system, then the following should be added to all ACL tokens
   574  which require it:
   575  
   576      # Enable discovery of all services
   577      service "" {
   578          policy = "read"
   579      }
   580  
   581  When the DNS interface is queried, the agent's
   582  [`acl_token`](/docs/agent/options.html#acl_token) is used, so be sure
   583  that token has sufficient privileges to return the DNS records you
   584  expect to retrieve from it.
   585  
   586  * Event and keyring ACLs
   587  
   588  Similar to service discovery, the new event and keyring ACLs will block access
   589  to these operations if the `acl_default_policy` is set to `deny`. If clients depend
   590  on open access to these, then the following should be added to all ACL tokens which
   591  require them:
   592  
   593      event "" {
   594        policy = "write"
   595      }
   596  
   597      keyring = "write"
   598  
   599  Unfortunately, these are new ACLs for Consul 0.6, so they must be added after the
   600  upgrade is complete.
   601  
   602  #### Prepared Queries
   603  
   604  Prepared queries introduce a new Raft log entry type that isn't supported on older
   605  versions of Consul. It's important to not use the prepared query features of Consul
   606  until all servers in a cluster have been upgraded to version 0.6.0.
   607  
   608  #### Single Private IP Enforcement
   609  
   610  Consul will refuse to start if there are multiple private IPs available, so
   611  if this is the case you will need to configure Consul's advertise or bind addresses
   612  before upgrading.
   613  
   614  #### New Web UI File Layout
   615  
   616  The release .zip file for Consul's web UI no longer contains a `dist` sub-folder;
   617  everything has been moved up one level. If you have any automated scripts that
   618  expect the old layout you may need to update them.
   619  
   620  ## Consul 0.5.1
   621  
   622  Consul version 0.5.1 uses a different backend store for persisting the Raft
   623  log. Because of this change, a data migration is necessary to move the log
   624  entries out of LMDB and into the newer backend, BoltDB.
   625  
   626  Consul version 0.5.1+ makes this transition seamless and easy. As a user, there
   627  are no special steps you need to take. When Consul starts, it checks
   628  for presence of the legacy LMDB data files, and migrates them automatically
   629  if any are found. You will see a log emitted when Raft data is migrated, like
   630  this:
   631  
   632  ```
   633  ==> Successfully migrated raft data in 5.839642ms
   634  ```
   635  
   636  This automatic upgrade will only exist in Consul 0.5.1+ and it will
   637  be removed starting with Consul 0.6.0+. It will still be possible to upgrade directly
   638  from pre-0.5.1 versions by using the consul-migrate utility, which is available on the
   639  [Consul Tools page](/downloads_tools.html).
   640  
   641  ## Consul 0.5
   642  
   643  Consul version 0.5 adds two features that complicate the upgrade process:
   644  
   645  * ACL system includes service discovery and registration
   646  * Internal use of tombstones to fix behavior of blocking queries
   647    in certain edge cases.
   648  
   649  Users of the ACL system need to be aware that deploying Consul 0.5 will
   650  cause service registration to be enforced. This means if an agent
   651  attempts to register a service without proper privileges it will be denied.
   652  If the `acl_default_policy` is "allow" then clients will continue to
   653  work without an updated policy. If the policy is "deny", then all clients
   654  will begin to have their registration rejected causing issues.
   655  
   656  To avoid this situation, all the ACL policies should be updated to
   657  add something like this:
   658  
   659      # Enable all services to be registered
   660      service "" {
   661          policy = "write"
   662      }
   663  
   664  This will set the service policy to `write` level for all services.
   665  The blank service name is the catch-all value. A more specific service
   666  can also be specified:
   667  
   668      # Enable only the API service to be registered
   669      service "api" {
   670          policy = "write"
   671      }
   672  
   673  The ACL policy can be updated while running 0.4, and enforcement will
   674  being with the upgrade to 0.5. The policy updates will ensure the
   675  availability of the cluster.
   676  
   677  The second major change is the new internal command used for tombstones.
   678  The details of the change are not important, however to function the leader
   679  node will replicate a new command to its followers. Consul is designed
   680  defensively, and when a command that is not recognized is received, the
   681  server will panic. This is a purposeful design decision to avoid the possibility
   682  of data loss, inconsistencies, or security issues caused by future incompatibility.
   683  
   684  In practice, this means if a Consul 0.5 node is the leader, all of its
   685  followers must also be running 0.5. There are a number of ways to do this
   686  to ensure cluster availability:
   687  
   688  * Add new 0.5 nodes, then remove the old servers. This will add the new
   689    nodes as followers, and once the old servers are removed, one of the
   690    0.5 nodes will become leader.
   691  
   692  * Upgrade the followers first, then the leader last. Using `consul info`,
   693    you can determine which nodes are followers. Do an in-place upgrade
   694    on them first, and finally upgrade the leader last.
   695  
   696  * Upgrade them in any order, but ensure all are done within 15 minutes.
   697    Even if the leader is upgraded to 0.5 first, as long as all of the followers
   698    are running 0.5 within 15 minutes there will be no issues.
   699  
   700  Finally, even if any of the methods above are not possible or the process
   701  fails for some reason, it is not fatal. The older version of the server
   702  will simply panic and stop. At that point, you can upgrade to the new version
   703  and restart the agent. There will be no data loss and the cluster will
   704  resume operations.