github.com/outbrain/consul@v1.4.5/website/source/docs/guides/datacenters.html.md (about)

     1  ---
     2  layout: "docs"
     3  page_title: "Multiple Datacenters - Basic Federation with the WAN Gossip Pool"
     4  sidebar_current: "docs-guides-datacenters"
     5  description: |-
     6    One of the key features of Consul is its support for multiple datacenters. The architecture of Consul is designed to promote low coupling of datacenters so that connectivity issues or failure of any datacenter does not impact the availability of Consul in other datacenters. This means each datacenter runs independently, each having a dedicated group of servers and a private LAN gossip pool.
     7  ---
     8  
     9  # Multiple Datacenters
    10  ## Basic Federation with the WAN Gossip Pool
    11  
    12  One of the key features of Consul is its support for multiple datacenters.
    13  The [architecture](/docs/internals/architecture.html) of Consul is designed to
    14  promote a low coupling of datacenters so that connectivity issues or
    15  failure of any datacenter does not impact the availability of Consul in other
    16  datacenters. This means each datacenter runs independently, each having a dedicated
    17  group of servers and a private LAN [gossip pool](/docs/internals/gossip.html).
    18  
    19  In general, data is not replicated between different Consul datacenters. When a
    20  request is made for a resource in another datacenter, the local Consul servers forward
    21  an RPC request to the remote Consul servers for that resource and return the results.
    22  If the remote datacenter is not available, then those resources will also not be
    23  available, but that won't otherwise affect the local datacenter. There are some special
    24  situations where a limited subset of data can be replicated, such as with Consul's built-in
    25  [ACL replication](/docs/guides/acl.html#outages-and-acl-replication) capability, or
    26  external tools like [consul-replicate](https://github.com/hashicorp/consul-replicate).
    27  
    28  This guide covers the basic form of federating Consul clusters using a single
    29  WAN gossip pool, interconnecting all Consul servers.
    30  [Consul Enterprise](https://www.hashicorp.com/products/consul/) version 0.8.0 added support
    31  for an advanced multiple datacenter capability. Please see the
    32  [Advanced Federation Guide](/docs/guides/areas.html) for more details.
    33  
    34  ## Getting Started
    35  
    36  To get started, follow the [bootstrapping guide](/docs/guides/bootstrapping.html) to
    37  start each datacenter. After bootstrapping, we should have two datacenters now which
    38  we can refer to as `dc1` and `dc2`. Note that datacenter names are opaque to Consul;
    39  they are simply labels that help human operators reason about the Consul clusters.
    40  
    41  To query the known WAN nodes, we use the [`members`](/docs/commands/members.html)
    42  command with the `-wan` parameter:
    43  
    44  ```text
    45  $ consul members -wan
    46  ...
    47  ```
    48  
    49  This will provide a list of all known members in the WAN gossip pool. This should
    50  only contain server nodes. Client nodes send requests to a datacenter-local server,
    51  so they do not participate in WAN gossip. Client requests are forwarded by local
    52  servers to a server in the target datacenter as necessary.
    53  
    54  The next step is to ensure that all the server nodes join the WAN gossip pool (include all the servers in all the datacenters):
    55  
    56  ```text
    57  $ consul join -wan <server 1> <server 2> ...
    58  ...
    59  ```
    60  
    61  The [`join`](/docs/commands/join.html) command is used with the `-wan` flag to indicate
    62  we are attempting to join a server in the WAN gossip pool. As with LAN gossip, you only
    63  need to join a single existing member, and the gossip protocol will be used to exchange
    64  information about all known members. For the initial setup, however, each server
    65  will only know about itself and must be added to the cluster. Consul 0.8.0 added WAN join
    66  flooding, so if one Consul server in a datacenter joins the WAN, it will automatically
    67  join the other servers in its local datacenter that it knows about via the LAN.
    68  
    69  Once the join is complete, the [`members`](/docs/commands/members.html) command can be
    70  used to verify that all server nodes gossiping over WAN.
    71  
    72  We can also verify that both datacenters are known using the
    73  [HTTP Catalog API](/api/catalog.html#catalog_datacenters):
    74  
    75  ```text
    76  $ curl http://localhost:8500/v1/catalog/datacenters
    77  ["dc1", "dc2"]
    78  ```
    79  
    80  As a simple test, you can try to query the nodes in each datacenter:
    81  
    82  ```text
    83  $ curl http://localhost:8500/v1/catalog/nodes?dc=dc1
    84  ...
    85  $ curl http://localhost:8500/v1/catalog/nodes?dc=dc2
    86  ...
    87  ```
    88  In order to persist the `join` information, the following can be added to the `consul` configuration in each of the `server` nodes in the cluster. For example, in `dc1` server nodes:
    89  ```
    90  ...
    91    "retry_join_wan":[
    92      "dc2-server-1",
    93      ...
    94      "dc2-server-N"
    95    ],
    96  ...
    97  ```
    98  
    99  There are a few networking requirements that must be satisfied for this to
   100  work. Of course, all server nodes must be able to talk to each other. Otherwise,
   101  the gossip protocol as well as RPC forwarding will not work. If service discovery
   102  is to be used across datacenters, the network must be able to route traffic
   103  between IP addresses across regions as well. Usually, this means that all datacenters
   104  must be connected using a VPN or other tunneling mechanism. Consul does not handle
   105  VPN or NAT traversal for you.
   106  
   107  Note that for RPC forwarding to work the bind address must be accessible from remote nodes. 
   108  Configuring `serf_wan`, `advertise_wan_addr` and `translate_wan_addrs` can lead to a
   109  situation where `consul members -wan` lists remote nodes but RPC operations fail with one 
   110  of the following errors:
   111  
   112  - `No path to datacenter`
   113  - `rpc error getting client: failed to get conn: dial tcp <LOCAL_ADDR>:0-><REMOTE_ADDR>:<REMOTE_RPC_PORT>: i/o timeout`
   114  
   115  The most likely cause of these errors is that `bind_addr` is set to a private address preventing
   116  the RPC server from accepting connections across the WAN. Setting `bind_addr` to a public
   117  address (or one that can be routed across the WAN) will resolve this issue. Be aware that
   118  exposing the RPC server on a public port should only be done **after** firewall rules have
   119  been established.
   120  
   121  The [`translate_wan_addrs`](/docs/agent/options.html#translate_wan_addrs) configuration
   122  provides a basic address rewriting capability.