github.com/outbrain/consul@v1.4.5/website/source/intro/getting-started/join.html.md (about)

     1  ---
     2  layout: "intro"
     3  page_title: "Consul Cluster"
     4  sidebar_current: "gettingstarted-join"
     5  description: >
     6    When a Consul agent is started, it begins as an isolated cluster of its own.
     7    To learn about other cluster members, the agent must join one or more other
     8    nodes using a provided join address. In this step, we will set up a two-node
     9    cluster and join the nodes together.
    10  ---
    11  
    12  # Consul Cluster
    13  
    14  We've started our first agent and registered and queried a service on that
    15  agent. Additionally, we've configured Consul Connect to automatically authorize and encrypt connections between services. This showed how easy it is to use Consul but didn't show how this could be extended to a scalable, production-grade service mesh infrastructure.
    16  In this step, we'll create our first real cluster with multiple members.
    17  
    18  When a Consul agent is started, it begins without knowledge of any other node:
    19  it is an isolated cluster of one. To learn about other cluster members, the
    20  agent must _join_ an existing cluster. To join an existing cluster, it only
    21  needs to know about a _single_ existing member. After it joins, the agent will
    22  gossip with this member and quickly discover the other members in the cluster.
    23  A Consul agent can join any other agent, not just agents in server mode.
    24  
    25  ## Starting the Agents
    26  
    27  To simulate a more realistic cluster, we will start a two node cluster via
    28  [Vagrant](https://www.vagrantup.com/). The Vagrantfile we will be using can
    29  be found in the [demo section of the Consul repo]
    30  (https://github.com/hashicorp/consul/tree/master/demo/vagrant-cluster).
    31  
    32  We first boot our two nodes:
    33  
    34  ```text
    35  $ vagrant up
    36  ```
    37  
    38  Once the systems are available, we can ssh into them to begin configuration
    39  of our cluster. We start by logging in to the first node:
    40  
    41  ```text
    42  $ vagrant ssh n1
    43  ```
    44  
    45  In our previous examples, we used the [`-dev`
    46  flag](/docs/agent/options.html#_dev) to quickly set up a development server.
    47  However, this is not sufficient for use in a clustered environment. We will
    48  omit the `-dev` flag from here on, and instead specify our clustering flags as
    49  outlined below.
    50  
    51  Each node in a cluster must have a unique name. By default, Consul uses the
    52  hostname of the machine, but we'll manually override it using the [`-node`
    53  command-line option](/docs/agent/options.html#_node).
    54  
    55  We will also specify a [`bind` address](/docs/agent/options.html#_bind):
    56  this is the address that Consul listens on, and it *must* be accessible by
    57  all other nodes in the cluster. While a `bind` address is not strictly
    58  necessary, it's always best to provide one. Consul will by default attempt to
    59  listen on all IPv4 interfaces on a system, but will fail to start with an
    60  error if multiple private IPs are found. Since production servers often
    61  have multiple interfaces, specifying a `bind` address assures that you will
    62  never bind Consul to the wrong interface.
    63  
    64  The first node will act as our sole server in this cluster, and we indicate
    65  this with the [`server` switch](/docs/agent/options.html#_server).
    66  
    67  The [`-bootstrap-expect` flag](/docs/agent/options.html#_bootstrap_expect)
    68  hints to the Consul server the number of additional server nodes we are
    69  expecting to join. The purpose of this flag is to delay the bootstrapping of
    70  the replicated log until the expected number of servers has successfully joined.
    71  You can read more about this in the [bootstrapping
    72  guide](/docs/guides/bootstrapping.html).
    73  
    74  We've included the [`-enable-script-checks`](/docs/agent/options.html#_enable_script_checks)
    75  flag set to `true` in order to enable health checks that can execute external scripts.
    76  This will be used in examples later. For production use, you'd want to configure
    77  [ACLs](/docs/guides/acl.html) in conjunction with this to control the ability to
    78  register arbitrary scripts.
    79  
    80  Finally, we add the [`config-dir` flag](/docs/agent/options.html#_config_dir),
    81  marking where service and check definitions can be found.
    82  
    83  All together, these settings yield a
    84  [`consul agent`](/docs/commands/agent.html) command like this:
    85  
    86  ```text
    87  vagrant@n1:~$ consul agent -server -bootstrap-expect=1 \
    88  	-data-dir=/tmp/consul -node=agent-one -bind=172.20.20.10 \
    89  	-enable-script-checks=true -config-dir=/etc/consul.d
    90  ...
    91  ```
    92  
    93  Now, in another terminal, we will connect to the second node:
    94  
    95  ```text
    96  $ vagrant ssh n2
    97  ```
    98  
    99  This time, we set the [`bind` address](/docs/agent/options.html#_bind)
   100  address to match the IP of the second node as specified in the Vagrantfile
   101  and the [`node` name](/docs/agent/options.html#_node) to be `agent-two`.
   102  Since this node will not be a Consul server, we don't provide a
   103  [`server` switch](/docs/agent/options.html#_server).
   104  
   105  All together, these settings yield a
   106  [`consul agent`](/docs/commands/agent.html) command like this:
   107  
   108  ```text
   109  vagrant@n2:~$ consul agent -data-dir=/tmp/consul -node=agent-two \
   110  	-bind=172.20.20.11 -enable-script-checks=true -config-dir=/etc/consul.d
   111  ...
   112  ```
   113  
   114  At this point, you have two Consul agents running: one server and one client.
   115  The two Consul agents still don't know anything about each other and are each
   116  part of their own single-node clusters. You can verify this by running
   117  [`consul members`](/docs/commands/members.html) against each agent and noting
   118  that only one member is visible to each agent.
   119  
   120  ## Joining a Cluster
   121  
   122  Now, we'll tell the first agent to join the second agent by running
   123  the following commands in a new terminal:
   124  
   125  ```text
   126  $ vagrant ssh n1
   127  ...
   128  vagrant@n1:~$ consul join 172.20.20.11
   129  Successfully joined cluster by contacting 1 nodes.
   130  ```
   131  
   132  You should see some log output in each of the agent logs. If you read
   133  carefully, you'll see that they received join information. If you
   134  run [`consul members`](/docs/commands/members.html) against each agent,
   135  you'll see that both agents now know about each other:
   136  
   137  ```text
   138  vagrant@n2:~$ consul members
   139  Node       Address            Status  Type    Build  Protocol
   140  agent-two  172.20.20.11:8301  alive   client  0.5.0  2
   141  agent-one  172.20.20.10:8301  alive   server  0.5.0  2
   142  ```
   143  
   144  -> **Remember:** To join a cluster, a Consul agent only needs to
   145  learn about <em>one existing member</em>. After joining the cluster, the
   146  agents gossip with each other to propagate full membership information.
   147  
   148  ## Auto-joining a Cluster on Start
   149  Ideally, whenever a new node is brought up in your datacenter, it should automatically join the Consul cluster without human intervention. Consul facilitates auto-join by enabling the auto-discovery of instances in AWS, Google Cloud or Azure with a given tag key/value. To use the integration, add the [`retry_join_ec2`](/docs/agent/options.html?#retry_join_ec2), [`retry_join_gce`](/docs/agent/options.html?#retry_join_gce) or the [`retry_join_azure`](/docs/agent/options.html?#retry_join_azure) nested object to your Consul configuration file. This will allow a new node to join the cluster without any hardcoded configuration. Alternatively, you can join a cluster at startup using the [`-join` flag](/docs/agent/options.html#_join) or [`start_join` setting](/docs/agent/options.html#start_join) with hardcoded addresses of other known Consul agents.
   150  
   151  ## Querying Nodes
   152  
   153  Just like querying services, Consul has an API for querying the
   154  nodes themselves. You can do this via the DNS or HTTP API.
   155  
   156  For the DNS API, the structure of the names is `NAME.node.consul` or
   157  `NAME.node.DATACENTER.consul`. If the datacenter is omitted, Consul
   158  will only search the local datacenter.
   159  
   160  For example, from "agent-one", we can query for the address of the
   161  node "agent-two":
   162  
   163  ```
   164  vagrant@n1:~$ dig @127.0.0.1 -p 8600 agent-two.node.consul
   165  ...
   166  
   167  ;; QUESTION SECTION:
   168  ;agent-two.node.consul.	IN	A
   169  
   170  ;; ANSWER SECTION:
   171  agent-two.node.consul.	0 IN	A	172.20.20.11
   172  ```
   173  
   174  The ability to look up nodes in addition to services is incredibly
   175  useful for system administration tasks. For example, knowing the address
   176  of the node to SSH into is as easy as making the node a part of the
   177  Consul cluster and querying it.
   178  
   179  ## Leaving a Cluster
   180  
   181  To leave the cluster, you can either gracefully quit an agent (using
   182  `Ctrl-C`) or force kill one of the agents. Gracefully leaving allows
   183  the node to transition into the _left_ state; otherwise, other nodes
   184  will detect it as having _failed_. The difference is covered
   185  in more detail [here](/intro/getting-started/agent.html#stopping).
   186  
   187  ## Next Steps
   188  
   189  We now have a multi-node Consul cluster up and running. Let's make
   190  our services more robust by giving them [health checks](checks.html)!