github.com/outbrain/consul@v1.4.5/website/source/docs/internals/gossip.html.md (about)

     1  ---
     2  layout: "docs"
     3  page_title: "Gossip Protocol"
     4  sidebar_current: "docs-internals-gossip"
     5  description: |-
     6    Consul uses a gossip protocol to manage membership and broadcast messages to the cluster. All of this is provided through the use of the Serf library. The gossip protocol used by Serf is based on SWIM: Scalable Weakly-consistent Infection-style Process Group Membership Protocol, with a few minor adaptations.
     7  ---
     8  
     9  # Gossip Protocol
    10  
    11  Consul uses a [gossip protocol](https://en.wikipedia.org/wiki/Gossip_protocol)
    12  to manage membership and broadcast messages to the cluster. All of this is provided
    13  through the use of the [Serf library](https://www.serf.io/). The gossip protocol
    14  used by Serf is based on
    15  ["SWIM: Scalable Weakly-consistent Infection-style Process Group Membership Protocol"](http://www.cs.cornell.edu/info/projects/spinglass/public_pdfs/swim.pdf),
    16  with a few minor adaptations. There are more details about [Serf's protocol here](https://www.serf.io/docs/internals/gossip.html).
    17  
    18  ~> **Advanced Topic!** This page covers technical details of
    19  the internals of Consul. You don't need to know these details to effectively
    20  operate and use Consul. These details are documented here for those who wish
    21  to learn about them without having to go spelunking through the source code.
    22  
    23  ## Gossip in Consul
    24  
    25  Consul makes use of two different gossip pools. We refer to each pool as the
    26  LAN or WAN pool respectively. Each datacenter Consul operates in has a LAN gossip pool
    27  containing all members of the datacenter, both clients and servers. The LAN pool is
    28  used for a few purposes. Membership information allows clients to automatically discover
    29  servers, reducing the amount of configuration needed. The distributed failure detection
    30  allows the work of failure detection to be shared by the entire cluster instead of
    31  concentrated on a few servers. Lastly, the gossip pool allows for reliable and fast
    32  event broadcasts for events like leader election.
    33  
    34  The WAN pool is globally unique, as all servers should participate in the WAN pool
    35  regardless of datacenter. Membership information provided by the WAN pool allows
    36  servers to perform cross datacenter requests. The integrated failure detection
    37  allows Consul to gracefully handle an entire datacenter losing connectivity, or just
    38  a single server in a remote datacenter.
    39  
    40  All of these features are provided by leveraging [Serf](https://www.serf.io/). It
    41  is used as an embedded library to provide these features. From a user perspective,
    42  this is not important, since the abstraction should be masked by Consul. It can be useful
    43  however as a developer to understand how this library is leveraged.
    44  
    45  <a name="lifeguard"></a>
    46  ## Lifeguard Enhancements
    47  
    48  SWIM makes the assumption that the local node is healthy in the sense
    49  that soft real-time processing of packets is possible. However, in cases
    50  where the local node is experiencing CPU or network exhaustion this assumption
    51  can be violated. The result is that the `serfHealth` check status can
    52  occasionally flap, resulting in false monitoring alarms, adding noise to
    53  telemetry, and simply causing the overall cluster to waste CPU and network
    54  resources diagnosing a failure that may not truly exist.
    55  
    56  Lifeguard completely resolves this issue with novel enhancements to SWIM.
    57  
    58  For more details about Lifeguard, please see the
    59  [Making Gossip More Robust with Lifeguard](https://www.hashicorp.com/blog/making-gossip-more-robust-with-lifeguard/)
    60  blog post, which provides a high level overview of the HashiCorp Research paper
    61  [Lifeguard : SWIM-ing with Situational Awareness](https://arxiv.org/abs/1707.00788). The
    62  [Serf gossip protocol guide](https://www.serf.io/docs/internals/gossip.html#lifeguard)
    63  also provides some lower-level details about the gossip protocol and Lifeguard.