github.com/outbrain/consul@v1.4.5/website/source/docs/guides/performance.html.md (about)

     1  ---
     2  layout: "docs"
     3  page_title: "Server Performance"
     4  sidebar_current: "docs-guides-performance"
     5  description: |-
     6    Consul requires different amounts of compute resources, depending on cluster size and expected workload. This guide provides guidance on choosing compute resources.
     7  ---
     8  
     9  # Server Performance
    10  
    11  Since Consul servers run a [consensus protocol](/docs/internals/consensus.html) to
    12  process all write operations and are contacted on nearly all read operations, server
    13  performance is critical for overall throughput and health of a Consul cluster. Servers
    14  are generally I/O bound for writes because the underlying Raft log store performs a sync
    15  to disk every time an entry is appended. Servers are generally CPU bound for reads since
    16  reads work from a fully in-memory data store that is optimized for concurrent access.
    17  
    18  <a name="minimum"></a>
    19  ## Minimum Server Requirements
    20  
    21  In Consul 0.7, the default server [performance parameters](/docs/agent/options.html#performance)
    22  were tuned to allow Consul to run reliably (but relatively slowly) on a server cluster of three
    23  [AWS t2.micro](https://aws.amazon.com/ec2/instance-types/) instances. These thresholds
    24  were determined empirically using a leader instance that was under sufficient read, write,
    25  and network load to cause it to permanently be at zero CPU credits, forcing it to the baseline
    26  performance mode for that instance type. Real-world workloads typically have more bursts of
    27  activity, so this is a conservative and pessimistic tuning strategy.
    28  
    29  This default was chosen based on feedback from users, many of whom wanted a low cost way
    30  to run small production or development clusters with low cost compute resources, at the
    31  expense of some performance in leader failure detection and leader election times.
    32  
    33  The default performance configuration is equivalent to this:
    34  
    35  ```javascript
    36  {
    37    "performance": {
    38      "raft_multiplier": 5
    39    }
    40  }
    41  ```
    42  
    43  <a name="production"></a>
    44  ## Production Server Requirements
    45  
    46  When running Consul 0.7 and later in production, it is recommended to configure the server
    47  [performance parameters](/docs/agent/options.html#performance) back to Consul's original
    48  high-performance settings. This will let Consul servers detect a failed leader and complete
    49  leader elections much more quickly than the default configuration which extends key Raft
    50  timeouts by a factor of 5, so it can be quite slow during these events.
    51  
    52  The high performance configuration is simple and looks like this:
    53  
    54  ```javascript
    55  {
    56    "performance": {
    57      "raft_multiplier": 1
    58    }
    59  }
    60  ```
    61  
    62  This value must take into account the network latency between the servers and the read/write load on the servers.
    63  
    64  The value of `raft_multiplier` is a scaling factor and directly affects the following parameters:
    65  
    66  |Param|Value||
    67  |-----|----:|-:|
    68  |HeartbeatTimeout|1000ms|default|
    69  |ElectionTimeout|1000ms|default|
    70  |LeaderLeaseTimeout|500ms|default|
    71  
    72  So a scaling factor of `5` (i.e. `raft_multiplier: 5`) updates the following values:
    73  
    74  |Param|Value|Calculation|
    75  |-----|----:|-:|
    76  |HeartbeatTimeout|5000ms|5 x 1000ms|
    77  |ElectionTimeout|5000ms|5 x 1000ms|
    78  |LeaderLeaseTimeout|2500ms|5 x 500ms|
    79  
    80  ~> **NOTE** Wide networks with more latency will perform better with larger values of `raft_multiplier`.
    81  
    82  The trade off is between leader stability and time to recover from an actual
    83  leader failure. A short multiplier minimizes failure detection and election time
    84  but may be triggered frequently in high latency situations. This can cause
    85  constant leadership churn and associated unavailability. A high multiplier
    86  reduces the chances that spurious failures will cause leadership churn but it
    87  does this at the expense of taking longer to detect real failures and thus takes
    88  longer to restore cluster availability.
    89  
    90  Leadership instability can also be caused by under-provisioned CPU resources and
    91  is more likely in environments where CPU cycles are shared with other workloads.
    92  In order for a server to remain the leader, it must send frequent heartbeat
    93  messages to all other servers every few hundred milliseconds. If some number of
    94  these are missing or late due to the leader not having sufficient CPU to send
    95  them on time, the other servers will detect it as failed and hold a new
    96  election.
    97  
    98  It's best to benchmark with a realistic workload when choosing a production server for Consul.
    99  Here are some general recommendations:
   100  
   101  * Consul will make use of multiple cores, and at least 2 cores are recommended.
   102  
   103  * <a name="last-contact"></a>Spurious leader elections can be caused by networking issues between
   104  the servers or insufficient CPU resources. Users in cloud environments often bump their servers
   105  up to the next instance class with improved networking and CPU until leader elections stabilize,
   106  and in Consul 0.7 or later the [performance parameters](/docs/agent/options.html#performance)
   107  configuration now gives you tools to trade off performance instead of upsizing servers. You can
   108  use the [`consul.raft.leader.lastContact` telemetry](/docs/agent/telemetry.html#last-contact)
   109  to observe how the Raft timing is performing and guide the decision to de-tune Raft performance
   110  or add more powerful servers.
   111  
   112  * For DNS-heavy workloads, configuring all Consul agents in a cluster with the
   113  [`allow_stale`](/docs/agent/options.html#allow_stale) configuration option will allow reads to
   114  scale across all Consul servers, not just the leader. Consul 0.7 and later enables stale reads
   115  for DNS by default. See [Stale Reads](/docs/guides/dns-cache.html#stale) in the
   116  [DNS Caching](/docs/guides/dns-cache.html) guide for more details. It's also good to set
   117  reasonable, non-zero [DNS TTL values](/docs/guides/dns-cache.html#ttl) if your clients will
   118  respect them.
   119  
   120  * In other applications that perform high volumes of reads against Consul, consider using the
   121  [stale consistency mode](/api/index.html#consistency) available to allow reads to scale
   122  across all the servers and not just be forwarded to the leader.
   123  
   124  * In Consul 0.9.3 and later, a new [`limits`](/docs/agent/options.html#limits) configuration is
   125  available on Consul clients to limit the RPC request rate they are allowed to make against the
   126  Consul servers. After hitting the limit, requests will start to return rate limit errors until
   127  time has passed and more requests are allowed. Configuring this across the cluster can help with
   128  enforcing a max desired application load level on the servers, and can help mitigate abusive
   129  applications.
   130  
   131  ## Memory Requirements
   132  
   133  Consul server agents operate on a working set of data comprised of key/value
   134  entries, the service catalog, prepared queries, access control lists, and
   135  sessions in memory. These data are persisted through Raft to disk in the form
   136  of a snapshot and log of changes since the previous snapshot for durability.
   137  
   138  When planning for memory requirements, you should typically allocate
   139  enough RAM for your server agents to contain between 2 to 4 times the working
   140  set size. You can determine the working set size by noting the value of
   141  `consul.runtime.alloc_bytes` in the [Telemetry data](/docs/agent/telemetry.html).
   142  
   143  > NOTE: Consul is not designed to serve as a general purpose database, and you
   144  > should keep this in mind when choosing what data are populated to the
   145  > key/value store.
   146  
   147  ## Read/Write Tuning
   148  
   149  Consul is write limited by disk I/O and read limited by CPU. Memory requirements will be dependent on the total size of KV pairs stored and should be sized according to that data (as should the hard drive storage). The limit on a key’s value size is `512KB`.
   150  
   151  -> Consul is write limited by disk I/O and read limited by CPU.
   152  
   153  For **write-heavy** workloads, the total RAM available for overhead must approximately be equal to
   154  
   155      RAM NEEDED = number of keys * average key size * 2-3x
   156  
   157  Since writes must be synced to disk (persistent storage) on a quorum of servers before they are committed, deploying a disk with high write throughput (or an SSD) will enhance performance on the write side. ([Documentation](/docs/agent/options.html#\_data\_dir))
   158  
   159  For a **read-heavy** workload, configure all Consul server agents with the `allow_stale` DNS option, or query the API with the `stale` [consistency mode](/api/index.html#consistency-modes). By default, all queries made to the server are RPC forwarded to and serviced by the leader. By enabling stale reads, any server will respond to any query, thereby reducing overhead on the leader. Typically, the stale response is `100ms` or less from consistent mode but it drastically improves performance and reduces latency under high load.
   160  
   161  If the leader server is out of memory or the disk is full, the server eventually stops responding, loses its election and cannot move past its last commit time. However, by configuring `max_stale` and setting it to a large value, Consul will continue to respond to queries during such outage scenarios. ([max_stale documentation](/docs/agent/options.html#max_stale)).
   162  
   163  It should be noted that `stale` is not appropriate for coordination where strong consistency is important (i.e. locking or application leader election). For critical cases, the optional `consistent` API query mode is required for true linearizability; the trade off is that this turns a read into a full quorum write so requires more resources and takes longer.
   164  
   165  **Read-heavy** clusters may take advantage of the [enhanced reading](/docs/enterprise/read-scale/index.html) feature (Enterprise) for better scalability. This feature allows additional servers to be introduced as non-voters. Being a non-voter, the server will still participate in data replication, but it will not block the leader from committing log entries.
   166  
   167  Consul’s agents use network sockets for communicating with the other nodes (gossip) and with the server agent. In addition, file descriptors are also opened for watch handlers, health checks, and log files. For a **write heavy** cluster, the `ulimit` size must be increased from the default  value (`1024`) to prevent the leader from running out of file descriptors.
   168  
   169  To prevent any CPU spikes from a misconfigured client, RPC requests to the server should be [rate limited](/docs/agent/options.html#limits)
   170  
   171  ~> **NOTE** Rate limiting is configured on the client agent only.
   172  
   173  In addition, two [performance indicators](/docs/agent/telemetry.html) &mdash; `consul.runtime.alloc_bytes` and `consul.runtime.heap_objects` &mdash; can help diagnose if the current sizing is not adequately meeting the load.
   174  
   175  ## Connect Certificate Signing CPU Limits
   176  
   177  If you enable [Connect](/docs/connect/index.html), the leader server will need
   178  to perform public key signing operations for every service instance in the
   179  cluster. Typically these operations are fast on modern hardware, however when
   180  the CA is changed or it's key rotated, the leader will face an influx of
   181  requests for new certificates for every service instance running.
   182  
   183  While the client agents distribute these randomly over 30 seconds to avoid an
   184  immediate thundering herd, they don't have enough information to tune that
   185  period based on the number of certificates in use in the cluster so picking
   186  longer smearing results in artificially slow rotations for small clusters.
   187  
   188  Smearing requests over 30s is sufficient to bring RPC load to a reasonable level
   189  in all but the very largest clusters, but the extra CPU load from cryptographic
   190  operations could impact the server's normal work. To limit that, Consul since
   191  1.4.1 exposes two ways to limit the impact Certificate signing has on the leader
   192  [`csr_max_per_second`](/docs/agent/options.html#ca_csr_max_per_second) and
   193  [`csr_max_concurrent`](/docs/agent/options.html#ca_csr_max_concurrent).
   194  
   195  By default we set a limit of 50 per second which is reasonable on modest
   196  hardware but may be too low and impact rotation times if more than 1500 service
   197  instances are using Connect in the cluster. `csr_max_per_second` is likely best
   198  if you have fewer than four cores available since a whole core being used by
   199  signing is likely to impact the server stability if it's all or a large portion
   200  of the cores available. The downside is that you need to capacity plan: how many
   201  service instances will need Connect certificates? What CSR rate can your server
   202  tolerate without impacting stability? How fast do you want CA rotations to
   203  process?
   204  
   205  For larger production deployments, we generally recommend multiple CPU cores for
   206  servers to handle the normal workload. With four or more cores available, it's
   207  simpler to limit signing CPU impact with `csr_max_concurrent` rather than tune
   208  the rate limit. This effectively sets how many CPU cores can be monopolized by
   209  certificate signing work (although it doesn't pin that work to specific cores).
   210  In this case `csr_max_per_second` should be disabled (set to `0`).
   211  
   212  For example if you have an 8 core server, setting `csr_max_concurrent` to `1`
   213  would allow you to process CSRs as fast as a single core can (which is likely
   214  sufficient for the very large clusters), without consuming all available
   215  CPU cores and impacting normal server work or stability.