github.com/Ilhicas/nomad@v1.0.4-0.20210304152020-e86851182bc3/website/content/docs/configuration/server.mdx (about)

     1  ---
     2  layout: docs
     3  page_title: server Stanza - Agent Configuration
     4  sidebar_title: server
     5  description: |-
     6    The "server" stanza configures the Nomad agent to operate in server mode to
     7    participate in scheduling decisions, register with service discovery, handle
     8    join failures, and more.
     9  ---
    10  
    11  # `server` Stanza
    12  
    13  <Placement groups={['server']} />
    14  
    15  The `server` stanza configures the Nomad agent to operate in server mode to
    16  participate in scheduling decisions, register with service discovery, handle
    17  join failures, and more.
    18  
    19  ```hcl
    20  server {
    21    enabled          = true
    22    bootstrap_expect = 3
    23    server_join {
    24      retry_join = [ "1.1.1.1", "2.2.2.2" ]
    25      retry_max = 3
    26      retry_interval = "15s"
    27    }
    28  }
    29  ```
    30  
    31  ## `server` Parameters
    32  
    33  - `authoritative_region` `(string: "")` - Specifies the authoritative region, which
    34    provides a single source of truth for global configurations such as ACL Policies and
    35    global ACL tokens. Non-authoritative regions will replicate from the authoritative
    36    to act as a mirror. By default, the local region is assumed to be authoritative.
    37  
    38  - `bootstrap_expect` `(int: required)` - Specifies the number of server nodes to
    39    wait for before bootstrapping. It is most common to use the odd-numbered
    40    integers `3` or `5` for this value, depending on the cluster size. A value of
    41    `1` does not provide any fault tolerance and is not recommended for production
    42    use cases.
    43  
    44  - `data_dir` `(string: "[data_dir]/server")` - Specifies the directory to use -
    45    for server-specific data, including the replicated log. By default, this is -
    46    the top-level [data_dir](/docs/configuration#data_dir)
    47    suffixed with "server", like `"/opt/nomad/server"`. This must be an absolute
    48    path.
    49  
    50  - `enabled` `(bool: false)` - Specifies if this agent should run in server mode.
    51    All other server options depend on this value being set.
    52  
    53  - `enabled_schedulers` `(array<string>: [all])` - Specifies which sub-schedulers
    54    this server will handle. This can be used to restrict the evaluations that
    55    worker threads will dequeue for processing.
    56  
    57  - `enable_event_broker` `(bool: true)` - Specifies if this server will generate
    58    events for its event stream.
    59  
    60  - `encrypt` `(string: "")` - Specifies the secret key to use for encryption of
    61    Nomad server's gossip network traffic. This key must be 32 bytes that are
    62    [RFC4648] "URL and filename safe" base64-encoded. You can generate an
    63    appropriately-formatted key with the [`nomad operator keygen`] command. The
    64    provided key is automatically persisted to the data directory and loaded
    65    automatically whenever the agent is restarted. This means that to encrypt
    66    Nomad server's gossip protocol, this option only needs to be provided once
    67    on each agent's initial startup sequence. If it is provided after Nomad has
    68    been initialized with an encryption key, then the provided key is ignored
    69    and a warning will be displayed. See the [encryption
    70    documentation][encryption] for more details on this option and its impact on
    71    the cluster.
    72  
    73  - `event_buffer_size` `(int: 100)` - Specifies the number of events generated
    74    by the server to be held in memory. Increasing this value enables new
    75    subscribers to have a larger look back window when initially subscribing.
    76    Decreasing will lower the amount of memory used for the event buffer.
    77  
    78  - `node_gc_threshold` `(string: "24h")` - Specifies how long a node must be in a
    79    terminal state before it is garbage collected and purged from the system. This
    80    is specified using a label suffix like "30s" or "1h".
    81  
    82  - `job_gc_interval` `(string: "5m")` - Specifies the interval between the job
    83    garbage collections. Only jobs who have been terminal for at least
    84    `job_gc_threshold` will be collected. Lowering the interval will perform more
    85    frequent but smaller collections. Raising the interval will perform collections
    86    less frequently but collect more jobs at a time. Reducing this interval is
    87    useful if there is a large throughput of tasks, leading to a large set of
    88    dead jobs. This is specified using a label suffix like "30s" or "3m". `job_gc_interval`
    89    was introduced in Nomad 0.10.0.
    90  
    91  - `job_gc_threshold` `(string: "4h")` - Specifies the minimum time a job must be
    92    in the terminal state before it is eligible for garbage collection. This is
    93    specified using a label suffix like "30s" or "1h".
    94  
    95  - `eval_gc_threshold` `(string: "1h")` - Specifies the minimum time an
    96    evaluation must be in the terminal state before it is eligible for garbage
    97    collection. This is specified using a label suffix like "30s" or "1h".
    98  
    99  - `deployment_gc_threshold` `(string: "1h")` - Specifies the minimum time a
   100    deployment must be in the terminal state before it is eligible for garbage
   101    collection. This is specified using a label suffix like "30s" or "1h".
   102  
   103  - `csi_volume_claim_gc_threshold` `(string: "1h")` - Specifies the minimum age of
   104    a CSI volume before it is eligible to have its claims garbage collected.
   105    This is specified using a label suffix like "30s" or "1h".
   106  
   107  - `csi_plugin_gc_threshold` `(string: "1h")` - Specifies the minimum age of a
   108    CSI plugin before it is eligible for garbage collection if not in use.
   109    This is specified using a label suffix like "30s" or "1h".
   110  
   111  - `default_scheduler_config` <code>([scheduler_configuration][update-scheduler-config]:
   112    nil)</code> - Specifies the initial default scheduler config when
   113    bootstrapping cluster. The parameter is ignored once the cluster is bootstrapped or
   114    value is updated through the [API endpoint][update-scheduler-config]. See [the
   115    example section](#configuring-scheduler-config) for more details
   116    `default_scheduler_config` was introduced in Nomad 0.10.4.
   117  
   118  - `heartbeat_grace` `(string: "10s")` - Specifies the additional time given as a
   119    grace period beyond the heartbeat TTL of nodes to account for network and
   120    processing delays as well as clock skew. This is specified using a label
   121    suffix like "30s" or "1h".
   122  
   123  - `min_heartbeat_ttl` `(string: "10s")` - Specifies the minimum time between
   124    node heartbeats. This is used as a floor to prevent excessive updates. This is
   125    specified using a label suffix like "30s" or "1h". Lowering the minimum TTL is
   126    a tradeoff as it lowers failure detection time of nodes at the tradeoff of
   127    false positives and increased load on the leader.
   128  
   129  - `max_heartbeats_per_second` `(float: 50.0)` - Specifies the maximum target
   130    rate of heartbeats being processed per second. This allows the TTL to be
   131    increased to meet the target rate. Increasing the maximum heartbeats per
   132    second is a tradeoff as it lowers failure detection time of nodes at the
   133    tradeoff of false positives and increased load on the leader.
   134  
   135  - `non_voting_server` `(bool: false)` - (Enterprise-only) Specifies whether
   136    this server will act as a non-voting member of the cluster to help provide
   137    read scalability.
   138  
   139  - `num_schedulers` `(int: [num-cores])` - Specifies the number of parallel
   140    scheduler threads to run. This can be as many as one per core, or `0` to
   141    disallow this server from making any scheduling decisions. This defaults to
   142    the number of CPU cores.
   143  
   144  - `protocol_version` `(int: 1)` - Specifies the Nomad protocol version to use
   145    when communicating with other Nomad servers. This value is typically not
   146    required as the agent internally knows the latest version, but may be useful
   147    in some upgrade scenarios.
   148  
   149  - `raft_protocol` `(int: 2)` - Specifies the Raft protocol version to use when
   150    communicating with other Nomad servers. This affects available Autopilot
   151    features and is typically not required as the agent internally knows the
   152    latest version, but may be useful in some upgrade scenarios.
   153  
   154  - `raft_multiplier` `(int: 1)` - An integer multiplier used by Nomad servers to
   155    scale key Raft timing parameters. Omitting this value or setting it to 0 uses
   156    default timing described below. Lower values are used to tighten timing and
   157    increase sensitivity while higher values relax timings and reduce sensitivity.
   158    Tuning this affects the time it takes Nomad to detect leader failures and to
   159    perform leader elections, at the expense of requiring more network and CPU
   160    resources for better performance. The maximum allowed value is 10.
   161  
   162    By default, Nomad will use the highest-performance timing, currently equivalent
   163    to setting this to a value of 1. Increasing the timings makes leader election
   164    less likely during periods of networking issues or resource starvation. Since
   165    leader elections pause Nomad's normal work, it may be beneficial for slow or
   166    unreliable networks to wait longer before electing a new leader. The tradeoff
   167    when raising this value is that during network partitions or other events
   168    (server crash) where a leader is lost, Nomad will not elect a new leader for
   169    a longer period of time than the default. The [`nomad.nomad.leader.barrier` and
   170    `nomad.raft.leader.lastContact` metrics](/docs/telemetry/metrics) are a good
   171    indicator of how often leader elections occur and raft latency.
   172  
   173  - `redundancy_zone` `(string: "")` - (Enterprise-only) Specifies the redundancy
   174    zone that this server will be a part of for Autopilot management. For more
   175    information, see the [Autopilot Guide](https://learn.hashicorp.com/tutorials/nomad/autopilot).
   176  
   177  - `rejoin_after_leave` `(bool: false)` - Specifies if Nomad will ignore a
   178    previous leave and attempt to rejoin the cluster when starting. By default,
   179    Nomad treats leave as a permanent intent and does not attempt to join the
   180    cluster again when starting. This flag allows the previous state to be used to
   181    rejoin the cluster.
   182  
   183  - `server_join` <code>([server_join][server-join]: nil)</code> - Specifies
   184    how the Nomad server will connect to other Nomad servers. The `retry_join`
   185    fields may directly specify the server address or use go-discover syntax for
   186    auto-discovery. See the [server_join documentation][server-join] for more detail.
   187  
   188  - `upgrade_version` `(string: "")` - A custom version of the format X.Y.Z to use
   189    in place of the Nomad version when custom upgrades are enabled in Autopilot.
   190    For more information, see the [Autopilot Guide](https://learn.hashicorp.com/tutorials/nomad/autopilot).
   191  
   192  ### Deprecated Parameters
   193  
   194  - `retry_join` `(array<string>: [])` - Specifies a list of server addresses to
   195    retry joining if the first attempt fails. This is similar to
   196    [`start_join`](#start_join), but only invokes if the initial join attempt
   197    fails. The list of addresses will be tried in the order specified, until one
   198    succeeds. After one succeeds, no further addresses will be contacted. This is
   199    useful for cases where we know the address will become available eventually.
   200    Use `retry_join` with an array as a replacement for `start_join`, **do not use
   201    both options**. See the [server_join][server-join]
   202    section for more information on the format of the string. This field is
   203    deprecated in favor of the [server_join stanza][server-join].
   204  
   205  - `retry_interval` `(string: "30s")` - Specifies the time to wait between retry
   206    join attempts. This field is deprecated in favor of the [server_join
   207    stanza][server-join].
   208  
   209  - `retry_max` `(int: 0)` - Specifies the maximum number of join attempts to be
   210    made before exiting with a return code of 1. By default, this is set to 0
   211    which is interpreted as infinite retries. This field is deprecated in favor of
   212    the [server_join stanza][server-join].
   213  
   214  - `start_join` `(array<string>: [])` - Specifies a list of server addresses to
   215    join on startup. If Nomad is unable to join with any of the specified
   216    addresses, agent startup will fail. See the [server address
   217    format](/docs/configuration/server_join#server-address-format)
   218    section for more information on the format of the string. This field is
   219    deprecated in favor of the [server_join stanza][server-join].
   220  
   221  ## `server` Examples
   222  
   223  ### Common Setup
   224  
   225  This example shows a common Nomad agent `server` configuration stanza. The two
   226  IP addresses could also be DNS, and should point to the other Nomad servers in
   227  the cluster
   228  
   229  ```hcl
   230  server {
   231    enabled          = true
   232    bootstrap_expect = 3
   233  
   234    server_join {
   235      retry_join     = [ "1.1.1.1", "2.2.2.2" ]
   236      retry_max      = 3
   237      retry_interval = "15s"
   238    }
   239  }
   240  ```
   241  
   242  ### Configuring Data Directory
   243  
   244  This example shows configuring a custom data directory for the server data.
   245  
   246  ```hcl
   247  server {
   248    data_dir = "/opt/nomad/server"
   249  }
   250  ```
   251  
   252  ### Automatic Bootstrapping
   253  
   254  The Nomad servers can automatically bootstrap if Consul is configured. For a
   255  more detailed explanation, please see the
   256  [automatic Nomad bootstrapping documentation](https://learn.hashicorp.com/tutorials/nomad/clustering).
   257  
   258  ### Restricting Schedulers
   259  
   260  This example shows restricting the schedulers that are enabled as well as the
   261  maximum number of cores to utilize when participating in scheduling decisions:
   262  
   263  ```hcl
   264  server {
   265    enabled            = true
   266    enabled_schedulers = ["batch", "service"]
   267    num_schedulers     = 7
   268  }
   269  ```
   270  
   271  ### Bootstrapping with a Custom Scheduler Config ((#configuring-scheduler-config))
   272  
   273  While [bootstrapping a cluster], you can use the `default_scheduler_config` stanza
   274  to prime the cluster with a [`SchedulerConfig`][update-scheduler-config]. The
   275  scheduler configuration determines which scheduling algorithm is configured—
   276  spread scheduling or binpacking—and which job types are eligible for preemption.
   277  
   278  ~> **Warning:** Once the cluster is bootstrapped, you must configure this using
   279  the [update scheduler configuration][update-scheduler-config] API. This
   280  option is only consulted during bootstrap.
   281  
   282  The structure matches the [Update Scheduler Config][update-scheduler-config] API
   283  endpoint, which you should consult for canonical documentation. However, the
   284  attributes names must be adapted to HCL syntax by using snake case
   285  representations rather than camel case.
   286  
   287  This example shows configuring spread scheduling and enabling preemption for all
   288  job-type schedulers.
   289  
   290  ```hcl
   291  server {
   292    default_scheduler_config {
   293      scheduler_algorithm = "spread"
   294  
   295      preemption_config {
   296        batch_scheduler_enabled   = true
   297        system_scheduler_enabled  = true
   298        service_scheduler_enabled = true
   299      }
   300    }
   301  }
   302  ```
   303  
   304  [encryption]: https://learn.hashicorp.com/tutorials/nomad/security-gossip-encryption 'Nomad Encryption Overview'
   305  [server-join]: /docs/configuration/server_join 'Server Join'
   306  [update-scheduler-config]: /api-docs/operator#update-scheduler-configuration 'Scheduler Config'
   307  [bootstrapping a cluster]: /docs/faq#bootstrapping
   308  [RFC4648]: https://tools.ietf.org/html/rfc4648#section-5
   309  [`nomad operator keygen`]: /docs/commands/operator/keygen