github.com/iqoqo/nomad@v0.11.3-0.20200911112621-d7021c74d101/website/pages/docs/configuration/server.mdx

github.com/iqoqo/nomad@v0.11.3-0.20200911112621-d7021c74d101/website/pages/docs/configuration/server.mdx (about)

     1  --
     2  layout: docs
     3  page_title: server Stanza - Agent Configuration
     4  sidebar_title: server
     5  description: |-
     6    The "server" stanza configures the Nomad agent to operate in server mode to
     7    participate in scheduling decisions, register with service discovery, handle
     8    join failures, and more.
     9  ---
    10  
    11  # `server` Stanza
    12  
    13  <Placement groups={['server']} />
    14  
    15  The `server` stanza configures the Nomad agent to operate in server mode to
    16  participate in scheduling decisions, register with service discovery, handle
    17  join failures, and more.
    18  
    19  ```hcl
    20  server {
    21    enabled          = true
    22    bootstrap_expect = 3
    23    server_join {
    24      retry_join = [ "1.1.1.1", "2.2.2.2" ]
    25      retry_max = 3
    26      retry_interval = "15s"
    27    }
    28  }
    29  ```
    30  
    31  ## `server` Parameters
    32  
    33  - `authoritative_region` `(string: "")` - Specifies the authoritative region, which
    34    provides a single source of truth for global configurations such as ACL Policies and
    35    global ACL tokens. Non-authoritative regions will replicate from the authoritative
    36    to act as a mirror. By default, the local region is assumed to be authoritative.
    37  
    38  - `bootstrap_expect` `(int: required)` - Specifies the number of server nodes to
    39    wait for before bootstrapping. It is most common to use the odd-numbered
    40    integers `3` or `5` for this value, depending on the cluster size. A value of
    41    `1` does not provide any fault tolerance and is not recommended for production
    42    use cases.
    43  
    44  - `data_dir` `(string: "[data_dir]/server")` - Specifies the directory to use -
    45    for server-specific data, including the replicated log. By default, this is -
    46    the top-level [data_dir](/docs/configuration#data_dir)
    47    suffixed with "server", like `"/opt/nomad/server"`. This must be an absolute
    48    path.
    49  
    50  - `enabled` `(bool: false)` - Specifies if this agent should run in server mode.
    51    All other server options depend on this value being set.
    52  
    53  - `enabled_schedulers` `(array<string>: [all])` - Specifies which sub-schedulers
    54    this server will handle. This can be used to restrict the evaluations that
    55    worker threads will dequeue for processing.
    56  
    57  - `encrypt` `(string: "")` - Specifies the secret key to use for encryption of
    58    Nomad server's gossip network traffic. This key must be 16 bytes that are
    59    base64-encoded. The provided key is automatically persisted to the data
    60    directory and loaded automatically whenever the agent is restarted. This means
    61    that to encrypt Nomad server's gossip protocol, this option only needs to be
    62    provided once on each agent's initial startup sequence. If it is provided
    63    after Nomad has been initialized with an encryption key, then the provided key
    64    is ignored and a warning will be displayed. See the
    65    [encryption documentation][encryption] for more details on this option
    66    and its impact on the cluster.
    67  
    68  - `node_gc_threshold` `(string: "24h")` - Specifies how long a node must be in a
    69    terminal state before it is garbage collected and purged from the system. This
    70    is specified using a label suffix like "30s" or "1h".
    71  
    72  - `job_gc_interval` `(string: "5m")` - Specifies the interval between the job
    73    garbage collections. Only jobs who have been terminal for at least
    74    `job_gc_threshold` will be collected. Lowering the interval will perform more
    75    frequent but smaller collections. Raising the interval will perform collections
    76    less frequently but collect more jobs at a time. Reducing this interval is
    77    useful if there is a large throughput of tasks, leading to a large set of
    78    dead jobs. This is specified using a label suffix like "30s" or "3m". `job_gc_interval`
    79    was introduced in Nomad 0.10.0.
    80  
    81  - `job_gc_threshold` `(string: "4h")` - Specifies the minimum time a job must be
    82    in the terminal state before it is eligible for garbage collection. This is
    83    specified using a label suffix like "30s" or "1h".
    84  
    85  - `eval_gc_threshold` `(string: "1h")` - Specifies the minimum time an
    86    evaluation must be in the terminal state before it is eligible for garbage
    87    collection. This is specified using a label suffix like "30s" or "1h".
    88  
    89  - `deployment_gc_threshold` `(string: "1h")` - Specifies the minimum time a
    90    deployment must be in the terminal state before it is eligible for garbage
    91    collection. This is specified using a label suffix like "30s" or "1h".
    92  
    93  - `csi_volume_claim_gc_threshold` `(string: "1h")` - Specifies the minimum age of
    94    a CSI volume before it is eligible to have its claims garbage collected.
    95    This is specified using a label suffix like "30s" or "1h".
    96  
    97  - `csi_plugin_gc_threshold` `(string: "1h")` - Specifies the minimum age of a
    98    CSI plugin before it is eligible for garbage collection if not in use.
    99    This is specified using a label suffix like "30s" or "1h".
   100  
   101  - `default_scheduler_config` <code>([scheduler_configuration][update-scheduler-config]:
   102    nil)</code> - Specifies the initial default scheduler config when
   103    bootstrapping cluster. The parameter is ignored once the cluster is bootstrapped or
   104    value is updated through the [API endpoint][update-scheduler-config]. See [the
   105    example section](#configuring-scheduler-config) for more details
   106    `default_scheduler_config` was introduced in Nomad 0.11.4.
   107  
   108  - `heartbeat_grace` `(string: "10s")` - Specifies the additional time given as a
   109    grace period beyond the heartbeat TTL of nodes to account for network and
   110    processing delays as well as clock skew. This is specified using a label
   111    suffix like "30s" or "1h".
   112  
   113  - `min_heartbeat_ttl` `(string: "10s")` - Specifies the minimum time between
   114    node heartbeats. This is used as a floor to prevent excessive updates. This is
   115    specified using a label suffix like "30s" or "1h". Lowering the minimum TTL is
   116    a tradeoff as it lowers failure detection time of nodes at the tradeoff of
   117    false positives and increased load on the leader.
   118  
   119  - `max_heartbeats_per_second` `(float: 50.0)` - Specifies the maximum target
   120    rate of heartbeats being processed per second. This allows the TTL to be
   121    increased to meet the target rate. Increasing the maximum heartbeats per
   122    second is a tradeoff as it lowers failure detection time of nodes at the
   123    tradeoff of false positives and increased load on the leader.
   124  
   125  - `non_voting_server` `(bool: false)` - (Enterprise-only) Specifies whether
   126    this server will act as a non-voting member of the cluster to help provide
   127    read scalability.
   128  
   129  - `num_schedulers` `(int: [num-cores])` - Specifies the number of parallel
   130    scheduler threads to run. This can be as many as one per core, or `0` to
   131    disallow this server from making any scheduling decisions. This defaults to
   132    the number of CPU cores.
   133  
   134  - `protocol_version` `(int: 1)` - Specifies the Nomad protocol version to use
   135    when communicating with other Nomad servers. This value is typically not
   136    required as the agent internally knows the latest version, but may be useful
   137    in some upgrade scenarios.
   138  
   139  - `raft_protocol` `(int: 2)` - Specifies the Raft protocol version to use when
   140    communicating with other Nomad servers. This affects available Autopilot
   141    features and is typically not required as the agent internally knows the
   142    latest version, but may be useful in some upgrade scenarios.
   143  
   144  - `redundancy_zone` `(string: "")` - (Enterprise-only) Specifies the redundancy
   145    zone that this server will be a part of for Autopilot management. For more
   146    information, see the [Autopilot Guide](https://learn.hashicorp.com/nomad/operating-nomad/autopilot).
   147  
   148  - `rejoin_after_leave` `(bool: false)` - Specifies if Nomad will ignore a
   149    previous leave and attempt to rejoin the cluster when starting. By default,
   150    Nomad treats leave as a permanent intent and does not attempt to join the
   151    cluster again when starting. This flag allows the previous state to be used to
   152    rejoin the cluster.
   153  
   154  - `server_join` <code>([server_join][server-join]: nil)</code> - Specifies
   155    how the Nomad server will connect to other Nomad servers. The `retry_join`
   156    fields may directly specify the server address or use go-discover syntax for
   157    auto-discovery. See the [server_join documentation][server-join] for more detail.
   158  
   159  - `upgrade_version` `(string: "")` - A custom version of the format X.Y.Z to use
   160    in place of the Nomad version when custom upgrades are enabled in Autopilot.
   161    For more information, see the [Autopilot Guide](https://learn.hashicorp.com/nomad/operating-nomad/autopilot).
   162  
   163  ### Deprecated Parameters
   164  
   165  - `retry_join` `(array<string>: [])` - Specifies a list of server addresses to
   166    retry joining if the first attempt fails. This is similar to
   167    [`start_join`](#start_join), but only invokes if the initial join attempt
   168    fails. The list of addresses will be tried in the order specified, until one
   169    succeeds. After one succeeds, no further addresses will be contacted. This is
   170    useful for cases where we know the address will become available eventually.
   171    Use `retry_join` with an array as a replacement for `start_join`, **do not use
   172    both options**. See the [server_join][server-join]
   173    section for more information on the format of the string. This field is
   174    deprecated in favor of the [server_join stanza][server-join].
   175  
   176  - `retry_interval` `(string: "30s")` - Specifies the time to wait between retry
   177    join attempts. This field is deprecated in favor of the [server_join
   178    stanza][server-join].
   179  
   180  - `retry_max` `(int: 0)` - Specifies the maximum number of join attempts to be
   181    made before exiting with a return code of 1. By default, this is set to 0
   182    which is interpreted as infinite retries. This field is deprecated in favor of
   183    the [server_join stanza][server-join].
   184  
   185  - `start_join` `(array<string>: [])` - Specifies a list of server addresses to
   186    join on startup. If Nomad is unable to join with any of the specified
   187    addresses, agent startup will fail. See the [server address
   188    format](/docs/configuration/server_join#server-address-format)
   189    section for more information on the format of the string. This field is
   190    deprecated in favor of the [server_join stanza][server-join].
   191  
   192  ## `server` Examples
   193  
   194  ### Common Setup
   195  
   196  This example shows a common Nomad agent `server` configuration stanza. The two
   197  IP addresses could also be DNS, and should point to the other Nomad servers in
   198  the cluster
   199  
   200  ```hcl
   201  server {
   202    enabled          = true
   203    bootstrap_expect = 3
   204  
   205    server_join {
   206      retry_join     = [ "1.1.1.1", "2.2.2.2" ]
   207      retry_max      = 3
   208      retry_interval = "15s"
   209    }
   210  }
   211  ```
   212  
   213  ### Configuring Data Directory
   214  
   215  This example shows configuring a custom data directory for the server data.
   216  
   217  ```hcl
   218  server {
   219    data_dir = "/opt/nomad/server"
   220  }
   221  ```
   222  
   223  ### Automatic Bootstrapping
   224  
   225  The Nomad servers can automatically bootstrap if Consul is configured. For a
   226  more detailed explanation, please see the
   227  [automatic Nomad bootstrapping documentation](https://learn.hashicorp.com/nomad/operating-nomad/clustering).
   228  
   229  ### Restricting Schedulers
   230  
   231  This example shows restricting the schedulers that are enabled as well as the
   232  maximum number of cores to utilize when participating in scheduling decisions:
   233  
   234  ```hcl
   235  server {
   236    enabled            = true
   237    enabled_schedulers = ["batch", "service"]
   238    num_schedulers     = 7
   239  }
   240  ```
   241  
   242  ### Configuring Scheduler Config
   243  
   244  This example shows enabling preemption for all schedulers.
   245  
   246  ```hcl
   247  server {
   248    default_scheduler_config {
   249      scheduler_algorithm = "binpack"
   250  
   251      preemption_config {
   252        batch_scheduler_enabled   = true
   253        system_scheduler_enabled  = true
   254        service_scheduler_enabled = true
   255      }
   256    }
   257  }
   258  ```
   259  
   260  The structure matches the [Update Scheduler Config][update-scheduler-config] endpoint,
   261  but adopted to hcl syntax (namely using snake case rather than camel case).
   262  
   263  Nomad servers check their `default_scheduler_config` only during cluster
   264  bootstrap. During upgrades, if a previously bootstrapped cluster already set
   265  scheduler configuration via the [Update Scheduler Config][update-scheduler-config]
   266  endpoint, that is always preferred.
   267  
   268  [encryption]: https://learn.hashicorp.com/nomad/transport-security/gossip-encryption 'Nomad Encryption Overview'
   269  [server-join]: /docs/configuration/server_join 'Server Join'
   270  [update-scheduler-config]: /api-docs/operator#update-scheduler-configuration 'Scheduler Config'