github.com/Ilhicas/nomad@v1.0.4-0.20210304152020-e86851182bc3/website/content/docs/configuration/server.mdx (about) 1 --- 2 layout: docs 3 page_title: server Stanza - Agent Configuration 4 sidebar_title: server 5 description: |- 6 The "server" stanza configures the Nomad agent to operate in server mode to 7 participate in scheduling decisions, register with service discovery, handle 8 join failures, and more. 9 --- 10 11 # `server` Stanza 12 13 <Placement groups={['server']} /> 14 15 The `server` stanza configures the Nomad agent to operate in server mode to 16 participate in scheduling decisions, register with service discovery, handle 17 join failures, and more. 18 19 ```hcl 20 server { 21 enabled = true 22 bootstrap_expect = 3 23 server_join { 24 retry_join = [ "1.1.1.1", "2.2.2.2" ] 25 retry_max = 3 26 retry_interval = "15s" 27 } 28 } 29 ``` 30 31 ## `server` Parameters 32 33 - `authoritative_region` `(string: "")` - Specifies the authoritative region, which 34 provides a single source of truth for global configurations such as ACL Policies and 35 global ACL tokens. Non-authoritative regions will replicate from the authoritative 36 to act as a mirror. By default, the local region is assumed to be authoritative. 37 38 - `bootstrap_expect` `(int: required)` - Specifies the number of server nodes to 39 wait for before bootstrapping. It is most common to use the odd-numbered 40 integers `3` or `5` for this value, depending on the cluster size. A value of 41 `1` does not provide any fault tolerance and is not recommended for production 42 use cases. 43 44 - `data_dir` `(string: "[data_dir]/server")` - Specifies the directory to use - 45 for server-specific data, including the replicated log. By default, this is - 46 the top-level [data_dir](/docs/configuration#data_dir) 47 suffixed with "server", like `"/opt/nomad/server"`. This must be an absolute 48 path. 49 50 - `enabled` `(bool: false)` - Specifies if this agent should run in server mode. 51 All other server options depend on this value being set. 52 53 - `enabled_schedulers` `(array<string>: [all])` - Specifies which sub-schedulers 54 this server will handle. This can be used to restrict the evaluations that 55 worker threads will dequeue for processing. 56 57 - `enable_event_broker` `(bool: true)` - Specifies if this server will generate 58 events for its event stream. 59 60 - `encrypt` `(string: "")` - Specifies the secret key to use for encryption of 61 Nomad server's gossip network traffic. This key must be 32 bytes that are 62 [RFC4648] "URL and filename safe" base64-encoded. You can generate an 63 appropriately-formatted key with the [`nomad operator keygen`] command. The 64 provided key is automatically persisted to the data directory and loaded 65 automatically whenever the agent is restarted. This means that to encrypt 66 Nomad server's gossip protocol, this option only needs to be provided once 67 on each agent's initial startup sequence. If it is provided after Nomad has 68 been initialized with an encryption key, then the provided key is ignored 69 and a warning will be displayed. See the [encryption 70 documentation][encryption] for more details on this option and its impact on 71 the cluster. 72 73 - `event_buffer_size` `(int: 100)` - Specifies the number of events generated 74 by the server to be held in memory. Increasing this value enables new 75 subscribers to have a larger look back window when initially subscribing. 76 Decreasing will lower the amount of memory used for the event buffer. 77 78 - `node_gc_threshold` `(string: "24h")` - Specifies how long a node must be in a 79 terminal state before it is garbage collected and purged from the system. This 80 is specified using a label suffix like "30s" or "1h". 81 82 - `job_gc_interval` `(string: "5m")` - Specifies the interval between the job 83 garbage collections. Only jobs who have been terminal for at least 84 `job_gc_threshold` will be collected. Lowering the interval will perform more 85 frequent but smaller collections. Raising the interval will perform collections 86 less frequently but collect more jobs at a time. Reducing this interval is 87 useful if there is a large throughput of tasks, leading to a large set of 88 dead jobs. This is specified using a label suffix like "30s" or "3m". `job_gc_interval` 89 was introduced in Nomad 0.10.0. 90 91 - `job_gc_threshold` `(string: "4h")` - Specifies the minimum time a job must be 92 in the terminal state before it is eligible for garbage collection. This is 93 specified using a label suffix like "30s" or "1h". 94 95 - `eval_gc_threshold` `(string: "1h")` - Specifies the minimum time an 96 evaluation must be in the terminal state before it is eligible for garbage 97 collection. This is specified using a label suffix like "30s" or "1h". 98 99 - `deployment_gc_threshold` `(string: "1h")` - Specifies the minimum time a 100 deployment must be in the terminal state before it is eligible for garbage 101 collection. This is specified using a label suffix like "30s" or "1h". 102 103 - `csi_volume_claim_gc_threshold` `(string: "1h")` - Specifies the minimum age of 104 a CSI volume before it is eligible to have its claims garbage collected. 105 This is specified using a label suffix like "30s" or "1h". 106 107 - `csi_plugin_gc_threshold` `(string: "1h")` - Specifies the minimum age of a 108 CSI plugin before it is eligible for garbage collection if not in use. 109 This is specified using a label suffix like "30s" or "1h". 110 111 - `default_scheduler_config` <code>([scheduler_configuration][update-scheduler-config]: 112 nil)</code> - Specifies the initial default scheduler config when 113 bootstrapping cluster. The parameter is ignored once the cluster is bootstrapped or 114 value is updated through the [API endpoint][update-scheduler-config]. See [the 115 example section](#configuring-scheduler-config) for more details 116 `default_scheduler_config` was introduced in Nomad 0.10.4. 117 118 - `heartbeat_grace` `(string: "10s")` - Specifies the additional time given as a 119 grace period beyond the heartbeat TTL of nodes to account for network and 120 processing delays as well as clock skew. This is specified using a label 121 suffix like "30s" or "1h". 122 123 - `min_heartbeat_ttl` `(string: "10s")` - Specifies the minimum time between 124 node heartbeats. This is used as a floor to prevent excessive updates. This is 125 specified using a label suffix like "30s" or "1h". Lowering the minimum TTL is 126 a tradeoff as it lowers failure detection time of nodes at the tradeoff of 127 false positives and increased load on the leader. 128 129 - `max_heartbeats_per_second` `(float: 50.0)` - Specifies the maximum target 130 rate of heartbeats being processed per second. This allows the TTL to be 131 increased to meet the target rate. Increasing the maximum heartbeats per 132 second is a tradeoff as it lowers failure detection time of nodes at the 133 tradeoff of false positives and increased load on the leader. 134 135 - `non_voting_server` `(bool: false)` - (Enterprise-only) Specifies whether 136 this server will act as a non-voting member of the cluster to help provide 137 read scalability. 138 139 - `num_schedulers` `(int: [num-cores])` - Specifies the number of parallel 140 scheduler threads to run. This can be as many as one per core, or `0` to 141 disallow this server from making any scheduling decisions. This defaults to 142 the number of CPU cores. 143 144 - `protocol_version` `(int: 1)` - Specifies the Nomad protocol version to use 145 when communicating with other Nomad servers. This value is typically not 146 required as the agent internally knows the latest version, but may be useful 147 in some upgrade scenarios. 148 149 - `raft_protocol` `(int: 2)` - Specifies the Raft protocol version to use when 150 communicating with other Nomad servers. This affects available Autopilot 151 features and is typically not required as the agent internally knows the 152 latest version, but may be useful in some upgrade scenarios. 153 154 - `raft_multiplier` `(int: 1)` - An integer multiplier used by Nomad servers to 155 scale key Raft timing parameters. Omitting this value or setting it to 0 uses 156 default timing described below. Lower values are used to tighten timing and 157 increase sensitivity while higher values relax timings and reduce sensitivity. 158 Tuning this affects the time it takes Nomad to detect leader failures and to 159 perform leader elections, at the expense of requiring more network and CPU 160 resources for better performance. The maximum allowed value is 10. 161 162 By default, Nomad will use the highest-performance timing, currently equivalent 163 to setting this to a value of 1. Increasing the timings makes leader election 164 less likely during periods of networking issues or resource starvation. Since 165 leader elections pause Nomad's normal work, it may be beneficial for slow or 166 unreliable networks to wait longer before electing a new leader. The tradeoff 167 when raising this value is that during network partitions or other events 168 (server crash) where a leader is lost, Nomad will not elect a new leader for 169 a longer period of time than the default. The [`nomad.nomad.leader.barrier` and 170 `nomad.raft.leader.lastContact` metrics](/docs/telemetry/metrics) are a good 171 indicator of how often leader elections occur and raft latency. 172 173 - `redundancy_zone` `(string: "")` - (Enterprise-only) Specifies the redundancy 174 zone that this server will be a part of for Autopilot management. For more 175 information, see the [Autopilot Guide](https://learn.hashicorp.com/tutorials/nomad/autopilot). 176 177 - `rejoin_after_leave` `(bool: false)` - Specifies if Nomad will ignore a 178 previous leave and attempt to rejoin the cluster when starting. By default, 179 Nomad treats leave as a permanent intent and does not attempt to join the 180 cluster again when starting. This flag allows the previous state to be used to 181 rejoin the cluster. 182 183 - `server_join` <code>([server_join][server-join]: nil)</code> - Specifies 184 how the Nomad server will connect to other Nomad servers. The `retry_join` 185 fields may directly specify the server address or use go-discover syntax for 186 auto-discovery. See the [server_join documentation][server-join] for more detail. 187 188 - `upgrade_version` `(string: "")` - A custom version of the format X.Y.Z to use 189 in place of the Nomad version when custom upgrades are enabled in Autopilot. 190 For more information, see the [Autopilot Guide](https://learn.hashicorp.com/tutorials/nomad/autopilot). 191 192 ### Deprecated Parameters 193 194 - `retry_join` `(array<string>: [])` - Specifies a list of server addresses to 195 retry joining if the first attempt fails. This is similar to 196 [`start_join`](#start_join), but only invokes if the initial join attempt 197 fails. The list of addresses will be tried in the order specified, until one 198 succeeds. After one succeeds, no further addresses will be contacted. This is 199 useful for cases where we know the address will become available eventually. 200 Use `retry_join` with an array as a replacement for `start_join`, **do not use 201 both options**. See the [server_join][server-join] 202 section for more information on the format of the string. This field is 203 deprecated in favor of the [server_join stanza][server-join]. 204 205 - `retry_interval` `(string: "30s")` - Specifies the time to wait between retry 206 join attempts. This field is deprecated in favor of the [server_join 207 stanza][server-join]. 208 209 - `retry_max` `(int: 0)` - Specifies the maximum number of join attempts to be 210 made before exiting with a return code of 1. By default, this is set to 0 211 which is interpreted as infinite retries. This field is deprecated in favor of 212 the [server_join stanza][server-join]. 213 214 - `start_join` `(array<string>: [])` - Specifies a list of server addresses to 215 join on startup. If Nomad is unable to join with any of the specified 216 addresses, agent startup will fail. See the [server address 217 format](/docs/configuration/server_join#server-address-format) 218 section for more information on the format of the string. This field is 219 deprecated in favor of the [server_join stanza][server-join]. 220 221 ## `server` Examples 222 223 ### Common Setup 224 225 This example shows a common Nomad agent `server` configuration stanza. The two 226 IP addresses could also be DNS, and should point to the other Nomad servers in 227 the cluster 228 229 ```hcl 230 server { 231 enabled = true 232 bootstrap_expect = 3 233 234 server_join { 235 retry_join = [ "1.1.1.1", "2.2.2.2" ] 236 retry_max = 3 237 retry_interval = "15s" 238 } 239 } 240 ``` 241 242 ### Configuring Data Directory 243 244 This example shows configuring a custom data directory for the server data. 245 246 ```hcl 247 server { 248 data_dir = "/opt/nomad/server" 249 } 250 ``` 251 252 ### Automatic Bootstrapping 253 254 The Nomad servers can automatically bootstrap if Consul is configured. For a 255 more detailed explanation, please see the 256 [automatic Nomad bootstrapping documentation](https://learn.hashicorp.com/tutorials/nomad/clustering). 257 258 ### Restricting Schedulers 259 260 This example shows restricting the schedulers that are enabled as well as the 261 maximum number of cores to utilize when participating in scheduling decisions: 262 263 ```hcl 264 server { 265 enabled = true 266 enabled_schedulers = ["batch", "service"] 267 num_schedulers = 7 268 } 269 ``` 270 271 ### Bootstrapping with a Custom Scheduler Config ((#configuring-scheduler-config)) 272 273 While [bootstrapping a cluster], you can use the `default_scheduler_config` stanza 274 to prime the cluster with a [`SchedulerConfig`][update-scheduler-config]. The 275 scheduler configuration determines which scheduling algorithm is configured— 276 spread scheduling or binpacking—and which job types are eligible for preemption. 277 278 ~> **Warning:** Once the cluster is bootstrapped, you must configure this using 279 the [update scheduler configuration][update-scheduler-config] API. This 280 option is only consulted during bootstrap. 281 282 The structure matches the [Update Scheduler Config][update-scheduler-config] API 283 endpoint, which you should consult for canonical documentation. However, the 284 attributes names must be adapted to HCL syntax by using snake case 285 representations rather than camel case. 286 287 This example shows configuring spread scheduling and enabling preemption for all 288 job-type schedulers. 289 290 ```hcl 291 server { 292 default_scheduler_config { 293 scheduler_algorithm = "spread" 294 295 preemption_config { 296 batch_scheduler_enabled = true 297 system_scheduler_enabled = true 298 service_scheduler_enabled = true 299 } 300 } 301 } 302 ``` 303 304 [encryption]: https://learn.hashicorp.com/tutorials/nomad/security-gossip-encryption 'Nomad Encryption Overview' 305 [server-join]: /docs/configuration/server_join 'Server Join' 306 [update-scheduler-config]: /api-docs/operator#update-scheduler-configuration 'Scheduler Config' 307 [bootstrapping a cluster]: /docs/faq#bootstrapping 308 [RFC4648]: https://tools.ietf.org/html/rfc4648#section-5 309 [`nomad operator keygen`]: /docs/commands/operator/keygen