github.com/anth0d/nomad@v0.0.0-20221214183521-ae3a0a2cad06/website/content/docs/install/production/requirements.mdx (about)

     1  ---
     2  layout: docs
     3  page_title: Requirements
     4  description: |-
     5    Learn about Nomad client and server requirements such as memory and CPU
     6    recommendations, network topologies, and more.
     7  ---
     8  
     9  # Requirements
    10  
    11  ## Resources (RAM, CPU, etc.)
    12  
    13  **Nomad servers** may need to be run on large machine instances. We suggest
    14  having between 4-8+ cores, 16-32 GB+ of memory, 40-80 GB+ of **fast** disk and
    15  significant network bandwidth. The core count and network recommendations are to
    16  ensure high throughput as Nomad heavily relies on network communication and as
    17  the Servers are managing all the nodes in the region and performing scheduling.
    18  The memory and disk requirements are due to the fact that Nomad stores all state
    19  in memory and will store two snapshots of this data onto disk, which causes high IO in busy clusters with lots of writes. Thus disk should
    20  be at least 2 times the memory available to the server when deploying a high
    21  load cluster. When running on AWS prefer NVME or Provisioned IOPS SSD storage for data dir.
    22  
    23  These recommendations are guidelines and operators should always monitor the
    24  resource usage of Nomad to determine if the machines are under or over-sized.
    25  
    26  **Nomad clients** support reserving resources on the node that should not be
    27  used by Nomad. This should be used to target a specific resource utilization per
    28  node and to reserve resources for applications running outside of Nomad's
    29  supervision such as Consul and the operating system itself.
    30  
    31  Please see the [reservation configuration](/docs/configuration/client#reserved) for
    32  more detail.
    33  
    34  ## Network Topology
    35  
    36  **Nomad servers** are expected to have sub 10 millisecond network latencies
    37  between each other to ensure liveness and high throughput scheduling. Nomad
    38  servers can be spread across multiple datacenters if they have low latency
    39  connections between them to achieve high availability.
    40  
    41  For example, on AWS every region comprises of multiple zones which have very low
    42  latency links between them, so every zone can be modeled as a Nomad datacenter
    43  and every Zone can have a single Nomad server which could be connected to form a
    44  quorum and a region.
    45  
    46  Nomad servers uses Raft for state replication and Raft being highly consistent
    47  needs a quorum of servers to function, therefore we recommend running an odd
    48  number of Nomad servers in a region. Usually running 3-5 servers in a region is
    49  recommended. The cluster can withstand a failure of one server in a cluster of
    50  three servers and two failures in a cluster of five servers. Adding more servers
    51  to the quorum adds more time to replicate state and hence throughput decreases
    52  so we don't recommend having more than seven servers in a region.
    53  
    54  **Nomad clients** do not have the same latency requirements as servers since they
    55  are not participating in Raft. Thus clients can have 100+ millisecond latency to
    56  their servers. This allows having a set of Nomad servers that service clients
    57  that can be spread geographically over a continent or even the world in the case
    58  of having a single "global" region and many datacenter.
    59  
    60  ## Ports Used
    61  
    62  Nomad requires 3 different ports to work properly on servers and 2 on clients,
    63  some on TCP, UDP, or both protocols. Below we document the requirements for each
    64  port. If you use a firewall of any type, you must ensure that it is configured to
    65  allow the following traffic.
    66  
    67  - HTTP API (Default 4646). This is used by clients and servers to serve the HTTP
    68    API. TCP only.
    69  
    70  - RPC (Default 4647). This is used for internal RPC communication between client
    71    agents and servers, and for inter-server traffic. TCP only.
    72  
    73  - Serf WAN (Default 4648). This is used by servers to gossip both over the LAN and
    74    WAN to other servers. It isn't required that Nomad clients can reach this address.
    75    TCP and UDP.
    76  
    77  When tasks ask for dynamic ports, they are allocated out of the port range
    78  between 20,000 and 32,000. This is well under the ephemeral port range suggested
    79  by the [IANA](https://en.wikipedia.org/wiki/Ephemeral_port). If your operating
    80  system's default ephemeral port range overlaps with Nomad's dynamic port range,
    81  you should tune the OS to avoid this overlap.
    82  
    83  On Linux this can be checked and set as follows:
    84  
    85  ```shell-session
    86  $ cat /proc/sys/net/ipv4/ip_local_port_range
    87  32768   60999
    88  $ echo "49152 65535" > /proc/sys/net/ipv4/ip_local_port_range
    89  ```
    90  
    91  ## Bridge Networking and `iptables`
    92  
    93  Nomad's task group networks and Consul Connect integration use bridge networking and iptables to send traffic between containers. The Linux kernel bridge module has three "tunables" that control whether traffic crossing the bridge are processed by iptables. Some operating systems (RedHat, CentOS, and Fedora in particular) configure these tunables to optimize for VM workloads where iptables rules might not be correctly configured for guest traffic.
    94  
    95  These tunables can be set to allow iptables processing for the bridge network as follows:
    96  
    97  ```shell-session
    98  $ echo 1 > /proc/sys/net/bridge/bridge-nf-call-arptables
    99  $ echo 1 > /proc/sys/net/bridge/bridge-nf-call-ip6tables
   100  $ echo 1 > /proc/sys/net/bridge/bridge-nf-call-iptables
   101  ```
   102  
   103  To preserve these settings on startup of a client node, add a file including the following to `/etc/sysctl.d/` or remove the file your Linux distribution puts in that directory.
   104  
   105  ```text
   106  net.bridge.bridge-nf-call-arptables = 1
   107  net.bridge.bridge-nf-call-ip6tables = 1
   108  net.bridge.bridge-nf-call-iptables = 1
   109  ```
   110  
   111  ## Hardening Nomad
   112  
   113  As noted in the [Security Model][] guide, Nomad is not **secure-by-default**.
   114  
   115  ### User Permissions
   116  
   117  Nomad servers and Nomad clients have different requirements for permissions.
   118  
   119  Nomad servers should be run with the lowest possible permissions. They need
   120  access to their own data directory and the ability to bind to their ports. You
   121  should create a `nomad` user with the minimal set of required privileges. If you
   122  are installing Nomad from the official Linux packages, the systemd unit file
   123  runs Nomad as `root`. For your server nodes you should change this to a
   124  minimally privileged `nomad` user. See the [production deployment guide][] for
   125  details.
   126  
   127  Nomad clients must be run as `root` due to the OS isolation mechanisms that
   128  require root privileges (see also [Linux Capabilities][] below). The Nomad
   129  client's data directory should be owned by `root` with filesystem permissions
   130  set to `0700`.
   131  
   132  ### Linux Capabilities
   133  
   134  On Linux, Nomad clients require privileged capabilities for isolating
   135  tasks. Nomad clients require `CAP_SYS_ADMIN` for creating the tmpfs used for
   136  secrets, bind-mounting task directories, mounting volumes, and running some task
   137  driver plugins. Nomad clients require `CAP_NET_ADMIN` for a variety of tasks to
   138  set up networking. You should run Nomad clients as `root`, but running as `root`
   139  does not grant these required capabilities if Nomad is running in a user
   140  namespace. Running Nomad clients inside a user namespace is unsupported. See the
   141  [`capabilities(7)`][] man page for details on Linux capabilities.
   142  
   143  In order to run a task, Nomad clients perform privileged operations normally
   144  reserved to the `root` user:
   145  
   146  * Mounting tmpfs file systems for the task `/secrets` directory.
   147  * Creating the network bridge for `bridge` networking.
   148  * Allowing inbound and outbound network traffic to the workload (typically via
   149    `iptables`).
   150  * Starting tasks as a specific `user`.
   151  * Setting the owner of `template` outputs.
   152  
   153  On Linux this set of requirements expands to:
   154  
   155  * Configuring resource isolation via cgroups.
   156  * Configuring namespace isolation: `mount`, `user`, `pid`, `ipc`, and `network`
   157    namespaces.
   158  
   159  Nomad task drivers that support bind-mounting volumes also need to run as `root`
   160  to do so. This includes the built-in `exec` and `java` task drivers. The
   161  built-in task drivers run in the same process as the Nomad client, so this
   162  requires that the Nomad client agent is also running as `root`.
   163  
   164  ### Rootless Nomad Clients
   165  
   166  Although it's possible to run a Nomad client agent as a non-root user or as
   167  `root` in a user namespace, to perform the privileged operations described above
   168  you also need to grant the client agent `CAP_SYS_ADMIN` and `CAP_NET_ADMIN`
   169  capabilities. Note that these capabilities are nearly functionally equivalent to
   170  running as `root` and that a process running with `CAP_SYS_ADMIN` can almost
   171  always escalate itself to "true" (unnamespaced) `root`.
   172  
   173  Some task drivers delegate many of their privileged operations to an external
   174  process such as `dockerd` or `podman`. If you don't need `bridge` networking and
   175  are using these task drivers or custom task drivers, you may be able to run
   176  Nomad client agents as a non-root user with the following additional
   177  configuration:
   178  
   179  * Delegated cgroups: to safely set cgroups as an unprivileged user requires
   180    cgroups v2.
   181  * User namespaces: on some distros this may require setting sysctls like
   182    `kernel.unprivileged_userns_clone=1`
   183  * The task driver engine (ex. `dockerd`, `podman`, `containerd`, etc) must be
   184    configured for rootless operation. This requires cgroups v2, user namespaces,
   185    and typically either a patched kernel or kernel module (ex. `overlay.ko`)
   186    allowing unprivileged [overlay filesystem][] or a FUSE overlay filesystem.
   187  
   188  This is not a supported or well-tested configuration. See [GH-13669][] for a
   189  further discussion and to provide feedback on your experiences trying to run
   190  rootless Nomad clients.
   191  
   192  [Security Model]: /docs/concepts/security
   193  [production deployment guide]: https://developer.hashicorp.com/nomad/tutorials/enterprise/production-deployment-guide-vm-with-consul#configure-systemd
   194  [linux capabilities]: #linux-capabilities
   195  [`capabilities(7)`]: https://man7.org/linux/man-pages/man7/capabilities.7.html
   196  [overlay filesystem]: https://www.kernel.org/doc/html/latest/filesystems/overlayfs.html
   197  [GH-13669]: https://github.com/hashicorp/nomad/issues/13669