github.com/smintz/nomad@v0.8.3/website/source/docs/internals/architecture.html.md (about)

     1  ---
     2  layout: "docs"
     3  page_title: "Architecture"
     4  sidebar_current: "docs-internals-architecture"
     5  description: |-
     6    Learn about the internal architecture of Nomad.
     7  ---
     8  
     9  # Architecture
    10  
    11  Nomad is a complex system that has many different pieces. To help both users and developers of Nomad
    12  build a mental model of how it works, this page documents the system architecture.
    13  
    14  ~> **Advanced Topic!** This page covers technical details
    15  of Nomad. You do not need to understand these details to
    16  effectively use Nomad. The details are documented here for
    17  those who wish to learn about them without having to go
    18  spelunking through the source code.
    19  
    20  # Glossary
    21  
    22  Before describing the architecture, we provide a glossary of terms to help
    23  clarify what is being discussed:
    24  
    25  * **Job** - A Job is a specification provided by users that declares a workload for
    26    Nomad. A Job is a form of _desired state_; the user is expressing that the job should
    27    be running, but not where it should be run. The responsibility of Nomad is to make sure
    28    the _actual state_ matches the user desired state. A Job is composed of one or more
    29    task groups.
    30  
    31  * **Task Group** - A Task Group is a set of tasks that must be run together. For example, a
    32    web server may require that a log shipping co-process is always running as well. A task
    33    group is the unit of scheduling, meaning the entire group must run on the same client node and
    34    cannot be split.
    35  
    36  * **Driver** – A Driver represents the basic means of executing your **Tasks**.
    37    Example Drivers include Docker, Qemu, Java, and static binaries.
    38  
    39  * **Task** - A Task is the smallest unit of work in Nomad. Tasks are executed by drivers,
    40    which allow Nomad to be flexible in the types of tasks it supports.  Tasks
    41    specify their driver, configuration for the driver, constraints, and resources required.
    42  
    43  * **Client** - A Client of Nomad is a machine that tasks can be run on. All clients run the
    44    Nomad agent. The agent is responsible for registering with the servers, watching for any
    45    work to be assigned and executing tasks. The Nomad agent is a long lived process which
    46    interfaces with the servers.
    47  
    48  * **Allocation** - An Allocation is a mapping between a task group in a job and a client
    49    node. A single job may have hundreds or thousands of task groups, meaning an equivalent
    50    number of allocations must exist to map the work to client machines. Allocations are created
    51    by the Nomad servers as part of scheduling decisions made during an evaluation.
    52  
    53  * **Evaluation** - Evaluations are the mechanism by which Nomad makes scheduling decisions.
    54    When either the _desired state_ (jobs) or _actual state_ (clients) changes, Nomad creates
    55    a new evaluation to determine if any actions must be taken. An evaluation may result
    56    in changes to allocations if necessary.
    57  
    58  * **Server** - Nomad servers are the brains of the cluster. There is a cluster of servers
    59    per region and they manage all jobs and clients, run evaluations, and create task allocations.
    60    The servers replicate data between each other and perform leader election to ensure high
    61    availability. Servers federate across regions to make Nomad globally aware.
    62  
    63  * **Regions and Datacenters** - Nomad models infrastructure as regions and
    64    datacenters. Regions may contain multiple datacenters. Servers are assigned to
    65    a specific region, managing state and making scheduling decisions within that
    66    region. Multiple regions can be federated together. For example, you may
    67    have a `US` region with the `us-east-1` and `us-west-1` datacenters,
    68    connected to the `EU` region with the `eu-fr-1` and `eu-uk-1` datacenters.
    69    Requests that are made between regions are forwarded to the appropriate servers.
    70    Data is _not_ replicated between regions.
    71  
    72  * **Bin Packing** - Bin Packing is the process of filling bins with items in a way that
    73    maximizes the utilization of bins. This extends to Nomad, where the clients are "bins"
    74    and the items are task groups. Nomad optimizes resources by efficiently bin packing
    75    tasks onto client machines.
    76  
    77  # High-Level Overview
    78  
    79  Looking at only a single region, at a high level Nomad looks like this:
    80  
    81  [![Regional Architecture](/assets/images/nomad-architecture-region.png)](/assets/images/nomad-architecture-region.png)
    82  
    83  Within each region, we have both clients and servers. Servers are responsible for
    84  accepting jobs from users, managing clients, and [computing task placements](/docs/internals/scheduling.html).
    85  Each region may have clients from multiple datacenters, allowing a small number of servers
    86  to handle very large clusters.
    87  
    88  In some cases, for either availability or scalability, you may need to run multiple
    89  regions. Nomad supports federating multiple regions together into a single cluster.
    90  At a high level, this setup looks like this:
    91  
    92  [![Global Architecture](/assets/images/nomad-architecture-global.png)](/assets/images/nomad-architecture-global.png)
    93  
    94  Regions are fully independent from each other, and do not share jobs, clients, or
    95  state. They are loosely-coupled using a gossip protocol, which allows users to
    96  submit jobs to any region or query the state of any region transparently. Requests
    97  are forwarded to the appropriate server to be processed and the results returned. 
    98  Data is _not_ replicated between regions.
    99  
   100  The servers in each region are all part of a single consensus group. This means
   101  that they work together to elect a single leader which has extra duties. The leader
   102  is responsible for processing all queries and transactions. Nomad is optimistically
   103  concurrent, meaning all servers participate in making scheduling decisions in parallel.
   104  The leader provides the additional coordination necessary to do this safely and
   105  to ensure clients are not oversubscribed.
   106  
   107  Each region is expected to have either three or five servers. This strikes a balance
   108  between availability in the case of failure and performance, as consensus gets
   109  progressively slower as more servers are added. However, there is no limit to the number
   110  of clients per region.
   111  
   112  Clients are configured to communicate with their regional servers and communicate
   113  using remote procedure calls (RPC) to register themselves, send heartbeats for liveness,
   114  wait for new allocations, and update the status of allocations. A client registers
   115  with the servers to provide the resources available, attributes, and installed drivers.
   116  Servers use this information for scheduling decisions and create allocations to assign
   117  work to clients.
   118  
   119  Users make use of the Nomad CLI or API to submit jobs to the servers. A job represents
   120  a desired state and provides the set of tasks that should be run. The servers are
   121  responsible for scheduling the tasks, which is done by finding an optimal placement for
   122  each task such that resource utilization is maximized while satisfying all constraints
   123  specified by the job. Resource utilization is maximized by bin packing, in which
   124  the scheduling tries to make use of all the resources of a machine without
   125  exhausting any dimension. Job constraints can be used to ensure an application is
   126  running in an appropriate environment. Constraints can be technical requirements based
   127  on hardware features such as architecture and availability of GPUs, or software features
   128  like operating system and kernel version, or they can be business constraints like
   129  ensuring PCI compliant workloads run on appropriate servers.
   130  
   131  # Getting in Depth
   132  
   133  This has been a brief high-level overview of the architecture of Nomad. There
   134  are more details available for each of the sub-systems. The [consensus protocol](/docs/internals/consensus.html),
   135  [gossip protocol](/docs/internals/gossip.html), and [scheduler design](/docs/internals/scheduling.html)
   136  are all documented in more detail.
   137  
   138  For other details, either consult the code, ask in IRC or reach out to the mailing list.
   139