github.com/iqoqo/nomad@v0.11.3-0.20200911112621-d7021c74d101/website/pages/intro/getting-started/running.mdx

github.com/iqoqo/nomad@v0.11.3-0.20200911112621-d7021c74d101/website/pages/intro/getting-started/running.mdx (about)

     1  ---
     2  layout: intro
     3  page_title: Running Nomad
     4  sidebar_title: Running Nomad
     5  description: 'Learn about the Nomad agent, and the lifecycle of running and stopping.'
     6  ---
     7  
     8  # Running Nomad
     9  
    10  Nomad relies on a long running agent on every machine in the cluster.
    11  The agent can run either in server or client mode. Each region must
    12  have at least one server, though a cluster of 3 or 5 servers is recommended.
    13  A single server deployment is _**highly**_ discouraged as data loss is inevitable
    14  in a failure scenario.
    15  
    16  All other agents run in client mode. A Nomad client is a very lightweight
    17  process that registers the host machine, performs heartbeating, and runs the tasks
    18  that are assigned to it by the servers. The agent must be run on every node that
    19  is part of the cluster so that the servers can assign work to those machines.
    20  
    21  ## Starting the Agent
    22  
    23  For simplicity, we will run a single Nomad agent in development mode. This mode
    24  is used to quickly start an agent that is acting as a client and server to test
    25  job configurations or prototype interactions. It should _**not**_ be used in
    26  production as it does not persist state.
    27  
    28  ```shell-sessionsudo nomad agent -dev
    29  
    30  ==> Starting Nomad agent...
    31  ==> Nomad agent configuration:
    32  
    33                  Client: true
    34               Log Level: DEBUG
    35                  Region: global (DC: dc1)
    36                  Server: true
    37  
    38  ==> Nomad agent started! Log data will stream in below:
    39  
    40      [INFO] serf: EventMemberJoin: nomad.global 127.0.0.1
    41      [INFO] nomad: starting 4 scheduling worker(s) for [service batch _core]
    42      [INFO] client: using alloc directory /tmp/NomadClient599911093
    43      [INFO] raft: Node at 127.0.0.1:4647 [Follower] entering Follower state
    44      [INFO] nomad: adding server nomad.global (Addr: 127.0.0.1:4647) (DC: dc1)
    45      [WARN] fingerprint.network: Ethtool not found, checking /sys/net speed file
    46      [WARN] raft: Heartbeat timeout reached, starting election
    47      [INFO] raft: Node at 127.0.0.1:4647 [Candidate] entering Candidate state
    48      [DEBUG] raft: Votes needed: 1
    49      [DEBUG] raft: Vote granted. Tally: 1
    50      [INFO] raft: Election won. Tally: 1
    51      [INFO] raft: Node at 127.0.0.1:4647 [Leader] entering Leader state
    52      [INFO] raft: Disabling EnableSingleNode (bootstrap)
    53      [DEBUG] raft: Node 127.0.0.1:4647 updated peer set (2): [127.0.0.1:4647]
    54      [INFO] nomad: cluster leadership acquired
    55      [DEBUG] client: applied fingerprints [arch cpu host memory storage network]
    56      [DEBUG] client: available drivers [docker exec java]
    57      [DEBUG] client: node registration complete
    58      [DEBUG] client: updated allocations at index 1 (0 allocs)
    59      [DEBUG] client: allocs: (added 0) (removed 0) (updated 0) (ignore 0)
    60      [DEBUG] client: state updated to ready
    61  ```
    62  
    63  As you can see, the Nomad agent has started and has output some log
    64  data. From the log data, you can see that our agent is running in both
    65  client and server mode, and has claimed leadership of the cluster.
    66  Additionally, the local client has been registered and marked as ready.
    67  
    68  -> **Note:** Typically any agent running in client mode must be run with root level
    69  privilege. Nomad makes use of operating system primitives for resource isolation
    70  which require elevated permissions. The agent will function as non-root, but
    71  certain task drivers will not be available.
    72  
    73  ## Cluster Nodes
    74  
    75  If you run [`nomad node status`](/docs/commands/node/status) in another
    76  terminal, you can see the registered nodes of the Nomad cluster:
    77  
    78  ```shell-sessionnomad node status
    79  ID        DC   Name   Class   Drain  Eligibility  Status
    80  171a583b  dc1  nomad  <none>  false  eligible     ready
    81  ```
    82  
    83  The output shows our Node ID, which is a randomly generated UUID,
    84  its datacenter, node name, node class, drain mode and current status.
    85  We can see that our node is in the ready state, and task draining is
    86  currently off.
    87  
    88  The agent is also running in server mode, which means it is part of
    89  the [gossip protocol](/docs/internals/gossip) used to connect all
    90  the server instances together. We can view the members of the gossip
    91  ring using the [`server members`](/docs/commands/server/members) command:
    92  
    93  ```shell-sessionnomad server members
    94  Name          Address    Port  Status  Leader  Protocol  Build  Datacenter  Region
    95  nomad.global  127.0.0.1  4648  alive   true    2         0.7.0  dc1         global
    96  ```
    97  
    98  The output shows our own agent, the address it is running on, its
    99  health state, some version information, and the datacenter and region.
   100  Additional metadata can be viewed by providing the `-detailed` flag.
   101  
   102  ## Stopping the Agent ((#stopping))
   103  
   104  You can use `Ctrl-C` (the interrupt signal) to halt the agent.
   105  By default, all signals will cause the agent to forcefully shutdown.
   106  The agent [can be configured](/docs/configuration#leave_on_terminate) to
   107  gracefully leave on either the interrupt or terminate signals.
   108  
   109  After interrupting the agent, you should see it leave the cluster
   110  and shut down:
   111  
   112  ```
   113  ^C==> Caught signal: interrupt
   114      [DEBUG] http: Shutting down http server
   115      [INFO] agent: requesting shutdown
   116      [INFO] client: shutting down
   117      [INFO] nomad: shutting down server
   118      [WARN] serf: Shutdown without a Leave
   119      [INFO] agent: shutdown complete
   120  ```
   121  
   122  By gracefully leaving, Nomad clients update their status to prevent
   123  further tasks from being scheduled and to start migrating any tasks that are
   124  already assigned. Nomad servers notify their peers they intend to leave.
   125  When a server leaves, replication to that server stops. If a server fails,
   126  replication continues to be attempted until the node recovers. Nomad will
   127  automatically try to reconnect to _failed_ nodes, allowing it to recover from
   128  certain network conditions, while _left_ nodes are no longer contacted.
   129  
   130  If an agent is operating as a server, [`leave_on_terminate`](/docs/configuration#leave_on_terminate) should only
   131  be set if the server will never rejoin the cluster again. The default value of `false` for `leave_on_terminate` and `leave_on_interrupt`
   132  work well for most scenarios. If Nomad servers are part of an auto scaling group where new servers are brought up to replace
   133  failed servers, using graceful leave avoids causing a potential availability outage affecting the [consensus protocol](/docs/internals/consensus).
   134  As of Nomad 0.8, Nomad includes Autopilot which automatically removes failed or dead servers. This allows the operator to skip setting `leave_on_terminate`.
   135  
   136  If a server does forcefully exit and will not be returning into service, the
   137  [`server force-leave` command](/docs/commands/server/force-leave) should
   138  be used to force the server from a _failed_ to a _left_ state.
   139  
   140  ## Next Steps
   141  
   142  If you shut down the development Nomad agent as instructed above, ensure that it is back up and running again and let's try to [run a job](/intro/getting-started/jobs)!