github.com/kardianos/nomad@v0.1.3-0.20151022182107-b13df73ee850/website/source/intro/getting-started/running.html.md

github.com/kardianos/nomad@v0.1.3-0.20151022182107-b13df73ee850/website/source/intro/getting-started/running.html.md (about)

     1  ---
     2  layout: "intro"
     3  page_title: "Running Nomad"
     4  sidebar_current: "getting-started-running"
     5  description: |-
     6    Learn about the Nomad agent, and the lifecycle of running and stopping.
     7  ---
     8  
     9  # Running Nomad
    10  
    11  Nomad relies on a long running agent on every machine in the cluster.
    12  The agent can run either in server or client mode. Each region must
    13  have at least one server, though a cluster of 3 or 5 servers is recommended.
    14  A single server deployment is _**highly**_ discouraged as data loss is inevitable
    15  in a failure scenario.
    16  
    17  All other agents run in client mode. A client is a very lightweight
    18  process that registers the host machine, performs heartbeating, and runs any tasks
    19  that are assigned to it by the servers. The agent must be run on every node that
    20  is part of the cluster so that the servers can assign work to those machines.
    21  
    22  ## Starting the Agent
    23  
    24  For simplicity, we will run a single Nomad agent in development mode. This mode
    25  is used to quickly start an agent that is acting as a client and server to test
    26  job configurations or prototype interactions. It should _**not**_ be used in
    27  production as it does not persist state.
    28  
    29  ```
    30  vagrant@nomad:~$ sudo nomad agent -dev
    31  
    32  ==> Starting Nomad agent...
    33  ==> Nomad agent configuration:
    34  
    35                   Atlas: <disabled>
    36                  Client: true
    37               Log Level: DEBUG
    38                  Region: global (DC: dc1)
    39                  Server: true
    40  
    41  ==> Nomad agent started! Log data will stream in below:
    42  
    43      [INFO] serf: EventMemberJoin: nomad.global 127.0.0.1
    44      [INFO] nomad: starting 4 scheduling worker(s) for [service batch _core]
    45      [INFO] client: using alloc directory /tmp/NomadClient599911093
    46      [INFO] raft: Node at 127.0.0.1:4647 [Follower] entering Follower state
    47      [INFO] nomad: adding server nomad.global (Addr: 127.0.0.1:4647) (DC: dc1)
    48      [WARN] fingerprint.network: Ethtool not found, checking /sys/net speed file
    49      [WARN] raft: Heartbeat timeout reached, starting election
    50      [INFO] raft: Node at 127.0.0.1:4647 [Candidate] entering Candidate state
    51      [DEBUG] raft: Votes needed: 1
    52      [DEBUG] raft: Vote granted. Tally: 1
    53      [INFO] raft: Election won. Tally: 1
    54      [INFO] raft: Node at 127.0.0.1:4647 [Leader] entering Leader state
    55      [INFO] raft: Disabling EnableSingleNode (bootstrap)
    56      [DEBUG] raft: Node 127.0.0.1:4647 updated peer set (2): [127.0.0.1:4647]
    57      [INFO] nomad: cluster leadership acquired
    58      [DEBUG] client: applied fingerprints [arch cpu host memory storage network]
    59      [DEBUG] client: available drivers [docker exec java]
    60      [DEBUG] client: node registration complete
    61      [DEBUG] client: updated allocations at index 1 (0 allocs)
    62      [DEBUG] client: allocs: (added 0) (removed 0) (updated 0) (ignore 0)
    63      [DEBUG] client: state updated to ready
    64  ```
    65  
    66  As you can see, the Nomad agent has started and has output some log
    67  data. From the log data, you can see that our agent is running in both
    68  client and server mode, and has claimed leadership of the cluster.
    69  Additionally, the local client has been registered and marked as ready.
    70  
    71  -> **Note:** Typically any agent running in client mode must be run with root level
    72  privilege. Nomad makes use of operating system primitives for resource isolation
    73  which require elevated permissions. The agent will function as non-root, but
    74  certain task drivers will not be available.
    75  
    76  ## Cluster Nodes
    77  
    78  If you run [`nomad node-status`](/docs/commands/node-status.html) in another terminal, you
    79  can see the registered nodes of the Nomad cluster:
    80  
    81  ```text
    82  $ vagrant ssh
    83  ...
    84  
    85  $ nomad node-status
    86  ID                                    DC   Name   Class   Drain  Status
    87  72d3af97-144f-1e5f-94e5-df1516fe4add  dc1  nomad  <none>  false  ready
    88  ```
    89  
    90  The output shows our Node ID, which is a randomly generated UUID,
    91  its datacenter, node name, node class, drain mode and current status.
    92  We can see that our node is in the ready state, and task draining is
    93  currently off.
    94  
    95  The agent is also running in server mode, which means it is part of
    96  the [gossip protocol](/docs/internals/gossip.html) used to connect all
    97  the server instances together. We can view the members of the gossip
    98  ring using the [`server-members`](/docs/commands/server-members.html) command:
    99  
   100  ```text
   101  $ nomad server-members
   102  Name          Addr       Port  Status  Proto  Build     DC   Region
   103  nomad.global  127.0.0.1  4648  alive   2      0.1.0dev  dc1  global
   104  ```
   105  
   106  The output shows our own agent, the address it is running on, its
   107  health state, some version information, and the datacenter and region.
   108  Additional metadata can be viewed by providing the `-detailed` flag.
   109  
   110  ## <a name="stopping"></a>Stopping the Agent
   111  
   112  You can use `Ctrl-C` (the interrupt signal) to halt the agent.
   113  By default, all signals will cause the agent to forcefully shutdown.
   114  The agent [can be configured](/docs/agent/config.html) to gracefully
   115  leave on either the interrupt or terminate signals.
   116  
   117  After interrupting the agent, you should see it leave the cluster
   118  and shut down:
   119  
   120  ```
   121  ^C==> Caught signal: interrupt
   122      [DEBUG] http: Shutting down http server
   123      [INFO] agent: requesting shutdown
   124      [INFO] client: shutting down
   125      [INFO] nomad: shutting down server
   126      [WARN] serf: Shutdown without a Leave
   127      [INFO] agent: shutdown complete
   128  ```
   129  
   130  By gracefully leaving, Nomad clients update their status to prevent
   131  futher tasks from being scheduled and to start migrating any tasks that are
   132  already assigned. Nomad servers notify their peers they intend to leave.
   133  When a server leaves, replication to that server stops. If a server fails,
   134  replication continues to be attempted until the node recovers. Nomad will
   135  automatically try to reconnect to _failed_ nodes, allowing it to recover from
   136  certain network conditions, while _left_ nodes are no longer contacted.
   137  
   138  If an agent is operating as a server, a graceful leave is important to avoid
   139  causing a potential availability outage affecting the
   140  [consensus protocol](/docs/internals/consensus.html). If a server does
   141  forcefully exit and will not be returning into service, the
   142  [`server-force-leave` command](/docs/commands/server-force-leave.html) should
   143  be used to force the server from a _failed_ to a _left_ state.
   144  
   145  ## Next Steps
   146  
   147  The development Nomad agent is up and running. Let's try to [run a job](jobs.html)!