github.com/ferranbt/nomad@v0.9.3-0.20190607002617-85c449b7667c/website/source/api/operator.html.md (about)

     1  ---
     2  layout: api
     3  page_title: Operator - HTTP API
     4  sidebar_current: api-operator
     5  description: |-
     6    The /operator endpoints provides cluster-level tools for Nomad operators, such
     7    as interacting with the Raft subsystem.
     8  ---
     9  # /v1/operator
    10  
    11  The `/operator` endpoint provides cluster-level tools for Nomad operators, such
    12  as interacting with the Raft subsystem.
    13  
    14  ~> Use this interface with extreme caution, as improper use could lead to a
    15  Nomad outage and even loss of data.
    16  
    17  See the [Outage Recovery](/guides/operations/outage.html) guide for some examples of how
    18  these capabilities are used. For a CLI to perform these operations manually,
    19  please see the documentation for the
    20  [`nomad operator`](/docs/commands/operator.html) command.
    21  
    22  
    23  ## Read Raft Configuration
    24  
    25  This endpoint queries the status of a client node registered with Nomad.
    26  
    27  | Method | Path                              | Produces                   |
    28  | ------ | --------------------------------- | -------------------------- |
    29  | `GET`  | `/v1/operator/raft/configuration` | `application/json`         |
    30  
    31  The table below shows this endpoint's support for
    32  [blocking queries](/api/index.html#blocking-queries) and
    33  [required ACLs](/api/index.html#acls).
    34  
    35  | Blocking Queries | ACL Required |
    36  | ---------------- | ------------ |
    37  | `NO`             | `management` |
    38  
    39  ### Parameters
    40  
    41  - `stale` - Specifies if the cluster should respond without an active leader.
    42    This is specified as a query string parameter.
    43  
    44  ### Sample Request
    45  
    46  ```text
    47  $ curl \
    48      https://localhost:4646/v1/operator/raft/configuration
    49  ```
    50  
    51  ### Sample Response
    52  
    53  ```json
    54  {
    55    "Index": 1,
    56    "Servers": [
    57      {
    58        "Address": "127.0.0.1:4647",
    59        "ID": "127.0.0.1:4647",
    60        "Leader": true,
    61        "Node": "bacon-mac.global",
    62        "RaftProtocol": 2,
    63        "Voter": true
    64      }
    65    ]
    66  }
    67  ```
    68  
    69  #### Field Reference
    70  
    71  - `Index` `(int)` - The `Index` value is the Raft corresponding to this
    72    configuration. The latest configuration may not yet be committed if changes
    73    are in flight.
    74  
    75  - `Servers` `(array: Server)` - The returned `Servers` array has information
    76    about the servers in the Raft peer configuration.
    77  
    78    - `ID` `(string)` - The ID of the server. This is the same as the `Address`
    79      but may be upgraded to a GUID in a future version of Nomad.
    80  
    81    - `Node` `(string)` - The node name of the server, as known to Nomad, or
    82      `"(unknown)"` if the node is stale and not known.
    83  
    84    - `Address` `(string)` - The `ip:port` for the server.
    85  
    86    - `Leader` `(bool)` - is either "true" or "false" depending on the server's
    87      role in the Raft configuration.
    88  
    89    - `Voter` `(bool)` - is "true" or "false", indicating if the server has a vote
    90      in the Raft configuration. Future versions of Nomad may add support for
    91      non-voting servers.
    92  
    93  ## Remove Raft Peer
    94  
    95  This endpoint removes a Nomad server with given address from the Raft
    96  configuration. The return code signifies success or failure.
    97  
    98  | Method   | Path                       | Produces                   |
    99  | -------- | ---------------------------| -------------------------- |
   100  | `DELETE` | `/v1/operator/raft/peer`   | `application/json`         |
   101  
   102  The table below shows this endpoint's support for
   103  [blocking queries](/api/index.html#blocking-queries) and
   104  [required ACLs](/api/index.html#acls).
   105  
   106  | Blocking Queries | ACL Required |
   107  | ---------------- | ------------ |
   108  | `NO`             | `management` |
   109  
   110  ### Parameters
   111  
   112  - `address` `(string: <optional>)` - Specifies the server to remove as
   113    `ip:port`. This cannot be provided along with the `id` parameter.
   114  
   115  - `id` `(string: <optional>)` - Specifies the server to remove as
   116    `id`. This cannot be provided along with the `address` parameter.
   117  
   118  ### Sample Request
   119  
   120  ```text
   121  $ curl \
   122      --request DELETE \
   123      https://localhost:4646/v1/operator/raft/peer?address=1.2.3.4
   124  ```
   125  
   126  ## Read Autopilot Configuration
   127  
   128  This endpoint retrieves its latest Autopilot configuration.
   129  
   130  | Method | Path                         | Produces                   |
   131  | ------ | ---------------------------- | -------------------------- |
   132  | `GET`  | `/operator/autopilot/configuration` | `application/json` |
   133  
   134  The table below shows this endpoint's support for
   135  [blocking queries](/api/index.html#blocking-queries) and
   136  [required ACLs](/api/index.html#acls).
   137  
   138  | Blocking Queries | ACL Required    |
   139  | ---------------- | --------------- |
   140  | `NO`             | `operator:read` |
   141  
   142  ### Sample Request
   143  
   144  ```text
   145  $ curl \
   146      https://localhost:4646/operator/autopilot/configuration
   147  ```
   148  
   149  ### Sample Response
   150  
   151  ```json
   152  {
   153    "CleanupDeadServers": true,
   154    "LastContactThreshold": "200ms",
   155    "MaxTrailingLogs": 250,
   156    "ServerStabilizationTime": "10s",
   157    "EnableRedundancyZones": false,
   158    "DisableUpgradeMigration": false,
   159    "EnableCustomUpgrades": false,
   160    "CreateIndex": 4,
   161    "ModifyIndex": 4
   162  }
   163  ```
   164  
   165  For more information about the Autopilot configuration options, see the
   166  [agent configuration section](/docs/configuration/autopilot.html).
   167  
   168  ## Update Autopilot Configuration
   169  
   170  This endpoint updates the Autopilot configuration of the cluster.
   171  
   172  | Method | Path                         | Produces                   |
   173  | ------ | ---------------------------- | -------------------------- |
   174  | `PUT`  | `/operator/autopilot/configuration` | `application/json` |
   175  
   176  The table below shows this endpoint's support for
   177  [blocking queries](/api/index.html#blocking-queries) and
   178  [required ACLs](/api/index.html#acls).
   179  
   180  | Blocking Queries | ACL Required     |
   181  | ---------------- | ---------------- |
   182  | `NO`             | `operator:write` |
   183  
   184  ### Parameters
   185  
   186  - `cas` `(int: 0)` - Specifies to use a Check-And-Set operation. The update will
   187    only happen if the given index matches the `ModifyIndex` of the configuration
   188    at the time of writing.
   189  
   190  ### Sample Payload
   191  
   192  ```json
   193  {
   194    "CleanupDeadServers": true,
   195    "LastContactThreshold": "200ms",
   196    "MaxTrailingLogs": 250,
   197    "ServerStabilizationTime": "10s",
   198    "EnableRedundancyZones": false,
   199    "DisableUpgradeMigration": false,
   200    "EnableCustomUpgrades": false,
   201    "CreateIndex": 4,
   202    "ModifyIndex": 4
   203  }
   204  ```
   205  
   206  - `CleanupDeadServers` `(bool: true)` - Specifies automatic removal of dead
   207    server nodes periodically and whenever a new server is added to the cluster.
   208  
   209  - `LastContactThreshold` `(string: "200ms")` - Specifies the maximum amount of
   210    time a server can go without contact from the leader before being considered
   211    unhealthy. Must be a duration value such as `10s`.
   212  
   213  - `MaxTrailingLogs` `(int: 250)` specifies the maximum number of log entries
   214    that a server can trail the leader by before being considered unhealthy.
   215  
   216  - `ServerStabilizationTime` `(string: "10s")` - Specifies the minimum amount of
   217    time a server must be stable in the 'healthy' state before being added to the
   218    cluster. Only takes effect if all servers are running Raft protocol version 3
   219    or higher. Must be a duration value such as `30s`.
   220  
   221  - `EnableRedundancyZones` `(bool: false)` - (Enterprise-only) Specifies whether 
   222    to enable redundancy zones.
   223  
   224  - `DisableUpgradeMigration` `(bool: false)` - (Enterprise-only) Disables Autopilot's
   225    upgrade migration strategy in Nomad Enterprise of waiting until enough
   226    newer-versioned servers have been added to the cluster before promoting any of
   227    them to voters.
   228  
   229  - `EnableCustomUpgrades` `(bool: false)` - (Enterprise-only) Specifies whether to 
   230    enable using custom upgrade versions when performing migrations.
   231  
   232  ## Read Health
   233  
   234  This endpoint queries the health of the autopilot status.
   235  
   236  | Method | Path                         | Produces                   |
   237  | ------ | ---------------------------- | -------------------------- |
   238  | `GET`  | `/operator/autopilot/health` | `application/json`         |
   239  
   240  The table below shows this endpoint's support for
   241  [blocking queries](/api/index.html#blocking-queries) and
   242  [required ACLs](/api/index.html#acls).
   243  
   244  | Blocking Queries | ACL Required    |
   245  | ---------------- | --------------- |
   246  | `NO`             | `operator:read` |
   247  
   248  ### Sample Request
   249  
   250  ```text
   251  $ curl \
   252      https://localhost:4646/v1/operator/autopilot/health
   253  ```
   254  
   255  ### Sample response
   256  
   257  ```json
   258  {
   259    "Healthy": true,
   260    "FailureTolerance": 0,
   261    "Servers": [
   262      {
   263        "ID": "e349749b-3303-3ddf-959c-b5885a0e1f6e",
   264        "Name": "node1",
   265        "Address": "127.0.0.1:8300",
   266        "SerfStatus": "alive",
   267        "Version": "0.8.0",
   268        "Leader": true,
   269        "LastContact": "0s",
   270        "LastTerm": 2,
   271        "LastIndex": 46,
   272        "Healthy": true,
   273        "Voter": true,
   274        "StableSince": "2017-03-06T22:07:51Z"
   275      },
   276      {
   277        "ID": "e36ee410-cc3c-0a0c-c724-63817ab30303",
   278        "Name": "node2",
   279        "Address": "127.0.0.1:8205",
   280        "SerfStatus": "alive",
   281        "Version": "0.8.0",
   282        "Leader": false,
   283        "LastContact": "27.291304ms",
   284        "LastTerm": 2,
   285        "LastIndex": 46,
   286        "Healthy": true,
   287        "Voter": false,
   288        "StableSince": "2017-03-06T22:18:26Z"
   289      }
   290    ]
   291  }
   292  ```
   293  
   294  - `Healthy` is whether all the servers are currently healthy.
   295  
   296  - `FailureTolerance` is the number of redundant healthy servers that could be
   297    fail without causing an outage (this would be 2 in a healthy cluster of 5
   298    servers).
   299  
   300  - `Servers` holds detailed health information on each server:
   301  
   302    - `ID` is the Raft ID of the server.
   303  
   304    - `Name` is the node name of the server.
   305  
   306    - `Address` is the address of the server.
   307  
   308    - `SerfStatus` is the SerfHealth check status for the server.
   309  
   310    - `Version` is the Nomad version of the server.
   311  
   312    - `Leader` is whether this server is currently the leader.
   313  
   314    - `LastContact` is the time elapsed since this server's last contact with the leader.
   315  
   316    - `LastTerm` is the server's last known Raft leader term.
   317  
   318    - `LastIndex` is the index of the server's last committed Raft log entry.
   319  
   320    - `Healthy` is whether the server is healthy according to the current Autopilot configuration.
   321  
   322    - `Voter` is whether the server is a voting member of the Raft cluster.
   323  
   324    - `StableSince` is the time this server has been in its current `Healthy` state.
   325  
   326    The HTTP status code will indicate the health of the cluster. If `Healthy` is true, then a
   327    status of 200 will be returned. If `Healthy` is false, then a status of 429 will be returned.
   328  
   329  
   330  ## Read Scheduler Configuration
   331  
   332  This endpoint retrieves the latest Scheduler configuration. This API was introduced in
   333  Nomad 0.9 and currently supports enabling/disabling preemption. More options may be added in
   334  the future.
   335  
   336  | Method | Path                         | Produces                   |
   337  | ------ | ---------------------------- | -------------------------- |
   338  | `GET`  | `/operator/scheduler/configuration` | `application/json` |
   339  
   340  The table below shows this endpoint's support for
   341  [blocking queries](/api/index.html#blocking-queries) and
   342  [required ACLs](/api/index.html#acls).
   343  
   344  | Blocking Queries |  ACL Required    |
   345  | ---------------- | ---------------  |
   346  | `NO`             | `operator:read`  |
   347  
   348  ### Sample Request
   349  
   350  ```text
   351  $ curl \
   352      https://localhost:4646/operator/scheduler/configuration
   353  ```
   354  
   355  ### Sample Response
   356  
   357  ```json
   358  {
   359    "Index": 5,
   360    "KnownLeader": true,
   361    "LastContact": 0,
   362    "SchedulerConfig": {
   363      "CreateIndex": 5,
   364      "ModifyIndex": 5,
   365      "PreemptionConfig": {
   366        "SystemSchedulerEnabled": true,
   367        "BatchSchedulerEnabled": false,
   368        "ServiceSchedulerEnabled": false
   369      }
   370    }
   371  }
   372  ```
   373  #### Field Reference
   374  
   375  - `Index` `(int)` - The `Index` value is the Raft commit index corresponding to this
   376    configuration.
   377  
   378  - `SchedulerConfig` `(SchedulerConfig)` - The returned `SchedulerConfig` object has configuration
   379    settings mentioned below.
   380  
   381    - `PreemptionConfig` `(PreemptionConfig)` - Options to enable preemption for various schedulers.
   382           - `SystemSchedulerEnabled` `(bool: true)` - Specifies whether preemption for system jobs is enabled. Note that
   383           this defaults to true.
   384           - `BatchSchedulerEnabled` `(bool: false)` (Enterprise Only) - Specifies whether preemption for batch jobs is enabled. Note that
   385           this defaults to false and must be explicitly enabled.
   386           - `ServiceSchedulerEnabled` `(bool: false)` (Enterprise Only) - Specifies whether preemption for service jobs is enabled. Note that
   387           this defaults to false and must be explicitly enabled.
   388    - `CreateIndex` - The Raft index at which the config was created.
   389    - `ModifyIndex` - The Raft index at which the config was modified.
   390  
   391  ## Update Scheduler Configuration
   392  
   393  This endpoint updates the scheduler configuration of the cluster.
   394  
   395  | Method | Path                         | Produces                   |
   396  | ------ | ---------------------------- | -------------------------- |
   397  | `PUT`, `POST`  | `/operator/scheduler/configuration` | `application/json` |
   398  
   399  The table below shows this endpoint's support for
   400  [blocking queries](/api/index.html#blocking-queries) and
   401  [required ACLs](/api/index.html#acls).
   402  
   403  | Blocking Queries |  ACL Required     |
   404  | ---------------- | ----------------  |
   405  | `NO`             | `operator:write`  |
   406  
   407  ### Parameters
   408  
   409  - `cas` `(int: 0)` - Specifies to use a Check-And-Set operation. The update will
   410    only happen if the given index matches the `ModifyIndex` of the configuration
   411    at the time of writing.
   412  
   413  ### Sample Payload
   414  
   415  ```json
   416  {
   417    "PreemptionConfig": {
   418      "SystemSchedulerEnabled": true,
   419      "BatchSchedulerEnabled": false,
   420      "ServiceSchedulerEnabled": true
   421    }
   422  }
   423  ```
   424  
   425  - `PreemptionConfig` `(PreemptionConfig)` - Options to enable preemption for various schedulers.
   426   - `SystemSchedulerEnabled` `(bool: true)` - Specifies whether preemption for system jobs is enabled. Note that
   427           if this is set to true, then system jobs can preempt any other jobs.
   428   - `BatchSchedulerEnabled` `(bool: false)` (Enterprise Only) - Specifies whether preemption for batch jobs is enabled. Note that
   429           if this is set to true, then batch jobs can preempt any other jobs.
   430   - `ServiceSchedulerEnabled` `(bool: false)` (Enterprise Only) - Specifies whether preemption for service jobs is enabled. Note that
   431           if this is set to true, then service jobs can preempt any other jobs.