github.com/anth0d/nomad@v0.0.0-20221214183521-ae3a0a2cad06/website/content/tools/autoscaling/plugins/apm/nomad.mdx

github.com/anth0d/nomad@v0.0.0-20221214183521-ae3a0a2cad06/website/content/tools/autoscaling/plugins/apm/nomad.mdx (about)

     1  ---
     2  layout: docs
     3  page_title: 'Autoscaling Plugins: Nomad API'
     4  description: The "nomad-apm" APM plugin queries the Nomad API for metrics.
     5  ---
     6  
     7  # Nomad APM Plugin
     8  
     9  The Nomad APM plugin allows querying the Nomad API for metric data. This provides
    10  an immediate starting point without addition applications but comes at the price
    11  of efficiency. When using this APM, it is advised to monitor Nomad carefully
    12  ensuring it is not put under excessive load pressure.
    13  
    14  ~> The Nomad APM plugin should only be used when scaling based on CPU and
    15  memory usage. For more advanced scenarios, such as scaling a cluster to
    16  zero clients, you should use a different APM plugin.
    17  
    18  ## Agent Configuration Options
    19  
    20  ```hcl
    21  apm "nomad-apm" {
    22    driver = "nomad-apm"
    23  }
    24  ```
    25  
    26  When using a Nomad cluster with ACLs enabled, following ACL policy will provide the appropriate
    27  permissions for obtaining task group metrics:
    28  
    29  ```hcl
    30  namespace "default" {
    31    policy       = "read"
    32    capabilities = ["read-job"]
    33  }
    34  ```
    35  
    36  In order to obtain cluster level metrics, the following ACL policy will be required:
    37  
    38  ```hcl
    39  node {
    40    policy = "read"
    41  }
    42  
    43  namespace "default" {
    44    policy       = "read"
    45    capabilities = ["read-job"]
    46  }
    47  ```
    48  
    49  ## Policy Configuration Options - Task Groups
    50  
    51  The Nomad APM allows querying Nomad to understand the current resource usage of
    52  a task group.
    53  
    54  ```hcl
    55  check {
    56    source = "nomad-apm"
    57    query  = "avg_cpu"
    58    # ...
    59  }
    60  ```
    61  
    62  Querying Nomad task group metrics is be done using the `<operation>_<metric>`
    63  syntax, where valid operations are:
    64  
    65  - `avg` - returns the average of the metric value across allocations in the task
    66    group.
    67  
    68  - `min` - returns the lowest metric value among the allocations in the task group.
    69  
    70  - `max` - returns the highest metric value among the allocations in the task
    71    group.
    72  
    73  - `sum` - returns the sum of all the metric values for the allocations in the
    74    task group.
    75  
    76  The metric value can be:
    77  
    78  - `cpu` - CPU usage as reported by the `nomad.client.allocs.cpu.total_percent`
    79    metric.
    80  
    81  - `memory` - Memory usage as reported by the `nomad.client.allocs.memory.usage`
    82    metric.
    83  
    84  ## Policy Configuration Options - Client Nodes
    85  
    86  The Nomad APM allows querying Nomad to understand the current allocated resource
    87  as a percentage of the total available.
    88  
    89  ~> **Note:** When using the Nomad APM plugin for cluster scaling, your policy `target` and
    90  all Nomad clients intended to be targeted by the policy must have a
    91  `node_class` defined. Nodes without `node_class` are evaluated using the
    92  default class value `autoscaler-default-pool`.
    93  
    94  ```hcl
    95  policy {
    96    # ...
    97    check {
    98      source = "nomad-apm"
    99      query  = "percentage-allocated_cpu"
   100      # ...
   101    }
   102  
   103    target "..." {
   104      # ...
   105      node_class = "autoscale"
   106      # ..
   107    }
   108  }
   109  ```
   110  
   111  Querying Nomad client node metrics is be done using the `<operation>_<metric>`
   112  syntax, where valid operations are:
   113  
   114  - `percentage-allocated` - returns the allocated percentage of the desired
   115    resource.
   116  
   117  The metric value can be:
   118  
   119  - `cpu` - allocated CPU as reported by calculating total allocatable against the
   120    total allocated by the scheduler.
   121  
   122  - `cpu-allocated` - the percentage of CPU used out of the total CPU allocated
   123    for the allocation.
   124  
   125  - `memory` - allocated memory as reported by calculating total allocatable against
   126    the total allocated by the scheduler.
   127  
   128  - `memory-allocated` - the percentage of memory used out of the total memory
   129    allocated for the allocation.
   130  
   131  [nomad_telemetry_stanza]: /docs/configuration/telemetry#inlinecode-publish_allocation_metrics