github.com/Ilhicas/nomad@v1.0.4-0.20210304152020-e86851182bc3/website/content/docs/autoscaling/plugins/apm.mdx (about) 1 --- 2 layout: docs 3 page_title: APM 4 sidebar_title: APM 5 description: APM plugins provide metric data points describing the resources current state. 6 --- 7 8 # APM Plugins 9 10 APMs are used to store metrics about an applications performance and current 11 state. The APM (Application Performance Management) plugin is responsible for 12 querying the APM and returning a value which will be used to determine if 13 scaling should occur. 14 15 ## Prometheus APM Plugin 16 17 Use [Prometheus][prometheus_io] metrics to scale your Nomad job task groups or 18 cluster. 19 20 ### Agent Configuration Options 21 22 ```hcl 23 apm "prometheus" { 24 driver = "prometheus" 25 26 config = { 27 address = "http://prometheus.my.endpoint.io:9090" 28 } 29 } 30 ``` 31 32 - `address` `(string: "http://127.0.0.1:9090")` - The address of the Prometheus 33 endpoint used to perform queries. 34 35 ### Policy Configuration Options 36 37 ```hcl 38 check { 39 source = "prometheus" 40 query = "avg((haproxy_server_current_sessions{backend=\"http_back\"}) and (haproxy_server_up{backend=\"http_back\"} == 1))" 41 ... 42 } 43 ``` 44 45 ## Datadog APM Plugin 46 47 The [Datadog][datadog_homepage] APM allows using [time series][datadog_timeseries] 48 data to make scaling decisions. 49 50 ### Agent Configuration Options 51 52 ```hcl 53 apm "datadog" { 54 driver = "datadog" 55 56 config = { 57 dd_api_key = "<api key>" 58 dd_app_key = "<app key>" 59 } 60 } 61 ``` 62 63 - `dd_api_key` `(string: "")` - The Datadog API key to use for authentication. 64 - `dd_app_key` `(string: "")` - The Datadog APP key to use for authentication. 65 66 The Datadog plugin can also read its configuration options via environment 67 variables. The accepted keys are `DD_API_KEY` and `DD_APP_KEY`. The agent 68 configuration parameters take precedence over the environment variables. 69 70 ### Policy Configuration Options 71 72 ```hcl 73 check { 74 source = "datadog" 75 query = "avg:proxy.backend.response.time{proxy-service:web-app}" 76 ... 77 } 78 ``` 79 80 ## Nomad APM Plugin 81 82 The Nomad APM plugin allows querying the Nomad API for metric data. This provides 83 an immediate starting point without addition applications but comes at the price 84 of efficiency. When using this APM, it is advised to monitor Nomad carefully 85 ensuring it is not put under excessive load pressure. 86 87 ### Agent Configuration Options 88 89 ```hcl 90 apm "nomad-apm" { 91 driver = "nomad-apm" 92 } 93 ``` 94 95 When using a Nomad cluster with ACLs enabled, following ACL policy will provide the appropriate 96 permissions for obtaining task group metrics: 97 98 ```hcl 99 namespace "default" { 100 policy = "read" 101 capabilities = ["read-job"] 102 } 103 ``` 104 105 In order to obtain cluster level metrics, the following ACL policy will be required: 106 107 ```hcl 108 node { 109 policy = "read" 110 } 111 112 namespace "default" { 113 policy = "read" 114 capabilities = ["read-job"] 115 } 116 ``` 117 118 ### Policy Configuration Options - Task Groups 119 120 The Nomad APM allows querying Nomad to understand the current resource usage of 121 a task group. 122 123 ```hcl 124 check { 125 source = "nomad-apm" 126 query = "avg_cpu" 127 ... 128 } 129 ``` 130 131 Querying Nomad task group metrics is be done using the `operation_metric` syntax, 132 where valid operations are: 133 134 - `avg` - returns the average of the metric value across allocations in the task 135 group. 136 137 - `min` - returns the lowest metric value among the allocations in the task group. 138 139 - `max` - returns the highest metric value among the allocations in the task 140 group. 141 142 - `sum` - returns the sum of all the metric values for the allocations in the 143 task group. 144 145 The metric value can be: 146 147 - `cpu` - CPU usage as reported by the `nomad.client.allocs.cpu.total_percent` 148 metric. 149 150 - `memory` - Memory usage as reported by the `nomad.client.allocs.memory.usage` 151 metric. 152 153 ### Policy Configuration Options - Client Nodes 154 155 The Nomad APM allows querying Nomad to understand the current allocated resource 156 as a percentage of the total available. 157 158 ```hcl 159 check { 160 source = "nomad-apm" 161 query = "percentage-allocated_cpu" 162 ... 163 } 164 ``` 165 166 Querying Nomad client node metrics is be done using the `operation_metric` syntax, 167 where valid operations are: 168 169 - `percentage-allocated` - returns the allocated percentage of the desired 170 resource. 171 172 The metric value can be: 173 174 - `cpu` - allocated CPU as reported by calculating total allocatable against the 175 total allocated by the scheduler. 176 177 - `memory` - allocated memory as reported by calculating total allocatable against 178 the total allocated by the scheduler. 179 180 [prometheus_io]: https://prometheus.io/ 181 [prometheus_scaler_function]: https://prometheus.io/docs/prometheus/latest/querying/functions/#scalar 182 [nomad_telemetry_stanza]: /docs/configuration/telemetry#inlinecode-publish_allocation_metrics 183 [datadog_homepage]: https://www.datadoghq.com/ 184 [datadog_timeseries]: https://docs.datadoghq.com/api/v1/metrics/#query-timeseries-points