github.com/Ilhicas/nomad@v1.0.4-0.20210304152020-e86851182bc3/website/content/docs/autoscaling/telemetry.mdx (about) 1 --- 2 layout: docs 3 page_title: Telemetry 4 sidebar_title: Telemetry 5 description: > 6 Overview of runtime metrics available in the Nomad Autoscaler. 7 --- 8 9 # Nomad Autoscaler Telemetry 10 11 The Nomad Autoscaler agent collects various runtime metrics about the performance 12 of different libraries and subsystems. These metrics are aggregated on a ten 13 second interval and are retained for one minute. To configure the telemetry output 14 please see the [agent configuration][agent_telemetry_config]. 15 16 This data can be accessed via the `/v1/metrics` HTTP endpoint, via sending a 17 signal to the Nomad Autoscaler process or via a number of integrations. 18 19 To view this data via sending a signal to the Nomad Autoscaler process: on Unix, 20 this is `USR1` while on Windows it is `BREAK`. Once Nomad Autoscaler receives 21 the signal, it will dump the current telemetry information to the agent's `stderr`. 22 23 This telemetry information can be used for debugging or otherwise 24 getting a better view of what Nomad is doing. 25 26 Below is sample output of a telemetry dump: 27 28 ```text 29 [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.sys_bytes': 74793216.000 30 [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.malloc_count': 219856.000 31 [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.free_count': 183613.000 32 [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.total_gc_pause_ns': 348822.000 33 [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.total_gc_runs': 5.000 34 [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.num_goroutines': 12.000 35 [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.policy.total_num': 0.000 36 [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.alloc_bytes': 4316568.000 37 [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.heap_objects': 36243.000 38 [2020-08-25 10:01:20 +0100 BST][S] 'nomad-autoscaler.runtime.gc_pause_ns': Count: 5 Min: 38083.000 Mean: 69764.400 Max: 122291.000 Stddev: 31487.808 Sum: 348822.000 LastUpdated: 2020-08-25 10:01:26.574809 +0100 BST m=+1.241576679 39 [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.alloc_bytes': 4370504.000 40 [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.malloc_count': 220853.000 41 [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.free_count': 183613.000 42 [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.policy.total_num': 0.000 43 [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.num_goroutines': 12.000 44 [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.total_gc_pause_ns': 348822.000 45 [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.total_gc_runs': 5.000 46 [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.sys_bytes': 74793216.000 47 [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.heap_objects': 37240.000 48 ``` 49 50 ## Runtime Metrics 51 52 The runtime metrics help understand the Nomad Autoscaler agent's memory and load 53 pressure performance. 54 55 <table> 56 <thead> 57 <tr> 58 <th>Metric</th> 59 <th>Description</th> 60 <th>Type</th> 61 </tr> 62 </thead> 63 <tbody> 64 <tr> 65 <td> 66 <code>nomad-autoscaler.runtime.num_goroutines</code> 67 </td> 68 <td>Number of running goroutines</td> 69 <td>Gauge</td> 70 </tr> 71 <tr> 72 <td> 73 <code>nomad-autoscaler.runtime.alloc_bytes</code> 74 </td> 75 <td>The number of allocated heap bytes</td> 76 <td>Gauge</td> 77 </tr> 78 <tr> 79 <td> 80 <code>nomad-autoscaler.runtime.sys_bytes</code> 81 </td> 82 <td>The total bytes of memory obtained from the OS</td> 83 <td>Gauge</td> 84 </tr> 85 <tr> 86 <td> 87 <code>nomad-autoscaler.runtime.malloc_count</code> 88 </td> 89 <td>Cumulative count of heap objects allocated</td> 90 <td>Gauge</td> 91 </tr> 92 <tr> 93 <td> 94 <code>nomad-autoscaler.runtime.free_count</code> 95 </td> 96 <td>Cumulative count of heap objects freed</td> 97 <td>Gauge</td> 98 </tr> 99 <tr> 100 <td> 101 <code>nomad-autoscaler.runtime.heap_objects</code> 102 </td> 103 <td>Number of allocated heap objects</td> 104 <td>Gauge</td> 105 </tr> 106 <tr> 107 <td> 108 <code>nomad-autoscaler.runtime.total_gc_pause_ns</code> 109 </td> 110 <td>Cumulative nanoseconds in GC stop-the-world pauses</td> 111 <td>Gauge</td> 112 </tr> 113 <tr> 114 <td> 115 <code>nomad-autoscaler.runtime.total_gc_runs</code> 116 </td> 117 <td>Number of completed GC cycles</td> 118 <td>Gauge</td> 119 </tr> 120 <tr> 121 <td> 122 <code>nomad-autoscaler.runtime.gc_pause_ns</code> 123 </td> 124 <td>Number of nanoseconds to complete the last GC cycle</td> 125 <td>Timer</td> 126 </tr> 127 </tbody> 128 </table> 129 130 ## Policy Metrics 131 132 Policy metrics provide insights into the performance of the Nomad Autoscaler's 133 policy handling. 134 135 <table> 136 <thead> 137 <tr> 138 <th>Metric</th> 139 <th>Description</th> 140 <th>Type</th> 141 <th>Labels</th> 142 </tr> 143 </thead> 144 <tbody> 145 <tr> 146 <td> 147 <code>nomad-autoscaler.policy.total_num</code> 148 </td> 149 <td>The number of policies currently held within the autoscaler</td> 150 <td>Gauge</td> 151 <td></td> 152 </tr> 153 <tr> 154 <td> 155 <code>nomad-autoscaler.policy.source.error_count</code> 156 </td> 157 <td>Tracks the number of errors generated by the policy sources</td> 158 <td>Counter</td> 159 <td>policy_source</td> 160 </tr> 161 </tbody> 162 </table> 163 164 ## Scaling Metrics 165 166 Scaling metrics provide insight into the performance of scaling actions as well 167 as overall success and failure counters. 168 169 <table> 170 <thead> 171 <tr> 172 <th>Metric</th> 173 <th>Description</th> 174 <th>Type</th> 175 <th>Labels</th> 176 </tr> 177 </thead> 178 <tbody> 179 <tr> 180 <td> 181 <code>nomad-autoscaler.scale.evaluate_ms</code> 182 </td> 183 <td>The time taken to evaluate the checks within a single policy</td> 184 <td>Timer</td> 185 <td>policy_id, target_name</td> 186 </tr> 187 <tr> 188 <td> 189 <code>nomad-autoscaler.scale.invoke_ms</code> 190 </td> 191 <td>The time taken to invoke scaling based on the scaling evaluations</td> 192 <td>Timer</td> 193 <td>policy_id, target_name</td> 194 </tr> 195 <tr> 196 <td> 197 <code>nomad-autoscaler.scale.invoke.success_count</code> 198 </td> 199 <td>Tracks the number of successful scaling actions triggered</td> 200 <td>Counter</td> 201 <td></td> 202 </tr> 203 <tr> 204 <td> 205 <code>nomad-autoscaler.scale.invoke.error_count</code> 206 </td> 207 <td>Tracks the number of unsuccessful scaling actions triggered</td> 208 <td>Counter</td> 209 <td></td> 210 </tr> 211 </tbody> 212 </table> 213 214 ## Plugin Metrics 215 216 Plugin metrics provide insight into the performance of Nomad Autoscaler plugins 217 and help identify potential bottle necks or latency issues. 218 219 <table> 220 <thead> 221 <tr> 222 <th>Metric</th> 223 <th>Description</th> 224 <th>Type</th> 225 <th>Labels</th> 226 </tr> 227 </thead> 228 <tbody> 229 <tr> 230 <td> 231 <code>nomad-autoscaler.plugin.manager.access_ms</code> 232 </td> 233 <td>The time taken to dispense a plugin</td> 234 <td>Timer</td> 235 <td></td> 236 </tr> 237 <tr> 238 <td> 239 <code>nomad-autoscaler.target.status.invoke_ms</code> 240 </td> 241 <td>The time taken to perform the target plugin status call</td> 242 <td>Timer</td> 243 <td>policy_id, plugin_name</td> 244 </tr> 245 <tr> 246 <td> 247 <code>nomad-autoscaler.target.scale.invoke_ms</code> 248 </td> 249 <td>The time taken to perform the target plugin scale call</td> 250 <td>Timer</td> 251 <td>policy_id, plugin_name</td> 252 </tr> 253 <tr> 254 <td> 255 <code>nomad-autoscaler.apm.query.invoke_ms</code> 256 </td> 257 <td>The time taken to perform the APM plugin query call</td> 258 <td>Timer</td> 259 <td>policy_id, plugin_name</td> 260 </tr> 261 <tr> 262 <td> 263 <code>nomad-autoscaler.strategy.run.invoke_ms</code> 264 </td> 265 <td>The time taken to perform the strategy plugin run call</td> 266 <td>Timer</td> 267 <td>policy_id, plugin_name</td> 268 </tr> 269 </tbody> 270 </table> 271 272 [agent_telemetry_config]: /docs/autoscaling/agent#telemetry-block