github.com/anth0d/nomad@v0.0.0-20221214183521-ae3a0a2cad06/website/content/tools/autoscaling/telemetry.mdx (about) 1 --- 2 layout: docs 3 page_title: Telemetry 4 description: > 5 Overview of runtime metrics available in the Nomad Autoscaler. 6 --- 7 8 # Nomad Autoscaler Telemetry 9 10 The Nomad Autoscaler agent collects various runtime metrics about the performance 11 of different libraries and subsystems. These metrics are aggregated on a ten 12 second interval and are retained for one minute. To configure the telemetry output 13 please see the [agent configuration][agent_telemetry_config]. 14 15 This data can be accessed via the `/v1/metrics` HTTP endpoint, via sending a 16 signal to the Nomad Autoscaler process or via a number of integrations. 17 18 To view this data via sending a signal to the Nomad Autoscaler process: on Unix, 19 this is `USR1` while on Windows it is `BREAK`. Once Nomad Autoscaler receives 20 the signal, it will dump the current telemetry information to the agent's `stderr`. 21 22 This telemetry information can be used for debugging or otherwise 23 getting a better view of what Nomad is doing. 24 25 Below is sample output of a telemetry dump: 26 27 ```text 28 [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.sys_bytes': 74793216.000 29 [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.malloc_count': 219856.000 30 [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.free_count': 183613.000 31 [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.total_gc_pause_ns': 348822.000 32 [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.total_gc_runs': 5.000 33 [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.num_goroutines': 12.000 34 [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.policy.total_num': 0.000 35 [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.alloc_bytes': 4316568.000 36 [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.heap_objects': 36243.000 37 [2020-08-25 10:01:20 +0100 BST][S] 'nomad-autoscaler.runtime.gc_pause_ns': Count: 5 Min: 38083.000 Mean: 69764.400 Max: 122291.000 Stddev: 31487.808 Sum: 348822.000 LastUpdated: 2020-08-25 10:01:26.574809 +0100 BST m=+1.241576679 38 [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.alloc_bytes': 4370504.000 39 [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.malloc_count': 220853.000 40 [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.free_count': 183613.000 41 [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.policy.total_num': 0.000 42 [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.num_goroutines': 12.000 43 [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.total_gc_pause_ns': 348822.000 44 [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.total_gc_runs': 5.000 45 [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.sys_bytes': 74793216.000 46 [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.heap_objects': 37240.000 47 ``` 48 49 ## Runtime Metrics 50 51 The runtime metrics help understand the Nomad Autoscaler agent's memory and load 52 pressure performance. 53 54 <table> 55 <thead> 56 <tr> 57 <th>Metric</th> 58 <th>Description</th> 59 <th>Type</th> 60 </tr> 61 </thead> 62 <tbody> 63 <tr> 64 <td> 65 <code>nomad-autoscaler.runtime.num_goroutines</code> 66 </td> 67 <td>Number of running goroutines</td> 68 <td>Gauge</td> 69 </tr> 70 <tr> 71 <td> 72 <code>nomad-autoscaler.runtime.alloc_bytes</code> 73 </td> 74 <td>The number of allocated heap bytes</td> 75 <td>Gauge</td> 76 </tr> 77 <tr> 78 <td> 79 <code>nomad-autoscaler.runtime.sys_bytes</code> 80 </td> 81 <td>The total bytes of memory obtained from the OS</td> 82 <td>Gauge</td> 83 </tr> 84 <tr> 85 <td> 86 <code>nomad-autoscaler.runtime.malloc_count</code> 87 </td> 88 <td>Cumulative count of heap objects allocated</td> 89 <td>Gauge</td> 90 </tr> 91 <tr> 92 <td> 93 <code>nomad-autoscaler.runtime.free_count</code> 94 </td> 95 <td>Cumulative count of heap objects freed</td> 96 <td>Gauge</td> 97 </tr> 98 <tr> 99 <td> 100 <code>nomad-autoscaler.runtime.heap_objects</code> 101 </td> 102 <td>Number of allocated heap objects</td> 103 <td>Gauge</td> 104 </tr> 105 <tr> 106 <td> 107 <code>nomad-autoscaler.runtime.total_gc_pause_ns</code> 108 </td> 109 <td>Cumulative nanoseconds in GC stop-the-world pauses</td> 110 <td>Gauge</td> 111 </tr> 112 <tr> 113 <td> 114 <code>nomad-autoscaler.runtime.total_gc_runs</code> 115 </td> 116 <td>Number of completed GC cycles</td> 117 <td>Gauge</td> 118 </tr> 119 <tr> 120 <td> 121 <code>nomad-autoscaler.runtime.gc_pause_ns</code> 122 </td> 123 <td>Number of nanoseconds to complete the last GC cycle</td> 124 <td>Timer</td> 125 </tr> 126 </tbody> 127 </table> 128 129 ## Policy Metrics 130 131 Policy metrics provide insights into the performance of the Nomad Autoscaler's 132 policy handling. 133 134 <table> 135 <thead> 136 <tr> 137 <th>Metric</th> 138 <th>Description</th> 139 <th>Type</th> 140 <th>Labels</th> 141 </tr> 142 </thead> 143 <tbody> 144 <tr> 145 <td> 146 <code>nomad-autoscaler.policy.total_num</code> 147 </td> 148 <td>The number of policies currently held within the autoscaler</td> 149 <td>Gauge</td> 150 <td></td> 151 </tr> 152 <tr> 153 <td> 154 <code>nomad-autoscaler.policy.source.error_count</code> 155 </td> 156 <td>Tracks the number of errors generated by the policy sources</td> 157 <td>Counter</td> 158 <td>policy_source</td> 159 </tr> 160 </tbody> 161 </table> 162 163 ## Scaling Metrics 164 165 Scaling metrics provide insight into the performance of scaling actions as well 166 as overall success and failure counters. 167 168 <table> 169 <thead> 170 <tr> 171 <th>Metric</th> 172 <th>Description</th> 173 <th>Type</th> 174 <th>Labels</th> 175 </tr> 176 </thead> 177 <tbody> 178 <tr> 179 <td> 180 <code>nomad-autoscaler.scale.evaluate_ms</code> 181 </td> 182 <td>The time taken to evaluate the checks within a single policy</td> 183 <td>Timer</td> 184 <td>policy_id, target_name</td> 185 </tr> 186 <tr> 187 <td> 188 <code>nomad-autoscaler.scale.invoke_ms</code> 189 </td> 190 <td>The time taken to invoke scaling based on the scaling evaluations</td> 191 <td>Timer</td> 192 <td>policy_id, target_name</td> 193 </tr> 194 <tr> 195 <td> 196 <code>nomad-autoscaler.scale.invoke.success_count</code> 197 </td> 198 <td>Tracks the number of successful scaling actions triggered</td> 199 <td>Counter</td> 200 <td></td> 201 </tr> 202 <tr> 203 <td> 204 <code>nomad-autoscaler.scale.invoke.error_count</code> 205 </td> 206 <td>Tracks the number of unsuccessful scaling actions triggered</td> 207 <td>Counter</td> 208 <td></td> 209 </tr> 210 </tbody> 211 </table> 212 213 ## Plugin Metrics 214 215 Plugin metrics provide insight into the performance of Nomad Autoscaler plugins 216 and help identify potential bottle necks or latency issues. 217 218 <table> 219 <thead> 220 <tr> 221 <th>Metric</th> 222 <th>Description</th> 223 <th>Type</th> 224 <th>Labels</th> 225 </tr> 226 </thead> 227 <tbody> 228 <tr> 229 <td> 230 <code>nomad-autoscaler.plugin.manager.access_ms</code> 231 </td> 232 <td>The time taken to dispense a plugin</td> 233 <td>Timer</td> 234 <td></td> 235 </tr> 236 <tr> 237 <td> 238 <code>nomad-autoscaler.target.status.invoke_ms</code> 239 </td> 240 <td>The time taken to perform the target plugin status call</td> 241 <td>Timer</td> 242 <td>policy_id, plugin_name</td> 243 </tr> 244 <tr> 245 <td> 246 <code>nomad-autoscaler.target.scale.invoke_ms</code> 247 </td> 248 <td>The time taken to perform the target plugin scale call</td> 249 <td>Timer</td> 250 <td>policy_id, plugin_name</td> 251 </tr> 252 <tr> 253 <td> 254 <code>nomad-autoscaler.apm.query.invoke_ms</code> 255 </td> 256 <td>The time taken to perform the APM plugin query call</td> 257 <td>Timer</td> 258 <td>policy_id, plugin_name</td> 259 </tr> 260 <tr> 261 <td> 262 <code>nomad-autoscaler.strategy.run.invoke_ms</code> 263 </td> 264 <td>The time taken to perform the strategy plugin run call</td> 265 <td>Timer</td> 266 <td>policy_id, plugin_name</td> 267 </tr> 268 </tbody> 269 </table> 270 271 [agent_telemetry_config]: /tools/autoscaling/agent#telemetry-block