github.com/Ilhicas/nomad@v1.0.4-0.20210304152020-e86851182bc3/website/content/docs/autoscaling/telemetry.mdx (about)

     1  ---
     2  layout: docs
     3  page_title: Telemetry
     4  sidebar_title: Telemetry
     5  description: >
     6    Overview of runtime metrics available in the Nomad Autoscaler.
     7  ---
     8  
     9  # Nomad Autoscaler Telemetry
    10  
    11  The Nomad Autoscaler agent collects various runtime metrics about the performance
    12  of different libraries and subsystems. These metrics are aggregated on a ten
    13  second interval and are retained for one minute. To configure the telemetry output
    14  please see the [agent configuration][agent_telemetry_config].
    15  
    16  This data can be accessed via the `/v1/metrics` HTTP endpoint, via sending a
    17  signal to the Nomad Autoscaler process or via a number of integrations.
    18  
    19  To view this data via sending a signal to the Nomad Autoscaler process: on Unix,
    20  this is `USR1` while on Windows it is `BREAK`. Once Nomad Autoscaler receives
    21  the signal, it will dump the current telemetry information to the agent's `stderr`.
    22  
    23  This telemetry information can be used for debugging or otherwise
    24  getting a better view of what Nomad is doing.
    25  
    26  Below is sample output of a telemetry dump:
    27  
    28  ```text
    29  [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.sys_bytes': 74793216.000
    30  [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.malloc_count': 219856.000
    31  [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.free_count': 183613.000
    32  [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.total_gc_pause_ns': 348822.000
    33  [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.total_gc_runs': 5.000
    34  [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.num_goroutines': 12.000
    35  [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.policy.total_num': 0.000
    36  [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.alloc_bytes': 4316568.000
    37  [2020-08-25 10:01:20 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.heap_objects': 36243.000
    38  [2020-08-25 10:01:20 +0100 BST][S] 'nomad-autoscaler.runtime.gc_pause_ns': Count: 5 Min: 38083.000 Mean: 69764.400 Max: 122291.000 Stddev: 31487.808 Sum: 348822.000 LastUpdated: 2020-08-25 10:01:26.574809 +0100 BST m=+1.241576679
    39  [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.alloc_bytes': 4370504.000
    40  [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.malloc_count': 220853.000
    41  [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.free_count': 183613.000
    42  [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.policy.total_num': 0.000
    43  [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.num_goroutines': 12.000
    44  [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.total_gc_pause_ns': 348822.000
    45  [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.total_gc_runs': 5.000
    46  [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.sys_bytes': 74793216.000
    47  [2020-08-25 10:01:30 +0100 BST][G] 'nomad-autoscaler.pathfinder.runtime.heap_objects': 37240.000
    48  ```
    49  
    50  ## Runtime Metrics
    51  
    52  The runtime metrics help understand the Nomad Autoscaler agent's memory and load
    53  pressure performance.
    54  
    55  <table>
    56    <thead>
    57      <tr>
    58        <th>Metric</th>
    59        <th>Description</th>
    60        <th>Type</th>
    61      </tr>
    62    </thead>
    63    <tbody>
    64      <tr>
    65        <td>
    66          <code>nomad-autoscaler.runtime.num_goroutines</code>
    67        </td>
    68        <td>Number of running goroutines</td>
    69        <td>Gauge</td>
    70      </tr>
    71      <tr>
    72        <td>
    73          <code>nomad-autoscaler.runtime.alloc_bytes</code>
    74        </td>
    75        <td>The number of allocated heap bytes</td>
    76        <td>Gauge</td>
    77      </tr>
    78      <tr>
    79        <td>
    80          <code>nomad-autoscaler.runtime.sys_bytes</code>
    81        </td>
    82        <td>The total bytes of memory obtained from the OS</td>
    83        <td>Gauge</td>
    84      </tr>
    85      <tr>
    86        <td>
    87          <code>nomad-autoscaler.runtime.malloc_count</code>
    88        </td>
    89        <td>Cumulative count of heap objects allocated</td>
    90        <td>Gauge</td>
    91      </tr>
    92      <tr>
    93        <td>
    94          <code>nomad-autoscaler.runtime.free_count</code>
    95        </td>
    96        <td>Cumulative count of heap objects freed</td>
    97        <td>Gauge</td>
    98      </tr>
    99      <tr>
   100        <td>
   101          <code>nomad-autoscaler.runtime.heap_objects</code>
   102        </td>
   103        <td>Number of allocated heap objects</td>
   104        <td>Gauge</td>
   105      </tr>
   106      <tr>
   107        <td>
   108          <code>nomad-autoscaler.runtime.total_gc_pause_ns</code>
   109        </td>
   110        <td>Cumulative nanoseconds in GC stop-the-world pauses</td>
   111        <td>Gauge</td>
   112      </tr>
   113      <tr>
   114        <td>
   115          <code>nomad-autoscaler.runtime.total_gc_runs</code>
   116        </td>
   117        <td>Number of completed GC cycles</td>
   118        <td>Gauge</td>
   119      </tr>
   120      <tr>
   121        <td>
   122          <code>nomad-autoscaler.runtime.gc_pause_ns</code>
   123        </td>
   124        <td>Number of nanoseconds to complete the last GC cycle</td>
   125        <td>Timer</td>
   126      </tr>
   127    </tbody>
   128  </table>
   129  
   130  ## Policy Metrics
   131  
   132  Policy metrics provide insights into the performance of the Nomad Autoscaler's
   133  policy handling.
   134  
   135  <table>
   136    <thead>
   137      <tr>
   138        <th>Metric</th>
   139        <th>Description</th>
   140        <th>Type</th>
   141        <th>Labels</th>
   142      </tr>
   143    </thead>
   144    <tbody>
   145      <tr>
   146        <td>
   147          <code>nomad-autoscaler.policy.total_num</code>
   148        </td>
   149        <td>The number of policies currently held within the autoscaler</td>
   150        <td>Gauge</td>
   151        <td></td>
   152      </tr>
   153      <tr>
   154        <td>
   155          <code>nomad-autoscaler.policy.source.error_count</code>
   156        </td>
   157        <td>Tracks the number of errors generated by the policy sources</td>
   158        <td>Counter</td>
   159        <td>policy_source</td>
   160      </tr>
   161    </tbody>
   162  </table>
   163  
   164  ## Scaling Metrics
   165  
   166  Scaling metrics provide insight into the performance of scaling actions as well
   167  as overall success and failure counters.
   168  
   169  <table>
   170    <thead>
   171      <tr>
   172        <th>Metric</th>
   173        <th>Description</th>
   174        <th>Type</th>
   175        <th>Labels</th>
   176      </tr>
   177    </thead>
   178    <tbody>
   179      <tr>
   180        <td>
   181          <code>nomad-autoscaler.scale.evaluate_ms</code>
   182        </td>
   183        <td>The time taken to evaluate the checks within a single policy</td>
   184        <td>Timer</td>
   185        <td>policy_id, target_name</td>
   186      </tr>
   187      <tr>
   188        <td>
   189          <code>nomad-autoscaler.scale.invoke_ms</code>
   190        </td>
   191        <td>The time taken to invoke scaling based on the scaling evaluations</td>
   192        <td>Timer</td>
   193        <td>policy_id, target_name</td>
   194      </tr>
   195      <tr>
   196        <td>
   197          <code>nomad-autoscaler.scale.invoke.success_count</code>
   198        </td>
   199        <td>Tracks the number of successful scaling actions triggered</td>
   200        <td>Counter</td>
   201        <td></td>
   202      </tr>
   203      <tr>
   204        <td>
   205          <code>nomad-autoscaler.scale.invoke.error_count</code>
   206        </td>
   207        <td>Tracks the number of unsuccessful scaling actions triggered</td>
   208        <td>Counter</td>
   209        <td></td>
   210      </tr>
   211    </tbody>
   212  </table>
   213  
   214  ## Plugin Metrics
   215  
   216  Plugin metrics provide insight into the performance of Nomad Autoscaler plugins
   217  and help identify potential bottle necks or latency issues.
   218  
   219  <table>
   220    <thead>
   221      <tr>
   222        <th>Metric</th>
   223        <th>Description</th>
   224        <th>Type</th>
   225        <th>Labels</th>
   226      </tr>
   227    </thead>
   228    <tbody>
   229      <tr>
   230        <td>
   231          <code>nomad-autoscaler.plugin.manager.access_ms</code>
   232        </td>
   233        <td>The time taken to dispense a plugin</td>
   234        <td>Timer</td>
   235        <td></td>
   236      </tr>
   237      <tr>
   238        <td>
   239          <code>nomad-autoscaler.target.status.invoke_ms</code>
   240        </td>
   241        <td>The time taken to perform the target plugin status call</td>
   242        <td>Timer</td>
   243        <td>policy_id, plugin_name</td>
   244      </tr>
   245      <tr>
   246        <td>
   247          <code>nomad-autoscaler.target.scale.invoke_ms</code>
   248        </td>
   249        <td>The time taken to perform the target plugin scale call</td>
   250        <td>Timer</td>
   251        <td>policy_id, plugin_name</td>
   252      </tr>
   253      <tr>
   254        <td>
   255          <code>nomad-autoscaler.apm.query.invoke_ms</code>
   256        </td>
   257        <td>The time taken to perform the APM plugin query call</td>
   258        <td>Timer</td>
   259        <td>policy_id, plugin_name</td>
   260      </tr>
   261      <tr>
   262        <td>
   263          <code>nomad-autoscaler.strategy.run.invoke_ms</code>
   264        </td>
   265        <td>The time taken to perform the strategy plugin run call</td>
   266        <td>Timer</td>
   267        <td>policy_id, plugin_name</td>
   268      </tr>
   269    </tbody>
   270  </table>
   271  
   272  [agent_telemetry_config]: /docs/autoscaling/agent#telemetry-block