trpc.group/trpc-go/trpc-go@v1.0.3/metrics/README.md (about)

     1  English | [中文](README.zh_CN.md)
     2  
     3  # Metrics
     4  
     5  Metrics are can be simply understood as a series of numerical measurements.
     6  Different applications require different measurements.
     7  For example, For a web server, it might be request times; for a database, it might be the number of active connections or active queries, and so on.
     8  
     9  Metrics play a crucial role in understanding why your application works in a certain way.
    10  Suppose you are running a web application and find it running slowly.
    11  To understand what happened to your application, you need some information.
    12  For example, the application may slow down when the number of requests is high.
    13  If you have request count metrics, you can determine the cause and increase the number of servers to handle the load.
    14  
    15  ## Metric types
    16  
    17  Metrics can be categorized into unidimensional and multidimensional based on their data dimensions.
    18  
    19  ### Unidimensional metrics
    20  
    21  Unidimensional metrics consist of three parts: metric name, metric value, and metric aggregation policy.
    22  A metric name uniquely identifies a unidimensional monitoring metric.
    23  The metric aggregation policy describes how to aggregate metric values, such as summing, averaging, maximizing, and minimizing.
    24  For example, if you want to monitor the average CPU load, you can define and report a unidimensional monitoring metric with the metric name "cpu.avg.load":
    25  
    26  ```golang
    27  import (
    28      "trpc.group/trpc-go/trpc-go/log"
    29      "trpc.group/trpc-go/trpc-go/metrics"
    30  )
    31  
    32  if err := metrics.ReportSingleDimensionMetrics("cpu.avg.load", 70.0, metrics.PolicyAVG); err ! = nil {
    33      log.Infof("reporting cpu.avg.load metric failed: %v", err)
    34  }
    35  ```
    36  
    37  #### Common metrics
    38  
    39  The metrics package provides several common types of unidimensional metrics such as counter, gauge, timer, and histogram, depending on the aggregation policy, the value range of the metric value, and the possible actions that can be taken on the metric value.
    40  It is recommended to prioritize the use of these built-in metrics, and then customize other types of unidimensional metrics if they do not meet your needs.
    41  
    42  ##### Counter
    43  
    44  Counter is used to count the cumulative amount of a certain type of metrics, it will save the cumulative value continuously from system startup.
    45  It supports +1, -1, -n, +n operations on Counter.
    46  For example, if you want to monitor the number of requests for a particular microservice, you can define a Counter with the metric name "request.num":
    47  
    48  ```go
    49  import "trpc.group/trpc-go/trpc-go/metrics"
    50  
    51  _ = metrics.Counter("request.num")
    52  metrics.IncrCounter("request.num", 30)
    53  ```
    54  
    55  ##### Gauge
    56  
    57  Gauge is used to count the amount of moments of a certain type of metric.
    58  For example, if you want to monitor the average CPU load, you can define and report a Gauge with the metric name "cpu.load.avg":
    59  
    60  ```go
    61  import "trpc.group/trpc-go/trpc-go/metrics"
    62  
    63  _ = metrics.Gauge("cpu.avg.load")
    64  metrics.SetGauge("cpu.avg.load", 0.75)
    65  ```
    66  
    67  ##### Timer
    68  
    69  Timer is a special type of Gauge, which can count the time consumed by an operation according to its start time and end time.
    70  For example, if you want to monitor the time spent on an operation, you can define and report a timer with the name "operation.time.cost":
    71  
    72  ```go
    73  import "trpc.group/trpc-go/trpc-go/metrics"
    74  
    75  _ = metrics.Timer("operation.time.cost")
    76  // The operation took 2s.
    77  timeCost := 2 * time.Second
    78  metrics.RecordTimer("operation.time.cost", timeCost)
    79  ```
    80  
    81  ##### Histogram
    82  
    83  Histograms are used to count the distribution of certain types of metrics, such as maximum, minimum, mean, standard deviation, and various quartiles, e.g. 90%, 95% of the data is distributed within a certain range.
    84  Histograms are created with pre-divided buckets, and the sample points collected are placed in the corresponding buckets when the Histogram is reported.
    85  For example, if you want to monitor the distribution of request sizes, you can create buckets and put the collected samples into a histogram with the metric "request.size":
    86  
    87  ```golang
    88  buckets := metrics.NewValueBounds(1, 2, 5, 10)
    89  metrics.AddSample("request.size", buckets, 3)
    90  metrics.AddSample("request.size", buckets, 7)
    91  ```
    92  
    93  ### Multidimensional metrics
    94  
    95  Multidimensional metrics usually need to be combined with backend monitoring platforms to calculate and display data in different dimensions.
    96  Multidimensional metrics consist of a metric name, metric dimension information, and multiple unidimensional metrics.
    97  For example, if you want to monitor the requests received by a service based on different dimensions such as application name, service name, etc., you can create the following multidimensional metrics:
    98  
    99  ```go
   100  import (
   101      "trpc.group/trpc-go/trpc-go/log"
   102      "trpc.group/trpc-go/trpc-go/metrics"
   103  )
   104  
   105  if err := metrics.ReportMultiDimensionMetricsX("request",
   106      []*metrics.Dimension{
   107          {
   108              Name:  "app",
   109              Value: "trpc-go",
   110          },
   111          {
   112              Name:  "server",
   113              Value: "example",
   114          },
   115          {
   116              Name:  "service",
   117              Value: "hello",
   118          },
   119      },
   120      []*metrics.Metrics{
   121          metrics.NewMetrics("request-count", 1, metrics.PolicySUM),
   122          metrics.NewMetrics("request-cost", float64(time.Second), metrics.PolicyAVG),
   123          metrics.NewMetrics("request-size", 30, metrics.PolicyHistogram),
   124      }); err != nil {
   125          log.Infof("reporting request multi dimension metrics failed: %v", err)
   126  }
   127  ```
   128  
   129  ## Reporting to external monitoring systems
   130  
   131  Metrics need to be reported to various monitoring systems, either internal to the company or external to the open source community, such as Prometheus.
   132  The metrics package provides a generic `Sink` interface for this purpose:
   133  
   134  ```golang
   135  // Sink defines the interface an external monitor system should provide.
   136  type Sink interface {
   137  // Name returns the name of the monitor system.
   138  Name() string
   139  // Name returns the name of the monitor system. Name() string // Report reports a record to monitor system.
   140  Report(rec Record, opts .... Option) error
   141  Option) error }
   142  ```
   143  
   144  To integrate with different monitoring systems, you only need to implement the Sink interface and register the implementation to the metrics package.
   145  For example, to report metrics to the console, the following three steps are usually required.
   146  
   147  1. Create a `ConsoleSink` struct that implements the `Sink` interface.
   148     The metrics package already has a built-in implementation of `ConsoleSink`, which can be created directly via `metrics.NewConsoleSink()`
   149  
   150  2. Register the `ConsoleSink` to the metrics package.
   151  
   152  3. Create various metrics and report them.
   153  
   154  The following code snippet demonstrates the above three steps:
   155  
   156  ```golang
   157  import "trpc.group/trpc-go/trpc-go/log"
   158  
   159  // 1. Create a `ConsoleSink` struct that implements the `Sink` interface.
   160  s := metrics.NewConsoleSink()
   161  
   162  // 2. Register the `ConsoleSink` to the metrics package.
   163  metrics.RegisterMetricsSink(s)
   164  
   165  // 3. Create various metrics and report them.
   166  _ = metrics.Counter("request.num")
   167  metrics.IncrCounter("request.num", 30)
   168  ```