github.com/XiaoMi/Gaea@v1.2.5/docs/grafana.md (about)

     1  # 监控配置
     2  
     3  gaea proxy基于prometheus做统计数据的存储,使用grafana实现数据可视化展示。
     4  
     5  ## 监控说明
     6  
     7  ### gaea proxy监控
     8  
     9  [proxy grafana配置模板](template/gaea_proxy.json)  
    10  
    11  proxy监控概览这部分主要展示gaea proxy的整体运行情况,主要包含以下几个监控项:
    12  
    13  - 集群QPS
    14  - 业务流量
    15  - 业务请求耗时
    16  - SQL 错误数
    17  - CPU 负载
    18  - 内存 负载
    19  - 流量负载
    20  - 会话数
    21  - 业务会话数
    22  - 协程数量
    23  - GC 停顿时间
    24  - 堆对象数量
    25     
    26  
    27  ### 租户各指标监控
    28  
    29  [namespace grafana配置模板](template/gaea_namespace.json)
    30  
    31  导入模板之前需要把模板里的gaea_test_namespace 替换为实际使用的namespace
    32  
    33  租户指标监控主要展示某个namespace的统计数据,主要包含以下几个监控项:
    34  
    35  - QPS
    36  - 流量
    37  - SQL耗时
    38  - SQL错误数
    39  - 高耗时SQL指纹
    40  - 错误SQL指纹
    41  - 连接数
    42  - 空闲连接数
    43  - 连接等待队列
    44  
    45  
    46  ## prometheus配置说明
    47  
    48  ```
    49  - job_name: 'gaea_proxy'
    50      metrics_path: '/api/metric/metrics'
    51      static_configs:
    52      - targets: ["admin_addr1"]
    53      - targets: ["admin_addr2"]
    54      - targets: ["admin_addr3"]
    55      basic_auth:
    56        username: admin_user
    57        password: admin_password
    58  ```
    59  需要修改admin_addr,admin_user,admin_password与gaea.ini中的以下几项保持一致。
    60  ```
    61  ;管理地址
    62  admin_addr=0.0.0.0:13307
    63  ;basic auth
    64  admin_user=admin
    65  admin_password=admin
    66  ```
    67  增加Prometheus Recoding Rules
    68  ```
    69  groups:
    70    - name: gaea_proxy_rule
    71      rules:
    72      - record: gaea_proxy_sql_timings_count_rate_each_namespace
    73        expr: sum(avg(rate(gaea_proxy_sql_timings_count[20s])) without (slave)) by (namespace)
    74      - record: gaea_proxy_sql_timings_count_rate_total
    75        expr: sum(sum(avg(rate(gaea_proxy_sql_timings_count[20s])) without (slave)) by (namespace))
    76      - record: gaea_proxy_flow_counts_rate_namespace_flowdirection
    77        expr: sum(avg(rate(gaea_proxy_flow_counts[20s])) without (slave)) by (namespace, flowdirection)
    78      - record: gaea_proxy_flow_counts_rate_namespace
    79        expr: sum(avg(rate(gaea_proxy_flow_counts[20s])) without (slave)) by (namespace)
    80      - record: gaea_proxy_flow_counts_rate_total
    81        expr: sum(sum(avg(rate(gaea_proxy_flow_counts[20s])) without (slave)) by (namespace))
    82      - record: gaea_proxy_sql_timings_rate_namespace_operation
    83        expr: sum(delta(gaea_proxy_sql_timings_sum[20s])) by (namespace,operation) / sum(delta(gaea_proxy_sql_timings_count[20s])) by (namespace,operation)
    84      - record: gaea_proxy_sql_timings_rate_namespace
    85        expr: sum(delta(gaea_proxy_sql_timings_sum[20s])) by (namespace) / sum(delta(gaea_proxy_sql_timings_count[20s])) by (namespace)
    86      - record: gaea_proxy_sql_error_counts_rate_namespace
    87        expr: sum(avg(rate(gaea_proxy_sql_error_counts[20s])) without (instance)) by (namespace)
    88  ``` 
    89  ##  
    90  
    91