github.com/XiaoMi/Gaea@v1.2.5/docs/grafana.md (about) 1 # 监控配置 2 3 gaea proxy基于prometheus做统计数据的存储,使用grafana实现数据可视化展示。 4 5 ## 监控说明 6 7 ### gaea proxy监控 8 9 [proxy grafana配置模板](template/gaea_proxy.json) 10 11 proxy监控概览这部分主要展示gaea proxy的整体运行情况,主要包含以下几个监控项: 12 13 - 集群QPS 14 - 业务流量 15 - 业务请求耗时 16 - SQL 错误数 17 - CPU 负载 18 - 内存 负载 19 - 流量负载 20 - 会话数 21 - 业务会话数 22 - 协程数量 23 - GC 停顿时间 24 - 堆对象数量 25 26 27 ### 租户各指标监控 28 29 [namespace grafana配置模板](template/gaea_namespace.json) 30 31 导入模板之前需要把模板里的gaea_test_namespace 替换为实际使用的namespace 32 33 租户指标监控主要展示某个namespace的统计数据,主要包含以下几个监控项: 34 35 - QPS 36 - 流量 37 - SQL耗时 38 - SQL错误数 39 - 高耗时SQL指纹 40 - 错误SQL指纹 41 - 连接数 42 - 空闲连接数 43 - 连接等待队列 44 45 46 ## prometheus配置说明 47 48 ``` 49 - job_name: 'gaea_proxy' 50 metrics_path: '/api/metric/metrics' 51 static_configs: 52 - targets: ["admin_addr1"] 53 - targets: ["admin_addr2"] 54 - targets: ["admin_addr3"] 55 basic_auth: 56 username: admin_user 57 password: admin_password 58 ``` 59 需要修改admin_addr,admin_user,admin_password与gaea.ini中的以下几项保持一致。 60 ``` 61 ;管理地址 62 admin_addr=0.0.0.0:13307 63 ;basic auth 64 admin_user=admin 65 admin_password=admin 66 ``` 67 增加Prometheus Recoding Rules 68 ``` 69 groups: 70 - name: gaea_proxy_rule 71 rules: 72 - record: gaea_proxy_sql_timings_count_rate_each_namespace 73 expr: sum(avg(rate(gaea_proxy_sql_timings_count[20s])) without (slave)) by (namespace) 74 - record: gaea_proxy_sql_timings_count_rate_total 75 expr: sum(sum(avg(rate(gaea_proxy_sql_timings_count[20s])) without (slave)) by (namespace)) 76 - record: gaea_proxy_flow_counts_rate_namespace_flowdirection 77 expr: sum(avg(rate(gaea_proxy_flow_counts[20s])) without (slave)) by (namespace, flowdirection) 78 - record: gaea_proxy_flow_counts_rate_namespace 79 expr: sum(avg(rate(gaea_proxy_flow_counts[20s])) without (slave)) by (namespace) 80 - record: gaea_proxy_flow_counts_rate_total 81 expr: sum(sum(avg(rate(gaea_proxy_flow_counts[20s])) without (slave)) by (namespace)) 82 - record: gaea_proxy_sql_timings_rate_namespace_operation 83 expr: sum(delta(gaea_proxy_sql_timings_sum[20s])) by (namespace,operation) / sum(delta(gaea_proxy_sql_timings_count[20s])) by (namespace,operation) 84 - record: gaea_proxy_sql_timings_rate_namespace 85 expr: sum(delta(gaea_proxy_sql_timings_sum[20s])) by (namespace) / sum(delta(gaea_proxy_sql_timings_count[20s])) by (namespace) 86 - record: gaea_proxy_sql_error_counts_rate_namespace 87 expr: sum(avg(rate(gaea_proxy_sql_error_counts[20s])) without (instance)) by (namespace) 88 ``` 89 ## 90 91