github.com/argoproj/argo-cd/v3@v3.2.1/docs/operator-manual/metrics.md (about) 1 # Metrics 2 3 Argo CD exposes different sets of Prometheus metrics per server. 4 5 ## Application Controller Metrics 6 7 Metrics about applications. Scraped at the `argocd-metrics:8082/metrics` endpoint. 8 9 | Metric | Type | Description | 10 | ------------------------------------------------- | :-------: | ------------------------------------------------------------------------------------------------------------------------------------------- | 11 | `argocd_app_info` | gauge | Information about Applications. It contains labels such as `sync_status` and `health_status` that reflect the application state in Argo CD. | 12 | `argocd_app_condition` | gauge | Report Applications conditions. It contains the conditions currently present in the application status. | 13 | `argocd_app_k8s_request_total` | counter | Number of Kubernetes requests executed during application reconciliation | 14 | `argocd_app_labels` | gauge | Argo Application labels converted to Prometheus labels. Disabled by default. See section below about how to enable it. | 15 | `argocd_app_orphaned_resources_count` | gauge | Number of orphaned resources per application. | 16 | `argocd_app_reconcile` | histogram | Application reconciliation performance in seconds. | 17 | `argocd_app_sync_total` | counter | Counter for application sync history | 18 | `argocd_app_sync_duration_seconds_total` | counter | Application sync performance in seconds total. | 19 | `argocd_cluster_api_resource_objects` | gauge | Number of k8s resource objects in the cache. | 20 | `argocd_cluster_api_resources` | gauge | Number of monitored Kubernetes API resources. | 21 | `argocd_cluster_cache_age_seconds` | gauge | Cluster cache age in seconds. | 22 | `argocd_cluster_connection_status` | gauge | The k8s cluster current connection status. | 23 | `argocd_cluster_events_total` | counter | Number of processes k8s resource events. | 24 | `argocd_cluster_info` | gauge | Information about cluster. | 25 | `argocd_redis_request_duration` | histogram | Redis requests duration. | 26 | `argocd_redis_request_total` | counter | Number of redis requests executed during application reconciliation | 27 | `argocd_resource_events_processing` | histogram | Time to process resource events in batch in seconds | 28 | `argocd_resource_events_processed_in_batch` | gauge | Number of resource events processed in batch | 29 | `argocd_kubectl_exec_pending` | gauge | Number of pending kubectl executions | 30 | `argocd_kubectl_exec_total` | counter | Number of kubectl executions | 31 | `argocd_kubectl_client_cert_rotation_age_seconds` | gauge | Age of kubectl client certificate rotation. | 32 | `argocd_kubectl_request_duration_seconds` | histogram | Latency of kubectl requests. | 33 | `argocd_kubectl_dns_resolution_duration_seconds` | histogram | Latency of kubectl resolver. | 34 | `argocd_kubectl_request_size_bytes` | histogram | Size of kubectl requests. | 35 | `argocd_kubectl_response_size_bytes` | histogram | Size of kubectl responses. | 36 | `argocd_kubectl_rate_limiter_duration_seconds` | histogram | Latency of kubectl rate limiter. | 37 | `argocd_kubectl_requests_total` | counter | Result of kubectl requests. | 38 | `argocd_kubectl_exec_plugin_call_total` | counter | Number of kubectl exec plugin calls. | 39 | `argocd_kubectl_request_retries_total` | counter | Number of kubectl request retries. | 40 | `argocd_kubectl_transport_cache_entries` | gauge | Number of kubectl transport cache entries. | 41 | `argocd_kubectl_transport_create_calls_total` | counter | Number of kubectl transport create calls. | 42 43 ### Labels 44 45 | Label Name | Example Value | Description | 46 | ------------------ | ------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | 47 | call_status | no_error | Status of the kubectl exec plugin call. Possible values are: no_error, plugin_execution_error, plugin_not_found_error, client_internal_error. | 48 | code | 200 | HTTP status code returned by the request or exit code of a command. kubectl metrics produced by client-go use `code` for HTTP responses, while metrics produced by Argo CD use `response_code`. | 49 | command | apply | kubectl command executed. Possible values are: apply, auth, create, replace. | 50 | dest_server | https://example.com | Destination server for an Application. | 51 | failed | false | Indicates if the Redis request failed. Possible values are: true, false. | 52 | group | apps | Group name of a Kubernetes resource being monitored. | 53 | host | example.com | Hostname of the Kubernetes API to which the request was made. | 54 | hostname | argocd-application-controller-0 | Hostname of the Argo CD component that initiated the request to Redis. | 55 | initiator | argocd-server | Name of the Argo CD component that initiated the request to Redis. Possible values are: argocd-application-controller, argocd-repo-server, argocd-server. | 56 | kind | Deployment | Kind name of a Kubernetes resource being monitored. | 57 | method | GET | HTTP method used for the request. Possible values are: GET, DELETE, PATCH, POST, PUT. | 58 | name | my-app | Name of an Application. | 59 | namespace | default | Namespace of an Application (namespace where the Application CR is located, not the destination namespace). | 60 | phase | Succeeded | Phase of a sync operation. Possible values are: Error, Failed, Running, Succeeded, Terminating. | 61 | project | my-project | AppProject of an Application. | 62 | resource_kind | Pod | Kind of Kubernetes resource being synced. | 63 | resource_namespace | default | Namespace of Kubernetes resource being synced. | 64 | response_code | 404 | HTTP response code from the server. | 65 | result | hit | Result of an attempt to get a transport from the kubectl (client-go) transport cache. Possible values are: hit, miss, unreachable. | 66 | server | https://example.com | Server where the operation is performed. | 67 | verb | List | Kubernetes API verb used in the request. Possible values are: Get, Watch, List, Create, Delete, Patch, Update. | 68 69 ### Metrics Cache Expiration 70 71 If you use Argo CD with many application and project creation and deletion, 72 the metrics page will keep in cache your application and project's history. 73 If you are having issues because of a large number of metrics cardinality due 74 to deleted resources, you can schedule a metrics reset to clean the 75 history with an application controller flag. Example: 76 `--metrics-cache-expiration="24h0m0s"`. 77 78 ### Exposing Application labels as Prometheus metrics 79 80 There are use-cases where Argo CD Applications contain labels that are desired to be exposed as Prometheus metrics. 81 Some examples are: 82 83 - Having the team name as a label to allow routing alerts to specific receivers 84 - Creating dashboards broken down by business units 85 86 As the Application labels are specific to each company, this feature is disabled by default. To enable it, add the 87 `--metrics-application-labels` flag to the Argo CD application controller. 88 89 The example below will expose the Argo CD Application labels `team-name` and `business-unit` to Prometheus: 90 91 containers: 92 - command: 93 - argocd-application-controller 94 - --metrics-application-labels 95 - team-name 96 - --metrics-application-labels 97 - business-unit 98 99 In this case, the metric would look like: 100 101 ``` 102 # TYPE argocd_app_labels gauge 103 argocd_app_labels{label_business_unit="bu-id-1",label_team_name="my-team",name="my-app-1",namespace="argocd",project="important-project"} 1 104 argocd_app_labels{label_business_unit="bu-id-1",label_team_name="my-team",name="my-app-2",namespace="argocd",project="important-project"} 1 105 argocd_app_labels{label_business_unit="bu-id-2",label_team_name="another-team",name="my-app-3",namespace="argocd",project="important-project"} 1 106 ``` 107 108 ### Exposing Application conditions as Prometheus metrics 109 110 There are use-cases where Argo CD Applications contain conditions that are desired to be exposed as Prometheus metrics. 111 Some examples are: 112 113 - Hunting orphaned resources across all deployed applications 114 - Knowing which resources are excluded from ArgoCD 115 116 As the Application conditions are specific to each company, this feature is disabled by default. To enable it, add the 117 `--metrics-application-conditions` flag to the Argo CD application controller. 118 119 The example below will expose the Argo CD Application condition `OrphanedResourceWarning` and `ExcludedResourceWarning` to Prometheus: 120 121 ```yaml 122 containers: 123 - command: 124 - argocd-application-controller 125 - --metrics-application-conditions 126 - OrphanedResourceWarning 127 - --metrics-application-conditions 128 - ExcludedResourceWarning 129 ``` 130 131 ## Application Set Controller metrics 132 133 The Application Set controller exposes the following metrics for application sets. 134 135 | Metric | Type | Description | 136 | ------------------------------------------------- | :-------: | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | 137 | `argocd_appset_info` | gauge | Information about Application Sets. It contains labels for the name and namespace of an application set as well as `Resource_update_status` that reflects the `ResourcesUpToDate` property | 138 | `argocd_appset_reconcile` | histogram | Application reconciliation performance in seconds. It contains labels for the name and namespace of an applicationset | 139 | `argocd_appset_labels` | gauge | Applicationset labels translated to Prometheus labels. Disabled by default | 140 | `argocd_appset_owned_applications` | gauge | Number of applications owned by the applicationset. It contains labels for the name and namespace of an applicationset. | 141 | `argocd_kubectl_client_cert_rotation_age_seconds` | gauge | Age of kubectl client certificate rotation. | 142 | `argocd_kubectl_request_duration_seconds` | histogram | Latency of kubectl requests. | 143 | `argocd_kubectl_dns_resolution_duration_seconds` | histogram | Latency of kubectl resolver. | 144 | `argocd_kubectl_request_size_bytes` | histogram | Size of kubectl requests. | 145 | `argocd_kubectl_response_size_bytes` | histogram | Size of kubectl responses. | 146 | `argocd_kubectl_rate_limiter_duration_seconds` | histogram | Latency of kubectl rate limiter. | 147 | `argocd_kubectl_requests_total` | counter | Result of kubectl requests. | 148 | `argocd_kubectl_exec_plugin_call_total` | counter | Number of kubectl exec plugin calls. | 149 | `argocd_kubectl_request_retries_total` | counter | Number of kubectl request retries. | 150 | `argocd_kubectl_transport_cache_entries` | gauge | Number of kubectl transport cache entries. | 151 | `argocd_kubectl_transport_create_calls_total` | counter | Number of kubectl transport create calls. | 152 153 Similar to the same metric in application controller (`argocd_app_labels`) the metric `argocd_appset_labels` is disabled by default. You can enable it by providing the `–metrics-applicationset-labels` argument to the applicationset controller. 154 155 Once enabled it works exactly the same as application controller metrics (label\_ appended to normalized label name). 156 Available labels include Name, Namespace + all labels enabled by the command line options and their value (exactly like application controller metrics described in the previous section). | 157 158 ### Application Set GitHub API metrics 159 160 All the following `argocd_github_api_*` metrics can be enabled upon setting `applicationsetcontroller.enable.github.api.metrics: true` in `argocd-cmd-params-cm` ConfigMap. Note that they are disabled by default. 161 162 | Metric | Type | Description | 163 | -------------------------------------------- | :-------: | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | 164 | `argocd_github_api_requests_total` | counter | Number of Github API calls. It contains labels for the name and namespace of an applicationset. | 165 | `argocd_github_api_request_duration_seconds` | histogram | Github API request duration. It contains labels for the name and namespace of an applicationset. | 166 | `argocd_github_api_rate_limit_remaining` | gauge | The number of requests remaining in the current rate limit window. It contains labels for the name and namespace of an applicationset, and for the rate limit resource. | 167 | `argocd_github_api_rate_limit_limit` | gauge | The maximum number of requests that you can make per hour. It contains labels for the name and namespace of an applicationset, and for the rate limit resource. | 168 | `argocd_github_api_rate_limit_reset_seconds` | gauge | The time left till the current rate limit window resets, in seconds. It contains labels for the name and namespace of an applicationset, and for the rate limit resource. | 169 | `argocd_github_api_rate_limit_used` | gauge | The number of requests used in the current rate limit window. It contains labels for the name and namespace of an applicationset, and for the rate limit resource. | 170 171 ### Labels 172 173 | Label Name | Example Value | Description | 174 | ----------- | ------------- | --------------------------------------------------------------------------------------------------------------------------------------------- | 175 | call_status | no_error | Status of the kubectl exec plugin call. Possible values are: no_error, plugin_execution_error, plugin_not_found_error, client_internal_error. | 176 | code | 200 | HTTP status code returned by the request or exit code of a command. | 177 | host | example.com | Hostname of the Kubernetes API to which the request was made. | 178 | method | GET | HTTP method used for the request. Possible values are: GET, DELETE, PATCH, POST, PUT. | 179 | name | my-app | Name of an ApplicationSet. | 180 | namespace | default | Namespace of an ApplicationSet (namespace where the ApplicationSet CR is located, not the destination namespace). | 181 | result | hit | Result of an attempt to get a transport from the kubectl (client-go) transport cache. Possible values are: hit, miss, unreachable. | 182 | verb | List | Kubernetes API verb used in the request. Possible values are: Get, Watch, List, Create, Delete, Patch, Update. | 183 184 ### Exposing Cluster labels as Prometheus metrics 185 186 As the Cluster labels are specific to each company, this feature is disabled by default. To enable it, add the 187 `--metrics-cluster-labels` flag to the Argo CD application controller. 188 189 The example below will expose the Argo CD Application labels `team-name` and `environment` to Prometheus: 190 191 containers: 192 - command: 193 - argocd-application-controller 194 - --metrics-cluster-labels 195 - team-name 196 - --metrics-cluster-labels 197 - environment 198 199 In this case, the metric would look like: 200 201 ``` 202 # TYPE argocd_app_labels gauge 203 argocd_cluster_labels{label_environment="dev",label_team_name="team1",name="cluster1",server="server1"} 1 204 argocd_cluster_labels{label_environment="staging",label_team_name="team2",name="cluster2",server="server2"} 1 205 argocd_cluster_labels{label_environment="production",label_team_name="team3",name="cluster3",server="server3"} 1 206 ``` 207 208 ## API Server Metrics 209 210 Metrics about API Server API request and response activity (request totals, response codes, etc...). 211 Scraped at the `argocd-server-metrics:8083/metrics` endpoint. 212 For GRPC metrics to show up environment variable ARGOCD_ENABLE_GRPC_TIME_HISTOGRAM must be set to true. 213 214 | Metric | Type | Description 215 |---------------------------------------------------|:---------:|---------------------------------------------------------------------------------------------| 216 | `argocd_login_request_total` | counter | Number of login requests. | 217 | `argocd_redis_request_duration` | histogram | Redis requests duration. | 218 | `argocd_redis_request_total` | counter | Number of Kubernetes requests executed during application reconciliation. | 219 | `grpc_server_handled_total` | counter | Total number of RPCs completed on the server, regardless of success or failure. | 220 | `grpc_server_msg_sent_total` | counter | Total number of gRPC stream messages sent by the server. | 221 | `argocd_proxy_extension_request_total` | counter | Number of requests sent to the configured proxy extensions. | 222 | `argocd_proxy_extension_request_duration_seconds` | histogram | Request duration in seconds between the Argo CD API server and the proxy extension backend. | 223 | `argocd_kubectl_client_cert_rotation_age_seconds` | gauge | Age of kubectl client certificate rotation. | 224 | `argocd_kubectl_request_duration_seconds` | histogram | Latency of kubectl requests. | 225 | `argocd_kubectl_dns_resolution_duration_seconds` | histogram | Latency of kubectl resolver. | 226 | `argocd_kubectl_request_size_bytes` | histogram | Size of kubectl requests. | 227 | `argocd_kubectl_response_size_bytes` | histogram | Size of kubectl responses. | 228 | `argocd_kubectl_rate_limiter_duration_seconds` | histogram | Latency of kubectl rate limiter. | 229 | `argocd_kubectl_requests_total` | counter | Result of kubectl requests. | 230 | `argocd_kubectl_exec_plugin_call_total` | counter | Number of kubectl exec plugin calls. | 231 | `argocd_kubectl_request_retries_total` | counter | Number of kubectl request retries. | 232 | `argocd_kubectl_transport_cache_entries` | gauge | Number of kubectl transport cache entries. | 233 | `argocd_kubectl_transport_create_calls_total` | counter | Number of kubectl transport create calls. | 234 235 ### Labels 236 237 | Label Name | Example Value | Description | 238 | ----------- | ------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | 239 | call_status | no_error | Status of the kubectl exec plugin call. Possible values are: no_error, plugin_execution_error, plugin_not_found_error, client_internal_error. | 240 | code | 200 | HTTP status code returned by the request or exit code of a command. kubectl metrics produced by client-go use `code` for HTTP responses, while metrics produced by Argo CD proxy extensions use `status`. | 241 | extension | metrics | Name of the proxy extension being called. | 242 | failed | false | Indicates if the Redis request failed. Possible values are: true, false. | 243 | host | example.com | Hostname of the Kubernetes API to which the request was made. | 244 | initiator | argocd-server | Name of the Argo CD component that initiated the request to Redis. Possible values are: argocd-application-controller, argocd-repo-server, argocd-server. | 245 | method | GET | HTTP method used for the request. Possible values are: GET, DELETE, PATCH, POST, PUT. | 246 | result | hit | Result of an attempt to get a transport from the kubectl (client-go) transport cache. Possible values are: hit, miss, unreachable. | 247 | status | 200 | HTTP response code from the extension. | 248 | verb | List | Kubernetes API verb used in the request. Possible values are: Get, Watch, List, Create, Delete, Patch, Update. | 249 | version | v2.13.3 | Argo CD version. | 250 251 ## Repo Server Metrics 252 253 Metrics about the Repo Server. The gRPC metrics are not exposed by default. Metrics can be enabled using 254 `ARGOCD_ENABLE_GRPC_TIME_HISTOGRAM=true` environment variable. 255 Scraped at the `argocd-repo-server:8084/metrics` endpoint. 256 257 258 | Metric | Type | Description | 259 | --------------------------------------- | :-------: | ------------------------------------------------------------------------- | 260 | `argocd_git_request_duration_seconds` | histogram | Git requests duration seconds. | 261 | `argocd_git_request_total` | counter | Number of git requests performed by repo server | 262 | `argocd_git_fetch_fail_total` | counter | Number of git fetch requests failures by repo server | 263 | `argocd_redis_request_duration_seconds` | histogram | Redis requests duration seconds. | 264 | `argocd_redis_request_total` | counter | Number of Kubernetes requests executed during application reconciliation. | 265 | `argocd_repo_pending_request_total` | gauge | Number of pending requests requiring repository lock | 266 267 ## Commit Server Metrics 268 269 Metrics about the Commit Server. 270 Scraped at the `argocd-commit-server:8087/metrics` endpoint. 271 272 | Metric | Type | Description | 273 | ------------------------------------------------------- | :-------: | ---------------------------------------------------- | 274 | `argocd_commitserver_commit_pending_request_total` | guage | Number of pending commit requests. | 275 | `argocd_commitserver_git_request_duration_seconds` | histogram | Git requests duration seconds. | 276 | `argocd_commitserver_git_request_total` | counter | Number of git requests performed by commit server | 277 | `argocd_commitserver_commit_request_duration_seconds` | histogram | Commit requests duration seconds. | 278 | `argocd_commitserver_userinfo_request_duration_seconds` | histogram | Userinfo requests duration seconds. | 279 | `argocd_commitserver_commit_request_total` | counter | Number of commit requests performed by commit server | 280 281 ## Prometheus Operator 282 283 If using Prometheus Operator, the following ServiceMonitor example manifests can be used. 284 Add a namespace where Argo CD is installed and change `metadata.labels.release` to the name of label selected by your Prometheus. 285 286 ```yaml 287 apiVersion: monitoring.coreos.com/v1 288 kind: ServiceMonitor 289 metadata: 290 name: argocd-metrics 291 labels: 292 release: prometheus-operator 293 spec: 294 selector: 295 matchLabels: 296 app.kubernetes.io/name: argocd-metrics 297 endpoints: 298 - port: metrics 299 ``` 300 301 ```yaml 302 apiVersion: monitoring.coreos.com/v1 303 kind: ServiceMonitor 304 metadata: 305 name: argocd-server-metrics 306 labels: 307 release: prometheus-operator 308 spec: 309 selector: 310 matchLabels: 311 app.kubernetes.io/name: argocd-server-metrics 312 endpoints: 313 - port: metrics 314 ``` 315 316 ```yaml 317 apiVersion: monitoring.coreos.com/v1 318 kind: ServiceMonitor 319 metadata: 320 name: argocd-repo-server-metrics 321 labels: 322 release: prometheus-operator 323 spec: 324 selector: 325 matchLabels: 326 app.kubernetes.io/name: argocd-repo-server 327 endpoints: 328 - port: metrics 329 ``` 330 331 ```yaml 332 apiVersion: monitoring.coreos.com/v1 333 kind: ServiceMonitor 334 metadata: 335 name: argocd-applicationset-controller-metrics 336 labels: 337 release: prometheus-operator 338 spec: 339 selector: 340 matchLabels: 341 app.kubernetes.io/name: argocd-applicationset-controller 342 endpoints: 343 - port: metrics 344 ``` 345 346 ```yaml 347 apiVersion: monitoring.coreos.com/v1 348 kind: ServiceMonitor 349 metadata: 350 name: argocd-dex-server 351 labels: 352 release: prometheus-operator 353 spec: 354 selector: 355 matchLabels: 356 app.kubernetes.io/name: argocd-dex-server 357 endpoints: 358 - port: metrics 359 ``` 360 361 ```yaml 362 apiVersion: monitoring.coreos.com/v1 363 kind: ServiceMonitor 364 metadata: 365 name: argocd-redis-haproxy-metrics 366 labels: 367 release: prometheus-operator 368 spec: 369 selector: 370 matchLabels: 371 app.kubernetes.io/name: argocd-redis-ha-haproxy 372 endpoints: 373 - port: http-exporter-port 374 ``` 375 376 For notifications controller, you need to additionally add following: 377 378 ```yaml 379 apiVersion: monitoring.coreos.com/v1 380 kind: ServiceMonitor 381 metadata: 382 name: argocd-notifications-controller 383 labels: 384 release: prometheus-operator 385 spec: 386 selector: 387 matchLabels: 388 app.kubernetes.io/name: argocd-notifications-controller-metrics 389 endpoints: 390 - port: metrics 391 ``` 392 393 ## Dashboards 394 395 You can find an example Grafana dashboard [here](https://github.com/argoproj/argo-cd/blob/master/examples/dashboard.json) or check demo instance 396 [dashboard](https://grafana.apps.argoproj.io). 397 398 