github.com/argoproj/argo-cd/v2@v2.10.9/docs/operator-manual/health.md (about) 1 # Resource Health 2 3 ## Overview 4 Argo CD provides built-in health assessment for several standard Kubernetes types, which is then 5 surfaced to the overall Application health status as a whole. The following checks are made for 6 specific types of Kubernetes resources: 7 8 ### Deployment, ReplicaSet, StatefulSet, DaemonSet 9 * Observed generation is equal to desired generation. 10 * Number of **updated** replicas equals the number of desired replicas. 11 12 ### Service 13 * If service type is of type `LoadBalancer`, the `status.loadBalancer.ingress` list is non-empty, 14 with at least one value for `hostname` or `IP`. 15 16 ### Ingress 17 * The `status.loadBalancer.ingress` list is non-empty, with at least one value for `hostname` or `IP`. 18 19 ### Job 20 * If job `.spec.suspended` is set to 'true', then the job and app health will be marked as suspended. 21 ### PersistentVolumeClaim 22 * The `status.phase` is `Bound` 23 24 ### Argocd App 25 26 The health assessment of `argoproj.io/Application` CRD has been removed in argocd 1.8 (see [#3781](https://github.com/argoproj/argo-cd/issues/3781) for more information). 27 You might need to restore it if you are using app-of-apps pattern and orchestrating synchronization using sync waves. Add the following resource customization in 28 `argocd-cm` ConfigMap: 29 30 ```yaml 31 --- 32 apiVersion: v1 33 kind: ConfigMap 34 metadata: 35 name: argocd-cm 36 namespace: argocd 37 labels: 38 app.kubernetes.io/name: argocd-cm 39 app.kubernetes.io/part-of: argocd 40 data: 41 resource.customizations: | 42 argoproj.io/Application: 43 health.lua: | 44 hs = {} 45 hs.status = "Progressing" 46 hs.message = "" 47 if obj.status ~= nil then 48 if obj.status.health ~= nil then 49 hs.status = obj.status.health.status 50 if obj.status.health.message ~= nil then 51 hs.message = obj.status.health.message 52 end 53 end 54 end 55 return hs 56 ``` 57 58 ## Custom Health Checks 59 60 Argo CD supports custom health checks written in [Lua](https://www.lua.org/). This is useful if you: 61 62 * Are affected by known issues where your `Ingress` or `StatefulSet` resources are stuck in `Progressing` state because of bug in your resource controller. 63 * Have a custom resource for which Argo CD does not have a built-in health check. 64 65 There are two ways to configure a custom health check. The next two sections describe those ways. 66 67 ### Way 1. Define a Custom Health Check in `argocd-cm` ConfigMap 68 69 Custom health checks can be defined in 70 ```yaml 71 resource.customizations: | 72 <group/kind>: 73 health.lua: | 74 ``` 75 field of `argocd-cm`. If you are using argocd-operator, this is overridden by [the argocd-operator resourceCustomizations](https://argocd-operator.readthedocs.io/en/latest/reference/argocd/#resource-customizations). 76 77 The following example demonstrates a health check for `cert-manager.io/Certificate`. 78 79 ```yaml 80 data: 81 resource.customizations: | 82 cert-manager.io/Certificate: 83 health.lua: | 84 hs = {} 85 if obj.status ~= nil then 86 if obj.status.conditions ~= nil then 87 for i, condition in ipairs(obj.status.conditions) do 88 if condition.type == "Ready" and condition.status == "False" then 89 hs.status = "Degraded" 90 hs.message = condition.message 91 return hs 92 end 93 if condition.type == "Ready" and condition.status == "True" then 94 hs.status = "Healthy" 95 hs.message = condition.message 96 return hs 97 end 98 end 99 end 100 end 101 102 hs.status = "Progressing" 103 hs.message = "Waiting for certificate" 104 return hs 105 ``` 106 In order to prevent duplication of the custom health check for potentially multiple resources, it is also possible to specify a wildcard in the resource kind, and anywhere in the resource group, like this: 107 108 ```yaml 109 resource.customizations: | 110 ec2.aws.crossplane.io/*: 111 health.lua: | 112 ... 113 ``` 114 115 ```yaml 116 resource.customizations: | 117 "*.aws.crossplane.io/*": 118 health.lua: | 119 ... 120 ``` 121 122 !!!important 123 Please note the required quotes in the resource customization health section, if the wildcard starts with `*`. 124 125 The `obj` is a global variable which contains the resource. The script must return an object with status and optional message field. 126 The custom health check might return one of the following health statuses: 127 128 * `Healthy` - the resource is healthy 129 * `Progressing` - the resource is not healthy yet but still making progress and might be healthy soon 130 * `Degraded` - the resource is degraded 131 * `Suspended` - the resource is suspended and waiting for some external event to resume (e.g. suspended CronJob or paused Deployment) 132 133 By default health typically returns `Progressing` status. 134 135 NOTE: As a security measure, access to the standard Lua libraries will be disabled by default. Admins can control access by 136 setting `resource.customizations.useOpenLibs.<group_kind>`. In the following example, standard libraries are enabled for health check of `cert-manager.io/Certificate`. 137 138 ```yaml 139 data: 140 resource.customizations: | 141 cert-manager.io/Certificate: 142 health.lua.useOpenLibs: true 143 health.lua: | 144 # Lua standard libraries are enabled for this script 145 ``` 146 147 ### Way 2. Contribute a Custom Health Check 148 149 A health check can be bundled into Argo CD. Custom health check scripts are located in the `resource_customizations` directory of [https://github.com/argoproj/argo-cd](https://github.com/argoproj/argo-cd). This must have the following directory structure: 150 151 ``` 152 argo-cd 153 |-- resource_customizations 154 | |-- your.crd.group.io # CRD group 155 | | |-- MyKind # Resource kind 156 | | | |-- health.lua # Health check 157 | | | |-- health_test.yaml # Test inputs and expected results 158 | | | +-- testdata # Directory with test resource YAML definitions 159 ``` 160 161 Each health check must have tests defined in `health_test.yaml` file. The `health_test.yaml` is a YAML file with the following structure: 162 163 ```yaml 164 tests: 165 - healthStatus: 166 status: ExpectedStatus 167 message: Expected message 168 inputPath: testdata/test-resource-definition.yaml 169 ``` 170 171 To test the implemented custom health checks, run `go test -v ./util/lua/`. 172 173 The [PR#1139](https://github.com/argoproj/argo-cd/pull/1139) is an example of Cert Manager CRDs custom health check. 174 175 Please note that bundled health checks with wildcards are not supported. 176 177 ## Health Checks 178 179 An Argo CD App's health is inferred from the health of its immediate child resources (the resources represented in 180 source control). 181 182 But the health of a resource is not inherited from child resources - it is calculated using only information about the 183 resource itself. A resource's status field may or may not contain information about the health of a child resource, and 184 the resource's health check may or may not take that information into account. 185 186 The lack of inheritance is by design. A resource's health can't be inferred from its children because the health of a 187 child resource may not be relevant to the health of the parent resource. For example, a Deployment's health is not 188 necessarily affected by the health of its Pods. 189 190 ``` 191 App (healthy) 192 └── Deployment (healthy) 193 └── ReplicaSet (healthy) 194 └── Pod (healthy) 195 └── ReplicaSet (unhealthy) 196 └── Pod (unhealthy) 197 ``` 198 199 If you want the health of a child resource to affect the health of its parent, you need to configure the parent's health 200 check to take the child's health into account. Since only the parent resource's state is available to the health check, 201 the parent resource's controller needs to make the child resource's health available in the parent resource's status 202 field. 203 204 ``` 205 App (healthy) 206 └── CustomResource (healthy) <- This resource's health check needs to be fixed to mark the App as unhealthy 207 └── CustomChildResource (unhealthy) 208 ```