github.skymusic.top/operator-framework/operator-sdk@v0.8.2/doc/user/metrics/README.md (about) 1 # Monitoring with Prometheus 2 3 [Prometheus][prometheus] is an open-source systems monitoring and alerting toolkit. Below is the overview of the different helpers that exist in Operator SDK to help setup metrics in the generated operator. 4 5 ## Metrics in Operator SDK 6 7 The `func ExposeMetricsPort(ctx context.Context, port int32) (*v1.Service, error)` function exposes general metrics about the running program. These metrics are inherited from controller-runtime. This helper function creates a [Service][service] object with the metrics port exposed, which can then be accessed by Prometheus. The Service object is [garbage collected][gc] when the leader pod's root owner is deleted. 8 9 By default, the metrics are served on `0.0.0.0:8383/metrics`. To modify the port the metrics are exposed on, change the `var metricsPort int32 = 8383` variable in the `cmd/manager/main.go` file of the generated operator. 10 11 ### Usage: 12 13 ```go 14 import( 15 "github.com/operator-framework/operator-sdk/pkg/metrics" 16 "sigs.k8s.io/controller-runtime/pkg/manager" 17 ) 18 19 func main() { 20 21 ... 22 23 // Change the below variables to serve metrics on different host or port. 24 var metricsHost = "0.0.0.0" 25 var metricsPort int32 = 8383 26 27 // Pass metrics address to controller-runtime manager 28 mgr, err := manager.New(cfg, manager.Options{ 29 Namespace: namespace, 30 MetricsBindAddress: fmt.Sprintf("%s:%d", metricsHost, metricsPort), 31 }) 32 33 ... 34 35 // Create Service object to expose the metrics port. 36 _, err = metrics.ExposeMetricsPort(ctx, metricsPort) 37 if err != nil { 38 // handle error 39 log.Info(err.Error()) 40 } 41 42 ... 43 } 44 ``` 45 46 *Note:* The above example is already present in `cmd/manager/main.go` in all the operators generated with Operator SDK from v0.5.0 onwards. 47 48 ### Garbage collection 49 50 The metrics Service is [garbage collected][gc] when the resource used to deploy the operator is deleted (e.g. `Deployment`). This resource is determined when the metrics Service is created, at that time the resource owner reference is added to the Service. 51 52 In Kubernetes clusters where [OwnerReferencesPermissionEnforcement][ownerref-permission] is enabled (on by default in all OpenShift clusters), the role requires a `<RESOURCE-KIND>/finalizers` rule to be added. By default when creating the operator with the Operator SDK, this is done automatically under the assumption that the `Deployment` object was used to create the operator pods. In case another method of deploying the operator is used, replace the `- deployments/finalizers` in the `deploy/role.yaml` file. Example rule from `deploy/role.yaml` file for deploying operator with a `StatefulSet`: 53 54 ```yaml 55 ... 56 - apiGroups: 57 - apps 58 resourceNames: 59 - <STATEFULSET-NAME> 60 resources: 61 - statefulsets/finalizers 62 verbs: 63 - update 64 ... 65 ``` 66 67 [prometheus]: https://prometheus.io/ 68 [service]: https://kubernetes.io/docs/concepts/services-networking/service/ 69 [gc]: https://kubernetes.io/docs/concepts/workloads/controllers/garbage-collection/#owners-and-dependents 70 [ownerref-permission]: https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#ownerreferencespermissionenforcement