github.com/tototoman/operator-sdk@v0.8.2/doc/user/metrics/README.md (about)

     1  # Monitoring with Prometheus
     2  
     3  [Prometheus][prometheus] is an open-source systems monitoring and alerting toolkit. Below is the overview of the different helpers that exist in Operator SDK to help setup metrics in the generated operator.
     4  
     5  ## Metrics in Operator SDK
     6  
     7  The `func ExposeMetricsPort(ctx context.Context, port int32) (*v1.Service, error)` function exposes general metrics about the running program. These metrics are inherited from controller-runtime. This helper function creates a [Service][service] object with the metrics port exposed, which can then be accessed by Prometheus. The Service object is [garbage collected][gc] when the leader pod's root owner is deleted.
     8  
     9  By default, the metrics are served on `0.0.0.0:8383/metrics`. To modify the port the metrics are exposed on, change the `var metricsPort int32 = 8383` variable in the `cmd/manager/main.go` file of the generated operator.
    10  
    11  ### Usage:
    12  
    13  ```go
    14      import(
    15          "github.com/operator-framework/operator-sdk/pkg/metrics"
    16          "sigs.k8s.io/controller-runtime/pkg/manager"
    17      )
    18  
    19      func main() {
    20  
    21          ...
    22  
    23          // Change the below variables to serve metrics on different host or port.
    24          var metricsHost = "0.0.0.0"
    25          var metricsPort int32 = 8383
    26  
    27          // Pass metrics address to controller-runtime manager
    28          mgr, err := manager.New(cfg, manager.Options{
    29              Namespace:          namespace,
    30              MetricsBindAddress: fmt.Sprintf("%s:%d", metricsHost, metricsPort),
    31          })
    32  
    33          ...
    34  
    35          // Create Service object to expose the metrics port.
    36          _, err = metrics.ExposeMetricsPort(ctx, metricsPort)
    37          if err != nil {
    38              // handle error
    39              log.Info(err.Error())
    40          }
    41  
    42          ...
    43      }
    44  ```
    45  
    46  *Note:* The above example is already present in `cmd/manager/main.go` in all the operators generated with Operator SDK from v0.5.0 onwards.
    47  
    48  ### Garbage collection
    49  
    50  The metrics Service is [garbage collected][gc] when the resource used to deploy the operator is deleted (e.g. `Deployment`). This resource is determined when the metrics Service is created, at that time the resource owner reference is added to the Service.
    51  
    52  In Kubernetes clusters where [OwnerReferencesPermissionEnforcement][ownerref-permission] is enabled (on by default in all OpenShift clusters), the role requires a `<RESOURCE-KIND>/finalizers` rule to be added. By default when creating the operator with the Operator SDK, this is done automatically under the assumption that the `Deployment` object was used to create the operator pods. In case another method of deploying the operator is used, replace the `- deployments/finalizers` in the `deploy/role.yaml` file. Example rule from `deploy/role.yaml` file for deploying operator with a `StatefulSet`:
    53  
    54  ```yaml
    55  ...
    56  - apiGroups:
    57    - apps
    58    resourceNames:
    59    - <STATEFULSET-NAME>
    60    resources:
    61    - statefulsets/finalizers
    62    verbs:
    63    - update
    64  ...
    65  ```
    66  
    67  [prometheus]: https://prometheus.io/
    68  [service]: https://kubernetes.io/docs/concepts/services-networking/service/
    69  [gc]: https://kubernetes.io/docs/concepts/workloads/controllers/garbage-collection/#owners-and-dependents
    70  [ownerref-permission]: https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#ownerreferencespermissionenforcement