k8s.io/test-infra@v0.0.0-20240520184403-27c6b4c223d8/config/prow/cluster/monitoring/README.md (about)

     1  # Monitoring
     2  
     3  This folder contains the manifest files for monitoring prow resources.
     4  
     5  ## Deploy
     6  
     7  The deployment has been integrated into our CI system except `secret` objects.
     8  Cluster admins need to create `secret`s  manually.
     9  
    10  ```
    11  ### replace the sensitive inforamtion in the files before executing:
    12  $ kubectl create -f grafana_secret.yaml
    13  $ kubectl create -f alertmanager-prow_secret.yaml
    14  
    15  ```
    16  
    17  The grafana `Ingress` in [grafana_expose.yaml](grafana_expose.yaml) has
    18  GCE specific annotations. It can be modified/removed if [other ways](https://cloud.google.com/kubernetes-engine/docs/how-to/exposing-apps)
    19  of exposing a service are preferred.
    20  
    21  A successful deploy will spawn a stack of monitoring for prow in namespace `prow-monitoring`: _prometheus_, _alertmanager_, and _grafana_.
    22  
    23  _Add more dashboards_:
    24  
    25  Suppose that there is an App running as a pod that exposes Prometheus metrics on port `n` and we want to include it into our prow-monitoring stack.
    26  First step is to create a k8s-service to proxy port `n` if you have not done it yet.
    27  
    28  ### Add the service as target in Prometheus
    29  
    30  Add a new `servicemonitors.monitoring.coreos.com` which proxies the targeting service into [prow_servicemonitors.yaml](./prow_servicemonitors.yaml), eg,
    31  `servicemonitor` for `ghproxy`,
    32  
    33  ```
    34  apiVersion: monitoring.coreos.com/v1
    35  kind: ServiceMonitor
    36  metadata:
    37    labels:
    38      app: ghproxy
    39    name: ghproxy
    40    namespace: prow-monitoring
    41  spec:
    42    endpoints:
    43      - interval: 30s
    44        port: metrics
    45        scheme: http
    46    namespaceSelector:
    47      matchNames:
    48        - default
    49    selector:
    50      matchLabels:
    51        app: ghproxy
    52  
    53  ```
    54  
    55  The `svc` should be available on prometheus web UI: `Status` → `Targets`.
    56  
    57  _Note_ that the `servicemonitor` has to have label `app` as key (value could be an arbitrary string).
    58  
    59  ### Add a new grafana dashboard
    60  
    61  We use [jsonnet](https://jsonnet.org) to generate the json files for grafana dashboards and [jsonnet-bundler](https://github.com/jsonnet-bundler/jsonnet-bundler) to manage the jsonnet libs.
    62  Developing a new dashboard can be achieved by
    63  
    64  * Create a new file `<dashhoard_name>.jsonnet` in folder [./mixins/grafana_dashboards](./mixins/grafana_dashboards).
    65  
    66  * Use the configMap above in [grafana_deployment.yaml](grafana_deployment.yaml).
    67  
    68  ## Access components' Web page
    69  
    70  * For `grafana`, visit [monitoring.prow.k8s.io](https://monitoring.prow.k8s.io). Anonymous users are with read-only mode.
    71  Use `adm` and [password](https://github.com/kubernetes/test-infra/blob/master/config/prow/cluster/monitoring/grafana_deployment.yaml#L39-L45) to become admin.
    72  
    73  If the Prow instance does not publicly expose `grafana` it can still be accessed by cluster admins via [port-forwarding](https://kubernetes.io/docs/tasks/access-application-cluster/port-forward-access-application-cluster/). Run
    74  
    75  ```
    76  kubectl -n prow-monitoring port-forward service/grafana 8080:80
    77  ```
    78  then visit [localhost:8080](http://127.0.0.1:8080).
    79  
    80  * For `prometheus` and `alertmanager`, there is no public domain configured based on the security
    81  concerns (no authorization out of the box).
    82  Cluster admins can use [k8s port-forward](https://kubernetes.io/docs/tasks/access-application-cluster/port-forward-access-application-cluster/) to
    83  access the web.
    84  
    85      ```
    86      $ kubectl -n prow-monitoring port-forward $( kubectl -n prow-monitoring get pods --selector app=prometheus -o jsonpath={.items[0].metadata.name} ) 9090
    87      $ kubectl -n prow-monitoring port-forward $( kubectl -n prow-monitoring get pods --selector app=alertmanager -o jsonpath={.items[0].metadata.name} ) 9093
    88      ```
    89  
    90      Then, visit [127.0.0.1:9090](http://127.0.0.1:9090) for the `prometheus` pod and [127.0.0.1:9093](http://127.0.0.1:9093) for the `alertmanager` pod.
    91  
    92      As a result of no public domain for those two components, some of the links on the UI do not work, eg, the links on the slack alerts.