github.com/inspektor-gadget/inspektor-gadget@v0.28.1/docs/guides/prometheus.md (about) 1 --- 2 title: 'Using prometheus' 3 weight: 30 4 description: > 5 Expose metrics using prometheus 6 --- 7 8 The Prometheus gadget collects and exposes metrics in Prometheus format. It's available in both, for 9 Kubernetes (`ig-k8s`) and in Linux hosts (`ig`). 10 11 12 ```bash 13 $ kubectl gadget prometheus --config @<path> 14 $ ig prometheus --config @<path> --metrics-listen-address $IP:$PORT --metrics-path /metrics 15 ``` 16 17 ## Configuration File 18 19 The configuration file defines the metrics to be exposed and their settings. The structure of this 20 file is: 21 22 ```yaml 23 metrics_name: metrics_name 24 metrics: 25 - name: metric_name 26 type: counter or gauge or histogram 27 category: trace # category of the gadget to collect the metric. trace, snapshot, etc. 28 gadget: exec # gadget used to collect the metric. exec, open, etc. 29 selector: 30 # defines which events to take into consideration when updating the metrics. 31 # See more information below. 32 labels: 33 # defines the granularity of the labels to capture. See below. 34 ``` 35 36 ### Filtering (aka Selectors) 37 38 It's possible to configure Inspektor Gadget to only update metrics for some specific labels. This is 39 useful to keep the cardinality of the labels low. 40 41 ```yaml 42 selector: 43 - "columnName:value" # matches if the content of the column is equals to value 44 - "columnName:!value" # matches if the content of the column is not equal to value 45 - "columnName:>=value" # matches if the content of the column is greater and equal to value 46 - "columnName:>value" # matches if the content of columnName is greater than the value 47 - "columnName:<=value" # matches, if the content of columnName is lower or equal to the value 48 - "columnName:<value" # matches, if the content of columnName is lower than the value 49 - "columnName:~value" # matches if the content of column matches the regular expression 'value'. 50 # see https://github.com/google/re2/wiki/Syntax for more information on the syntax. 51 ``` 52 53 Some examples are: 54 55 Only metrics for default namespace 56 57 ```yaml 58 selector: 59 - k8s.namespace: default 60 ``` 61 62 Only events with retval != 0 63 64 ```yaml 65 selector: 66 - "retval:!0" 67 ``` 68 69 Only events executed by pid 1 by non root users 70 71 ```yaml 72 selector: 73 - "pid:0" 74 - "uid:>=1" 75 ``` 76 77 ### Counters 78 79 This is the most intuitive metric: "A _counter_ is a cumulative metric that represents a 80 single [monotonically increasing counter](https://en.wikipedia.org/wiki/Monotonic_function) whose 81 value can only increase or be reset to zero on restart. For example, you can use a counter to 82 represent the number of requests served, tasks completed, or errors." from 83 [https://prometheus.io/docs/concepts/metric_types/#counter](https://prometheus.io/docs/concepts/metric_types/#counter). 84 85 The following are examples of counters we can support with the existing gadgets. The first one 86 counts the number of executed processes by namespace, pod and container. 87 88 ```yaml 89 metrics_name: my_metrics 90 metrics: 91 - name: executed_processes 92 type: counter 93 category: trace 94 gadget: exec 95 labels: 96 - k8s.namespace 97 - k8s.pod 98 - k8s.container 99 ``` 100 101 By default, a counter is increased by one each time there is an event, however it's possible to 102 increase a counter using a field on the event too. 103 104 Executed processes by pod and container in the default namespace 105 106 ```yaml 107 metrics_name: metrics_name 108 metrics: 109 - name: executed_processes 110 type: counter 111 category: trace 112 gadget: exec 113 labels: 114 - k8s.pod 115 - k8s.container 116 selector: 117 - "k8s.namespace:default" 118 ``` 119 120 Or only count events for a given command: 121 122 `cat` executions by namespace, pod and container 123 124 ```yaml 125 metrics_name: metrics_name 126 metrics: 127 - name: executed_cats # ohno! 128 type: counter 129 category: trace 130 gadget: exec 131 labels: 132 - k8s.namespace 133 - k8s.pod 134 - k8s.container 135 selector: 136 - "comm:cat" 137 ``` 138 139 DNS requests aggregated by namespace and pod 140 141 ```yaml 142 metrics_name: metrics_name 143 metrics: 144 - name: dns_requests 145 type: counter 146 category: trace 147 gadget: dns 148 labels: 149 - k8s.namespace 150 - k8s.pod 151 selector: 152 - "qr:Q" # Only count query events 153 ``` 154 155 ### Gauges 156 157 "A _gauge_ is a metric that represents a single numerical value that can arbitrarily go up and down" 158 from 159 [https://prometheus.io/docs/concepts/metric_types/#gauge](https://prometheus.io/docs/concepts/metric_types/#gauge). 160 161 Right now only snapshotters are supported. 162 163 Examples of gauges are: 164 165 Number of processes by namespace, pod and container. 166 167 ```yaml 168 metrics_name: metrics_name 169 metrics: 170 - name: number_of_processes 171 type: gauge 172 category: snapshot 173 gadget: process 174 labels: 175 - k8s.namespace 176 - k8s.pod 177 - k8s.container 178 ``` 179 180 Number of sockets in `CLOSE_WAIT` state 181 182 ```yaml 183 metrics_name: metrics_name 184 metrics: 185 - name: number_of_sockets_close_wait 186 type: gauge 187 category: snapshot 188 gadget: socket 189 labels: 190 - k8s.namespace 191 - k8s.pod 192 - k8s.container 193 selector: 194 - "status:CLOSE_WAIT" 195 ``` 196 197 ### Histograms 198 199 "A _histogram_ samples observations (usually things like request durations or response sizes) and counts them in 200 configurable buckets. It also provides a sum of all observed values." 201 from [https://prometheus.io/docs/concepts/metric_types/#histogram](https://prometheus.io/docs/concepts/metric_types/#histogram). 202 We support the same bucket configuration as described in 203 [https://github.com/cloudflare/ebpf_exporter#histograms.](https://github.com/cloudflare/ebpf_exporter#histograms.) 204 205 Right now only trace gadgets are supported. 206 207 Example of histograms is: 208 209 Latency of DNS requests for all pods 210 211 ```yaml 212 metrics_name: metrics_name 213 metrics: 214 - name: dns_requests_latency 215 type: histogram 216 category: trace 217 field: latency 218 bucket: 219 min: 0 220 max: 10 221 multiplier: 100000 # 0.1ms 222 type: exp2 223 unit: ns 224 selector: 225 - "qr:R" # Latency is only calculated for response events 226 ``` 227 228 ### Guide 229 230 Let's see how we can use this gadget in different environments. 231 232 #### On Kubernetes 233 234 In this guide we'll use the Prometheus Service Discovery: it automatically detects the endpoints to 235 scrape metrics from. 236 237 If you already have a Prometheus instance running in your cluster, be sure you provide it with the 238 following configuration: 239 240 ```yaml 241 scrape_configs: 242 - job_name: 'kubernetes-pods' 243 244 scrape_interval: 1s 245 scrape_timeout: 1s 246 247 kubernetes_sd_configs: 248 - role: pod 249 250 relabel_configs: 251 - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape] 252 action: keep 253 regex: true 254 - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme] 255 action: replace 256 target_label: __scheme__ 257 regex: (https?) 258 - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path] 259 action: replace 260 target_label: __metrics_path__ 261 regex: (.+) 262 - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port] 263 action: replace 264 target_label: __address__ 265 regex: ([^:]+)(?::\d+)?;(\d+) 266 replacement: $1:$2 267 ``` 268 269 Otherwise, you can just apply the config provided with this guide: 270 271 ```bash 272 $ kubectl apply -f docs/examples/prometheus.yaml 273 namespace/monitoring created 274 serviceaccount/prometheus created 275 clusterrole.rbac.authorization.k8s.io/discoverer created 276 clusterrolebinding.rbac.authorization.k8s.io/prometheus-discoverer created 277 configmap/prometheus-server-conf created 278 deployment.apps/prometheus created 279 ``` 280 281 Create a port-forward session to Prometheus: 282 283 ```bash 284 $ kubectl port-forward --namespace monitoring deployment/prometheus 9090:9090 & 285 ``` 286 287 Let's create a metric that reports processes executed: 288 289 ```yaml 290 # myconfig.yaml 291 metrics_name: guide 292 metrics: 293 - name: executed_processes 294 type: counter 295 category: trace 296 gadget: exec 297 labels: 298 - k8s.namespace 299 - k8s.pod 300 - k8s.container 301 ``` 302 303 Start the gadget 304 305 ```bash 306 $ kubectl gadget prometheus --config @myconfig.yaml 307 INFO[0000] Running. Press Ctrl + C to finish 308 INFO[0000] minikube | Publishing metrics... 309 ``` 310 311 <!-- markdown-link-check-disable-next-line --> 312 Now, the `executed_processes_total` counter is available in Prometheus http://localhost:9090/graph?g0.expr=executed_processes_total&g0.tab=0&g0.stacked=0&g0.show_exemplars=0&g0.range_input=1m: 313 314  315 316 You can see that the counters are already going up for some containers. 317 318 Let's create a pod to execute from more processes: 319 320 ```bash 321 $ kubectl run mypod1 -it --image busybox --restart Never -- sh -c 'for i in $(seq 0 1 1000); do cat /dev/null ; ping -c 1 localhost > /dev/null; done' 322 ``` 323 324 If we check the counter again, we can see that it shows that our pod has executed a lot of processes: 325 326  327 328 Now, update the configuration file to only take into considerations executions of the `cat` binary: 329 330 ```yaml 331 # myconfig.yaml 332 metrics_name: guide 333 metrics: 334 - name: executed_processes 335 type: counter 336 category: trace 337 gadget: exec 338 labels: 339 - k8s.namespace 340 - k8s.pod 341 - k8s.container 342 selector: 343 - "comm:cat" 344 ``` 345 346 Restart the gadget 347 348 ```bash 349 $ kubectl gadget prometheus --config @myconfig.yaml 350 INFO[0000] Running. Press Ctrl + C to finish 351 INFO[0000] minikube | Publishing metrics... 352 ``` 353 354 Create a new pod that executes processes: 355 356 ```bash 357 $ kubectl run mypod2 -it --image busybox --restart Never -- sh -c 'for i in $(seq 0 1 1000); do cat /dev/null ; ping -c 1 localhost > /dev/null; done' 358 ``` 359 360 The counter only takes into consideration the cat commands now: 361 362  363 364 #### With `ig` 365 366 It's also possible to use the prometheus gadget without Kubernetes. In this case, we have to 367 configure Prometheus to point to the endpoint exposed by ig, it's `localhost:2223` by default: 368 369 ```yaml 370 # prometheus.yaml 371 scrape_configs: 372 - job_name: ig 373 scrape_interval: 1s 374 static_configs: 375 - targets: 376 - localhost:2223 377 ``` 378 379 Start prometheus with above configuration (please refer to [docker 380 installation](https://prometheus.io/docs/prometheus/latest/installation/#using-docker) in case you 381 want to run prometheus in a container). 382 383 ```bash 384 $ prometheus --config.file prometheus.yaml 385 ``` 386 387 Then, start the prometheus gadget with the same configuration as above Kubernetes section: 388 389 ```bash 390 $ sudo ig prometheus --config @myconfig.yaml 391 INFO[0000] Running. Press Ctrl + C to finish 392 INFO[0000] Publishing metrics... 393 ``` 394 395 <!-- markdown-link-check-disable-next-line --> 396 You can check in http://localhost:9090/targets and check that the ig endpoint is reporting metrics: 397 398  399 400 Let's execute some commands inside a container: 401 402 ```bash 403 docker run --rm -ti --name=mycontainer busybox sh -c 'for i in $(seq 0 1 1000); do cat /dev/null ; ping -c 1 localhost > /dev/null; done' 404 ``` 405 406 <!-- markdown-link-check-disable-next-line --> 407 We can see how the counter for `mycontainer` is increased in http://localhost:9090/graph?g0.expr=executed_processes_total&g0.tab=0&g0.stacked=0&g0.show_exemplars=0&g0.range_input=1m. 408 409  410 411 #### Grafana 412 413 It's possible to visualize the metrics in Grafana. As an example we will plot a histogram for DNS requests latency. We 414 can use the [docker compose file](../../tools/monitoring/docker-compose.yml) to prepare the environment: 415 416 ```bash 417 $ pushd tools/monitoring 418 $ docker compose up -d 419 $ popd 420 ``` 421 422 <!-- markdown-link-check-disable-next-line --> 423 At this point, Grafana is available at http://localhost:3000 and Prometheus at http://localhost:9090. We can start `ig` 424 with the following configuration: 425 426 ```bash 427 $ sudo ig prometheus --config @tools/monitoring/config/histogram.yaml 428 INFO[0000] Running. Press Ctrl + C to finish 429 ``` 430 431 Now, generate some DNS requests: 432 433 ```bash 434 $ docker run --rm -ti busybox sh -c 'for i in $(seq 0 1 1000); do cat /dev/null ; nslookup -querytype=a microsoft.com. > /dev/null; done' 435 ``` 436 437 <!-- markdown-link-check-disable --> 438 We should now be able to see the visualized histogram 439 at: http://localhost:3000/d/e1981f70-308c-4784-b986-9b5f1a895444/inspektor-gadget?orgId=1&viewPanel=1 440 <!-- markdown-link-check-enable --> 441 442  443 444 ### Limitations 445 446 - The `kubectl gadget` instance has to keep running in order to update the metrics. 447 - It's not possible to configure the metrics endpoint in ig-k8s