gvisor.dev/gvisor@v0.0.0-20240520182842-f9d4d51c7e0f/g3doc/user_guide/observability.md (about) 1 # Observability 2 3 [TOC] 4 5 This guide describes how to obtain Prometheus monitoring data from gVisor 6 sandboxes running with `runsc`. 7 8 **NOTE**: These metrics are mostly information about gVisor internals, and do 9 not provide introspection capabilities into the workload being sandboxed. If you 10 would like to monitor the sandboxed workload (e.g. for threat detection), refer 11 to **[Runtime Monitoring](runtime_monitoring.md)**. 12 13 `runsc` implements a 14 [Prometheus-compliant](https://prometheus.io/docs/instrumenting/exposition_formats/) 15 HTTP metric server using the `runsc metric-server` subcommand. This server is 16 meant to run **unsandboxed** as a sidecar process of your container runtime 17 (e.g. Docker). 18 19 ## One-off metric export 20 21 You can export metric information from running sandboxes using the `runsc 22 export-metrics` subcommand. This does not require special configuration or 23 setting up a Prometheus server. 24 25 ``` 26 $ docker run -d --runtime=runsc --name=foobar debian sleep 1h 27 c7ce77796e0ece4c0881fb26261608552ea4a67b2fe5934658b8b4433e5190ed 28 $ sudo /path/to/runsc --root=/var/run/docker/runtime-runc/moby export-metrics c7ce77796e0ece4c0881fb26261608552ea4a67b2fe5934658b8b4433e5190ed 29 # Command-line export for sandbox c7ce77796e0ece4c0881fb26261608552ea4a67b2fe5934658b8b4433e5190ed 30 # Writing data from snapshot containing 175 data points taken at 2023-01-25 15:46:50.469403696 -0800 PST. 31 32 33 # HELP runsc_fs_opens Number of file opens. 34 # TYPE runsc_fs_opens counter 35 runsc_fs_opens{sandbox="c7ce77796e0ece4c0881fb26261608552ea4a67b2fe5934658b8b4433e5190ed"} 62 1674690410469 36 37 # HELP runsc_fs_read_wait Time waiting on file reads, in nanoseconds. 38 # TYPE runsc_fs_read_wait counter 39 runsc_fs_read_wait{sandbox="c7ce77796e0ece4c0881fb26261608552ea4a67b2fe5934658b8b4433e5190ed"} 0 1674690410469 40 41 # HELP runsc_fs_reads Number of file reads. 42 # TYPE runsc_fs_reads counter 43 runsc_fs_reads{sandbox="c7ce77796e0ece4c0881fb26261608552ea4a67b2fe5934658b8b4433e5190ed"} 54 1674690410469 44 45 # [...] 46 ``` 47 48 ## Starting the metric server 49 50 Use the `runsc metric-server` subcommand: 51 52 ```shell 53 $ sudo runsc \ 54 --root=/var/run/docker/runtime-runc/moby \ 55 --metric-server=localhost:1337 \ 56 metric-server 57 ``` 58 59 `--root` needs to be set to the OCI runtime root directory that your runtime 60 implementation uses. For Docker, this is typically 61 `/var/run/docker/runtime-runc/moby`; otherwise, if you already have gVisor set 62 up, you can use `ps aux | grep runsc` on the host to find the `--root` that a 63 running sandbox is using. This directory is typically only accessible by the 64 user Docker runs as (usually `root`), hence `sudo`. The metric server uses the 65 `--root` directory to scan for sandboxes running on the system. 66 67 The `--metric-server` flag is the network address or UDS path to bind to. In 68 this example, this will create a server bound on all interfaces on TCP port 69 `1337`. To listen on `lo` only, you could alternatively use 70 `--metric-server=localhost:1337`. 71 72 If something goes wrong, you may also want to add `--debug 73 --debug-log=/dev/stderr` to understand the metric server's behavior. 74 75 You can query the metric server with `curl`: 76 77 ``` 78 $ curl http://localhost:1337/metrics 79 # Data for runsc metric server exporting data for sandboxes in root directory /var/run/docker/runtime-runc/moby 80 # [...] 81 82 # HELP process_start_time_seconds Unix timestamp at which the process started. Used by Prometheus for counter resets. 83 # TYPE process_start_time_seconds gauge 84 process_start_time_seconds 1674598082.698509 1674598109532 85 86 # End of metric data. 87 ``` 88 89 ## Starting sandboxes with metrics enabled 90 91 Sandbox metrics are disabled by default. To enable, add the flag 92 `--metric-server={ADDRESS}:{PORT}` to the runtime configuration. With Docker, 93 this can be set in `/etc/docker/daemon.json` like so: 94 95 ```json 96 { 97 "runtimes": { 98 "runsc": { 99 "path": "/path/to/runsc", 100 "runtimeArgs": [ 101 "--metric-server=localhost:1337" 102 ] 103 } 104 } 105 } 106 ``` 107 108 **NOTE**: The `--metric-server` flag value must be an exact string match between 109 the runtime configuration and the `runsc metric-server` command. 110 111 Once you've done this, you can start a container and see that it shows up in the 112 list of Prometheus metrics. 113 114 ``` 115 $ docker run -d --runtime=runsc --name=foobar debian sleep 1h 116 32beefcafe 117 118 $ curl http://localhost:1337/metrics 119 # Data for runsc metric server exporting data for sandboxes in root directory /var/run/docker/runtime-runc/moby 120 # Writing data from 3 snapshots: [...] 121 122 123 # HELP process_start_time_seconds Unix timestamp at which the process started. Used by Prometheus for counter resets. 124 # TYPE process_start_time_seconds gauge 125 process_start_time_seconds 1674599158.286067 1674599159819 126 127 # HELP runsc_fs_opens Number of file opens. 128 # TYPE runsc_fs_opens counter 129 runsc_fs_opens{iteration="42asdf",sandbox="32beefcafe"} 12 1674599159819 130 131 # HELP runsc_fs_read_wait Time waiting on file reads, in nanoseconds. 132 # TYPE runsc_fs_read_wait counter 133 runsc_fs_read_wait{iteration="42asdf",sandbox="32beefcafe"} 0 1674599159819 134 135 # [...] 136 137 # End of metric data. 138 ``` 139 140 Each per-container metric is labeled with at least: 141 142 - `sandbox`: The container ID, in this case `32beefcafe` 143 - `iteration`: A randomly-generated string (in this case `42asdf`) that stays 144 constant for the lifetime of the sandbox. This helps distinguish between 145 successive instances of the same sandbox with the same ID. 146 147 If you'd like to run some containers with metrics turned off and some on within 148 the same system, use two runtime entries in `/etc/docker/daemon.json` with only 149 one of them having the `--metric-server` flag set. 150 151 ## Exporting data to Prometheus 152 153 The metric server exposes a 154 [standard `/metrics` HTTP endpoint](https://prometheus.io/docs/instrumenting/exposition_formats/) 155 on the address given by the `--metric-server` flag passed to `runsc 156 metric-server`. Simply point Prometheus at this address. 157 158 If desired, you can change the 159 [exporter name](https://prometheus.io/docs/instrumenting/writing_exporters/) 160 (prefix applied to all metric names) using the `--exporter-prefix` flag. It 161 defaults to `runsc_`. 162 163 The sandbox metrics exported may be filtered by using the optional `GET` 164 parameter `runsc-sandbox-metrics-filter`, e.g. 165 `/metrics?runsc-sandbox-metrics-filter=fs_.*`. Metric names must fully match 166 this regular expression. Note that this filtering is performed before prepending 167 `--exporter-prefix` to metric names. 168 169 The metric server also supports listening on a 170 [Unix Domain Socket](https://en.wikipedia.org/wiki/Unix_domain_socket). This can 171 be convenient to avoid reserving port numbers on the machine's network 172 interface, or for tighter control over who can read the data. Clients should 173 talk HTTP over this UDS. While Prometheus doesn't natively support reading 174 metrics from a UDS, this feature can be used in conjunction with a tool such as 175 [`socat` to re-expose this as a regular TCP port](https://serverfault.com/questions/517906/how-to-expose-a-unix-domain-socket-directly-over-tcp) 176 within another context (e.g. a tightly-managed network namespace that Prometheus 177 runs in). 178 179 ``` 180 $ sudo runsc --root=/var/run/docker/runtime-runc/moby --metric-server=/run/docker/runsc-metrics.sock metric-server & 181 182 $ sudo curl --unix-socket /run/docker/runsc-metrics.sock http://runsc-metrics/metrics 183 # Data for runsc metric server exporting data for sandboxes in root directory /var/run/docker/runtime-runc/moby 184 # [...] 185 # End of metric data. 186 187 # Set up socat to forward requests from *:1337 to /run/docker/runsc-metrics.sock in its own network namespace: 188 $ sudo unshare --net socat TCP-LISTEN:1337,reuseaddr,fork UNIX-CONNECT:/run/docker/runsc-metrics.sock & 189 190 # Set up basic networking for socat's network namespace: 191 $ sudo nsenter --net="/proc/$(pidof socat)/ns/net" sh -c 'ip link set lo up && ip route add default dev lo' 192 193 # Grab metric data from this namespace: 194 $ sudo nsenter --net="/proc/$(pidof socat)/ns/net" curl http://localhost:1337/metrics 195 # Data for runsc metric server exporting data for sandboxes in root directory /var/run/docker/runtime-runc/moby 196 # [...] 197 # End of metric data. 198 ``` 199 200 ## Running the metric server in a sandbox 201 202 If you would like to run the metric server in a gVisor sandbox, you may do so, 203 provided that you give it access to the OCI runtime root directory, forward the 204 network port it binds to for external access, and enable host UDS support. 205 206 **WARNING**: Doing this does not provide you the full security of gVisor, as it 207 still grants the metric server full control over all running gVisor sandboxes on 208 the system. This step is only a defense-in-depth measure. 209 210 To do this, add a runtime with the `--host-uds=all` flag to 211 `/etc/docker/daemon.json`. The metric server needs the ability to open existing 212 UDSs (in order to communicate with running sandboxes), and to create new UDSs 213 (in order to create and listen on `/run/docker/runsc-metrics.sock`). 214 215 ```json 216 { 217 "runtimes": { 218 "runsc": { 219 "path": "/path/to/runsc", 220 "runtimeArgs": [ 221 "--metric-server=/run/docker/runsc-metrics.sock" 222 ] 223 }, 224 "runsc-metric-server": { 225 "path": "/path/to/runsc", 226 "runtimeArgs": [ 227 "--metric-server=/run/docker/runsc-metrics.sock", 228 "--host-uds=all" 229 ] 230 } 231 } 232 } 233 ``` 234 235 Then start the metric server with this runtime, passing through the directories 236 containing the control files `runsc` uses to detect and communicate with running 237 sandboxes: 238 239 ```shell 240 $ docker run -d --runtime=runsc-metric-server --name=runsc-metric-server \ 241 --volume="$(which runsc):/runsc:ro" \ 242 --volume=/var/run/docker/runtime-runc/moby:/var/run/docker/runtime-runc/moby \ 243 --volume=/run/docker:/run/docker \ 244 --volume=/var/run:/var/run \ 245 alpine \ 246 /runsc \ 247 --root=/var/run/docker/runtime-runc/moby \ 248 --metric-server=/run/docker/runsc-metrics.sock \ 249 --debug --debug-log=/dev/stderr \ 250 metric-server 251 ``` 252 253 Yes, this means the metric server will report data about its own sandbox: 254 255 ``` 256 $ metric_server_id="$(docker inspect --format='{{.ID}}' runsc-metric-server)" 257 $ sudo curl --unix-socket /run/docker/runsc-metrics.sock http://runsc-metrics/metrics | grep "$metric_server_id" 258 # - Snapshot with 175 data points taken at 2023-01-25 15:45:33.70256855 -0800 -0800: map[iteration:2407456650315156914 sandbox:737ce142058561d764ad870d028130a29944821dd918c7979351b249d5d30481] 259 runsc_fs_opens{iteration="2407456650315156914",sandbox="737ce142058561d764ad870d028130a29944821dd918c7979351b249d5d30481"} 54 1674690333702 260 runsc_fs_read_wait{iteration="2407456650315156914",sandbox="737ce142058561d764ad870d028130a29944821dd918c7979351b249d5d30481"} 0 1674690333702 261 runsc_fs_reads{iteration="2407456650315156914",sandbox="737ce142058561d764ad870d028130a29944821dd918c7979351b249d5d30481"} 52 1674690333702 262 # [...] 263 ``` 264 265 ## Labeling pods on Kubernetes 266 267 When using Kubernetes, users typically deal with pod names and container names. 268 On Kubelet machines, the underlying container names passed to the runtime are 269 non-human-friendly hexadecimal strings. 270 271 In order to provide more user-friendly labels, the metric server will pick up 272 the `io.kubernetes.cri.sandbox-name` and `io.kubernetes.cri.sandbox-namespace` 273 annotations provided by `containerd`, and automatically add these as labels 274 (`pod_name` and `namespace_name` respectively) for each per-sandbox metric. 275 276 ## Metrics exported 277 278 The metric server exports a lot of gVisor-internal metrics, and generates its 279 own metrics as well. All metrics have documentation and type annotations in the 280 `/metrics` output, and this section aims to document some useful ones. 281 282 ### Process-wide metrics 283 284 * `process_start_time_seconds`: Unix timestamp representing the time at which 285 the metric server started. This specific metric name is used by Prometheus, 286 and as such its name is not affected by the `--exporter-prefix` flag. This 287 metric is process-wide and has no labels. 288 * `num_sandboxes_total`: A process-wide metric representing the total number 289 of sandboxes that the metric server knows about. 290 * `num_sandboxes_running`: A process-wide metric representing the number of 291 running sandboxes that the metric server knows about. 292 * `num_sandboxes_broken_metrics`: A process-wide metric representing the 293 number of sandboxes from which the metric server could not get metric data. 294 295 ### Per-sandbox metrics 296 297 * `sandbox_presence`: A per-sandbox metric that is set to `1` for each sandbox 298 that the metric server knows about. This can be used to join with other 299 per-sandbox or per-pod metrics for which metric existence is not guaranteed. 300 * `sandbox_running`: A per-sandbox metric that is set to `1` for each sandbox 301 that the metric server knows about and that is actively running. This can be 302 used in conjunction with `sandbox_presence` to determine the set of 303 sandboxes that aren't running; useful if you want to alert about sandboxes 304 that are down. 305 * `sandbox_metadata`: A per-sandbox metric that carries a superset of the 306 typical per-sandbox labels found on other per-sandbox metrics. These extra 307 labels contain useful metadata about the sandbox, such as the version 308 number, [platform](platforms.md), and [network type](networking.md) being 309 used. 310 * `sandbox_capabilities`: A per-sandbox, per-capability metric that carries 311 the union of all capabilities present on at least one container of the 312 sandbox. Can optionally be filtered to only a subset of capabilities using 313 the `runsc-capability-filter` GET parameter on `/metrics` requests (regular 314 expression). Useful for auditing and aggregating the capabilities you rely 315 on across multiple sandboxes. 316 * `sandbox_creation_time_seconds`: A per-sandbox Unix timestamp representing 317 the time at which this sandbox was created.