github.com/yankunsam/loki/v2@v2.6.3-0.20220817130409-389df5235c27/docs/sources/logql/log_queries.md (about) 1 --- 2 title: Log queries 3 weight: 10 4 --- 5 # Log queries 6 7 All LogQL queries contain a **log stream selector**. 8 9  10 11 12 Optionally, the log stream selector can be followed by a **log pipeline**. A log pipeline is a set of stage expressions that are chained together and applied to the selected log streams. Each expression can filter out, parse, or mutate log lines and their respective labels. 13 14 The following example shows a full log query in action: 15 16 ```logql 17 {container="query-frontend",namespace="loki-dev"} |= "metrics.go" | logfmt | duration > 10s and throughput_mb < 500 18 ``` 19 20 The query is composed of: 21 22 - a log stream selector `{container="query-frontend",namespace="loki-dev"}` which targets the `query-frontend` container in the `loki-dev` namespace. 23 - a log pipeline `|= "metrics.go" | logfmt | duration > 10s and throughput_mb < 500` which will filter out log that contains the word `metrics.go`, then parses each log line to extract more labels and filter with them. 24 25 > To avoid escaping special characters you can use the `` ` ``(backtick) instead of `"` when quoting strings. 26 For example `` `\w+` `` is the same as `"\\w+"`. 27 This is specially useful when writing a regular expression which contains multiple backslashes that require escaping. 28 29 ## Log stream selector 30 31 The stream selector determines which log streams to include in a query's results. 32 A log stream is a unique source of log content, such as a file. 33 A more granular log stream selector then reduces the number of searched streams to a manageable volume. 34 This means that the labels passed to the log stream selector will affect the relative performance of the query's execution. 35 36 The log stream selector is specified by one or more comma-separated key-value pairs. Each key is a log label and each value is that label's value. 37 Curly braces (`{` and `}`) delimit the stream selector. 38 39 Consider this stream selector: 40 41 ```logql 42 {app="mysql",name="mysql-backup"} 43 ``` 44 45 All log streams that have both a label of `app` whose value is `mysql` 46 and a label of `name` whose value is `mysql-backup` will be included in 47 the query results. 48 A stream may contain other pairs of labels and values, 49 but only the specified pairs within the stream selector are used to determine 50 which streams will be included within the query results. 51 52 The same rules that apply for [Prometheus Label Selectors](https://prometheus.io/docs/prometheus/latest/querying/basics/#instant-vector-selectors) apply for Grafana Loki log stream selectors. 53 54 The `=` operator after the label name is a **label matching operator**. 55 The following label matching operators are supported: 56 57 - `=`: exactly equal 58 - `!=`: not equal 59 - `=~`: regex matches 60 - `!~`: regex does not match 61 62 Regex log stream examples: 63 64 - `{name =~ "mysql.+"}` 65 - `{name !~ "mysql.+"}` 66 - `` {name !~ `mysql-\d+`} `` 67 68 **Note:** The `=~` regex operator is fully anchored, meaning regex must match against the *entire* string, including newlines. The regex `.` character does not match newlines by default. If you want the regex dot character to match newlines you can use the single-line flag, like so: `(?s)search_term.+` matches `search_term\n`. 69 70 ## Log pipeline 71 72 A log pipeline can be appended to a log stream selector to further process and filter log streams. It is composed of a set of expressions. Each expression is executed in left to right sequence for each log line. If an expression filters out a log line, the pipeline will stop processing the current log line and start processing the next log line. 73 74 Some expressions can mutate the log content and respective labels, 75 which will be then be available for further filtering and processing in subsequent expressions. 76 An example that mutates is the expression 77 78 ``` 79 | line_format "{{.status_code}}" 80 ``` 81 82 83 Log pipeline expressions fall into one of three categories: 84 85 - Filtering expressions: [line filter expressions](#line-filter-expression) 86 and 87 [label filter expressions](#label-filter-expression) 88 - [Parsing expressions](#parser-expression) 89 - Formatting expressions: [line format expressions](#line-format-expression) 90 and 91 [label format expressions](#labels-format-expression) 92 93 ### Line filter expression 94 95 The line filter expression does a distributed `grep` 96 over the aggregated logs from the matching log streams. 97 It searches the contents of the log line, 98 discarding those lines that do not match the case sensitive expression. 99 100 Each line filter expression has a **filter operator** 101 followed by text or a regular expression. 102 These filter operators are supported: 103 104 - `|=`: Log line contains string 105 - `!=`: Log line does not contain string 106 - `|~`: Log line contains a match to the regular expression 107 - `!~`: Log line does not contain a match to the regular expression 108 109 Line filter expression examples: 110 111 - Keep log lines that have the substring "error": 112 113 ``` 114 |= "error" 115 ``` 116 117 A complete query using this example: 118 119 ``` 120 {job="mysql"} |= "error" 121 ``` 122 123 - Discard log lines that have the substring "kafka.server:type=ReplicaManager": 124 125 ``` 126 != "kafka.server:type=ReplicaManager" 127 ``` 128 129 A complete query using this example: 130 131 ``` 132 {instance=~"kafka-[23]",name="kafka"} != "kafka.server:type=ReplicaManager" 133 ``` 134 135 - Keep log lines that contain a substring that starts with `tsdb-ops` and ends with `io:2003`. A complete query with a regular expression: 136 137 ``` 138 {name="kafka"} |~ "tsdb-ops.*io:2003" 139 ``` 140 141 - Keep log lines that contain a substring that starts with `error=`, 142 and is followed by 1 or more word characters. A complete query with a regular expression: 143 144 ``` 145 {name="cassandra"} |~ `error=\w+` 146 ``` 147 148 Filter operators can be chained. 149 Filters are applied sequentially. 150 Query results will have satisfied every filter. 151 This complete query example will give results that include the string `error`, 152 and do not include the string `timeout`. 153 154 ```logql 155 {job="mysql"} |= "error" != "timeout" 156 ``` 157 158 When using `|~` and `!~`, Go (as in [Golang](https://golang.org/)) [RE2 syntax](https://github.com/google/re2/wiki/Syntax) regex may be used. 159 The matching is case-sensitive by default. 160 Switch to case-insensitive matching by prefixing the regular expression 161 with `(?i)`. 162 163 While line filter expressions could be placed anywhere within a log pipeline, 164 it is almost always better to have them at the beginning. 165 Placing them at the beginning improves the performance of the query, 166 as it only does further processing when a line matches. 167 For example, 168 while the results will be the same, 169 the query specified with 170 171 ``` 172 {job="mysql"} |= "error" | json | line_format "{{.err}}" 173 ``` 174 175 will always run faster than 176 177 ``` 178 {job="mysql"} | json | line_format "{{.message}}" |= "error" 179 ``` 180 181 Line filter expressions are the fastest way to filter logs once the 182 log stream selectors have been applied. 183 184 Line filter expressions have support matching IP addresses. See [Matching IP addresses](../ip/) for details. 185 186 ### Label filter expression 187 188 Label filter expression allows filtering log line using their original and extracted labels. It can contain multiple predicates. 189 190 A predicate contains a **label identifier**, an **operation** and a **value** to compare the label with. 191 192 For example with `cluster="namespace"` the cluster is the label identifier, the operation is `=` and the value is "namespace". The label identifier is always on the right side of the operation. 193 194 We support multiple **value** types which are automatically inferred from the query input. 195 196 - **String** is double quoted or backticked such as `"200"` or \``us-central1`\`. 197 - **[Duration](https://golang.org/pkg/time/#ParseDuration)** is a sequence of decimal numbers, each with optional fraction and a unit suffix, such as "300ms", "1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h". 198 - **Number** are floating-point number (64bits), such as`250`, `89.923`. 199 - **Bytes** is a sequence of decimal numbers, each with optional fraction and a unit suffix, such as "42MB", "1.5Kib" or "20b". Valid bytes units are "b", "kib", "kb", "mib", "mb", "gib", "gb", "tib", "tb", "pib", "pb", "eib", "eb". 200 201 String type work exactly like Prometheus label matchers use in [log stream selector](#log-stream-selector). This means you can use the same operations (`=`,`!=`,`=~`,`!~`). 202 203 > The string type is the only one that can filter out a log line with a label `__error__`. 204 205 Using Duration, Number and Bytes will convert the label value prior to comparision and support the following comparators: 206 207 - `==` or `=` for equality. 208 - `!=` for inequality. 209 - `>` and `>=` for greater than and greater than or equal. 210 - `<` and `<=` for lesser than and lesser than or equal. 211 212 For instance, `logfmt | duration > 1m and bytes_consumed > 20MB` 213 214 If the conversion of the label value fails, the log line is not filtered and an `__error__` label is added. To filters those errors see the [pipeline errors](../#pipeline-errors) section. 215 216 You can chain multiple predicates using `and` and `or` which respectively express the `and` and `or` binary operations. `and` can be equivalently expressed by a comma, a space or another pipe. Label filters can be place anywhere in a log pipeline. 217 218 This means that all the following expressions are equivalent: 219 220 ```logql 221 | duration >= 20ms or size == 20kb and method!~"2.." 222 | duration >= 20ms or size == 20kb | method!~"2.." 223 | duration >= 20ms or size == 20kb , method!~"2.." 224 | duration >= 20ms or size == 20kb method!~"2.." 225 226 ``` 227 228 The precedence for evaluation of multiple predicates is left to right. You can wrap predicates with parenthesis to force a different precedence. 229 230 These examples are equivalent: 231 232 ```logql 233 | duration >= 20ms or method="GET" and size <= 20KB 234 | ((duration >= 20ms or method="GET") and size <= 20KB) 235 ``` 236 237 To evaluate the logical `and` first, use parenthesis, as in this example: 238 239 ```logql 240 | duration >= 20ms or (method="GET" and size <= 20KB) 241 ``` 242 243 > Label filter expressions are the only expression allowed after the unwrap expression. This is mainly to allow filtering errors from the metric extraction. 244 245 Label filter expressions have support matching IP addresses. See [Matching IP addresses](../ip/) for details. 246 247 ### Parser expression 248 249 Parser expression can parse and extract labels from the log content. Those extracted labels can then be used for filtering using [label filter expressions](#label-filter-expression) or for [metric aggregations](../metric_queries). 250 251 Extracted label keys are automatically sanitized by all parsers, to follow Prometheus metric name convention.(They can only contain ASCII letters and digits, as well as underscores and colons. They cannot start with a digit.) 252 253 For instance, the pipeline `| json` will produce the following mapping: 254 ```json 255 { "a.b": {c: "d"}, e: "f" } 256 ``` 257 -> 258 ``` 259 {a_b_c="d", e="f"} 260 ``` 261 262 In case of errors, for instance if the line is not in the expected format, the log line won't be filtered but instead will get a new `__error__` label added. 263 264 If an extracted label key name already exists in the original log stream, the extracted label key will be suffixed with the `_extracted` keyword to make the distinction between the two labels. You can forcefully override the original label using a [label formatter expression](#labels-format-expression). However if an extracted key appears twice, only the latest label value will be kept. 265 266 Loki supports [JSON](#json), [logfmt](#logfmt), [pattern](#pattern), [regexp](#regular-expression) and [unpack](#unpack) parsers. 267 268 It's easier to use the predefined parsers `json` and `logfmt` when you can. If you can't, the `pattern` and `regexp` parsers can be used for log lines with an unusual structure. The `pattern` parser is easier and faster to write; it also outperforms the `regexp` parser. 269 Multiple parsers can be used by a single log pipeline. This is useful for parsing complex logs. There are examples in [Multiple parsers](#multiple-parsers). 270 271 #### JSON 272 273 The **json** parser operates in two modes: 274 275 1. **without** parameters: 276 277 Adding `| json` to your pipeline will extract all json properties as labels if the log line is a valid json document. 278 Nested properties are flattened into label keys using the `_` separator. 279 280 Note: **Arrays are skipped**. 281 282 For example the json parsers will extract from the following document: 283 284 ```json 285 { 286 "protocol": "HTTP/2.0", 287 "servers": ["129.0.1.1","10.2.1.3"], 288 "request": { 289 "time": "6.032", 290 "method": "GET", 291 "host": "foo.grafana.net", 292 "size": "55", 293 "headers": { 294 "Accept": "*/*", 295 "User-Agent": "curl/7.68.0" 296 } 297 }, 298 "response": { 299 "status": 401, 300 "size": "228", 301 "latency_seconds": "6.031" 302 } 303 } 304 ``` 305 306 The following list of labels: 307 308 ```kv 309 "protocol" => "HTTP/2.0" 310 "request_time" => "6.032" 311 "request_method" => "GET" 312 "request_host" => "foo.grafana.net" 313 "request_size" => "55" 314 "response_status" => "401" 315 "response_size" => "228" 316 "response_latency_seconds" => "6.031" 317 ``` 318 319 2. **with** parameters: 320 321 Using `| json label="expression", another="expression"` in your pipeline will extract only the 322 specified json fields to labels. You can specify one or more expressions in this way, the same 323 as [`label_format`](#labels-format-expression); all expressions must be quoted. 324 325 Currently, we only support field access (`my.field`, `my["field"]`) and array access (`list[0]`), and any combination 326 of these in any level of nesting (`my.list[0]["field"]`). 327 328 For example, `| json first_server="servers[0]", ua="request.headers[\"User-Agent\"]` will extract from the following document: 329 330 ```json 331 { 332 "protocol": "HTTP/2.0", 333 "servers": ["129.0.1.1","10.2.1.3"], 334 "request": { 335 "time": "6.032", 336 "method": "GET", 337 "host": "foo.grafana.net", 338 "size": "55", 339 "headers": { 340 "Accept": "*/*", 341 "User-Agent": "curl/7.68.0" 342 } 343 }, 344 "response": { 345 "status": 401, 346 "size": "228", 347 "latency_seconds": "6.031" 348 } 349 } 350 ``` 351 352 The following list of labels: 353 354 ```kv 355 "first_server" => "129.0.1.1" 356 "ua" => "curl/7.68.0" 357 ``` 358 359 If an array or an object returned by an expression, it will be assigned to the label in json format. 360 361 For example, `| json server_list="servers", headers="request.headers"` will extract: 362 363 ```kv 364 "server_list" => `["129.0.1.1","10.2.1.3"]` 365 "headers" => `{"Accept": "*/*", "User-Agent": "curl/7.68.0"}` 366 ``` 367 368 If the label to be extracted is same as the original JSON field, expression can be written as just `| json <label>` 369 370 For example, to extract `servers` fields as label, expression can be written as following 371 372 `| json servers` will extract: 373 374 ```kv 375 "servers" => `["129.0.1.1","10.2.1.3"]` 376 ``` 377 378 Note that `| json servers` is same as `| json servers="servers"` 379 380 #### logfmt 381 382 The **logfmt** parser can be added using the `| logfmt` and will extract all keys and values from the [logfmt](https://brandur.org/logfmt) formatted log line. 383 384 For example the following log line: 385 386 ```logfmt 387 at=info method=GET path=/ host=grafana.net fwd="124.133.124.161" service=8ms status=200 388 ``` 389 390 will get those labels extracted: 391 392 ```kv 393 "at" => "info" 394 "method" => "GET" 395 "path" => "/" 396 "host" => "grafana.net" 397 "fwd" => "124.133.124.161" 398 "service" => "8ms" 399 "status" => "200" 400 ``` 401 402 #### Pattern 403 404 The pattern parser allows the explicit extraction of fields from log lines by defining a pattern expression (`| pattern "<pattern-expression>"`). The expression matches the structure of a log line. 405 406 Consider this NGINX log line. 407 408 ```log 409 0.191.12.2 - - [10/Jun/2021:09:14:29 +0000] "GET /api/plugins/versioncheck HTTP/1.1" 200 2 "-" "Go-http-client/2.0" "13.76.247.102, 34.120.177.193" "TLSv1.2" "US" "" 410 ``` 411 412 This log line can be parsed with the expression 413 414 `<ip> - - <_> "<method> <uri> <_>" <status> <size> <_> "<agent>" <_>` 415 416 to extract these fields: 417 418 ```kv 419 "ip" => "0.191.12.2" 420 "method" => "GET" 421 "uri" => "/api/plugins/versioncheck" 422 "status" => "200" 423 "size" => "2" 424 "agent" => "Go-http-client/2.0" 425 ``` 426 427 A pattern expression is composed of captures and literals. 428 429 A capture is a field name delimited by the `<` and `>` characters. `<example>` defines the field name `example`. 430 An unnamed capture appears as `<_>`. The unnamed capture skips matched content. 431 432 Captures are matched from the line beginning or the previous set of literals, to the line end or the next set of literals. 433 If a capture is not matched, the pattern parser will stop. 434 435 Literals can be any sequence of UTF-8 characters, including whitespace characters. 436 437 By default, a pattern expression is anchored at the start of the log line. If the expression starts with literals, then the log line must also start with the same set of literals. Use `<_>` at the beginning of the expression if you don't want to anchor the expression at the start. 438 439 Consider the log line 440 441 ```log 442 level=debug ts=2021-06-10T09:24:13.472094048Z caller=logging.go:66 traceID=0568b66ad2d9294c msg="POST /loki/api/v1/push (204) 16.652862ms" 443 ``` 444 445 To match `msg="`, use the expression: 446 447 ```pattern 448 <_> msg="<method> <path> (<status>) <latency>" 449 ``` 450 451 A pattern expression is invalid if 452 453 - It does not contain any named capture. 454 - It contains two consecutive captures not separated by whitespace characters. 455 456 #### Regular expression 457 458 Unlike the logfmt and json, which extract implicitly all values and takes no parameters, the regexp parser takes a single parameter `| regexp "<re>"` which is the regular expression using the [Golang](https://golang.org/) [RE2 syntax](https://github.com/google/re2/wiki/Syntax). 459 460 The regular expression must contain a least one named sub-match (e.g `(?P<name>re)`), each sub-match will extract a different label. 461 462 For example the parser `| regexp "(?P<method>\\w+) (?P<path>[\\w|/]+) \\((?P<status>\\d+?)\\) (?P<duration>.*)"` will extract from the following line: 463 464 ```log 465 POST /api/prom/api/v1/query_range (200) 1.5s 466 ``` 467 468 those labels: 469 470 ```kv 471 "method" => "POST" 472 "path" => "/api/prom/api/v1/query_range" 473 "status" => "200" 474 "duration" => "1.5s" 475 ``` 476 477 #### unpack 478 479 The `unpack` parser parses a JSON log line, unpacking all embedded labels from Promtail's [`pack` stage]({{< relref "../clients/promtail/stages/pack.md" >}}). 480 **A special property `_entry` will also be used to replace the original log line**. 481 482 For example, using `| unpack` with the log line: 483 484 ```json 485 { 486 "container": "myapp", 487 "pod": "pod-3223f", 488 "_entry": "original log message" 489 } 490 ``` 491 492 extracts the `container` and `pod` labels; it sets `original log message` as the new log line. 493 494 You can combine the `unpack` and `json` parsers (or any other parsers) if the original embedded log line is of a specific format. 495 496 ### Line format expression 497 498 The line format expression can rewrite the log line content by using the [text/template](https://golang.org/pkg/text/template/) format. 499 It takes a single string parameter `| line_format "{{.label_name}}"`, which is the template format. All labels are injected variables into the template and are available to use with the `{{.label_name}}` notation. 500 501 For example the following expression: 502 503 ```logql 504 {container="frontend"} | logfmt | line_format "{{.query}} {{.duration}}" 505 ``` 506 507 Will extract and rewrite the log line to only contains the query and the duration of a request. 508 509 You can use double quoted string for the template or backticks `` `{{.label_name}}` `` to avoid the need to escape special characters. 510 511 `line_format` also supports `math` functions. Example: 512 513 If we have the following labels `ip=1.1.1.1`, `status=200` and `duration=3000`(ms), we can divide the duration by `1000` to get the value in seconds. 514 515 ```logql 516 {container="frontend"} | logfmt | line_format "{{.ip}} {{.status}} {{div .duration 1000}}" 517 ``` 518 519 The above query will give us the `line` as `1.1.1.1 200 3` 520 521 See [template functions](../template_functions/) to learn about available functions in the template format. 522 523 ### Labels format expression 524 525 The `| label_format` expression can rename, modify or add labels. It takes as parameter a comma separated list of equality operations, enabling multiple operations at once. 526 527 When both side are label identifiers, for example `dst=src`, the operation will rename the `src` label into `dst`. 528 529 The right side can alternatively be a template string (double quoted or backtick), for example `dst="{{.status}} {{.query}}"`, in which case the `dst` label value is replaced by the result of the [text/template](https://golang.org/pkg/text/template/) evaluation. This is the same template engine as the `| line_format` expression, which means labels are available as variables and you can use the same list of functions. 530 531 In both cases, if the destination label doesn't exist, then a new one is created. 532 533 The renaming form `dst=src` will _drop_ the `src` label after remapping it to the `dst` label. However, the _template_ form will preserve the referenced labels, such that `dst="{{.src}}"` results in both `dst` and `src` having the same value. 534 535 > A single label name can only appear once per expression. This means `| label_format foo=bar,foo="new"` is not allowed but you can use two expressions for the desired effect: `| label_format foo=bar | label_format foo="new"`