github.com/yankunsam/loki/v2@v2.6.3-0.20220817130409-389df5235c27/docs/sources/logql/log_queries.md

github.com/yankunsam/loki/v2@v2.6.3-0.20220817130409-389df5235c27/docs/sources/logql/log_queries.md (about)

     1  ---
     2  title: Log queries
     3  weight: 10
     4  ---
     5  # Log queries
     6  
     7  All LogQL queries contain a **log stream selector**.
     8  
     9  ![parts of a query](../query_components.png)
    10  
    11  
    12  Optionally, the log stream selector can be followed by a **log pipeline**. A log pipeline is a set of stage expressions that are chained together and applied to the selected log streams. Each expression can filter out, parse, or mutate log lines and their respective labels.
    13  
    14  The following example shows a full log query in action:
    15  
    16  ```logql
    17  {container="query-frontend",namespace="loki-dev"} |= "metrics.go" | logfmt | duration > 10s and throughput_mb < 500
    18  ```
    19  
    20  The query is composed of:
    21  
    22  - a log stream selector `{container="query-frontend",namespace="loki-dev"}` which targets the `query-frontend` container  in the `loki-dev` namespace.
    23  - a log pipeline `|= "metrics.go" | logfmt | duration > 10s and throughput_mb < 500` which will filter out log that contains the word `metrics.go`, then parses each log line to extract more labels and filter with them.
    24  
    25  > To avoid escaping special characters you can use the `` ` ``(backtick) instead of `"` when quoting strings.
    26  For example `` `\w+` `` is the same as `"\\w+"`.
    27  This is specially useful when writing a regular expression which contains multiple backslashes that require escaping.
    28  
    29  ## Log stream selector
    30  
    31  The stream selector determines which log streams to include in a query's results.
    32  A log stream is a unique source of log content, such as a file.
    33  A more granular log stream selector then reduces the number of searched streams to a manageable volume.
    34  This means that the labels passed to the log stream selector will affect the relative performance of the query's execution.
    35  
    36  The log stream selector is specified by one or more comma-separated key-value pairs. Each key is a log label and each value is that label's value.
    37  Curly braces (`{` and `}`) delimit the stream selector.
    38  
    39  Consider this stream selector:
    40  
    41  ```logql
    42  {app="mysql",name="mysql-backup"}
    43  ```
    44  
    45  All log streams that have both a label of `app` whose value is `mysql`
    46  and a label of `name` whose value is `mysql-backup` will be included in
    47  the query results.
    48  A stream may contain other pairs of labels and values,
    49  but only the specified pairs within the stream selector are used to determine
    50  which streams will be included within the query results.
    51  
    52  The same rules that apply for [Prometheus Label Selectors](https://prometheus.io/docs/prometheus/latest/querying/basics/#instant-vector-selectors) apply for Grafana Loki log stream selectors.
    53  
    54  The `=` operator after the label name is a **label matching operator**.
    55  The following label matching operators are supported:
    56  
    57  - `=`: exactly equal
    58  - `!=`: not equal
    59  - `=~`: regex matches
    60  - `!~`: regex does not match
    61  
    62  Regex log stream examples:
    63  
    64  - `{name =~ "mysql.+"}`
    65  - `{name !~ "mysql.+"}`
    66  - `` {name !~ `mysql-\d+`} ``
    67  
    68  **Note:** The `=~` regex operator is fully anchored, meaning regex must match against the *entire* string, including newlines. The regex `.` character does not match newlines by default. If you want the regex dot character to match newlines you can use the single-line flag, like so: `(?s)search_term.+` matches `search_term\n`.
    69  
    70  ## Log pipeline
    71  
    72  A log pipeline can be appended to a log stream selector to further process and filter log streams. It is composed of a set of expressions. Each expression is executed in left to right sequence for each log line. If an expression filters out a log line, the pipeline will stop processing the current log line and start processing the next log line.
    73  
    74  Some expressions can mutate the log content and respective labels,
    75  which will be then be available for further filtering and processing in subsequent expressions.
    76  An example that mutates is the expression
    77  
    78  ```
    79  | line_format "{{.status_code}}"
    80  ```
    81  
    82  
    83  Log pipeline expressions fall into one of three categories:
    84  
    85  - Filtering expressions: [line filter expressions](#line-filter-expression)
    86  and
    87  [label filter expressions](#label-filter-expression)
    88  - [Parsing expressions](#parser-expression)
    89  - Formatting expressions: [line format expressions](#line-format-expression)
    90  and
    91  [label format expressions](#labels-format-expression)
    92  
    93  ### Line filter expression
    94  
    95  The line filter expression does a distributed `grep`
    96  over the aggregated logs from the matching log streams.
    97  It searches the contents of the log line,
    98  discarding those lines that do not match the case sensitive expression.
    99  
   100  Each line filter expression has a **filter operator**
   101  followed by text or a regular expression.
   102  These filter operators are supported:
   103  
   104  - `|=`: Log line contains string
   105  - `!=`: Log line does not contain string
   106  - `|~`: Log line contains a match to the regular expression
   107  - `!~`: Log line does not contain a match to the regular expression
   108  
   109  Line filter expression examples:
   110  
   111  - Keep log lines that have the substring "error":
   112  
   113      ```
   114      |= "error"
   115      ```
   116  
   117      A complete query using this example:
   118  
   119      ```
   120      {job="mysql"} |= "error"
   121      ```
   122  
   123  - Discard log lines that have the substring "kafka.server:type=ReplicaManager":
   124  
   125      ```
   126      != "kafka.server:type=ReplicaManager"
   127      ```
   128  
   129      A complete query using this example:
   130  
   131      ```
   132      {instance=~"kafka-[23]",name="kafka"} != "kafka.server:type=ReplicaManager"
   133      ```
   134  
   135  - Keep log lines that contain a substring that starts with `tsdb-ops` and ends with `io:2003`. A complete query with a regular expression:
   136  
   137      ```
   138      {name="kafka"} |~ "tsdb-ops.*io:2003"
   139      ```
   140  
   141  - Keep log lines that contain a substring that starts with `error=`,
   142  and is followed by 1 or more word characters. A complete query with a regular expression:
   143  
   144      ```
   145      {name="cassandra"} |~  `error=\w+`
   146      ```
   147  
   148  Filter operators can be chained.
   149  Filters are applied sequentially.
   150  Query results will have satisfied every filter.
   151  This complete query example will give results that include the string `error`,
   152  and do not include the string `timeout`.
   153  
   154  ```logql
   155  {job="mysql"} |= "error" != "timeout"
   156  ```
   157  
   158  When using `|~` and `!~`, Go (as in [Golang](https://golang.org/)) [RE2 syntax](https://github.com/google/re2/wiki/Syntax) regex may be used.
   159  The matching is case-sensitive by default.
   160  Switch to case-insensitive matching by prefixing the regular expression
   161  with `(?i)`.
   162  
   163  While line filter expressions could be placed anywhere within a log pipeline,
   164  it is almost always better to have them at the beginning.
   165  Placing them at the beginning improves the performance of the query,
   166  as it only does further processing when a line matches.
   167  For example,
   168   while the results will be the same,
   169  the query specified with
   170  
   171  ```
   172  {job="mysql"} |= "error" | json | line_format "{{.err}}"
   173  ```
   174  
   175  will always run faster than
   176  
   177  ```
   178  {job="mysql"} | json | line_format "{{.message}}" |= "error"
   179  ```
   180  
   181  Line filter expressions are the fastest way to filter logs once the
   182  log stream selectors have been applied.
   183  
   184  Line filter expressions have support matching IP addresses. See [Matching IP addresses](../ip/) for details.
   185  
   186  ### Label filter expression
   187  
   188  Label filter expression allows filtering log line using their original and extracted labels. It can contain multiple predicates.
   189  
   190  A predicate contains a **label identifier**, an **operation** and a **value** to compare the label with.
   191  
   192  For example with `cluster="namespace"` the cluster is the label identifier, the operation is `=` and the value is "namespace". The label identifier is always on the right side of the operation.
   193  
   194  We support multiple **value** types which are automatically inferred from the query input.
   195  
   196  - **String** is double quoted or backticked such as `"200"` or \``us-central1`\`.
   197  - **[Duration](https://golang.org/pkg/time/#ParseDuration)** is a sequence of decimal numbers, each with optional fraction and a unit suffix, such as "300ms", "1.5h" or "2h45m". Valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h".
   198  - **Number** are floating-point number (64bits), such as`250`, `89.923`.
   199  - **Bytes** is a sequence of decimal numbers, each with optional fraction and a unit suffix, such as "42MB", "1.5Kib" or "20b". Valid bytes units are "b", "kib", "kb", "mib", "mb", "gib",  "gb", "tib", "tb", "pib", "pb", "eib", "eb".
   200  
   201  String type work exactly like Prometheus label matchers use in [log stream selector](#log-stream-selector). This means you can use the same operations (`=`,`!=`,`=~`,`!~`).
   202  
   203  > The string type is the only one that can filter out a log line with a label `__error__`.
   204  
   205  Using Duration, Number and Bytes will convert the label value prior to comparision and support the following comparators:
   206  
   207  - `==` or `=` for equality.
   208  - `!=` for inequality.
   209  - `>` and `>=` for greater than and greater than or equal.
   210  - `<` and `<=` for lesser than and lesser than or equal.
   211  
   212  For instance, `logfmt | duration > 1m and bytes_consumed > 20MB`
   213  
   214  If the conversion of the label value fails, the log line is not filtered and an `__error__` label is added. To filters those errors see the [pipeline errors](../#pipeline-errors) section.
   215  
   216  You can chain multiple predicates using `and` and `or` which respectively express the `and` and `or` binary operations. `and` can be equivalently expressed by a comma, a space or another pipe. Label filters can be place anywhere in a log pipeline.
   217  
   218  This means that all the following expressions are equivalent:
   219  
   220  ```logql
   221  | duration >= 20ms or size == 20kb and method!~"2.."
   222  | duration >= 20ms or size == 20kb | method!~"2.."
   223  | duration >= 20ms or size == 20kb , method!~"2.."
   224  | duration >= 20ms or size == 20kb  method!~"2.."
   225  
   226  ```
   227  
   228  The precedence for evaluation of multiple predicates is left to right. You can wrap predicates with parenthesis to force a different precedence.
   229  
   230  These examples are equivalent:
   231  
   232  ```logql
   233  | duration >= 20ms or method="GET" and size <= 20KB
   234  | ((duration >= 20ms or method="GET") and size <= 20KB)
   235  ```
   236  
   237  To evaluate the logical `and` first, use parenthesis, as in this example:
   238  
   239  ```logql
   240  | duration >= 20ms or (method="GET" and size <= 20KB)
   241  ```
   242  
   243  > Label filter expressions are the only expression allowed after the unwrap expression. This is mainly to allow filtering errors from the metric extraction.
   244  
   245  Label filter expressions have support matching IP addresses. See [Matching IP addresses](../ip/) for details.
   246  
   247  ### Parser expression
   248  
   249  Parser expression can parse and extract labels from the log content. Those extracted labels can then be used for filtering using [label filter expressions](#label-filter-expression) or for [metric aggregations](../metric_queries).
   250  
   251  Extracted label keys are automatically sanitized by all parsers, to follow Prometheus metric name convention.(They can only contain ASCII letters and digits, as well as underscores and colons. They cannot start with a digit.)
   252  
   253  For instance, the pipeline `| json` will produce the following mapping:
   254  ```json
   255  { "a.b": {c: "d"}, e: "f" }
   256  ```
   257  ->
   258  ```
   259  {a_b_c="d", e="f"}
   260  ```
   261  
   262  In case of errors, for instance if the line is not in the expected format, the log line won't be filtered but instead will get a new `__error__` label added.
   263  
   264  If an extracted label key name already exists in the original log stream, the extracted label key will be suffixed with the `_extracted` keyword to make the distinction between the two labels. You can forcefully override the original label using a [label formatter expression](#labels-format-expression). However if an extracted key appears twice, only the latest label value will be kept.
   265  
   266  Loki supports  [JSON](#json), [logfmt](#logfmt), [pattern](#pattern), [regexp](#regular-expression) and [unpack](#unpack) parsers.
   267  
   268  It's easier to use the predefined parsers `json` and `logfmt` when you can. If you can't, the `pattern` and `regexp` parsers can be used for log lines with an unusual structure. The `pattern` parser is easier and faster to write; it also outperforms the `regexp` parser.
   269  Multiple parsers can be used by a single log pipeline. This is useful for parsing complex logs. There are examples in [Multiple parsers](#multiple-parsers).
   270  
   271  #### JSON
   272  
   273  The **json** parser operates in two modes:
   274  
   275  1. **without** parameters:
   276  
   277     Adding `| json` to your pipeline will extract all json properties as labels if the log line is a valid json document.
   278     Nested properties are flattened into label keys using the `_` separator.
   279  
   280     Note: **Arrays are skipped**.
   281  
   282     For example the json parsers will extract from the following document:
   283  
   284     ```json
   285     {
   286         "protocol": "HTTP/2.0",
   287         "servers": ["129.0.1.1","10.2.1.3"],
   288         "request": {
   289             "time": "6.032",
   290             "method": "GET",
   291             "host": "foo.grafana.net",
   292             "size": "55",
   293             "headers": {
   294               "Accept": "*/*",
   295               "User-Agent": "curl/7.68.0"
   296             }
   297         },
   298         "response": {
   299             "status": 401,
   300             "size": "228",
   301             "latency_seconds": "6.031"
   302         }
   303     }
   304     ```
   305  
   306     The following list of labels:
   307  
   308     ```kv
   309     "protocol" => "HTTP/2.0"
   310     "request_time" => "6.032"
   311     "request_method" => "GET"
   312     "request_host" => "foo.grafana.net"
   313     "request_size" => "55"
   314     "response_status" => "401"
   315     "response_size" => "228"
   316     "response_latency_seconds" => "6.031"
   317     ```
   318  
   319  2. **with** parameters:
   320  
   321     Using `| json label="expression", another="expression"` in your pipeline will extract only the
   322     specified json fields to labels. You can specify one or more expressions in this way, the same
   323     as [`label_format`](#labels-format-expression); all expressions must be quoted.
   324  
   325     Currently, we only support field access (`my.field`, `my["field"]`) and array access (`list[0]`), and any combination
   326     of these in any level of nesting (`my.list[0]["field"]`).
   327  
   328     For example, `| json first_server="servers[0]", ua="request.headers[\"User-Agent\"]` will extract from the following document:
   329  
   330      ```json
   331      {
   332          "protocol": "HTTP/2.0",
   333          "servers": ["129.0.1.1","10.2.1.3"],
   334          "request": {
   335              "time": "6.032",
   336              "method": "GET",
   337              "host": "foo.grafana.net",
   338              "size": "55",
   339              "headers": {
   340                "Accept": "*/*",
   341                "User-Agent": "curl/7.68.0"
   342              }
   343          },
   344          "response": {
   345              "status": 401,
   346              "size": "228",
   347              "latency_seconds": "6.031"
   348          }
   349      }
   350      ```
   351  
   352     The following list of labels:
   353  
   354      ```kv
   355      "first_server" => "129.0.1.1"
   356      "ua" => "curl/7.68.0"
   357      ```
   358  
   359     If an array or an object returned by an expression, it will be assigned to the label in json format.
   360  
   361     For example, `| json server_list="servers", headers="request.headers"` will extract:
   362  
   363     ```kv
   364     "server_list" => `["129.0.1.1","10.2.1.3"]`
   365     "headers" => `{"Accept": "*/*", "User-Agent": "curl/7.68.0"}`
   366     ```
   367   
   368     If the label to be extracted is same as the original JSON field, expression can be written as just `| json <label>`
   369  
   370     For example, to extract `servers` fields as label, expression can be written as following
   371      
   372     `| json servers` will extract:
   373  
   374      ```kv
   375     "servers" => `["129.0.1.1","10.2.1.3"]`
   376     ```
   377  
   378     Note that `| json servers` is same as `| json servers="servers"`
   379  
   380  #### logfmt
   381  
   382  The **logfmt** parser can be added using the `| logfmt` and will extract all keys and values from the [logfmt](https://brandur.org/logfmt) formatted log line.
   383  
   384  For example the following log line:
   385  
   386  ```logfmt
   387  at=info method=GET path=/ host=grafana.net fwd="124.133.124.161" service=8ms status=200
   388  ```
   389  
   390  will get those labels extracted:
   391  
   392  ```kv
   393  "at" => "info"
   394  "method" => "GET"
   395  "path" => "/"
   396  "host" => "grafana.net"
   397  "fwd" => "124.133.124.161"
   398  "service" => "8ms"
   399  "status" => "200"
   400  ```
   401  
   402  #### Pattern
   403  
   404  The pattern parser allows the explicit extraction of fields from log lines by defining a pattern expression (`| pattern "<pattern-expression>"`). The expression matches the structure of a log line.
   405  
   406  Consider this NGINX log line.
   407  
   408  ```log
   409  0.191.12.2 - - [10/Jun/2021:09:14:29 +0000] "GET /api/plugins/versioncheck HTTP/1.1" 200 2 "-" "Go-http-client/2.0" "13.76.247.102, 34.120.177.193" "TLSv1.2" "US" ""
   410  ```
   411  
   412  This log line can be parsed with the expression
   413  
   414  `<ip> - - <_> "<method> <uri> <_>" <status> <size> <_> "<agent>" <_>`
   415  
   416  to extract these fields:
   417  
   418  ```kv
   419  "ip" => "0.191.12.2"
   420  "method" => "GET"
   421  "uri" => "/api/plugins/versioncheck"
   422  "status" => "200"
   423  "size" => "2"
   424  "agent" => "Go-http-client/2.0"
   425  ```
   426  
   427  A pattern expression is composed of captures and literals.
   428  
   429  A capture is a field name delimited by the `<` and `>` characters. `<example>` defines the field name `example`.
   430  An unnamed capture appears as `<_>`. The unnamed capture skips matched content.
   431  
   432  Captures are matched from the line beginning or the previous set of literals, to the line end or the next set of literals.
   433  If a capture is not matched, the pattern parser will stop.
   434  
   435  Literals can be any sequence of UTF-8 characters, including whitespace characters.
   436  
   437  By default, a pattern expression is anchored at the start of the log line. If the expression starts with literals, then the log line must also start with the same set of literals. Use `<_>` at the beginning of the expression if you don't want to anchor the expression at the start.
   438  
   439  Consider the log line
   440  
   441  ```log
   442  level=debug ts=2021-06-10T09:24:13.472094048Z caller=logging.go:66 traceID=0568b66ad2d9294c msg="POST /loki/api/v1/push (204) 16.652862ms"
   443  ```
   444  
   445  To match `msg="`, use the expression:
   446  
   447  ```pattern
   448  <_> msg="<method> <path> (<status>) <latency>"
   449  ```
   450  
   451  A pattern expression is invalid if
   452  
   453  - It does not contain any named capture.
   454  - It contains two consecutive captures not separated by whitespace characters.
   455  
   456  #### Regular expression
   457  
   458  Unlike the logfmt and json, which extract implicitly all values and takes no parameters, the regexp parser takes a single parameter `| regexp "<re>"` which is the regular expression using the [Golang](https://golang.org/) [RE2 syntax](https://github.com/google/re2/wiki/Syntax).
   459  
   460  The regular expression must contain a least one named sub-match (e.g `(?P<name>re)`), each sub-match will extract a different label.
   461  
   462  For example the parser `| regexp "(?P<method>\\w+) (?P<path>[\\w|/]+) \\((?P<status>\\d+?)\\) (?P<duration>.*)"` will extract from the following line:
   463  
   464  ```log
   465  POST /api/prom/api/v1/query_range (200) 1.5s
   466  ```
   467  
   468  those labels:
   469  
   470  ```kv
   471  "method" => "POST"
   472  "path" => "/api/prom/api/v1/query_range"
   473  "status" => "200"
   474  "duration" => "1.5s"
   475  ```
   476  
   477  #### unpack
   478  
   479  The `unpack` parser parses a JSON log line, unpacking all embedded labels from Promtail's [`pack` stage]({{< relref "../clients/promtail/stages/pack.md" >}}).
   480  **A special property `_entry` will also be used to replace the original log line**.
   481  
   482  For example, using `| unpack` with the log line:
   483  
   484  ```json
   485  {
   486    "container": "myapp",
   487    "pod": "pod-3223f",
   488    "_entry": "original log message"
   489  }
   490  ```
   491  
   492  extracts the `container` and `pod` labels; it sets `original log message` as the new log line.
   493  
   494  You can combine the `unpack` and `json` parsers (or any other parsers) if the original embedded log line is of a specific format.
   495  
   496  ### Line format expression
   497  
   498  The line format expression can rewrite the log line content by using the [text/template](https://golang.org/pkg/text/template/) format.
   499  It takes a single string parameter `| line_format "{{.label_name}}"`, which is the template format. All labels are injected variables into the template and are available to use with the `{{.label_name}}` notation.
   500  
   501  For example the following expression:
   502  
   503  ```logql
   504  {container="frontend"} | logfmt | line_format "{{.query}} {{.duration}}"
   505  ```
   506  
   507  Will extract and rewrite the log line to only contains the query and the duration of a request.
   508  
   509  You can use double quoted string for the template or backticks `` `{{.label_name}}` `` to avoid the need to escape special characters.
   510  
   511  `line_format` also supports `math` functions. Example:
   512  
   513  If we have the following labels `ip=1.1.1.1`, `status=200` and `duration=3000`(ms), we can divide the duration by `1000` to get the value in seconds.
   514  
   515  ```logql
   516  {container="frontend"} | logfmt | line_format "{{.ip}} {{.status}} {{div .duration 1000}}"
   517  ```
   518  
   519  The above query will give us the `line` as `1.1.1.1 200 3`
   520  
   521  See [template functions](../template_functions/) to learn about available functions in the template format.
   522  
   523  ### Labels format expression
   524  
   525  The `| label_format` expression can rename, modify or add labels. It takes as parameter a comma separated list of equality operations, enabling multiple operations at once.
   526  
   527  When both side are label identifiers, for example `dst=src`, the operation will rename the `src` label into `dst`.
   528  
   529  The right side can alternatively be a template string (double quoted or backtick), for example `dst="{{.status}} {{.query}}"`, in which case the `dst` label value is replaced by the result of the [text/template](https://golang.org/pkg/text/template/) evaluation. This is the same template engine as the `| line_format` expression, which means labels are available as variables and you can use the same list of functions.
   530  
   531  In both cases, if the destination label doesn't exist, then a new one is created.
   532  
   533  The renaming form `dst=src` will _drop_ the `src` label after remapping it to the `dst` label. However, the _template_ form will preserve the referenced labels, such that  `dst="{{.src}}"` results in both `dst` and `src` having the same value.
   534  
   535  > A single label name can only appear once per expression. This means `| label_format foo=bar,foo="new"` is not allowed but you can use two expressions for the desired effect: `| label_format foo=bar | label_format foo="new"`