github.com/yankunsam/loki/v2@v2.6.3-0.20220817130409-389df5235c27/docs/sources/clients/promtail/troubleshooting.md

github.com/yankunsam/loki/v2@v2.6.3-0.20220817130409-389df5235c27/docs/sources/clients/promtail/troubleshooting.md (about)

     1  ---
     2  title: Troubleshooting
     3  ---
     4  # Troubleshooting Promtail
     5  
     6  This document describes known failure modes of Promtail on edge cases and the
     7  adopted trade-offs.
     8  
     9  ## Dry running
    10  
    11  Promtail can be configured to print log stream entries instead of sending them to Loki.
    12  This can be used in combination with [piping data](#pipe-data-to-promtail) to debug or troubleshoot Promtail log parsing.
    13  
    14  In dry run mode, Promtail still support reading from a [positions](../configuration#position_config) file however no update will be made to the targeted file, this is to ensure you can easily retry the same set of lines.
    15  
    16  To start Promtail in dry run mode use the flag `--dry-run` as shown in the example below:
    17  
    18  ```bash
    19  cat my.log | promtail --stdin --dry-run --client.url http://127.0.0.1:3100/loki/api/v1/push
    20  ```
    21  
    22  ## Inspecting pipeline stages
    23  
    24  Promtail can output all changes to log entries as each pipeline stage is executed.
    25  Each log entry contains four fields:
    26  - line
    27  - timestamp
    28  - labels
    29  - extracted fields
    30  
    31  Enable the inspection output using the `--inspect` command-line option. The `--inspect` option can be used in combination with `--stdin` and `--dry-run`.
    32  
    33  ```bash
    34  cat my.log | promtail --stdin --dry-run --inspect --client.url http://127.0.0.1:3100/loki/api/v1/push
    35  ```
    36  
    37  ![screenshot](../inspect.png)
    38  
    39  The output uses color to highlight changes. Additions are in green, modifications in yellow, and removals in red.
    40  
    41  If no changes are applied during a stage, that is usually an indication of a misconfiguration or undesired behavior.
    42  
    43  The `--inspect` flag should not be used in production, as the calculation of changes between pipeline stages negatively
    44  impacts Promtail's performance.
    45  
    46  ## Pipe data to Promtail
    47  
    48  Promtail supports piping data for sending logs to Loki (via the flag `--stdin`). This is a very useful way to troubleshooting your configuration.
    49  Once you have Promtail installed you can for instance use the following command to send logs to a local Loki instance:
    50  
    51  ```bash
    52  cat my.log | promtail --stdin  --client.url http://127.0.0.1:3100/loki/api/v1/push
    53  ```
    54  
    55  You can also add additional labels from command line using:
    56  
    57  ```bash
    58  cat my.log | promtail --stdin  --client.url http://127.0.0.1:3100/loki/api/v1/push --client.external-labels=k1=v1,k2=v2
    59  ```
    60  
    61  This will add labels `k1` and `k2` with respective values `v1` and `v2`.
    62  
    63  In pipe mode Promtail also support file configuration using `--config.file`, however do note that positions config is not used and
    64  only **the first scrape config is used**.
    65  
    66  [`static_configs:`](../configuration) can be used to provide static labels, although the targets property is ignored.
    67  
    68  If you don't provide any [`scrape_config:`](../configuration#scrape_config) a default one is used which will automatically adds the following default labels: `{job="stdin",hostname="<detected_hostname>"}`.
    69  
    70  For example you could use this config below to parse and add the label `level` on all your piped logs:
    71  
    72  ```yaml
    73  clients:
    74    - url: http://localhost:3100/loki/api/v1/push
    75  
    76  scrape_configs:
    77  - job_name: system
    78    pipeline_stages:
    79    - regex:
    80        expression: '(level|lvl|severity)=(?P<level>\\w+)'
    81    - labels:
    82        level:
    83    static_configs:
    84    - labels:
    85        job: my-stdin-logs
    86  ```
    87  
    88  ```
    89  cat my.log | promtail --config.file promtail.yaml
    90  ```
    91  
    92  
    93  ## A tailed file is truncated while Promtail is not running
    94  
    95  Given the following order of events:
    96  
    97  1. Promtail is tailing `/app.log`
    98  1. Promtail current position for `/app.log` is `100` (byte offset)
    99  1. Promtail is stopped
   100  1. `/app.log` is truncated and new logs are appended to it
   101  1. Promtail is restarted
   102  
   103  When Promtail is restarted, it reads the previous position (`100`) from the
   104  positions file. Two scenarios are then possible:
   105  
   106  - `/app.log` size is less than the position before truncating
   107  - `/app.log` size is greater than or equal to the position before truncating
   108  
   109  If the `/app.log` file size is less than the previous position, then the file is
   110  detected as truncated and logs will be tailed starting from position `0`.
   111  Otherwise, if the `/app.log` file size is greater than or equal to the previous
   112  position, Promtail can't detect it was truncated while not running and will
   113  continue tailing the file from position `100`.
   114  
   115  Generally speaking, Promtail uses only the path to the file as key in the
   116  positions file. Whenever Promtail is started, for each file path referenced in
   117  the positions file, Promtail will read the file from the beginning if the file
   118  size is less than the offset stored in the position file, otherwise it will
   119  continue from the offset, regardless the file has been truncated or rolled
   120  multiple times while Promtail was not running.
   121  
   122  ## Loki is unavailable
   123  
   124  For each tailing file, Promtail reads a line, process it through the
   125  configured `pipeline_stages` and push the log entry to Loki. Log entries are
   126  batched together before getting pushed to Loki, based on the max batch duration
   127  `client.batch-wait` and size `client.batch-size-bytes`, whichever comes first.
   128  
   129  In case of any error while sending a log entries batch, Promtail adopts a
   130  "retry then discard" strategy:
   131  
   132  - Promtail retries to send log entry to the ingester up to `max_retries` times
   133  - If all retries fail, Promtail discards the batch of log entries (_which will
   134    be lost_) and proceeds with the next one
   135  
   136  You can configure the `max_retries` and the delay between two retries via the
   137  `backoff_config` in the Promtail config file:
   138  
   139  ```yaml
   140  clients:
   141    - url: INGESTER-URL
   142      backoff_config:
   143        min_period: 100ms
   144        max_period: 10s
   145        max_retries: 10
   146  ```
   147  
   148  The following table shows an example of the total delay applied by the backoff algorithm
   149  with `min_period: 100ms` and `max_period: 10s`:
   150  
   151  | Retry | Min delay | Max delay | Total min delay | Total max delay |
   152  |-------|-----------|-----------|-----------------|-----------------|
   153  | 1     | 100ms     | 200ms     | 100ms           | 200ms           |
   154  | 2     | 200ms     | 400ms     | 300ms           | 600ms           |
   155  | 3     | 400ms     | 800ms     | 700ms           | 1.4s            |
   156  | 4     | 800ms     | 1.6s      | 1.5s            | 3s              |
   157  | 5     | 1.6s      | 3.2s      | 3.1s            | 6.2s            |
   158  | 6     | 3.2s      | 6.4s      | 6.3s            | 12.6s           |
   159  | 7     | 6.4s      | 10s       | 12.7s           | 22.6s           |
   160  | 8     | 6.4s      | 10s       | 19.1s           | 32.6s           |
   161  | 9     | 6.4s      | 10s       | 25.5s           | 42.6s           |
   162  | 10    | 6.4s      | 10s       | 31.9s           | 52.6s           |
   163  | 11    | 6.4s      | 10s       | 38.3s           | 62.6s           |
   164  | 12    | 6.4s      | 10s       | 44.7s           | 72.6s           |
   165  | 13    | 6.4s      | 10s       | 51.1s           | 82.6s           |
   166  | 14    | 6.4s      | 10s       | 57.5s           | 92.6s           |
   167  | 15    | 6.4s      | 10s       | 63.9s           | 102.6s          |
   168  | 16    | 6.4s      | 10s       | 70.3s           | 112.6s          |
   169  | 17    | 6.4s      | 10s       | 76.7s           | 122.6s          |
   170  | 18    | 6.4s      | 10s       | 83.1s           | 132.6s          |
   171  | 19    | 6.4s      | 10s       | 89.5s           | 142.6s          |
   172  | 20    | 6.4s      | 10s       | 95.9s           | 152.6s          |
   173  
   174  
   175  ## Log entries pushed after a Promtail crash / panic / abruptly termination
   176  
   177  When Promtail shuts down gracefully, it saves the last read offsets in the
   178  positions file, so that on a subsequent restart it will continue tailing logs
   179  without duplicates neither losses.
   180  
   181  In the event of a crash or abruptly termination, Promtail can't save the last
   182  read offsets in the positions file. When restarted, Promtail will read the
   183  positions file saved at the last sync period and will continue tailing the files
   184  from there. This means that if new log entries have been read and pushed to the
   185  ingester between the last sync period and the crash, these log entries will be
   186  sent again to the ingester on Promtail restart.
   187  
   188  If Loki is not configured to [accept out-of-order writes](../../../configuration/#accept-out-of-order-writes), Loki will reject all log lines received in
   189  what it perceives is out of
   190  order. If Promtail happens to
   191  crash, it may re-send log lines that were sent prior to the crash. The default
   192  behavior of Promtail is to assign a timestamp to logs at the time it read the
   193  entry from the tailed file. This would result in duplicate log lines being sent
   194  to Loki; to avoid this issue, if your tailed file has a timestamp embedded in
   195  the log lines, a `timestamp` stage should be added to your pipeline.