github.com/yankunsam/loki/v2@v2.6.3-0.20220817130409-389df5235c27/docs/sources/operations/troubleshooting.md (about)

     1  ---
     2  title: Troubleshooting
     3  weight: 80
     4  aliases:
     5      - /docs/loki/latest/getting-started/troubleshooting/
     6  ---
     7  # Troubleshooting Grafana Loki
     8  
     9  ## "Loki: Bad Gateway. 502"
    10  
    11  This error can appear in Grafana when Grafana Loki is added as a
    12  datasource, indicating that Grafana in unable to connect to Loki. There may
    13  one of many root causes:
    14  
    15  - If Loki is deployed with Docker, and Grafana and Loki are not running in the
    16    same node, check your firewall to make sure the nodes can connect.
    17  - If Loki is deployed with Kubernetes:
    18      - If Grafana and Loki are in the same namespace, set the Loki URL as
    19        `http://$LOKI_SERVICE_NAME:$LOKI_PORT`
    20      - Otherwise, set the Loki URL as
    21        `http://$LOKI_SERVICE_NAME.$LOKI_NAMESPACE:$LOKI_PORT`
    22  
    23  ## "Data source connected, but no labels received. Verify that Loki and Promtail is configured properly."
    24  
    25  This error can appear in Grafana when Loki is added as a datasource, indicating
    26  that although Grafana has connected to Loki, Loki hasn't received any logs from
    27  Promtail yet. There may be one of many root causes:
    28  
    29  - Promtail is running and collecting logs but is unable to connect to Loki to
    30    send the logs. Check Promtail's output.
    31  - Promtail started sending logs to Loki before Loki was ready. This can
    32    happen in test environment where Promtail has already read all logs and sent
    33    them off. Here is what you can do:
    34      - Start Promtail after Loki, e.g., 60 seconds later.
    35      - To force Promtail to re-send log messages, delete the positions file
    36        (default location `/tmp/positions.yaml`).
    37  - Promtail is ignoring targets and isn't reading any logs because of a
    38    configuration issue.
    39      - This can be detected by turning on debug logging in Promtail and looking
    40        for `dropping target, no labels` or `ignoring target` messages.
    41  - Promtail cannot find the location of your log files. Check that the
    42    `scrape_configs` contains valid path settings for finding the logs on your
    43    worker nodes.
    44  - Your pods are running with different labels than the ones Promtail is
    45    configured to read. Check `scrape_configs` to validate.
    46  
    47  ## Loki timeout errors
    48  
    49  Loki 504 errors, context canceled, and error processing requests
    50  can have many possible causes.
    51  
    52  - Review Loki configuration
    53  
    54      - Loki configuration `querier.query_timeout`
    55      - `server.http_server_read_timeout`
    56      - `server.http_server_write_timeout`
    57      - `server.http_server_idle_timeout`
    58  
    59  - Check your Loki deployment.
    60  If you have a reverse proxy in front of Loki, that is, between Loki and Grafana, then check any configured timeouts, such as an NGINX proxy read timeout.
    61  
    62  - Other causes.  To determine if the issue is related to Loki itself or another system such as Grafana or a client-side error,
    63  attempt to run a [LogCLI](../../tools/logcli/) query in as direct a manner as you can. For example, if running on virtual machines, run the query on the local machine. If running in a Kubernetes cluster, then port forward the Loki HTTP port, and attempt to run the query there. If you do not get a timeout, then consider these causes:
    64  
    65      - Adjust the [Grafana dataproxy timeout](https://grafana.com/docs/grafana/latest/administration/configuration/#dataproxy). Configure Grafana with a large enough dataproxy timeout.
    66      - Check timeouts for reverse proxies or load balancers between your client and Grafana. Queries to Grafana are made from the your local browser with Grafana serving as a proxy (a dataproxy). Therefore, connections from your client to Grafana must have their timeout configured as well.
    67  
    68  ## Troubleshooting targets
    69  
    70  Promtail exposes two web pages that can be used to understand how its service
    71  discovery works.
    72  
    73  The service discovery page (`/service-discovery`) shows all
    74  discovered targets with their labels before and after relabeling as well as
    75  the reason why the target has been dropped.
    76  
    77  The targets page (`/targets`) displays only targets that are being actively
    78  scraped and their respective labels, files, and positions.
    79  
    80  On Kubernetes, you can access those two pages by port-forwarding the Promtail
    81  port (`9080` or `3101` if using Helm) locally:
    82  
    83  ```bash
    84  $ kubectl port-forward loki-promtail-jrfg7 9080
    85  # Then, in a web browser, visit http://localhost:9080/service-discovery
    86  ```
    87  
    88  ## Debug output
    89  
    90  Both Loki and Promtail support a log level flag with the addition of
    91  a command-line option:
    92  
    93  ```bash
    94  loki -log.level=debug
    95  ```
    96  
    97  ```bash
    98  promtail -log.level=debug
    99  ```
   100  
   101  ## Failed to create target, `ioutil.ReadDir: readdirent: not a directory`
   102  
   103  The Promtail configuration contains a `__path__` entry to a directory that
   104  Promtail cannot find.
   105  
   106  ## Connecting to a Promtail pod to troubleshoot
   107  
   108  First check [Troubleshooting targets](#troubleshooting-targets) section above.
   109  If that doesn't help answer your questions, you can connect to the Promtail pod
   110  to investigate further.
   111  
   112  If you are running Promtail as a DaemonSet in your cluster, you will have a
   113  Promtail pod on each node, so figure out which Promtail you need to debug first:
   114  
   115  
   116  ```shell
   117  $ kubectl get pods --all-namespaces -o wide
   118  NAME                                   READY   STATUS    RESTARTS   AGE   IP             NODE        NOMINATED NODE
   119  ...
   120  nginx-7b6fb56fb8-cw2cm                 1/1     Running   0          41d   10.56.4.12     node-ckgc   <none>
   121  ...
   122  promtail-bth9q                         1/1     Running   0          3h    10.56.4.217    node-ckgc   <none>
   123  ```
   124  
   125  That output is truncated to highlight just the two pods we are interested in,
   126  you can see with the `-o wide` flag the NODE on which they are running.
   127  
   128  You'll want to match the node for the pod you are interested in, in this example
   129  NGINX, to the Promtail running on the same node.
   130  
   131  To debug you can connect to the Promtail pod:
   132  
   133  ```shell
   134  kubectl exec -it promtail-bth9q -- /bin/sh
   135  ```
   136  
   137  Once connected, verify the config in `/etc/promtail/promtail.yml` has the
   138  contents you expect.
   139  
   140  Also check `/var/log/positions.yaml` (`/run/promtail/positions.yaml` when
   141  deployed by Helm or whatever value is specified for `positions.file`) and make
   142  sure Promtail is tailing the logs you would expect.
   143  
   144  You can check the Promtail log by looking in `/var/log/containers` at the
   145  Promtail container log.
   146  
   147  ## Enable tracing for Loki
   148  
   149  Loki can be traced using [Jaeger](https://www.jaegertracing.io/) by setting
   150  the environment variable `JAEGER_AGENT_HOST` to the hostname and port where
   151  Loki is running.
   152  
   153  If you deploy with Helm, use the following command:
   154  
   155  ```bash
   156  $ helm upgrade --install loki loki/loki --set "loki.tracing.jaegerAgentHost=YOUR_JAEGER_AGENT_HOST"
   157  ```