github.com/cilium/cilium@v1.16.2/Documentation/operations/troubleshooting.rst (about)

     1  .. only:: not (epub or latex or html)
     2  
     3      WARNING: You are looking at unreleased Cilium documentation.
     4      Please use the official rendered version released here:
     5      https://docs.cilium.io
     6  
     7  .. _admin_guide:
     8  
     9  ###############
    10  Troubleshooting
    11  ###############
    12  
    13  This document describes how to troubleshoot Cilium in different deployment
    14  modes. It focuses on a full deployment of Cilium within a datacenter or public
    15  cloud. If you are just looking for a simple way to experiment, we highly
    16  recommend trying out the :ref:`getting_started` guide instead.
    17  
    18  This guide assumes that you have read the :ref:`network_root` and `security_root` which explain all
    19  the components and concepts.
    20  
    21  We use GitHub issues to maintain a list of `Cilium Frequently Asked Questions
    22  (FAQ)`_. You can also check there to see if your question(s) is already
    23  addressed.
    24  
    25  Component & Cluster Health
    26  ==========================
    27  
    28  Kubernetes
    29  ----------
    30  
    31  An initial overview of Cilium can be retrieved by listing all pods to verify
    32  whether all pods have the status ``Running``:
    33  
    34  .. code-block:: shell-session
    35  
    36     $ kubectl -n kube-system get pods -l k8s-app=cilium
    37     NAME           READY     STATUS    RESTARTS   AGE
    38     cilium-2hq5z   1/1       Running   0          4d
    39     cilium-6kbtz   1/1       Running   0          4d
    40     cilium-klj4b   1/1       Running   0          4d
    41     cilium-zmjj9   1/1       Running   0          4d
    42  
    43  If Cilium encounters a problem that it cannot recover from, it will
    44  automatically report the failure state via ``cilium-dbg status`` which is regularly
    45  queried by the Kubernetes liveness probe to automatically restart Cilium pods.
    46  If a Cilium pod is in state ``CrashLoopBackoff`` then this indicates a
    47  permanent failure scenario.
    48  
    49  Detailed Status
    50  ~~~~~~~~~~~~~~~
    51  
    52  If a particular Cilium pod is not in running state, the status and health of
    53  the agent on that node can be retrieved by running ``cilium-dbg status`` in the
    54  context of that pod:
    55  
    56  .. code-block:: shell-session
    57  
    58     $ kubectl -n kube-system exec cilium-2hq5z -- cilium-dbg status
    59     KVStore:                Ok   etcd: 1/1 connected: http://demo-etcd-lab--a.etcd.tgraf.test1.lab.corp.isovalent.link:2379 - 3.2.5 (Leader)
    60     ContainerRuntime:       Ok   docker daemon: OK
    61     Kubernetes:             Ok   OK
    62     Kubernetes APIs:        ["cilium/v2::CiliumNetworkPolicy", "networking.k8s.io/v1::NetworkPolicy", "core/v1::Service", "core/v1::Endpoint", "core/v1::Node", "CustomResourceDefinition"]
    63     Cilium:                 Ok   OK
    64     NodeMonitor:            Disabled
    65     Cilium health daemon:   Ok
    66     Controller Status:      14/14 healthy
    67     Proxy Status:           OK, ip 10.2.0.172, port-range 10000-20000
    68     Cluster health:   4/4 reachable   (2018-06-16T09:49:58Z)
    69  
    70  Alternatively, the ``k8s-cilium-exec.sh`` script can be used to run ``cilium-dbg
    71  status`` on all nodes. This will provide detailed status and health information
    72  of all nodes in the cluster:
    73  
    74  .. code-block:: shell-session
    75  
    76     curl -sLO https://raw.githubusercontent.com/cilium/cilium/main/contrib/k8s/k8s-cilium-exec.sh
    77     chmod +x ./k8s-cilium-exec.sh
    78  
    79  ... and run ``cilium-dbg status`` on all nodes:
    80  
    81  .. code-block:: shell-session
    82  
    83     $ ./k8s-cilium-exec.sh cilium-dbg status
    84     KVStore:                Ok   Etcd: http://127.0.0.1:2379 - (Leader) 3.1.10
    85     ContainerRuntime:       Ok
    86     Kubernetes:             Ok   OK
    87     Kubernetes APIs:        ["networking.k8s.io/v1beta1::Ingress", "core/v1::Node", "CustomResourceDefinition", "cilium/v2::CiliumNetworkPolicy", "networking.k8s.io/v1::NetworkPolicy", "core/v1::Service", "core/v1::Endpoint"]
    88     Cilium:                 Ok   OK
    89     NodeMonitor:            Listening for events on 2 CPUs with 64x4096 of shared memory
    90     Cilium health daemon:   Ok
    91     Controller Status:      7/7 healthy
    92     Proxy Status:           OK, ip 10.15.28.238, 0 redirects, port-range 10000-20000
    93     Cluster health:   1/1 reachable   (2018-02-27T00:24:34Z)
    94  
    95  Detailed information about the status of Cilium can be inspected with the
    96  ``cilium-dbg status --verbose`` command. Verbose output includes detailed IPAM state
    97  (allocated addresses), Cilium controller status, and details of the Proxy
    98  status.
    99  
   100  .. _ts_agent_logs:
   101  
   102  Logs
   103  ~~~~
   104  
   105  To retrieve log files of a cilium pod, run (replace ``cilium-1234`` with a pod
   106  name returned by ``kubectl -n kube-system get pods -l k8s-app=cilium``)
   107  
   108  .. code-block:: shell-session
   109  
   110     kubectl -n kube-system logs --timestamps cilium-1234
   111  
   112  If the cilium pod was already restarted due to the liveness problem after
   113  encountering an issue, it can be useful to retrieve the logs of the pod before
   114  the last restart:
   115  
   116  .. code-block:: shell-session
   117  
   118     kubectl -n kube-system logs --timestamps -p cilium-1234
   119  
   120  Generic
   121  -------
   122  
   123  When logged in a host running Cilium, the cilium CLI can be invoked directly,
   124  e.g.:
   125  
   126  .. code-block:: shell-session
   127  
   128     $ cilium-dbg status
   129     KVStore:                Ok   etcd: 1/1 connected: https://192.168.60.11:2379 - 3.2.7 (Leader)
   130     ContainerRuntime:       Ok
   131     Kubernetes:             Ok   OK
   132     Kubernetes APIs:        ["core/v1::Endpoint", "networking.k8s.io/v1beta1::Ingress", "core/v1::Node", "CustomResourceDefinition", "cilium/v2::CiliumNetworkPolicy", "networking.k8s.io/v1::NetworkPolicy", "core/v1::Service"]
   133     Cilium:                 Ok   OK
   134     NodeMonitor:            Listening for events on 2 CPUs with 64x4096 of shared memory
   135     Cilium health daemon:   Ok
   136     IPv4 address pool:      261/65535 allocated
   137     IPv6 address pool:      4/4294967295 allocated
   138     Controller Status:      20/20 healthy
   139     Proxy Status:           OK, ip 10.0.28.238, port-range 10000-20000
   140     Hubble:                 Ok      Current/Max Flows: 2542/4096 (62.06%), Flows/s: 164.21      Metrics: Disabled
   141     Cluster health:         2/2 reachable   (2018-04-11T15:41:01Z)
   142  
   143  .. _hubble_troubleshooting:
   144  
   145  Observing Flows with Hubble
   146  ===========================
   147  
   148  Hubble is a built-in observability tool which allows you to inspect recent flow
   149  events on all endpoints managed by Cilium.
   150  
   151  Ensure Hubble is running correctly
   152  ----------------------------------
   153  
   154  To ensure the Hubble client can connect to the Hubble server running inside
   155  Cilium, you may use the ``hubble status`` command from within a Cilium pod:
   156  
   157  .. code-block:: shell-session
   158  
   159     $ hubble status
   160     Healthcheck (via unix:///var/run/cilium/hubble.sock): Ok
   161     Current/Max Flows: 4095/4095 (100.00%)
   162     Flows/s: 164.21
   163  
   164  ``cilium-agent`` must be running with the ``--enable-hubble`` option (default) in order
   165  for the Hubble server to be enabled. When deploying Cilium with Helm, make sure
   166  to set the ``hubble.enabled=true`` value.
   167  
   168  To check if Hubble is enabled in your deployment, you may look for the
   169  following output in ``cilium-dbg status``:
   170  
   171  .. code-block:: shell-session
   172  
   173     $ cilium status
   174     ...
   175     Hubble:   Ok   Current/Max Flows: 4095/4095 (100.00%), Flows/s: 164.21   Metrics: Disabled
   176     ...
   177  
   178  .. note::
   179     Pods need to be managed by Cilium in order to be observable by Hubble.
   180     See how to :ref:`ensure a pod is managed by Cilium<ensure_managed_pod>`
   181     for more details.
   182  
   183  Observing flows of a specific pod
   184  ---------------------------------
   185  
   186  In order to observe the traffic of a specific pod, you will first have to
   187  :ref:`retrieve the name of the cilium instance managing it<retrieve_cilium_pod>`.
   188  The Hubble CLI is part of the Cilium container image and can be accessed via
   189  ``kubectl exec``. The following query for example will show all events related
   190  to flows which either originated or terminated in the ``default/tiefighter`` pod
   191  in the last three minutes:
   192  
   193  .. code-block:: shell-session
   194  
   195     $ kubectl exec -n kube-system cilium-77lk6 -- hubble observe --since 3m --pod default/tiefighter
   196     May  4 12:47:08.811: default/tiefighter:53875 -> kube-system/coredns-74ff55c5b-66f4n:53 to-endpoint FORWARDED (UDP)
   197     May  4 12:47:08.811: default/tiefighter:53875 -> kube-system/coredns-74ff55c5b-66f4n:53 to-endpoint FORWARDED (UDP)
   198     May  4 12:47:08.811: default/tiefighter:53875 <- kube-system/coredns-74ff55c5b-66f4n:53 to-endpoint FORWARDED (UDP)
   199     May  4 12:47:08.811: default/tiefighter:53875 <- kube-system/coredns-74ff55c5b-66f4n:53 to-endpoint FORWARDED (UDP)
   200     May  4 12:47:08.811: default/tiefighter:50214 <> default/deathstar-c74d84667-cx5kp:80 to-overlay FORWARDED (TCP Flags: SYN)
   201     May  4 12:47:08.812: default/tiefighter:50214 <- default/deathstar-c74d84667-cx5kp:80 to-endpoint FORWARDED (TCP Flags: SYN, ACK)
   202     May  4 12:47:08.812: default/tiefighter:50214 <> default/deathstar-c74d84667-cx5kp:80 to-overlay FORWARDED (TCP Flags: ACK)
   203     May  4 12:47:08.812: default/tiefighter:50214 <> default/deathstar-c74d84667-cx5kp:80 to-overlay FORWARDED (TCP Flags: ACK, PSH)
   204     May  4 12:47:08.812: default/tiefighter:50214 <- default/deathstar-c74d84667-cx5kp:80 to-endpoint FORWARDED (TCP Flags: ACK, PSH)
   205     May  4 12:47:08.812: default/tiefighter:50214 <> default/deathstar-c74d84667-cx5kp:80 to-overlay FORWARDED (TCP Flags: ACK, FIN)
   206     May  4 12:47:08.812: default/tiefighter:50214 <- default/deathstar-c74d84667-cx5kp:80 to-endpoint FORWARDED (TCP Flags: ACK, FIN)
   207     May  4 12:47:08.812: default/tiefighter:50214 <> default/deathstar-c74d84667-cx5kp:80 to-overlay FORWARDED (TCP Flags: ACK)
   208  
   209  You may also use ``-o json`` to obtain more detailed information about each
   210  flow event.
   211  
   212  .. note::
   213     **Hubble Relay**  allows you to query multiple Hubble instances
   214     simultaneously without having to first manually target a specific node.  See
   215     `Observing flows with Hubble Relay`_ for more information.
   216  
   217  Observing flows with Hubble Relay
   218  =================================
   219  
   220  Hubble Relay is a service which allows to query multiple Hubble instances
   221  simultaneously and aggregate the results. See :ref:`hubble_setup` to enable
   222  Hubble Relay if it is not yet enabled and install the Hubble CLI on your local
   223  machine.
   224  
   225  You may access the Hubble Relay service by port-forwarding it locally:
   226  
   227  .. code-block:: shell-session
   228  
   229     kubectl -n kube-system port-forward service/hubble-relay --address 0.0.0.0 --address :: 4245:80
   230  
   231  This will forward the Hubble Relay service port (``80``) to your local machine
   232  on port ``4245`` on all of it's IP addresses.
   233  
   234  You can verify that Hubble Relay can be reached by using the Hubble CLI and
   235  running the following command from your local machine:
   236  
   237  .. code-block:: shell-session
   238  
   239     hubble status
   240  
   241  This command should return an output similar to the following:
   242  
   243  ::
   244  
   245     Healthcheck (via localhost:4245): Ok
   246     Current/Max Flows: 16380/16380 (100.00%)
   247     Flows/s: 46.19
   248     Connected Nodes: 4/4
   249  
   250  You may see details about nodes that Hubble Relay is connected to by running
   251  the following command:
   252  
   253  .. code-block:: shell-session
   254  
   255     hubble list nodes
   256  
   257  As Hubble Relay shares the same API as individual Hubble instances, you may
   258  follow the `Observing flows with Hubble`_ section keeping in mind that
   259  limitations with regards to what can be seen from individual Hubble instances no
   260  longer apply.
   261  
   262  Connectivity Problems
   263  =====================
   264  
   265  Cilium connectivity tests
   266  ------------------------------------
   267  
   268  The Cilium connectivity test deploys a series of services, deployments, and
   269  CiliumNetworkPolicy which will use various connectivity paths to connect to
   270  each other. Connectivity paths include with and without service load-balancing
   271  and various network policy combinations.
   272  
   273  .. note::
   274     The connectivity tests this will only work in a namespace with no other pods
   275     or network policies applied. If there is a Cilium Clusterwide Network Policy
   276     enabled, that may also break this connectivity check.
   277  
   278  To run the connectivity tests create an isolated test namespace called
   279  ``cilium-test`` to deploy the tests with.
   280  
   281  .. parsed-literal::
   282  
   283     kubectl create ns cilium-test
   284     kubectl apply --namespace=cilium-test -f \ |SCM_WEB|\/examples/kubernetes/connectivity-check/connectivity-check.yaml
   285  
   286  The tests cover various functionality of the system. Below we call out each test
   287  type. If tests pass, it suggests functionality of the referenced subsystem.
   288  
   289  +----------------------------+-----------------------------+-------------------------------+-----------------------------+----------------------------------------+
   290  | Pod-to-pod (intra-host)    | Pod-to-pod (inter-host)     | Pod-to-service (intra-host)   | Pod-to-service (inter-host) | Pod-to-external resource               |
   291  +============================+=============================+===============================+=============================+========================================+
   292  | eBPF routing is functional | Data plane, routing, network| eBPF service map lookup       | VXLAN overlay port if used  | Egress, CiliumNetworkPolicy, masquerade|
   293  +----------------------------+-----------------------------+-------------------------------+-----------------------------+----------------------------------------+
   294  
   295  The pod name indicates the connectivity
   296  variant and the readiness and liveness gate indicates success or failure of the
   297  test:
   298  
   299  .. code-block:: shell-session
   300  
   301     $ kubectl get pods -n cilium-test
   302     NAME                                                    READY   STATUS    RESTARTS   AGE
   303     echo-a-6788c799fd-42qxx                                 1/1     Running   0          69s
   304     echo-b-59757679d4-pjtdl                                 1/1     Running   0          69s
   305     echo-b-host-f86bd784d-wnh4v                             1/1     Running   0          68s
   306     host-to-b-multi-node-clusterip-585db65b4d-x74nz         1/1     Running   0          68s
   307     host-to-b-multi-node-headless-77c64bc7d8-kgf8p          1/1     Running   0          67s
   308     pod-to-a-allowed-cnp-87b5895c8-bfw4x                    1/1     Running   0          68s
   309     pod-to-a-b76ddb6b4-2v4kb                                1/1     Running   0          68s
   310     pod-to-a-denied-cnp-677d9f567b-kkjp4                    1/1     Running   0          68s
   311     pod-to-b-intra-node-nodeport-8484fb6d89-bwj8q           1/1     Running   0          68s
   312     pod-to-b-multi-node-clusterip-f7655dbc8-h5bwk           1/1     Running   0          68s
   313     pod-to-b-multi-node-headless-5fd98b9648-5bjj8           1/1     Running   0          68s
   314     pod-to-b-multi-node-nodeport-74bd8d7bd5-kmfmm           1/1     Running   0          68s
   315     pod-to-external-1111-7489c7c46d-jhtkr                   1/1     Running   0          68s
   316     pod-to-external-fqdn-allow-google-cnp-b7b6bcdcb-97p75   1/1     Running   0          68s
   317  
   318  Information about test failures can be determined by describing a failed test
   319  pod
   320  
   321  .. code-block:: shell-session
   322  
   323     $ kubectl describe pod pod-to-b-intra-node-hostport
   324       Warning  Unhealthy  6s (x6 over 56s)   kubelet, agent1    Readiness probe failed: curl: (7) Failed to connect to echo-b-host-headless port 40000: Connection refused
   325       Warning  Unhealthy  2s (x3 over 52s)   kubelet, agent1    Liveness probe failed: curl: (7) Failed to connect to echo-b-host-headless port 40000: Connection refused
   326  
   327  .. _cluster_connectivity_health:
   328  
   329  Checking cluster connectivity health
   330  ------------------------------------
   331  
   332  Cilium can rule out network fabric related issues when troubleshooting
   333  connectivity issues by providing reliable health and latency probes between all
   334  cluster nodes and a simulated workload running on each node.
   335  
   336  By default when Cilium is run, it launches instances of ``cilium-health`` in
   337  the background to determine the overall connectivity status of the cluster. This
   338  tool periodically runs bidirectional traffic across multiple paths through the
   339  cluster and through each node using different protocols to determine the health
   340  status of each path and protocol. At any point in time, cilium-health may be
   341  queried for the connectivity status of the last probe.
   342  
   343  .. code-block:: shell-session
   344  
   345     $ kubectl -n kube-system exec -ti cilium-2hq5z -- cilium-health status
   346     Probe time:   2018-06-16T09:51:58Z
   347     Nodes:
   348       ip-172-0-52-116.us-west-2.compute.internal (localhost):
   349         Host connectivity to 172.0.52.116:
   350           ICMP to stack: OK, RTT=315.254µs
   351           HTTP to agent: OK, RTT=368.579µs
   352         Endpoint connectivity to 10.2.0.183:
   353           ICMP to stack: OK, RTT=190.658µs
   354           HTTP to agent: OK, RTT=536.665µs
   355       ip-172-0-117-198.us-west-2.compute.internal:
   356         Host connectivity to 172.0.117.198:
   357           ICMP to stack: OK, RTT=1.009679ms
   358           HTTP to agent: OK, RTT=1.808628ms
   359         Endpoint connectivity to 10.2.1.234:
   360           ICMP to stack: OK, RTT=1.016365ms
   361           HTTP to agent: OK, RTT=2.29877ms
   362  
   363  For each node, the connectivity will be displayed for each protocol and path,
   364  both to the node itself and to an endpoint on that node. The latency specified
   365  is a snapshot at the last time a probe was run, which is typically once per
   366  minute. The ICMP connectivity row represents Layer 3 connectivity to the
   367  networking stack, while the HTTP connectivity row represents connection to an
   368  instance of the ``cilium-health`` agent running on the host or as an endpoint.
   369  
   370  .. _monitor:
   371  
   372  Monitoring Datapath State
   373  -------------------------
   374  
   375  Sometimes you may experience broken connectivity, which may be due to a
   376  number of different causes. A main cause can be unwanted packet drops on
   377  the networking level. The tool
   378  ``cilium-dbg monitor`` allows you to quickly inspect and see if and where packet
   379  drops happen. Following is an example output (use ``kubectl exec`` as in
   380  previous examples if running with Kubernetes):
   381  
   382  .. code-block:: shell-session
   383  
   384     $ kubectl -n kube-system exec -ti cilium-2hq5z -- cilium-dbg monitor --type drop
   385     Listening for events on 2 CPUs with 64x4096 of shared memory
   386     Press Ctrl-C to quit
   387     xx drop (Policy denied) to endpoint 25729, identity 261->264: fd02::c0a8:210b:0:bf00 -> fd02::c0a8:210b:0:6481 EchoRequest
   388     xx drop (Policy denied) to endpoint 25729, identity 261->264: fd02::c0a8:210b:0:bf00 -> fd02::c0a8:210b:0:6481 EchoRequest
   389     xx drop (Policy denied) to endpoint 25729, identity 261->264: 10.11.13.37 -> 10.11.101.61 EchoRequest
   390     xx drop (Policy denied) to endpoint 25729, identity 261->264: 10.11.13.37 -> 10.11.101.61 EchoRequest
   391     xx drop (Invalid destination mac) to endpoint 0, identity 0->0: fe80::5c25:ddff:fe8e:78d8 -> ff02::2 RouterSolicitation
   392  
   393  The above indicates that a packet to endpoint ID ``25729`` has been dropped due
   394  to violation of the Layer 3 policy.
   395  
   396  Handling drop (CT: Map insertion failed)
   397  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   398  
   399  If connectivity fails and ``cilium-dbg monitor --type drop`` shows ``xx drop (CT:
   400  Map insertion failed)``, then it is likely that the connection tracking table
   401  is filling up and the automatic adjustment of the garbage collector interval is
   402  insufficient.
   403  
   404  Setting ``--conntrack-gc-interval`` to an interval lower than the current value
   405  may help. This controls the time interval between two garbage collection runs.
   406  
   407  By default ``--conntrack-gc-interval`` is set to 0 which translates to
   408  using a dynamic interval. In that case, the interval is updated after each
   409  garbage collection run depending on how many entries were garbage collected.
   410  If very few or no entries were garbage collected, the interval will increase;
   411  if many entries were garbage collected, it will decrease. The current interval
   412  value is reported in the Cilium agent logs.
   413  
   414  Alternatively, the value for ``bpf-ct-global-any-max`` and
   415  ``bpf-ct-global-tcp-max`` can be increased. Setting both of these options will
   416  be a trade-off of CPU for ``conntrack-gc-interval``, and for
   417  ``bpf-ct-global-any-max`` and ``bpf-ct-global-tcp-max`` the amount of memory
   418  consumed. You can track conntrack garbage collection related metrics such as
   419  ``datapath_conntrack_gc_runs_total`` and ``datapath_conntrack_gc_entries`` to
   420  get visibility into garbage collection runs. Refer to :ref:`metrics` for more
   421  details.
   422  
   423  Enabling datapath debug messages
   424  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   425  
   426  By default, datapath debug messages are disabled, and therefore not shown in
   427  ``cilium-dbg monitor -v`` output. To enable them, add ``"datapath"`` to
   428  the ``debug-verbose`` option.
   429  
   430  Policy Troubleshooting
   431  ======================
   432  
   433  .. _ensure_managed_pod:
   434  
   435  Ensure pod is managed by Cilium
   436  -------------------------------
   437  
   438  A potential cause for policy enforcement not functioning as expected is that
   439  the networking of the pod selected by the policy is not being managed by
   440  Cilium. The following situations result in unmanaged pods:
   441  
   442  * The pod is running in host networking and will use the host's IP address
   443    directly. Such pods have full network connectivity but Cilium will not
   444    provide security policy enforcement for such pods by default. To enforce
   445    policy against these pods, either set ``hostNetwork`` to false or use
   446    :ref:`HostPolicies`.
   447  
   448  * The pod was started before Cilium was deployed. Cilium only manages pods
   449    that have been deployed after Cilium itself was started. Cilium will not
   450    provide security policy enforcement for such pods. These pods should be
   451    restarted in order to ensure that Cilium can provide security policy
   452    enforcement.
   453  
   454  If pod networking is not managed by Cilium. Ingress and egress policy rules
   455  selecting the respective pods will not be applied. See the section
   456  :ref:`network_policy` for more details.
   457  
   458  For a quick assessment of whether any pods are not managed by Cilium, the
   459  `Cilium CLI <https://github.com/cilium/cilium-cli>`_ will print the number
   460  of managed pods. If this prints that all of the pods are managed by Cilium,
   461  then there is no problem:
   462  
   463  .. code-block:: shell-session
   464  
   465     $ cilium status
   466         /¯¯\
   467      /¯¯\__/¯¯\    Cilium:         OK
   468      \__/¯¯\__/    Operator:       OK
   469      /¯¯\__/¯¯\    Hubble:         OK
   470      \__/¯¯\__/    ClusterMesh:    disabled
   471         \__/
   472  
   473     Deployment        cilium-operator    Desired: 2, Ready: 2/2, Available: 2/2
   474     Deployment        hubble-relay       Desired: 1, Ready: 1/1, Available: 1/1
   475     Deployment        hubble-ui          Desired: 1, Ready: 1/1, Available: 1/1
   476     DaemonSet         cilium             Desired: 2, Ready: 2/2, Available: 2/2
   477     Containers:       cilium-operator    Running: 2
   478                       hubble-relay       Running: 1
   479                       hubble-ui          Running: 1
   480                       cilium             Running: 2
   481     Cluster Pods:     5/5 managed by Cilium
   482     ...
   483  
   484  You can run the following script to list the pods which are *not* managed by
   485  Cilium:
   486  
   487  .. code-block:: shell-session
   488  
   489     $ curl -sLO https://raw.githubusercontent.com/cilium/cilium/main/contrib/k8s/k8s-unmanaged.sh
   490     $ chmod +x k8s-unmanaged.sh
   491     $ ./k8s-unmanaged.sh
   492     kube-system/cilium-hqpk7
   493     kube-system/kube-addon-manager-minikube
   494     kube-system/kube-dns-54cccfbdf8-zmv2c
   495     kube-system/kubernetes-dashboard-77d8b98585-g52k5
   496     kube-system/storage-provisioner
   497  
   498  Understand the rendering of your policy
   499  ---------------------------------------
   500  
   501  There are always multiple ways to approach a problem. Cilium can provide the
   502  rendering of the aggregate policy provided to it, leaving you to simply compare
   503  with what you expect the policy to actually be rather than search (and
   504  potentially overlook) every policy. At the expense of reading a very large dump
   505  of an endpoint, this is often a faster path to discovering errant policy
   506  requests in the Kubernetes API.
   507  
   508  Start by finding the endpoint you are debugging from the following list. There
   509  are several cross references for you to use in this list, including the IP
   510  address and pod labels:
   511  
   512  .. code-block:: shell-session
   513  
   514      kubectl -n kube-system exec -ti cilium-q8wvt -- cilium-dbg endpoint list
   515  
   516  When you find the correct endpoint, the first column of every row is the
   517  endpoint ID. Use that to dump the full endpoint information:
   518  
   519  .. code-block:: shell-session
   520  
   521      kubectl -n kube-system exec -ti cilium-q8wvt -- cilium-dbg endpoint get 59084
   522  
   523  .. image:: images/troubleshooting_policy.png
   524      :align: center
   525  
   526  Importing this dump into a JSON-friendly editor can help browse and navigate the
   527  information here. At the top level of the dump, there are two nodes of note:
   528  
   529  * ``spec``: The desired state of the endpoint
   530  * ``status``: The current state of the endpoint
   531  
   532  This is the standard Kubernetes control loop pattern. Cilium is the controller
   533  here, and it is iteratively working to bring the ``status`` in line with the
   534  ``spec``.
   535  
   536  Opening the ``status``, we can drill down through ``policy.realized.l4``. Do
   537  your ``ingress`` and ``egress`` rules match what you expect? If not, the
   538  reference to the errant rules can be found in the ``derived-from-rules`` node.
   539  
   540  Policymap pressure and overflow
   541  -------------------------------
   542  
   543  The most important step in debugging policymap pressure is finding out which
   544  node(s) are impacted.
   545  
   546  The ``cilium_bpf_map_pressure{map_name="cilium_policy_*"}`` metric monitors the
   547  endpoint's BPF policymap pressure. This metric exposes the maximum BPF map
   548  pressure on the node, meaning the policymap experiencing the most pressure on a
   549  particular node.
   550  
   551  Once the node is known, the troubleshooting steps are as follows:
   552  
   553  1. Find the Cilium pod on the node experiencing the problematic policymap
   554     pressure and obtain a shell via ``kubectl exec``.
   555  2. Use ``cilium policy selectors`` to get an overview of which selectors are
   556     selecting many identities. The output of this command as of Cilium v1.15
   557     additionally displays the namespace and name of the policy resource of each
   558     selector.
   559  3. The type of selector tells you what sort of policy rule could be having an
   560     impact. The three existing types of selectors are explained below, each with
   561     specific steps depending on the selector. See the steps below corresponding
   562     to the type of selector.
   563  4. Consider bumping the policymap size as a last resort. However, keep in mind
   564     the following implications:
   565  
   566     * Increased memory consumption for each policymap.
   567     * Generally, as identities increase in the cluster, the more work Cilium
   568       performs.
   569     * At a broader level, if the policy posture is such that all or nearly all
   570       identities are selected, this suggests that the posture is too permissive.
   571  
   572  +---------------+------------------------------------------------------------------------------------------------------------+
   573  | Selector type | Form in ``cilium policy selectors`` output                                                                 |
   574  +===============+============================================================================================================+
   575  | CIDR          | ``&LabelSelector{MatchLabels:map[string]string{cidr.1.1.1.1/32: ,}``                                       |
   576  +---------------+------------------------------------------------------------------------------------------------------------+
   577  | FQDN          | ``MatchName: , MatchPattern: *``                                                                           |
   578  +---------------+------------------------------------------------------------------------------------------------------------+
   579  | Label         | ``&LabelSelector{MatchLabels:map[string]string{any.name: curl,k8s.io.kubernetes.pod.namespace: default,}`` |
   580  +---------------+------------------------------------------------------------------------------------------------------------+
   581  
   582  An example output of ``cilium policy selectors``:
   583  
   584  .. code-block:: shell-session
   585  
   586      root@kind-worker:/home/cilium# cilium policy selectors
   587      SELECTOR                                                                                                                                                            LABELS                          USERS   IDENTITIES
   588      &LabelSelector{MatchLabels:map[string]string{k8s.io.kubernetes.pod.namespace: kube-system,k8s.k8s-app: kube-dns,},MatchExpressions:[]LabelSelectorRequirement{},}   default/tofqdn-dns-visibility   1       16500
   589      &LabelSelector{MatchLabels:map[string]string{reserved.none: ,},MatchExpressions:[]LabelSelectorRequirement{},}                                                      default/tofqdn-dns-visibility   1
   590      MatchName: , MatchPattern: *                                                                                                                                        default/tofqdn-dns-visibility   1       16777231
   591                                                                                                                                                                                                                  16777232
   592                                                                                                                                                                                                                  16777233
   593                                                                                                                                                                                                                  16860295
   594                                                                                                                                                                                                                  16860322
   595                                                                                                                                                                                                                  16860323
   596                                                                                                                                                                                                                  16860324
   597                                                                                                                                                                                                                  16860325
   598                                                                                                                                                                                                                  16860326
   599                                                                                                                                                                                                                  16860327
   600                                                                                                                                                                                                                  16860328
   601      &LabelSelector{MatchLabels:map[string]string{any.name: netperf,k8s.io.kubernetes.pod.namespace: default,},MatchExpressions:[]LabelSelectorRequirement{},}           default/tofqdn-dns-visibility   1
   602      &LabelSelector{MatchLabels:map[string]string{cidr.1.1.1.1/32: ,},MatchExpressions:[]LabelSelectorRequirement{},}                                                    default/tofqdn-dns-visibility   1       16860329
   603      &LabelSelector{MatchLabels:map[string]string{cidr.1.1.1.2/32: ,},MatchExpressions:[]LabelSelectorRequirement{},}                                                    default/tofqdn-dns-visibility   1       16860330
   604      &LabelSelector{MatchLabels:map[string]string{cidr.1.1.1.3/32: ,},MatchExpressions:[]LabelSelectorRequirement{},}                                                    default/tofqdn-dns-visibility   1       16860331
   605  
   606  From the output above, we see that all three selectors are in use. The
   607  significant action here is to determine which selector is selecting the most
   608  identities, because the policy containing that selector is the likely cause for
   609  the policymap pressure.
   610  
   611  Label
   612  ~~~~~
   613  
   614  See section on :ref:`identity-relevant labels <identity-relevant-labels>`.
   615  
   616  Another aspect to consider is the permissiveness of the policies and whether it
   617  could be reduced.
   618  
   619  CIDR
   620  ~~~~
   621  
   622  One way to reduce the number of identities selected by a CIDR selector is to
   623  broaden the range of the CIDR, if possible. For example, in the above example
   624  output, the policy contains a ``/32`` rule for each CIDR, rather than using a
   625  wider range like ``/30`` instead. Updating the policy with this rule creates an
   626  identity that represents all IPs within the ``/30`` and therefore, only
   627  requires the selector to select 1 identity.
   628  
   629  FQDN
   630  ~~~~
   631  
   632  See section on :ref:`isolating the source of toFQDNs issues regarding
   633  identities and policy <isolating-source-toFQDNs-issues-identities-policy>`.
   634  
   635  etcd (kvstore)
   636  ==============
   637  
   638  Introduction
   639  ------------
   640  
   641  Cilium can be operated in CRD-mode and kvstore/etcd mode. When cilium is
   642  running in kvstore/etcd mode, the kvstore becomes a vital component of the
   643  overall cluster health as it is required to be available for several
   644  operations.
   645  
   646  Operations for which the kvstore is strictly required when running in etcd
   647  mode:
   648  
   649  Scheduling of new workloads:
   650    As part of scheduling workloads/endpoints, agents will perform security
   651    identity allocation which requires interaction with the kvstore. If a
   652    workload can be scheduled due to re-using a known security identity, then
   653    state propagation of the endpoint details to other nodes will still depend on
   654    the kvstore and thus packets drops due to policy enforcement may be observed
   655    as other nodes in the cluster will not be aware of the new workload.
   656  
   657  Multi cluster:
   658    All state propagation between clusters depends on the kvstore.
   659  
   660  Node discovery:
   661    New nodes require to register themselves in the kvstore.
   662  
   663  Agent bootstrap:
   664    The Cilium agent will eventually fail if it can't connect to the kvstore at
   665    bootstrap time, however, the agent will still perform all possible operations
   666    while waiting for the kvstore to appear.
   667  
   668  Operations which *do not* require kvstore availability:
   669  
   670  All datapath operations:
   671    All datapath forwarding, policy enforcement and visibility functions for
   672    existing workloads/endpoints do not depend on the kvstore. Packets will
   673    continue to be forwarded and network policy rules will continue to be
   674    enforced.
   675  
   676    However, if the agent requires to restart as part of the
   677    :ref:`etcd_recovery_behavior`, there can be delays in:
   678  
   679    * processing of flow events and metrics
   680    * short unavailability of layer 7 proxies
   681  
   682  NetworkPolicy updates:
   683    Network policy updates will continue to be processed and applied.
   684  
   685  Services updates:
   686    All updates to services will be processed and applied.
   687  
   688  Understanding etcd status
   689  -------------------------
   690  
   691  The etcd status is reported when running ``cilium-dbg status``. The following line
   692  represents the status of etcd::
   693  
   694     KVStore:  Ok  etcd: 1/1 connected, lease-ID=29c6732d5d580cb5, lock lease-ID=29c6732d5d580cb7, has-quorum=true: https://192.168.60.11:2379 - 3.4.9 (Leader)
   695  
   696  OK:
   697    The overall status. Either ``OK`` or ``Failure``.
   698  
   699  1/1 connected:
   700    Number of total etcd endpoints and how many of them are reachable.
   701  
   702  lease-ID:
   703    UUID of the lease used for all keys owned by this agent.
   704  
   705  lock lease-ID:
   706    UUID of the lease used for locks acquired by this agent.
   707  
   708  has-quorum:
   709    Status of etcd quorum. Either ``true`` or set to an error.
   710  
   711  consecutive-errors:
   712    Number of consecutive quorum errors. Only printed if errors are present.
   713  
   714  https://192.168.60.11:2379 - 3.4.9 (Leader):
   715    List of all etcd endpoints stating the etcd version and whether the
   716    particular endpoint is currently the elected leader. If an etcd endpoint
   717    cannot be reached, the error is shown.
   718  
   719  .. _etcd_recovery_behavior:
   720  
   721  Recovery behavior
   722  -----------------
   723  
   724  In the event of an etcd endpoint becoming unhealthy, etcd should automatically
   725  resolve this by electing a new leader and by failing over to a healthy etcd
   726  endpoint. As long as quorum is preserved, the etcd cluster will remain
   727  functional.
   728  
   729  In addition, Cilium performs a background check in an interval to determine
   730  etcd health and potentially take action. The interval depends on the overall
   731  cluster size. The larger the cluster, the longer the `interval
   732  <https://pkg.go.dev/github.com/cilium/cilium/pkg/kvstore?tab=doc#ExtraOptions.StatusCheckInterval>`_:
   733  
   734   * If no etcd endpoints can be reached, Cilium will report failure in ``cilium-dbg
   735     status``. This will cause the liveness and readiness probe of Kubernetes to
   736     fail and Cilium will be restarted.
   737  
   738   * A lock is acquired and released to test a write operation which requires
   739     quorum. If this operation fails, loss of quorum is reported. If quorum fails
   740     for three or more intervals in a row, Cilium is declared unhealthy.
   741  
   742   * The Cilium operator will constantly write to a heartbeat key
   743     (``cilium/.heartbeat``). All Cilium agents will watch for updates to this
   744     heartbeat key. This validates the ability for an agent to receive key
   745     updates from etcd. If the heartbeat key is not updated in time, the quorum
   746     check is declared to have failed and Cilium is declared unhealthy after 3 or
   747     more consecutive failures.
   748  
   749  Example of a status with a quorum failure which has not yet reached the
   750  threshold::
   751  
   752      KVStore: Ok   etcd: 1/1 connected, lease-ID=29c6732d5d580cb5, lock lease-ID=29c6732d5d580cb7, has-quorum=2m2.778966915s since last heartbeat update has been received, consecutive-errors=1: https://192.168.60.11:2379 - 3.4.9 (Leader)
   753  
   754  Example of a status with the number of quorum failures exceeding the threshold::
   755  
   756      KVStore: Failure   Err: quorum check failed 8 times in a row: 4m28.446600949s since last heartbeat update has been received
   757  
   758  .. _troubleshooting_clustermesh:
   759  
   760  .. include:: ./troubleshooting_clustermesh.rst
   761  
   762  .. _troubleshooting_servicemesh:
   763  
   764  .. include:: troubleshooting_servicemesh.rst
   765  
   766  Symptom Library
   767  ===============
   768  
   769  Node to node traffic is being dropped
   770  -------------------------------------
   771  
   772  Symptom
   773  ~~~~~~~
   774  
   775  Endpoint to endpoint communication on a single node succeeds but communication
   776  fails between endpoints across multiple nodes.
   777  
   778  Troubleshooting steps:
   779  ~~~~~~~~~~~~~~~~~~~~~~
   780  
   781  #. Run ``cilium-health status`` on the node of the source and destination
   782     endpoint. It should describe the connectivity from that node to other
   783     nodes in the cluster, and to a simulated endpoint on each other node.
   784     Identify points in the cluster that cannot talk to each other. If the
   785     command does not describe the status of the other node, there may be an
   786     issue with the KV-Store.
   787  
   788  #. Run ``cilium-dbg monitor`` on the node of the source and destination endpoint.
   789     Look for packet drops.
   790  
   791     When running in :ref:`arch_overlay` mode:
   792  
   793  #. Run ``cilium-dbg bpf tunnel list`` and verify that each Cilium node is aware of
   794     the other nodes in the cluster.  If not, check the logfile for errors.
   795  
   796  #. If nodes are being populated correctly, run ``tcpdump -n -i cilium_vxlan`` on
   797     each node to verify whether cross node traffic is being forwarded correctly
   798     between nodes.
   799  
   800     If packets are being dropped,
   801  
   802     * verify that the node IP listed in ``cilium-dbg bpf tunnel list`` can reach each
   803       other.
   804     * verify that the firewall on each node allows UDP port 8472.
   805  
   806     When running in :ref:`arch_direct_routing` mode:
   807  
   808  #. Run ``ip route`` or check your cloud provider router and verify that you have
   809     routes installed to route the endpoint prefix between all nodes.
   810  
   811  #. Verify that the firewall on each node permits to route the endpoint IPs.
   812  
   813  
   814  Useful Scripts
   815  ==============
   816  
   817  .. _retrieve_cilium_pod:
   818  
   819  Retrieve Cilium pod managing a particular pod
   820  ---------------------------------------------
   821  
   822  Identifies the Cilium pod that is managing a particular pod in a namespace:
   823  
   824  .. code-block:: shell-session
   825  
   826      k8s-get-cilium-pod.sh <pod> <namespace>
   827  
   828  **Example:**
   829  
   830  .. code-block:: shell-session
   831  
   832      $ curl -sLO https://raw.githubusercontent.com/cilium/cilium/main/contrib/k8s/k8s-get-cilium-pod.sh
   833      $ chmod +x k8s-get-cilium-pod.sh
   834      $ ./k8s-get-cilium-pod.sh luke-pod default
   835      cilium-zmjj9
   836      cilium-node-init-v7r9p
   837      cilium-operator-f576f7977-s5gpq
   838  
   839  Execute a command in all Kubernetes Cilium pods
   840  -----------------------------------------------
   841  
   842  Run a command within all Cilium pods of a cluster
   843  
   844  .. code-block:: shell-session
   845  
   846      k8s-cilium-exec.sh <command>
   847  
   848  **Example:**
   849  
   850  .. code-block:: shell-session
   851  
   852      $ curl -sLO https://raw.githubusercontent.com/cilium/cilium/main/contrib/k8s/k8s-cilium-exec.sh
   853      $ chmod +x k8s-cilium-exec.sh
   854      $ ./k8s-cilium-exec.sh uptime
   855       10:15:16 up 6 days,  7:37,  0 users,  load average: 0.00, 0.02, 0.00
   856       10:15:16 up 6 days,  7:32,  0 users,  load average: 0.00, 0.03, 0.04
   857       10:15:16 up 6 days,  7:30,  0 users,  load average: 0.75, 0.27, 0.15
   858       10:15:16 up 6 days,  7:28,  0 users,  load average: 0.14, 0.04, 0.01
   859  
   860  List unmanaged Kubernetes pods
   861  ------------------------------
   862  
   863  Lists all Kubernetes pods in the cluster for which Cilium does *not* provide
   864  networking. This includes pods running in host-networking mode and pods that
   865  were started before Cilium was deployed.
   866  
   867  .. code-block:: shell-session
   868  
   869     k8s-unmanaged.sh
   870  
   871  **Example:**
   872  
   873  .. code-block:: shell-session
   874  
   875     $ curl -sLO https://raw.githubusercontent.com/cilium/cilium/main/contrib/k8s/k8s-unmanaged.sh
   876     $ chmod +x k8s-unmanaged.sh
   877     $ ./k8s-unmanaged.sh
   878     kube-system/cilium-hqpk7
   879     kube-system/kube-addon-manager-minikube
   880     kube-system/kube-dns-54cccfbdf8-zmv2c
   881     kube-system/kubernetes-dashboard-77d8b98585-g52k5
   882     kube-system/storage-provisioner
   883  
   884  Reporting a problem
   885  ===================
   886  
   887  Before you report a problem, make sure to retrieve the necessary information
   888  from your cluster before the failure state is lost.
   889  
   890  .. _sysdump:
   891  
   892  Automatic log & state collection
   893  --------------------------------
   894  
   895  .. include:: ../installation/cli-download.rst
   896  
   897  Then, execute ``cilium sysdump`` command to collect troubleshooting information
   898  from your Kubernetes cluster:
   899  
   900  .. code-block:: shell-session
   901  
   902     cilium sysdump
   903  
   904  Note that by default ``cilium sysdump`` will attempt to collect as much logs as
   905  possible and for all the nodes in the cluster. If your cluster size is above 20
   906  nodes, consider setting the following options to limit the size of the sysdump.
   907  This is not required, but useful for those who have a constraint on bandwidth or
   908  upload size.
   909  
   910  * set the ``--node-list`` option to pick only a few nodes in case the cluster has
   911    many of them.
   912  * set the ``--logs-since-time`` option to go back in time to when the issues started.
   913  * set the ``--logs-limit-bytes`` option to limit the size of the log files (note:
   914    passed onto ``kubectl logs``; does not apply to entire collection archive).
   915  
   916  Ideally, a sysdump that has a full history of select nodes, rather than a brief
   917  history of all the nodes, would be preferred (by using ``--node-list``). The second
   918  recommended way would be to use ``--logs-since-time`` if you are able to narrow down
   919  when the issues started. Lastly, if the Cilium agent and Operator logs are too
   920  large, consider ``--logs-limit-bytes``.
   921  
   922  Use ``--help`` to see more options:
   923  
   924  .. code-block:: shell-session
   925  
   926     cilium sysdump --help
   927  
   928  Single Node Bugtool
   929  ~~~~~~~~~~~~~~~~~~~
   930  
   931  If you are not running Kubernetes, it is also possible to run the bug
   932  collection tool manually with the scope of a single node:
   933  
   934  The ``cilium-bugtool`` captures potentially useful information about your
   935  environment for debugging. The tool is meant to be used for debugging a single
   936  Cilium agent node. In the Kubernetes case, if you have multiple Cilium pods,
   937  the tool can retrieve debugging information from all of them. The tool works by
   938  archiving a collection of command output and files from several places. By
   939  default, it writes to the ``tmp`` directory.
   940  
   941  Note that the command needs to be run from inside the Cilium pod/container.
   942  
   943  .. code-block:: shell-session
   944  
   945     cilium-bugtool
   946  
   947  When running it with no option as shown above, it will try to copy various
   948  files and execute some commands. If ``kubectl`` is detected, it will search for
   949  Cilium pods. The default label being ``k8s-app=cilium``, but this and the
   950  namespace can be changed via ``k8s-namespace`` and ``k8s-label`` respectively.
   951  
   952  If you want to capture the archive from a Kubernetes pod, then the process is a
   953  bit different
   954  
   955  .. code-block:: shell-session
   956  
   957     $ # First we need to get the Cilium pod
   958     $ kubectl get pods --namespace kube-system
   959     NAME                          READY     STATUS    RESTARTS   AGE
   960     cilium-kg8lv                  1/1       Running   0          13m
   961     kube-addon-manager-minikube   1/1       Running   0          1h
   962     kube-dns-6fc954457d-sf2nk     3/3       Running   0          1h
   963     kubernetes-dashboard-6xvc7    1/1       Running   0          1h
   964  
   965     $ # Run the bugtool from this pod
   966     $ kubectl -n kube-system exec cilium-kg8lv -- cilium-bugtool
   967     [...]
   968  
   969     $ # Copy the archive from the pod
   970     $ kubectl cp kube-system/cilium-kg8lv:/tmp/cilium-bugtool-20180411-155146.166+0000-UTC-266836983.tar /tmp/cilium-bugtool-20180411-155146.166+0000-UTC-266836983.tar
   971     [...]
   972  
   973  .. note::
   974  
   975     Please check the archive for sensitive information and strip it
   976     away before sharing it with us.
   977  
   978  Below is an approximate list of the kind of information in the archive.
   979  
   980  * Cilium status
   981  * Cilium version
   982  * Kernel configuration
   983  * Resolve configuration
   984  * Cilium endpoint state
   985  * Cilium logs
   986  * Docker logs
   987  * ``dmesg``
   988  * ``ethtool``
   989  * ``ip a``
   990  * ``ip link``
   991  * ``ip r``
   992  * ``iptables-save``
   993  * ``kubectl -n kube-system get pods``
   994  * ``kubectl get pods,svc for all namespaces``
   995  * ``uname``
   996  * ``uptime``
   997  * ``cilium-dbg bpf * list``
   998  * ``cilium-dbg endpoint get for each endpoint``
   999  * ``cilium-dbg endpoint list``
  1000  * ``hostname``
  1001  * ``cilium-dbg policy get``
  1002  * ``cilium-dbg service list``
  1003  
  1004  
  1005  Debugging information
  1006  ~~~~~~~~~~~~~~~~~~~~~
  1007  
  1008  If you are not running Kubernetes, you can use the ``cilium-dbg debuginfo`` command
  1009  to retrieve useful debugging information. If you are running Kubernetes, this
  1010  command is automatically run as part of the system dump.
  1011  
  1012  ``cilium-dbg debuginfo`` can print useful output from the Cilium API. The output
  1013  format is in Markdown format so this can be used when reporting a bug on the
  1014  `issue tracker`_.  Running without arguments will print to standard output, but
  1015  you can also redirect to a file like
  1016  
  1017  .. code-block:: shell-session
  1018  
  1019     cilium-dbg debuginfo -f debuginfo.md
  1020  
  1021  .. note::
  1022  
  1023     Please check the debuginfo file for sensitive information and strip it
  1024     away before sharing it with us.
  1025  
  1026  
  1027  Slack assistance
  1028  ----------------
  1029  
  1030  The `Cilium Slack`_ community is a helpful first point of assistance to get
  1031  help troubleshooting a problem or to discuss options on how to address a
  1032  problem. The community is open to anyone.
  1033  
  1034  Report an issue via GitHub
  1035  --------------------------
  1036  
  1037  If you believe to have found an issue in Cilium, please report a
  1038  `GitHub issue`_ and make sure to attach a system dump as described above to
  1039  ensure that developers have the best chance to reproduce the issue.
  1040  
  1041  .. _NodeSelector: https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector
  1042  .. _RBAC: https://kubernetes.io/docs/reference/access-authn-authz/rbac/
  1043  .. _CNI: https://github.com/containernetworking/cni
  1044  .. _Volumes: https://kubernetes.io/docs/tasks/configure-pod-container/configure-volume-storage/
  1045  
  1046  .. _Cilium Frequently Asked Questions (FAQ): https://github.com/cilium/cilium/issues?utf8=%E2%9C%93&q=label%3Akind%2Fquestion%20
  1047  
  1048  .. _issue tracker: https://github.com/cilium/cilium/issues
  1049  .. _GitHub issue: `issue tracker`_