github.com/projectcontour/contour@v1.28.2/site/content/docs/1.25/troubleshooting/envoy-container-draining.md

github.com/projectcontour/contour@v1.28.2/site/content/docs/1.25/troubleshooting/envoy-container-draining.md (about)

     1  # Envoy container stuck in unready/draining state
     2  
     3  It's possible for the Envoy containers to become stuck in an unready/draining state.
     4  This is an unintended side effect of the shutdown-manager sidecar container being restarted by the kubelet.
     5  For more details on exactly how this happens, see [this issue][1].
     6  
     7  If you observe Envoy containers in this state, you should `kubectl delete` them to allow new Pods to be created to replace them.
     8  
     9  To make this issue less likely to occur, you should:
    10  - ensure you have [resource requests][2] on all your containers
    11  - ensure you do **not** have a liveness probe on the shutdown-manager sidecar container in the envoy daemonset (this was removed from the example YAML in Contour 1.24.0).
    12  
    13  If the above are not sufficient for preventing the issue, you may also add a liveness probe to the envoy container itself, like the following:
    14  
    15  ```yaml
    16  livenessProbe:
    17    httpGet:
    18      path: /ready
    19      port: 8002
    20    initialDelaySeconds: 15
    21    periodSeconds: 5
    22    failureThreshold: 6
    23  ```
    24  
    25  This will cause the kubelet to restart the envoy container if it does get stuck in this state, resulting in a return to normal operations load balancing traffic.
    26  Note that in this case, it's possible that a graceful drain of connections may or may not occur, depending on the exact sequence of operations that preceded the envoy container failing the liveness probe.
    27  
    28  [1]: https://github.com/projectcontour/contour/issues/4851
    29  [2]: /docs/{{< param latest_version >}}/deploy-options/#setting-resource-requests-and-limits