istio.io/istio@v0.0.0-20240520182934-d79c90f27776/cni/README.md (about) 1 # Istio CNI Node Agent 2 3 The Istio CNI Node Agent is responsible for several things 4 5 - Install an Istio CNI plugin binary on each node's filesystem, updating that node's CNI config in e.g (`/etc/cni/net.d`), and watching the config and binary paths to reinstall if things are modified. 6 - In sidecar mode, the CNI plugin can configure sidecar networking for pods when they are scheduled by the container runtime, using iptables. The CNI handling the netns setup replaces the current Istio approach using a `NET_ADMIN` privileged `initContainers` container, `istio-init`, injected in the pods along with `istio-proxy` sidecars. This removes the need for a privileged, `NET_ADMIN` container in the Istio users' application pods. 7 - In ambient mode, the CNI plugin does not configure any networking, but is only responsible for synchronously pushing new pod events back up to an ambient watch server which runs as part of the Istio CNI node agent. The ambient server will find the pod netns and configure networking inside that pod via iptables. The ambient server will additionally watch enabled namespaces, and enroll already-started-but-newly-enrolled pods in a similar fashion. 8 9 ## Privileges required 10 11 Regardless of mode, the Istio CNI Node Agent requires privileged node permissions, and will require allow-listing in constrained environments that block privileged workloads by default. If using sidecar repair mode or ambient mode, the node agent additionally needs permissions to enter pod network namespaces and perform networking configuration in them. If either sidecar repair or ambient mode are enabled, on startup the container will drop all Linux capabilities via (`drop:ALL`), and re-add back the ones sidecar repair/ambient explicitly require to function, namely: 12 13 - CAP_SYS_ADMIN 14 - CAP_NET_ADMIN 15 - CAP_NET_RAW 16 17 ## Ambient mode details 18 19 Fundamentally, this component is responsible for the following: 20 21 - Sets up redirection with newly-started (or newly-added, previously-started) application pods such that traffic from application pods is forwarded to the local node's ztunnel pod. 22 - Configures required iptables, sockets, and packet routing miscellanea within the `ztunnel` and application pod network namespaces to make that happen. 23 24 This component accomplishes that in the following ways: 25 26 1. By installing a separate, very basic "CNI plugin" binary onto the node to forward low-level pod lifecycle events (CmdAdd/CmdDel/etc) from whatever node-level CNI subsystem is in use to this node agent for processing via socket. 27 1. By running as a node-level daemonset that: 28 29 - listens for these UDS events from the CNI plugin (which fire when new pods are spawned in an ambient-enabled namespace), and adds those pods to the ambient mesh. 30 - watches k8s resource for existing pods, so that pods that have already been started can be moved in or out of the ambient mesh. 31 - sends UDS events to ztunnel via a socket whenever a pod is enabled for ambient mesh (whether from CNI plugin or node watcher), instructing ztunnel to create the "tube" socket. 32 33 The ambient CNI agent is the only place where ambient network config and pod redirection machinery happens. 34 In ambient mode, the CNI plugin is effectively just a shim to catch pod creation events and notify the CNI agent early enough to set up network redirection before the pod is fully started. This is necessary because the CNI plugin is effectively the first thing to see a scheduled pod - before the K8S control plane will see things like the pod IP or networking info, the CNI will - but the CNI plugin alone is not sufficient to handle all pod events (already-started pod updates, rebuilding current state on CNI restart) that the node agent cares about. 35 36 ## Reference 37 38 ### Design details 39 40 Broadly, `istio-cni` accomplishes ambient redirection by instructing ztunnel to set up sockets within the application pod network namespace, where: 41 42 - one end of the socket is in the application pod 43 - and the other end is in ztunnel's pod 44 45 and setting up iptables rules to funnel traffic thru that socket "tube" to ztunnel and back. 46 47 This effectively behaves like ztunnel is an in-pod sidecar, without actually requiring the injection of ztunnel as a sidecar into the pod manifest, or mutatating the application pod in any way. 48 49 Additionally, it does not require any network rules/routing/config in the host network namespace, which greatly increases ambient mode compatibility with 3rd-party CNIs. In virtually all cases, this "in-pod" ambient CNI is exactly as compatible with 3rd-party CNIs as sidecars are/were. 50 51 ### Notable Env Vars 52 53 | Env Var | Default | Purpose | 54 |--------------------|-----------------|-----------------------------------------------------------------------------------------------------------------------------------------------| 55 | HOST_PROBE_SNAT_IP | "169.254.7.127" | Applied to SNAT host probe packets, so they can be identified/skipped podside. Any link-local address in the 169.254.0.0/16 block can be used | 56 | HOST_PROBE_SNAT_IPV6 | "fd16:9254:7127:1337:ffff:ffff:ffff:ffff" | IPv6 link local ranges are designed to be collision-resistant by default, and so this probably never needs to be overridden | 57 58 ## Sidecar Mode Implementation Details 59 60 Istio CNI injection is currently based on the same Pod annotations used in init-container/inject mode. 61 62 ### Selection API 63 64 - plugin config "exclude namespaces" applies first 65 - ambient is enabled if: 66 - namespace label "istio.io/dataplane-mode" == "ambient", and/or pod label "istio.io/dataplane-mode" == "ambient" 67 - "sidecar.istio.io/status" annotation is not present on the pod (created by injection of sidecar) 68 - pod label "istio.io/dataplane-mode" is not "none" 69 - sidecar interception is enabled if: 70 - "istio-init" container is not present in the pod. 71 - istio-proxy container exists and 72 - does not have DISABLE_ENVOY environment variable (which triggers proxyless mode) 73 - has a istio-proxy container, with first 2 args "proxy" and "sidecar" - or less then 2 args, or first arg not proxy. 74 - "sidecar.istio.io/inject" is not false 75 - "sidecar.istio.io/status" exists 76 77 ### Redirect API 78 79 The annotation based control is currently only supported in 'sidecar' mode. See plugin/redirect.go for details. 80 81 - redirectMode allows TPROXY may to be set, required envoy has extra permissions. Default is redirect. 82 - includeIPCidr, excludeIPCidr 83 - includeInboudPorts, excludeInboundPorts 84 - includeOutboutPorts, excludeOutboundPorts 85 - excludeInterfaces 86 - kubevirtInterfaces 87 - ISTIO_META_DNS_CAPTURE env variable on the proxy - enables dns redirect 88 - INVALID_DROP env var on proxy - changes behavior from reset to drop in iptables 89 - auto excluded inbound ports: 15020, 15021, 15090 90 91 The code automatically detects the proxyUID and proxyGID from RunAsUser/RunAsGroup and exclude them from interception, defaulting to 1337 92 93 ### Overview 94 95 - [istio-cni Helm chart](../manifests/charts/istio-cni/templates) 96 - `install-cni` daemonset - main function is to install and help the node CNI, but it is also a proper server and interacts with K8S, watching Pods for recovery. 97 - `istio-cni-config` configmap with CNI plugin config to add to CNI plugin chained config 98 - creates service-account `istio-cni` with `ClusterRoleBinding` to allow gets on pods' info and delete/modifications for recovery. 99 100 - `install-cni` container 101 - copies `istio-cni` and `istio-iptables` to `/opt/cni/bin` 102 - creates kubeconfig for the service account the pod runs under 103 - periodically copy the K8S JWT token for istio-cni on the host to connect to K8S. 104 - injects the CNI plugin config to the CNI config file 105 - CNI installer will try to look for the config file under the mounted CNI net dir based on file name extensions (`.conf`, `.conflist`) 106 - the file name can be explicitly set by `CNI_CONF_NAME` env var 107 - the program inserts `CNI_NETWORK_CONFIG` into the `plugins` list in `/etc/cni/net.d/${CNI_CONF_NAME}` 108 - the actual code is in pkg/install - including a readiness probe, monitoring. 109 - it also sets up a UDS socket for istio-cni to send logs to this container. 110 - based on config, it may run the 'repair' controller that detects pods where istio setup fails and restarts them, or created in corner cases. 111 - if ambient is enabled, also runs an ambient controller, watching Pod, Namespace 112 113 - `istio-cni` 114 - CNI plugin executable copied to `/opt/cni/bin` 115 - currently implemented for k8s only 116 - on pod add, determines whether pod should have netns setup to redirect to Istio proxy. See [cmdAdd](#cmdadd-workflow) for detailed logic. 117 - it connects to K8S using the kubeconfig and JWT token copied from install-cni to get Pod and Namespace. Since this is a short-running command, each invocation creates a new connection. 118 - If so, calls `istio-iptables` with params to setup pod netns 119 - If ambient, sets up the ambient logic. 120 121 - `istio-iptables` 122 - sets up iptables to redirect a list of ports to the port envoy will listen 123 - shared code with istio-init container 124 - it will generate an iptables-save config, based on annotations/labels and other settings, and apply it. 125 126 ### CmdAdd Sidecar Workflow 127 128 `CmdAdd` is triggered when there is a new pod created. This runs on the node, in a chain of CNI plugins - Istio is 129 run after the main CNI sets up the pod IP and networking. 130 131 1. Check k8s pod namespace against exclusion list (plugin config) 132 - Config must exclude namespace that Istio control-plane is installed in (TODO: this may change, exclude at pod level is sufficient and we may want Istiod and other istio components to use ambient too) 133 - If excluded, ignore the pod and return prevResult 134 1. Setup redirect rules for the pods: 135 - Get the port list from pods definition, as well as annotations. 136 - Setup iptables with required port list: `nsenter --net=<k8s pod netns> /opt/cni/bin/istio-iptables ...`. Following conditions will prevent the redirect rules to be setup in the pods: 137 - Pods have annotation `sidecar.istio.io/inject` set to `false` or has no key `sidecar.istio.io/status` in annotations 138 - Pod has `istio-init` initContainer - this indicates a pod running its own injection setup. 139 1. Return prevResult 140 141 ## Troubleshooting 142 143 ### Collecting Logs 144 145 #### Using `istioctl`/helm 146 147 - Set: `values.global.logging.level="cni:debug,ambient:debug"` 148 - Inspect the pod logs of a `istio-cni` Daemonset pod on a specific node. 149 150 #### From a specific node syslog 151 152 The CNI plugins are executed by threads in the `kubelet` process. The CNI plugins logs end up the syslog 153 under the `kubelet` process. On systems with `journalctl` the following is an example command line 154 to view the last 1000 `kubelet` logs via the `less` utility to allow for `vi`-style searching: 155 156 ```console 157 $ journalctl -t kubelet -n 1000 | less 158 ``` 159 160 #### GKE via Stackdriver Log Viewer 161 162 Each GKE cluster's will have many categories of logs collected by Stackdriver. Logs can be monitored via 163 the project's [log viewer](https://cloud.google.com/logging/docs/view/overview) and/or the `gcloud logging read` 164 capability. 165 166 The following example grabs the last 10 `kubelet` logs containing the string "cmdAdd" in the log message. 167 168 ```console 169 $ gcloud logging read "resource.type=k8s_node AND jsonPayload.SYSLOG_IDENTIFIER=kubelet AND jsonPayload.MESSAGE:cmdAdd" --limit 10 --format json 170 ``` 171 172 ## Other Reference 173 174 The framework for this implementation of the CNI plugin is based on the 175 [containernetworking sample plugin](https://github.com/containernetworking/plugins/tree/main/plugins/sample) 176 177 The details for the deployment & installation of this plugin were pretty much lifted directly from the 178 [Calico CNI plugin](https://github.com/projectcalico/cni-plugin). 179 180 Specifically: 181 182 - The CNI installation script is containerized and deployed as a daemonset in k8s. The relevant calico k8s manifests were used as the model for the istio-cni plugin's manifest: 183 - [daemonset and configmap](https://docs.projectcalico.org/v3.2/getting-started/kubernetes/installation/hosted/calico.yaml) - search for the `calico-node` Daemonset and its `install-cni` container deployment 184 - [RBAC](https://docs.projectcalico.org/v3.2/getting-started/kubernetes/installation/rbac.yaml) - this creates the service account the CNI plugin is configured to use to access the kube-api-server