k8s.io/kubernetes@v1.29.3/pkg/proxy/ipvs/README.md (about) 1 - [IPVS](#ipvs) 2 - [What is IPVS](#what-is-ipvs) 3 - [IPVS vs. IPTABLES](#ipvs-vs-iptables) 4 - [When IPVS falls back to IPTABLES](#when-ipvs-falls-back-to-iptables) 5 - [Run kube-proxy in IPVS mode](#run-kube-proxy-in-ipvs-mode) 6 - [Prerequisite](#prerequisite) 7 - [Local UP Cluster](#local-up-cluster) 8 - [GCE Cluster](#gce-cluster) 9 - [Cluster Created by Kubeadm](#cluster-created-by-kubeadm) 10 - [Debug](#debug) 11 - [Check IPVS proxy rules](#check-ipvs-proxy-rules) 12 - [Why kube-proxy can't start IPVS mode](#why-kube-proxy-cant-start-ipvs-mode) 13 14 # IPVS 15 16 This document intends to show users 17 - what is IPVS 18 - difference between IPVS and IPTABLES 19 - how to run kube-proxy in IPVS mode and info on debugging 20 21 ## What is IPVS 22 23 **IPVS (IP Virtual Server)** implements transport-layer load balancing, usually called Layer 4 LAN switching, as part of 24 Linux kernel. 25 26 IPVS runs on a host and acts as a load balancer in front of a cluster of real servers. IPVS can direct requests for TCP 27 and UDP-based services to the real servers, and make services of real servers appear as virtual services on a single IP address. 28 29 ## IPVS vs. IPTABLES 30 IPVS mode was introduced in Kubernetes v1.8, goes beta in v1.9 and GA in v1.11. IPTABLES mode was added in v1.1 and become the default operating mode since v1.2. Both IPVS and IPTABLES are based on `netfilter`. 31 Differences between IPVS mode and IPTABLES mode are as follows: 32 33 1. IPVS provides better scalability and performance for large clusters. 34 35 2. IPVS supports more sophisticated load balancing algorithms than IPTABLES (least load, least connections, locality, weighted, etc.). 36 37 3. IPVS supports server health checking and connection retries, etc. 38 39 ### When IPVS falls back to IPTABLES 40 IPVS proxier will employ IPTABLES in doing packet filtering, SNAT or masquerade. 41 Specifically, IPVS proxier will use ipset to store source or destination address of traffics that need DROP or do masquerade, to make sure the number of IPTABLES rules be constant, no matter how many services we have. 42 43 44 Here is the table of ipset sets that IPVS proxier used. 45 46 | set name | members | usage | 47 | :----------------------------- | ---------------------------------------- | ---------------------------------------- | 48 | KUBE-CLUSTER-IP | All service IP + port | Mark-Masq for cases that `masquerade-all=true` or `clusterCIDR` specified | 49 | KUBE-LOOP-BACK | All service IP + port + IP | masquerade for solving hairpin purpose | 50 | KUBE-EXTERNAL-IP | service external IP + port | masquerade for packages to external IPs | 51 | KUBE-LOAD-BALANCER | load balancer ingress IP + port | masquerade for packages to load balancer type service | 52 | KUBE-LOAD-BALANCER-LOCAL | LB ingress IP + port with `externalTrafficPolicy=local` | accept packages to load balancer with `externalTrafficPolicy=local` | 53 | KUBE-LOAD-BALANCER-FW | load balancer ingress IP + port with `loadBalancerSourceRanges` | package filter for load balancer with `loadBalancerSourceRanges` specified | 54 | KUBE-LOAD-BALANCER-SOURCE-CIDR | load balancer ingress IP + port + source CIDR | package filter for load balancer with `loadBalancerSourceRanges` specified | 55 | KUBE-NODE-PORT-TCP | nodeport type service TCP port | masquerade for packets to nodePort(TCP) | 56 | KUBE-NODE-PORT-LOCAL-TCP | nodeport type service TCP port with `externalTrafficPolicy=local` | accept packages to nodeport service with `externalTrafficPolicy=local` | 57 | KUBE-NODE-PORT-UDP | nodeport type service UDP port | masquerade for packets to nodePort(UDP) | 58 | KUBE-NODE-PORT-LOCAL-UDP | nodeport type service UDP port with `externalTrafficPolicy=local` | accept packages to nodeport service with `externalTrafficPolicy=local` | 59 60 61 IPVS proxier will fall back on IPTABLES in the following scenarios. 62 63 **1. kube-proxy starts with --masquerade-all=true** 64 65 If kube-proxy starts with `--masquerade-all=true`, IPVS proxier will masquerade all traffic accessing service Cluster IP, which behaves the same as what IPTABLES proxier. Suppose kube-proxy has flag `--masquerade-all=true` specified, then the IPTABLES installed by IPVS proxier should be like what is shown below. 66 67 ```shell 68 # iptables -t nat -nL 69 70 Chain PREROUTING (policy ACCEPT) 71 target prot opt source destination 72 KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */ 73 74 Chain OUTPUT (policy ACCEPT) 75 target prot opt source destination 76 KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */ 77 78 Chain POSTROUTING (policy ACCEPT) 79 target prot opt source destination 80 KUBE-POSTROUTING all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */ 81 82 Chain KUBE-MARK-MASQ (2 references) 83 target prot opt source destination 84 MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x4000 85 86 Chain KUBE-POSTROUTING (1 references) 87 target prot opt source destination 88 MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000 89 MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-LOOP-BACK dst,dst,src 90 91 Chain KUBE-SERVICES (2 references) 92 target prot opt source destination 93 KUBE-MARK-MASQ all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-CLUSTER-IP dst,dst 94 ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-CLUSTER-IP dst,dst 95 ``` 96 97 **2. Specify cluster CIDR in kube-proxy startup** 98 99 If kube-proxy starts with `--cluster-cidr=<cidr>`, IPVS proxier will masquerade off-cluster traffic accessing service Cluster IP, which behaves the same as what IPTABLES proxier. Suppose kube-proxy is provided with the cluster cidr `10.244.16.0/24`, then the IPTABLES installed by IPVS proxier should be like what is shown below. 100 101 ```shell 102 # iptables -t nat -nL 103 104 Chain PREROUTING (policy ACCEPT) 105 target prot opt source destination 106 KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */ 107 108 Chain OUTPUT (policy ACCEPT) 109 target prot opt source destination 110 KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */ 111 112 Chain POSTROUTING (policy ACCEPT) 113 target prot opt source destination 114 KUBE-POSTROUTING all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */ 115 116 Chain KUBE-MARK-MASQ (3 references) 117 target prot opt source destination 118 MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x4000 119 120 Chain KUBE-POSTROUTING (1 references) 121 target prot opt source destination 122 MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000 123 MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-LOOP-BACK dst,dst,src 124 125 Chain KUBE-SERVICES (2 references) 126 target prot opt source destination 127 KUBE-MARK-MASQ all -- !10.244.16.0/24 0.0.0.0/0 match-set KUBE-CLUSTER-IP dst,dst 128 ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-CLUSTER-IP dst,dst 129 ``` 130 131 **3. Load Balancer type service** 132 133 For loadBalancer type service, IPVS proxier will install IPTABLES with match of ipset `KUBE-LOAD-BALANCER`. 134 Specially when service's `LoadBalancerSourceRanges` is specified or specified `externalTrafficPolicy=local`, 135 IPVS proxier will create ipset sets `KUBE-LOAD-BALANCER-LOCAL`/`KUBE-LOAD-BALANCER-FW`/`KUBE-LOAD-BALANCER-SOURCE-CIDR` 136 and install IPTABLES accordingly, which should look like what is shown below. 137 138 ```shell 139 # iptables -t nat -nL 140 141 Chain PREROUTING (policy ACCEPT) 142 target prot opt source destination 143 KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */ 144 145 Chain OUTPUT (policy ACCEPT) 146 target prot opt source destination 147 KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */ 148 149 Chain POSTROUTING (policy ACCEPT) 150 target prot opt source destination 151 KUBE-POSTROUTING all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */ 152 153 Chain KUBE-FIREWALL (1 references) 154 target prot opt source destination 155 RETURN all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-LOAD-BALANCER-SOURCE-CIDR dst,dst,src 156 KUBE-MARK-DROP all -- 0.0.0.0/0 0.0.0.0/0 157 158 Chain KUBE-LOAD-BALANCER (1 references) 159 target prot opt source destination 160 KUBE-FIREWALL all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-LOAD-BALANCER-FW dst,dst 161 RETURN all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-LOAD-BALANCER-LOCAL dst,dst 162 KUBE-MARK-MASQ all -- 0.0.0.0/0 0.0.0.0/0 163 164 Chain KUBE-MARK-DROP (1 references) 165 target prot opt source destination 166 MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x8000 167 168 Chain KUBE-MARK-MASQ (2 references) 169 target prot opt source destination 170 MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x4000 171 172 Chain KUBE-POSTROUTING (1 references) 173 target prot opt source destination 174 MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000 175 MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-LOOP-BACK dst,dst,src 176 177 Chain KUBE-SERVICES (2 references) 178 target prot opt source destination 179 KUBE-LOAD-BALANCER all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-LOAD-BALANCER dst,dst 180 ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-LOAD-BALANCER dst,dst 181 ``` 182 183 **4. NodePort type service** 184 185 For NodePort type service, IPVS proxier will install IPTABLES with match of ipset `KUBE-NODE-PORT-TCP/KUBE-NODE-PORT-UDP`. 186 When specified `externalTrafficPolicy=local`, IPVS proxier will create ipset sets `KUBE-NODE-PORT-LOCAL-TCP/KUBE-NODE-PORT-LOCAL-UDP` 187 and install IPTABLES accordingly, which should look like what is shown below. 188 189 Suppose service with TCP type nodePort. 190 191 ```shell 192 Chain PREROUTING (policy ACCEPT) 193 target prot opt source destination 194 KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */ 195 196 Chain OUTPUT (policy ACCEPT) 197 target prot opt source destination 198 KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */ 199 200 Chain POSTROUTING (policy ACCEPT) 201 target prot opt source destination 202 KUBE-POSTROUTING all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */ 203 204 Chain KUBE-MARK-MASQ (2 references) 205 target prot opt source destination 206 MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x4000 207 208 Chain KUBE-NODE-PORT (1 references) 209 target prot opt source destination 210 RETURN all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-NODE-PORT-LOCAL-TCP dst 211 KUBE-MARK-MASQ all -- 0.0.0.0/0 0.0.0.0/0 212 213 Chain KUBE-POSTROUTING (1 references) 214 target prot opt source destination 215 MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000 216 MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-LOOP-BACK dst,dst,src 217 218 Chain KUBE-SERVICES (2 references) 219 target prot opt source destination 220 KUBE-NODE-PORT all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-NODE-PORT-TCP dst 221 ``` 222 223 **5. Service with externalIPs specified** 224 225 For service with `externalIPs` specified, IPVS proxier will install IPTABLES with match of ipset `KUBE-EXTERNAL-IP`, 226 Suppose we have service with `externalIPs` specified, IPTABLES rules should look like what is shown below. 227 228 ```shell 229 Chain PREROUTING (policy ACCEPT) 230 target prot opt source destination 231 KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */ 232 233 Chain OUTPUT (policy ACCEPT) 234 target prot opt source destination 235 KUBE-SERVICES all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service portals */ 236 237 Chain POSTROUTING (policy ACCEPT) 238 target prot opt source destination 239 KUBE-POSTROUTING all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes postrouting rules */ 240 241 Chain KUBE-MARK-MASQ (2 references) 242 target prot opt source destination 243 MARK all -- 0.0.0.0/0 0.0.0.0/0 MARK or 0x4000 244 245 Chain KUBE-POSTROUTING (1 references) 246 target prot opt source destination 247 MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 /* kubernetes service traffic requiring SNAT */ mark match 0x4000/0x4000 248 MASQUERADE all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-LOOP-BACK dst,dst,src 249 250 Chain KUBE-SERVICES (2 references) 251 target prot opt source destination 252 KUBE-MARK-MASQ all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-EXTERNAL-IP dst,dst 253 ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-EXTERNAL-IP dst,dst PHYSDEV match ! --physdev-is-in ADDRTYPE match src-type !LOCAL 254 ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 match-set KUBE-EXTERNAL-IP dst,dst ADDRTYPE match dst-type LOCAL 255 ``` 256 257 ## Run kube-proxy in IPVS mode 258 259 Currently, local-up scripts, GCE scripts and kubeadm support switching IPVS proxy mode via exporting environment variables or specifying flags. 260 261 ### Prerequisite 262 Ensure IPVS required kernel modules (**Notes**: use `nf_conntrack` instead of `nf_conntrack_ipv4` for Linux kernel 4.19 and later) 263 ```shell 264 ip_vs 265 ip_vs_rr 266 ip_vs_wrr 267 ip_vs_sh 268 nf_conntrack_ipv4 269 ``` 270 1. have been compiled into the node kernel. Use 271 272 `grep -e ipvs -e nf_conntrack_ipv4 /lib/modules/$(uname -r)/modules.builtin` 273 274 and get results like the followings if compiled into kernel. 275 ``` 276 kernel/net/ipv4/netfilter/nf_conntrack_ipv4.ko 277 kernel/net/netfilter/ipvs/ip_vs.ko 278 kernel/net/netfilter/ipvs/ip_vs_rr.ko 279 kernel/net/netfilter/ipvs/ip_vs_wrr.ko 280 kernel/net/netfilter/ipvs/ip_vs_lc.ko 281 kernel/net/netfilter/ipvs/ip_vs_wlc.ko 282 kernel/net/netfilter/ipvs/ip_vs_fo.ko 283 kernel/net/netfilter/ipvs/ip_vs_ovf.ko 284 kernel/net/netfilter/ipvs/ip_vs_lblc.ko 285 kernel/net/netfilter/ipvs/ip_vs_lblcr.ko 286 kernel/net/netfilter/ipvs/ip_vs_dh.ko 287 kernel/net/netfilter/ipvs/ip_vs_sh.ko 288 kernel/net/netfilter/ipvs/ip_vs_sed.ko 289 kernel/net/netfilter/ipvs/ip_vs_nq.ko 290 kernel/net/netfilter/ipvs/ip_vs_ftp.ko 291 ``` 292 293 OR 294 295 2. have been loaded. 296 ```shell 297 # load module <module_name> 298 modprobe -- ip_vs 299 modprobe -- ip_vs_rr 300 modprobe -- ip_vs_wrr 301 modprobe -- ip_vs_sh 302 modprobe -- nf_conntrack_ipv4 303 304 # to check loaded modules, use 305 lsmod | grep -e ip_vs -e nf_conntrack_ipv4 306 # or 307 cut -f1 -d " " /proc/modules | grep -e ip_vs -e nf_conntrack_ipv4 308 ``` 309 310 Packages such as `ipset` should also be installed on the node before using IPVS mode. 311 312 Kube-proxy will fall back to IPTABLES mode if those requirements are not met. 313 314 ### Local UP Cluster 315 316 Kube-proxy will run in IPTABLES mode by default in a [local-up cluster](https://github.com/kubernetes/community/blob/master/contributors/devel/running-locally.md). 317 318 To use IPVS mode, users should export the env `KUBE_PROXY_MODE=ipvs` to specify the IPVS mode before [starting the cluster](https://github.com/kubernetes/community/blob/master/contributors/devel/running-locally.md#starting-the-cluster): 319 ```shell 320 # before running `hack/local-up-cluster.sh` 321 export KUBE_PROXY_MODE=ipvs 322 ``` 323 324 ### GCE Cluster 325 326 Similar to local-up cluster, kube-proxy in [clusters running on GCE](https://kubernetes.io/docs/getting-started-guides/gce/) run in IPTABLES mode by default. Users need to export the env `KUBE_PROXY_MODE=ipvs` before [starting a cluster](https://kubernetes.io/docs/getting-started-guides/gce/#starting-a-cluster): 327 ```shell 328 #before running one of the commands chosen to start a cluster: 329 # curl -sS https://get.k8s.io | bash 330 # wget -q -O - https://get.k8s.io | bash 331 # cluster/kube-up.sh 332 export KUBE_PROXY_MODE=ipvs 333 ``` 334 335 ### Cluster Created by Kubeadm 336 337 If you are using kubeadm with a [configuration file](https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init/#config-file), you have to add mode: ipvs in a KubeProxyConfiguration (separated by -- that is also passed to kubeadm init). 338 339 ```yaml 340 ... 341 apiVersion: kubeproxy.config.k8s.io/v1alpha1 342 kind: KubeProxyConfiguration 343 mode: ipvs 344 ... 345 ``` 346 347 before running 348 349 `kubeadm init --config <path_to_configuration_file>` 350 351 to specify the ipvs mode before deploying the cluster. 352 353 **Notes** 354 If ipvs mode is successfully on, you should see IPVS proxy rules (use `ipvsadm`) like 355 ```shell 356 # ipvsadm -ln 357 IP Virtual Server version 1.2.1 (size=4096) 358 Prot LocalAddress:Port Scheduler Flags 359 -> RemoteAddress:Port Forward Weight ActiveConn InActConn 360 TCP 10.0.0.1:443 rr persistent 10800 361 -> 192.168.0.1:6443 Masq 1 1 0 362 ``` 363 or similar logs occur in kube-proxy logs (for example, `/tmp/kube-proxy.log` for local-up cluster) when the local cluster is running: 364 ``` 365 Using ipvs Proxier. 366 ``` 367 368 While there is no IPVS proxy rules or the following logs occurs indicate that the kube-proxy fails to use IPVS mode: 369 ``` 370 Can't use ipvs proxier, trying iptables proxier 371 Using iptables Proxier. 372 ``` 373 See the following section for more details on debugging. 374 375 ## Debug 376 377 ### Check IPVS proxy rules 378 379 Users can use `ipvsadm` tool to check whether kube-proxy are maintaining IPVS rules correctly. For example, we have the following services in the cluster: 380 381 ``` 382 # kubectl get svc --all-namespaces 383 NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE 384 default kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 1d 385 kube-system kube-dns ClusterIP 10.0.0.10 <none> 53/UDP,53/TCP 1d 386 ``` 387 We may get IPVS proxy rules like: 388 389 ```shell 390 # ipvsadm -ln 391 IP Virtual Server version 1.2.1 (size=4096) 392 Prot LocalAddress:Port Scheduler Flags 393 -> RemoteAddress:Port Forward Weight ActiveConn InActConn 394 TCP 10.0.0.1:443 rr persistent 10800 395 -> 192.168.0.1:6443 Masq 1 1 0 396 TCP 10.0.0.10:53 rr 397 -> 172.17.0.2:53 Masq 1 0 0 398 UDP 10.0.0.10:53 rr 399 -> 172.17.0.2:53 Masq 1 0 0 400 ``` 401 402 ### Why kube-proxy can't start IPVS mode 403 404 Use the following check list to help you solve the problems: 405 406 **1. Specify proxy-mode=ipvs** 407 408 Check whether the kube-proxy mode has been set to `ipvs`. 409 410 **2. Install required kernel modules and packages** 411 412 Check whether the IPVS required kernel modules have been compiled into the kernel and packages installed. (see Prerequisite)