github.com/cilium/cilium@v1.16.2/Documentation/network/egress-gateway/egress-gateway.rst (about)

     1  .. only:: not (epub or latex or html)
     2  
     3      WARNING: You are looking at unreleased Cilium documentation.
     4      Please use the official rendered version released here:
     5      https://docs.cilium.io
     6  
     7  .. _egress-gateway:
     8  
     9  **************
    10  Egress Gateway
    11  **************
    12  
    13  The egress gateway feature routes all IPv4 connections originating from pods and
    14  destined to specific cluster-external CIDRs through particular nodes, from now
    15  on called "gateway nodes".
    16  
    17  When the egress gateway feature is enabled and egress gateway policies are in
    18  place, matching packets that leave the cluster are masqueraded with selected,
    19  predictable IPs associated with the gateway nodes. As an example, this feature
    20  can be used in combination with legacy firewalls to allow traffic to legacy
    21  infrastructure only from specific pods within a given namespace. The pods
    22  typically have ever-changing IP addresses, and even if masquerading was to be
    23  used as a way to mitigate this, the IP addresses of nodes can also change
    24  frequently over time.
    25  
    26  This document explains how to enable the egress gateway feature and how to
    27  configure egress gateway policies to route and SNAT the egress traffic for a
    28  specific workload.
    29  
    30  .. note::
    31  
    32      This guide assumes that Cilium has been correctly installed in your
    33      Kubernetes cluster. Please see :ref:`k8s_quick_install` for more
    34      information. If unsure, run ``cilium status`` and validate that Cilium is up
    35      and running.
    36  
    37  .. admonition:: Video
    38    :class: attention
    39  
    40    For more insights on Cilium's Egress Gateway, check out `eCHO episode 76: Cilium Egress Gateway <https://www.youtube.com/watch?v=zEQdgNGa7bg>`__.
    41  
    42  Preliminary Considerations
    43  ==========================
    44  
    45  Cilium must make use of network-facing interfaces and IP addresses present on
    46  the designated gateway nodes. These interfaces and IP addresses must be
    47  provisioned and configured by the operator based on their networking
    48  environment. The process is highly-dependent on said networking environment. For
    49  example, in AWS/EKS, and depending on the requirements, this may mean creating
    50  one or more Elastic Network Interfaces with one or more IP addresses and
    51  attaching them to instances that serve as gateway nodes so that AWS can
    52  adequately route traffic flowing from and to the instances. Other cloud
    53  providers have similar networking requirements and constructs.
    54  
    55  Additionally, the enablement of the egress gateway feature requires that both
    56  BPF masquerading and the kube-proxy replacement are enabled, which may not be
    57  possible in all environments (due to, e.g., incompatible kernel versions).
    58  
    59  Delay for enforcement of egress policies on new pods
    60  ----------------------------------------------------
    61  
    62  When new pods are started, there is a delay before egress gateway policies are
    63  applied for those pods. That means traffic from those pods may leave the
    64  cluster with a source IP address (pod IP or node IP) that doesn't match the
    65  egress gateway IP. That egressing traffic will also not be redirected through
    66  the gateway node.
    67  
    68  .. _egress-gateway-incompatible-features:
    69  
    70  Incompatibility with other features
    71  -----------------------------------
    72  
    73  Because egress gateway isn't compatible with identity allocation mode ``kvstore``,
    74  you must use Kubernetes as Cilium's identity store (``identityAllocationMode``
    75  set to ``crd``). This is the default setting for new installations.
    76  
    77  Egress gateway is not compatible with the Cluster Mesh feature. The gateway selected
    78  by an egress gateway policy must be in the same cluster as the selected pods.
    79  
    80  Egress gateway is not compatible with the CiliumEndpointSlice feature
    81  (see :gh-issue:`24833` for details).
    82  
    83  Egress gateway is not supported for IPv6 traffic.
    84  
    85  Enable egress gateway
    86  =====================
    87  
    88  The egress gateway feature and all the requirements can be enabled as follow:
    89  
    90  .. tabs::
    91      .. group-tab:: Helm
    92  
    93          .. parsed-literal::
    94  
    95              $ helm upgrade cilium |CHART_RELEASE| \\
    96                 --namespace kube-system \\
    97                 --reuse-values \\
    98                 --set egressGateway.enabled=true \\
    99                 --set bpf.masquerade=true \\
   100                 --set kubeProxyReplacement=true
   101  
   102      .. group-tab:: ConfigMap
   103  
   104          .. code-block:: yaml
   105  
   106              enable-bpf-masquerade: true
   107              enable-ipv4-egress-gateway: true
   108              kube-proxy-replacement: true
   109  
   110  Rollout both the agent pods and the operator pods to make the changes effective:
   111  
   112  .. code-block:: shell-session
   113  
   114      $ kubectl rollout restart ds cilium -n kube-system
   115      $ kubectl rollout restart deploy cilium-operator -n kube-system
   116  
   117  Writing egress gateway policies
   118  ===============================
   119  
   120  The API provided by Cilium to drive the egress gateway feature is the
   121  ``CiliumEgressGatewayPolicy`` resource.
   122  
   123  Metadata
   124  --------
   125  
   126  ``CiliumEgressGatewayPolicy`` is a cluster-scoped custom resource definition, so a
   127  ``.metadata.namespace`` field should not be specified.
   128  
   129  .. code-block:: yaml
   130  
   131      apiVersion: cilium.io/v2
   132      kind: CiliumEgressGatewayPolicy
   133      metadata:
   134        name: example-policy
   135  
   136  To target pods belonging to a given namespace only labels/expressions should be
   137  used instead (as described below).
   138  
   139  Selecting source pods
   140  ---------------------
   141  
   142  The ``selectors`` field of a ``CiliumEgressGatewayPolicy`` resource is used to
   143  select source pods via a label selector. This can be done using ``matchLabels``:
   144  
   145  .. code-block:: yaml
   146  
   147      selectors:
   148      - podSelector:
   149          matchLabels:
   150            labelKey: labelVal
   151  
   152  It can also be done using ``matchExpressions``:
   153  
   154  .. code-block:: yaml
   155  
   156      selectors:
   157      - podSelector:
   158          matchExpressions:
   159          - {key: testKey, operator: In, values: [testVal]}
   160          - {key: testKey2, operator: NotIn, values: [testVal2]}
   161  
   162  Moreover, multiple ``podSelector`` can be specified:
   163  
   164  .. code-block:: yaml
   165  
   166      selectors:
   167      - podSelector:
   168        [..]
   169      - podSelector:
   170        [..]
   171  
   172  To select pods belonging to a given namespace, the special
   173  ``io.kubernetes.pod.namespace`` label should be used.
   174  
   175  .. note::
   176      Only security identities will be taken into account.
   177      See :ref:`identity-relevant-labels` for more information.
   178  
   179  Selecting the destination
   180  -------------------------
   181  
   182  One or more IPv4 destination CIDRs can be specified with ``destinationCIDRs``:
   183  
   184  .. code-block:: yaml
   185  
   186      destinationCIDRs:
   187      - "a.b.c.d/32"
   188      - "e.f.g.0/24"
   189  
   190  .. note::
   191  
   192      Any IP belonging to these ranges which is also an internal cluster IP (e.g.
   193      pods, nodes, Kubernetes API server) will be excluded from the egress gateway
   194      SNAT logic.
   195  
   196  It's possible to specify exceptions to the ``destinationCIDRs`` list with
   197  ``excludedCIDRs``:
   198  
   199  .. code-block:: yaml
   200  
   201      destinationCIDRs:
   202      - "a.b.0.0/16"
   203      excludedCIDRs:
   204      - "a.b.c.0/24"
   205  
   206  In this case traffic destined to the ``a.b.0.0/16`` CIDR, except for the
   207  ``a.b.c.0/24`` destination, will go through egress gateway and leave the cluster
   208  with the designated egress IP.
   209  
   210  Selecting and configuring the gateway node
   211  ------------------------------------------
   212  
   213  The node that should act as gateway node for a given policy can be configured
   214  with the ``egressGateway`` field. The node is matched based on its labels, with
   215  the ``nodeSelector`` field:
   216  
   217  .. code-block:: yaml
   218  
   219    egressGateway:
   220      nodeSelector:
   221        matchLabels:
   222          testLabel: testVal
   223  
   224  .. note::
   225  
   226      In case multiple nodes are a match for the given set of labels, the
   227      first node in lexical ordering based on their name will be selected.
   228  
   229  .. note::
   230  
   231      If there is no match for the given set of labels, Cilium drops the
   232      traffic that matches the destination CIDR(s).
   233  
   234  The IP address that should be used to SNAT traffic must also be configured.
   235  There are 3 different ways this can be achieved:
   236  
   237  1. By specifying the interface:
   238  
   239     .. code-block:: yaml
   240  
   241       egressGateway:
   242         nodeSelector:
   243           matchLabels:
   244             testLabel: testVal
   245         interface: ethX
   246  
   247     In this case the first IPv4 address assigned to the ``ethX`` interface will be used.
   248  
   249  2. By explicitly specifying the egress IP:
   250  
   251     .. code-block:: yaml
   252  
   253       egressGateway:
   254         nodeSelector:
   255           matchLabels:
   256             testLabel: testVal
   257         egressIP: a.b.c.d
   258  
   259     .. warning::
   260  
   261       The egress IP must be assigned to a network device on the node.
   262  
   263  3. By omitting both ``egressIP`` and ``interface`` properties, which will make
   264     the agent use the first IPv4 assigned to the interface for the default route.
   265  
   266     .. code-block:: yaml
   267  
   268       egressGateway:
   269         nodeSelector:
   270           matchLabels:
   271             testLabel: testVal
   272  
   273  Regardless of which way the egress IP is configured, the user must ensure that
   274  Cilium is running on the device that has the egress IP assigned to it, by
   275  setting the ``--devices`` agent option accordingly.
   276  
   277  .. warning::
   278  
   279     The ``egressIP`` and ``interface`` properties cannot both be specified in the ``egressGateway`` spec. Egress Gateway Policies that contain both of these properties will be ignored by Cilium.
   280  
   281  .. note::
   282  
   283     When Cilium is unable to select the Egress IP for an Egress Gateway policy (for example because the specified ``egressIP`` is not configured for any
   284     network interface on the gateway node), then the gateway node will drop traffic that matches the policy with the reason ``No Egress IP configured``.
   285  
   286  .. note::
   287  
   288     After Cilium has selected the Egress IP for an Egress Gateway policy (or failed to do so), it does not automatically respond to a change in the
   289     gateway node's network configuration (for example if an IP address is added or deleted). You can force a fresh selection by re-applying the
   290     Egress Gateway policy.
   291  
   292  Example policy
   293  --------------
   294  
   295  Below is an example of a ``CiliumEgressGatewayPolicy`` resource that conforms to
   296  the specification above:
   297  
   298  .. code-block:: yaml
   299  
   300    apiVersion: cilium.io/v2
   301    kind: CiliumEgressGatewayPolicy
   302    metadata:
   303      name: egress-sample
   304    spec:
   305      # Specify which pods should be subject to the current policy.
   306      # Multiple pod selectors can be specified.
   307      selectors:
   308      - podSelector:
   309          matchLabels:
   310            org: empire
   311            class: mediabot
   312            # The following label selects default namespace
   313            io.kubernetes.pod.namespace: default
   314  
   315      # Specify which destination CIDR(s) this policy applies to.
   316      # Multiple CIDRs can be specified.
   317      destinationCIDRs:
   318      - "0.0.0.0/0"
   319  
   320      # Configure the gateway node.
   321      egressGateway:
   322        # Specify which node should act as gateway for this policy.
   323        nodeSelector:
   324          matchLabels:
   325            node.kubernetes.io/name: a-specific-node
   326  
   327        # Specify the IP address used to SNAT traffic matched by the policy.
   328        # It must exist as an IP associated with a network interface on the instance.
   329        egressIP: 10.168.60.100
   330  
   331        # Alternatively it's possible to specify the interface to be used for egress traffic.
   332        # In this case the first IPv4 assigned to that interface will be used as egress IP.
   333        # interface: enp0s8
   334  
   335  Creating the ``CiliumEgressGatewayPolicy`` resource above would cause all
   336  traffic originating from pods with the ``org: empire`` and ``class: mediabot``
   337  labels in the ``default`` namespace and destined to ``0.0.0.0/0`` (i.e. all
   338  traffic leaving the cluster) to be routed through the gateway node with the
   339  ``node.kubernetes.io/name: a-specific-node`` label, which will then SNAT said
   340  traffic with the ``10.168.60.100`` egress IP.
   341  
   342  Selection of the egress network interface
   343  =========================================
   344  
   345  For gateway nodes with multiple network interfaces, Cilium selects the egress
   346  network interface based on the node's routing setup
   347  (``ip route get <externalIP> from <egressIP>``).
   348  
   349  .. warning::
   350  
   351     Redirecting to the correct egress network interface can fail under certain
   352     conditions when using a pre-5.10 kernel. In this case Cilium falls back to
   353     the current (== default) network interface.
   354  
   355     For environments that strictly require traffic to leave through the
   356     correct egress interface (for example EKS in ENI mode), it is recommended to use
   357     a 5.10 kernel or newer.
   358  
   359  Testing the egress gateway feature
   360  ==================================
   361  
   362  In this section we are going to show the necessary steps to test the feature.
   363  First we deploy a pod that connects to a cluster-external service. Then we apply
   364  a ``CiliumEgressGatewayPolicy`` and observe that the pod's connection gets
   365  redirected through the Gateway node.
   366  We assume a 2-node cluster with IPs ``192.168.60.11`` (node1) and
   367  ``192.168.60.12`` (node2). The client pod gets deployed to node1, and the CEGP
   368  selects node2 as Gateway node.
   369  
   370  Create an external service (optional)
   371  -------------------------------------
   372  
   373  If you don't have an external service to experiment with, you can use Nginx, as
   374  the server access logs will show from which IP address the request is coming.
   375  
   376  Create an nginx service on a Linux node that is external to the existing Kubernetes
   377  cluster, and use it as the destination of the egress traffic:
   378  
   379  .. code-block:: shell-session
   380  
   381      $ # Install and start nginx
   382      $ sudo apt install nginx
   383      $ sudo systemctl start nginx
   384  
   385  In this example, the IP associated with the host running the Nginx instance will
   386  be ``192.168.60.13``.
   387  
   388  Deploy client pods
   389  ------------------
   390  
   391  Deploy a client pod that will be used to connect to the Nginx instance:
   392  
   393  .. parsed-literal::
   394  
   395      $ kubectl create -f \ |SCM_WEB|\/examples/kubernetes-dns/dns-sw-app.yaml
   396      $ kubectl get pods
   397      NAME                             READY   STATUS    RESTARTS   AGE
   398      pod/mediabot                     1/1     Running   0          14s
   399  
   400      $ kubectl exec mediabot -- curl http://192.168.60.13:80
   401  
   402  Verify from the Nginx access log (or other external services) that the request
   403  is coming from one of the nodes in the Kubernetes cluster. In this example the
   404  access logs should contain something like:
   405  
   406  .. code-block:: shell-session
   407  
   408      $ tail /var/log/nginx/access.log
   409      [...]
   410      192.168.60.11 - - [04/Apr/2021:22:06:57 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.52.1"
   411  
   412  since the client pod is running on the node ``192.168.60.11`` it is expected
   413  that, without any Cilium egress gateway policy in place, traffic will leave the
   414  cluster with the IP of the node.
   415  
   416  Apply egress gateway policy
   417  ---------------------------
   418  
   419  Download the ``egress-sample`` Egress Gateway Policy yaml:
   420  
   421  .. parsed-literal::
   422  
   423      $ wget \ |SCM_WEB|\/examples/kubernetes-egress-gateway/egress-gateway-policy.yaml
   424  
   425  Modify the ``destinationCIDRs`` to include the IP of the host where your
   426  designated external service is running on.
   427  
   428  Specifying an IP address in the ``egressIP`` field is optional.
   429  To make things easier in this example, it is possible to comment out that line.
   430  This way, the agent will use the first IPv4 assigned to the interface for the
   431  default route.
   432  
   433  To let the policy select the node designated to be the Egress Gateway, apply the
   434  label ``egress-node: true`` to it:
   435  
   436  .. code-block:: shell-session
   437  
   438      $ kubectl label nodes <egress-gateway-node> egress-node=true
   439  
   440  Note that the Egress Gateway node should be a different node from the one where
   441  the ``mediabot`` pod is running on.
   442  
   443  Apply the ``egress-sample`` egress gateway Policy, which will cause all traffic
   444  from the mediabot pod to leave the cluster with the IP of the Egress Gateway node:
   445  
   446  .. code-block:: shell-session
   447  
   448      $ kubectl apply -f egress-gateway-policy.yaml
   449  
   450  Verify the setup
   451  ----------------
   452  
   453  We can now verify with the client pod that the policy is working correctly:
   454  
   455  .. code-block:: shell-session
   456  
   457      $ kubectl exec mediabot -- curl http://192.168.60.13:80
   458      <HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
   459      [...]
   460  
   461  The access log from Nginx should show that the request is coming from the
   462  selected Egress IP rather than the one of the node where the pod is running:
   463  
   464  .. code-block:: shell-session
   465  
   466      $ tail /var/log/nginx/access.log
   467      [...]
   468      192.168.60.100 - - [04/Apr/2021:22:06:57 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.52.1"
   469  
   470  Troubleshooting
   471  ---------------
   472  
   473  To troubleshoot a policy that is not behaving as expected, you can view the
   474  egress configuration in a cilium agent (the configuration is propagated to all agents,
   475  so it shouldn't matter which one you pick).
   476  
   477  .. code-block:: shell-session
   478  
   479      $ kubectl -n kube-system exec ds/cilium -- cilium-dbg bpf egress list
   480      Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), wait-for-node-init (init), clean-cilium-state (init)
   481      Source IP    Destination CIDR    Egress IP   Gateway IP
   482      192.168.2.23 192.168.60.13/32    0.0.0.0     192.168.60.12
   483  
   484  The Source IP address matches the IP address of each pod that matches the
   485  policy's ``podSelector``. The Gateway IP address matches the (internal) IP address
   486  of the egress node that matches the policy's ``nodeSelector``. The Egress IP is
   487  0.0.0.0 on all agents except for the one running on the egress gateway node,
   488  where you should see the Egress IP address being used for this traffic (which
   489  will be the ``egressIP`` from the policy, if specified).
   490  
   491  If the egress list shown does not contain entries as expected to match your
   492  policy, check that the pod(s) and egress node are labeled correctly to match
   493  the policy selectors.
   494  
   495  Troubleshooting SNAT Connection Limits
   496  --------------------------------------
   497  
   498  For more advanced troubleshooting topics please see advanced egress gateway troubleshooting topic for :ref:`SNAT connection limits<snat_connection_limits>`.
   499