github.com/cilium/cilium@v1.16.2/Documentation/security/threat-model.rst

github.com/cilium/cilium@v1.16.2/Documentation/security/threat-model.rst (about)

     1  .. only:: not (epub or latex or html)
     2  
     3      WARNING: You are looking at unreleased Cilium documentation.
     4      Please use the official rendered version released here:
     5      https://docs.cilium.io
     6  
     7  Threat Model
     8  ============
     9  
    10  This section presents a threat model for Cilium. This threat model
    11  allows interested parties to understand:
    12  
    13  -  security-specific implications of Cilium's architecture
    14  -  controls that are in place to secure data flowing through Cilium's various components
    15  -  recommended controls for running Cilium in a production environment
    16  
    17  Scope and Prerequisites
    18  -----------------------
    19  
    20  This threat model considers the possible attacks that could affect an
    21  up-to-date version of Cilium running in a production environment; it
    22  will be refreshed when there are significant changes to Cilium's
    23  architecture or security posture.
    24  
    25  This model does not consider supply-chain attacks, such as attacks where
    26  a malicious contributor is able to intentionally inject vulnerable code
    27  into Cilium. For users who are concerned about supply-chain attacks,
    28  Cilium's `security audit`_ assessed Cilium's supply chain controls
    29  against `the SLSA framework`_.
    30  
    31  In order to understand the following threat model, readers will need
    32  familiarity with basic Kubernetes concepts, as well as a high-level
    33  understanding of Cilium's :ref:`architecture and components<component_overview>`.
    34  
    35  .. _security audit: https://github.com/cilium/cilium.io/blob/main/Security-Reports/CiliumSecurityAudit2022.pdf
    36  .. _the SLSA framework:  https://slsa.dev/
    37  
    38  Methodology
    39  -----------
    40  
    41  This threat model considers eight different types of threat
    42  actors, placed at different parts of a typical deployment stack. We will
    43  primarily use Kubernetes as an example but the threat model remains
    44  accurate if deployed with other orchestration systems, or when running
    45  Cilium outside of Kubernetes. The attackers will have different levels
    46  of initial privileges, giving us a broad overview of the security
    47  guarantees that Cilium can provide depending on the nature of the threat
    48  and the extent of a previous compromise.
    49  
    50  For each threat actor, this guide uses the `the STRIDE methodology`_ to
    51  assess likely attacks. Where one attack type in the STRIDE set can lead to others
    52  (for example, tampering leading to denial of service), we have described the
    53  attack path under the most impactful attack type. For the potential attacks
    54  that we identify, we recommend controls that can be used to reduce the
    55  risk of the identified attacks compromising a cluster. Applying the
    56  recommended controls is strongly advised in order to run Cilium securely
    57  in production.
    58  
    59  .. _the STRIDE methodology: https://en.wikipedia.org/wiki/STRIDE_(security)
    60  
    61  Reference Architecture
    62  ----------------------
    63  
    64  For ease of understanding, consider a single Kubernetes
    65  cluster running Cilium, as illustrated below:
    66  
    67  .. image:: images/cilium_threat_model_reference_architecture.png
    68  
    69  The Threat Surface
    70  ~~~~~~~~~~~~~~~~~~
    71  
    72  In the above scenario, the aim of Cilium's security controls is to
    73  ensure that all the components of the Cilium platform are operating
    74  correctly, to the extent possible given the abilities of the threat
    75  actor that Cilium is faced with. The key components that need to be
    76  protected are:
    77  
    78  -  the Cilium agent running on a node, either as a Kubernetes pod, a host process, or as an entire virtual machine
    79  -  Cilium state (either stored via CRDs or via an external key-value store like etcd)
    80  -  eBPF programs loaded by Cilium into the kernel
    81  -  network packets managed by Cilium
    82  -  observability data collected by Cilium and stored by Hubble
    83  
    84  The Threat Model
    85  ----------------
    86  
    87  For each type of attacker, we consider the plausible types of attacks
    88  available to them, how Cilium can be used to protect against these
    89  attacks, as well as the security controls that Cilium provides. For
    90  attacks which might arise as a consequence of the high level of
    91  privileges required by Cilium, we also suggest mitigations that users
    92  should apply to secure their environments.
    93  
    94  .. _kubernetes-workload-attacker:
    95  
    96  Kubernetes Workload Attacker
    97  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    98  
    99  For the first scenario, consider an attacker who has been able to
   100  gain access to a Kubernetes pod, and is now able to run arbitrary code
   101  inside a container. This could occur, for example, if a vulnerable
   102  service is exposed externally to a network. In this case, let us also
   103  assume that the compromised pod does not have any elevated privileges
   104  (in Kubernetes or on the host) or direct access to host files.
   105  
   106  .. image:: images/cilium_threat_model_workload.png
   107  
   108  In this scenario, there is no potential for compromise of the Cilium
   109  stack; in fact, Cilium provides several features that would allow users
   110  to limit the scope of such an attack:
   111  
   112  .. rst-class:: wrapped-table
   113  
   114  +-----------------+---------------------+--------------------------------+
   115  | Threat surface  | Identified STRIDE   | Cilium security benefits       |
   116  |                 | threats             |                                |
   117  +=================+=====================+================================+
   118  | Cilium agent    | Potential denial of | Cilium can enforce             |
   119  |                 | service if the      | `bandwidth limitations`_       |
   120  |                 | compromised         | on pods to limit the network   | 
   121  |                 |                     | resource utilization.          |
   122  |                 | Kubernetes workload |                                |
   123  |                 | does not have       |                                |
   124  |                 | defined resource    |                                |
   125  |                 | limits.             |                                |
   126  +-----------------+---------------------+--------------------------------+
   127  | Cilium          | None                |                                |
   128  | configuration   |                     |                                |
   129  +-----------------+---------------------+--------------------------------+
   130  | Cilium eBPF     | None                |                                |
   131  | programs        |                     |                                |
   132  +-----------------+---------------------+--------------------------------+
   133  | Network data    | None                | - Cilium's network policy can  |
   134  |                 |                     |   be used to provide           |
   135  |                 |                     |   least-privilege isolation    |
   136  |                 |                     |   between Kubernetes           |
   137  |                 |                     |   workloads, and between       |
   138  |                 |                     |   Kubernetes workloads and     |
   139  |                 |                     |   "external" endpoints running |
   140  |                 |                     |   outside the Kubernetes       |
   141  |                 |                     |   cluster, or running on the   |
   142  |                 |                     |   Kubernetes worker nodes.     |
   143  |                 |                     |   Users should ideally define  |
   144  |                 |                     |   specific allow rules that    |
   145  |                 |                     |   only permit expected         |
   146  |                 |                     |   communication between        |
   147  |                 |                     |   services.                    |
   148  |                 |                     | - Cilium's network             |
   149  |                 |                     |   connectivity will prevent an |
   150  |                 |                     |   attacker from observing the  |
   151  |                 |                     |   traffic intended for other   |
   152  |                 |                     |   workloads, or sending        |
   153  |                 |                     |   traffic that "spoofs" the    |
   154  |                 |                     |   identity of another pod,     |
   155  |                 |                     |   even if transparent          |
   156  |                 |                     |   encryption is not in use.    |
   157  |                 |                     |   Pods cannot send traffic     |
   158  |                 |                     |   that "spoofs" other pods due |
   159  |                 |                     |   to limits on the use of      |
   160  |                 |                     |   source IPs and limits on     |
   161  |                 |                     |   sending tunneled traffic.    |
   162  +-----------------+---------------------+--------------------------------+
   163  | Observability   | None                | Cilium's Hubble flow-event     |
   164  | data            |                     | observability can be used to   |
   165  |                 |                     | provide reliable audit of      |
   166  |                 |                     | the attacker's L3/L4 and L7    |
   167  |                 |                     | network connectivity.          |
   168  +-----------------+---------------------+--------------------------------+
   169  
   170  .. _bandwidth limitations: https://docs.cilium.io/en/stable/network/kubernetes/bandwidth-manager/
   171  
   172  Recommended Controls
   173  ^^^^^^^^^^^^^^^^^^^^
   174  
   175  -  Kubernetes workloads should have `defined resource limits`_.
   176     This will help in ensuring that Cilium is not starved of resources due to a misbehaving deployment in a cluster.
   177  -  Cilium can be given prioritized access to system resources either via
   178     Kubernetes, cgroups, or other controls.
   179  -  Runtime security solutions such as `Tetragon`_ should be deployed to 
   180     ensure that container compromises can be detected as they occur.
   181  
   182  .. _defined resource limits: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
   183  .. _Tetragon: https://github.com/cilium/tetragon
   184  
   185  .. _limited-privilege-host-attacker:
   186  
   187  Limited-privilege Host Attacker
   188  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   189  
   190  In this scenario, the attacker is someone with the ability to run
   191  arbitrary code with direct access to the host PID or network namespace
   192  (or both), but without "root" privileges that would allow them to
   193  disable Cilium components or undermine the eBPF and other kernel state
   194  Cilium relies on.
   195  
   196  This level of access could exist for a variety of reasons, including:
   197  
   198  -  Pods or other containers running in the host PID or network
   199     namespace, but not with "root" privileges. This includes
   200     ``hostNetwork: true`` and ``hostPID: true`` containers.
   201  -  Non-"root" SSH or other console access to a node.
   202  -  A containerized workload that has "escaped" the container namespace
   203     but as a non-privileged user.
   204  
   205  .. image:: images/cilium_threat_model_non_privileged.png
   206  
   207  In this case, an attacker would be able to bypass some of Cilium's
   208  network controls, as described below:
   209  
   210  .. rst-class:: wrapped-table
   211  
   212  +-----------------+-------------------------+----------------------------+
   213  | **Threat        | **Identified STRIDE     | **Cilium security          |
   214  | surface**       | threats**               | benefits**                 |
   215  +=================+=========================+============================+
   216  | Cilium agent    | - If the non-privileged |                            |
   217  |                 |   attacker is able to   |                            |
   218  |                 |   access the container  |                            |
   219  |                 |   runtime and Cilium is |                            |
   220  |                 |   running as a          |                            |
   221  |                 |   container, the        |                            |
   222  |                 |   attacker will be able |                            |
   223  |                 |   to tamper with the    |                            |
   224  |                 |   Cilium agent running  |                            |
   225  |                 |   on the node.          |                            |
   226  |                 | - Denial of service is  |                            |
   227  |                 |   also possible via     |                            |
   228  |                 |   spawning workloads    |                            |
   229  |                 |   directly on the host. |                            |
   230  +-----------------+-------------------------+----------------------------+
   231  | Cilium          | Same as for the Cilium  |                            |
   232  | configuration   | agent.                  |                            |
   233  |                 |                         |                            |
   234  |                 |                         |                            |
   235  |                 |                         |                            |
   236  |                 |                         |                            |
   237  |                 |                         |                            |
   238  |                 |                         |                            |
   239  |                 |                         |                            |
   240  +-----------------+-------------------------+----------------------------+
   241  | Cilium eBPF     | Same as for the Cilium  |                            |
   242  | programs        | agent.                  |                            |
   243  |                 |                         |                            |
   244  |                 |                         |                            |
   245  |                 |                         |                            |
   246  |                 |                         |                            |
   247  |                 |                         |                            |
   248  |                 |                         |                            |
   249  |                 |                         |                            |
   250  +-----------------+-------------------------+----------------------------+
   251  | Network data    | Elevation of            | Cilium's network           |
   252  |                 | privilege: traffic      | connectivity will prevent  |
   253  |                 | sent by the attacker    | an attacker from observing |
   254  |                 | will no longer be       | the traffic intended for   |
   255  |                 | subject to Kubernetes   | other workloads, or        |
   256  |                 | or                      | sending traffic that       |
   257  |                 | container-networked     | spoofs the identity of     |
   258  |                 | Cilium network          | another pod, even if       |
   259  |                 | policies.               | transparent encryption is  |
   260  |                 | :ref:`Host-networked    | not in use.                |
   261  |                 | Cilium                  |                            |
   262  |                 | policies                |                            |
   263  |                 | <host_firewall>`        |                            |
   264  |                 | will continue to        |                            |
   265  |                 | apply. Other traffic    |                            |
   266  |                 | within the cluster      |                            |
   267  |                 | remains unaffected.     |                            |
   268  +-----------------+-------------------------+----------------------------+
   269  | Observability   | None                    | Cilium's Hubble flow-event |
   270  | data            |                         | observability can be used  |
   271  |                 |                         | to provide reliable audit  |
   272  |                 |                         | of the attacker's L3/L4    |
   273  |                 |                         | and L7 network             |
   274  |                 |                         | connectivity. Traffic sent |
   275  |                 |                         | by the attacker will be    |
   276  |                 |                         | attributed to the worker   |
   277  |                 |                         | node, and not to a         |
   278  |                 |                         | specific Kubernetes        |
   279  |                 |                         | workload.                  |
   280  +-----------------+-------------------------+----------------------------+
   281  
   282  Recommended Controls
   283  ^^^^^^^^^^^^^^^^^^^^
   284  
   285  In addition to the recommended controls against the :ref:`kubernetes-workload-attacker`:
   286  
   287  -  Container images should be regularly patched to reduce the chance of
   288     compromise.
   289  -  Minimal container images should be used where possible.
   290  -  Host-level privileges should be avoided where possible.
   291  -  Ensure that the container users do not have access to the underlying
   292     container runtime.
   293  
   294  .. _root-equivalent-host-attacker:
   295  
   296  Root-equivalent Host Attacker
   297  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   298  
   299  A "root" privilege host attacker has full privileges to do everything on
   300  the local host. This access could exist for several reasons, including:
   301  
   302  -  Root SSH or other console access to the Kubernetes worker node.
   303  -  A containerized workload that has escaped the container namespace as
   304     a privileged user.
   305  -  Pods running with ``privileged: true`` or other significant
   306     capabilities like ``CAP_SYS_ADMIN`` or ``CAP_BPF``.
   307  
   308  .. image:: images/cilium_threat_model_root.png
   309  
   310  .. rst-class:: wrapped-table
   311  
   312  +-------------------+--------------------------------------------------+
   313  | **Threat          | **Identified STRIDE threats**                    |
   314  | surface**         |                                                  |
   315  +===================+==================================================+
   316  | Cilium agent      | In this situation, all potential attacks covered |
   317  |                   | by STRIDE are possible. Of note:                 |
   318  |                   |                                                  |
   319  |                   | -  The attacker would be able to disable eBPF on |
   320  |                   |    the node, disabling Cilium's network and      |
   321  |                   |    runtime visibility and enforcement. All       |
   322  |                   |    further operations by the attacker will be    |
   323  |                   |    unlimited and unaudited.                      |
   324  |                   | -  The attacker would be able to observe network |
   325  |                   |    connectivity across all workloads on the      |
   326  |                   |    host.                                         |
   327  |                   | -  The attacker can spoof traffic from the node  |
   328  |                   |    such that it appears to come from pods        |
   329  |                   |    with any identity.                            |
   330  |                   | -  If the physical network allows ARP poisoning, |
   331  |                   |    or if any other attack allows a               |
   332  |                   |    compromised node to "attract" traffic         |
   333  |                   |    destined to other nodes, the attacker can     |
   334  |                   |    potentially intercept all traffic in the      |
   335  |                   |    cluster, even if this traffic is encrypted    |
   336  |                   |    using IPsec, since we use a cluster-wide      |
   337  |                   |    pre-shared key.                               |
   338  |                   | -  The attacker can also use Cilium's            |
   339  |                   |    credentials to :ref:`attack the Kubernetes    |
   340  |                   |    API server <kubernetes-api-server-attacker>`, |
   341  |                   |    as well as Cilium's :ref:`etcd key-value      |
   342  |                   |    store <kv-store-attacker>` (if in use).       |
   343  |                   | -  If the compromised node is running the        |
   344  |                   |    ``cilium-operator`` pod, the attacker         |
   345  |                   |    would be able to carry out denial of          |
   346  |                   |    service attacks against other nodes using     |
   347  |                   |    the ``cilium-operator`` service account       |
   348  |                   |    credentials found on the node.                |
   349  +-------------------+                                                  |
   350  | Cilium            |                                                  |
   351  | configuration     |                                                  |
   352  +-------------------+                                                  |
   353  | Cilium eBPF       |                                                  |
   354  | programs          |                                                  |
   355  +-------------------+                                                  |
   356  | Network data      |                                                  |
   357  +-------------------+                                                  |
   358  | Observability     |                                                  |
   359  | data              |                                                  |
   360  +-------------------+--------------------------------------------------+
   361  
   362  This attack scenario emphasizes the importance of securing Kubernetes
   363  nodes, minimizing the permissions available to container workloads, and
   364  monitoring for suspicious activity on the node, container, and API
   365  server levels.
   366  
   367  Recommended Controls
   368  ^^^^^^^^^^^^^^^^^^^^
   369  
   370  In addition to the controls against a :ref:`limited-privilege-host-attacker`:
   371  
   372  -  Workloads with privileged access should be reviewed; privileged access should
   373     only be provided to deployments if essential.
   374  -  Network policies should be configured to limit connectivity to workloads with
   375     privileged access.
   376  -  Kubernetes audit logging should be enabled, with audit logs being sent to a
   377     centralized external location for automated review.
   378  -  Detections should be configured to alert on suspicious activity.
   379  -  ``cilium-operator`` pods should not be scheduled on nodes that run regular
   380     workloads, and should instead be configured to run on control plane nodes.
   381  
   382  .. _mitm-attacker:
   383  
   384  Man-in-the-middle Attacker
   385  ~~~~~~~~~~~~~~~~~~~~~~~~~~
   386  
   387  In this scenario, our attacker has access to the underlying network
   388  between Kubernetes worker nodes, but not the Kubernetes worker nodes
   389  themselves. This attacker may inspect, modify, or inject malicious
   390  network traffic.
   391  
   392  .. image:: images/cilium_threat_model_mitm.png
   393  
   394  The threat matrix for such an attacker is as follows:
   395  
   396  .. rst-class:: wrapped-table
   397  
   398  +------------------+---------------------------------------------------+
   399  | **Threat         | **Identified STRIDE threats**                     |
   400  | surface**        |                                                   |
   401  +==================+===================================================+
   402  | Cilium agent     | None                                              |
   403  +------------------+---------------------------------------------------+
   404  | Cilium           | None                                              |
   405  | configuration    |                                                   |
   406  +------------------+---------------------------------------------------+
   407  | Cilium eBPF      | None                                              |
   408  | programs         |                                                   |
   409  +------------------+---------------------------------------------------+
   410  | Network data     | - Without transparent encryption, an attacker     |
   411  |                  |   could inspect traffic between workloads in both |
   412  |                  |   overlay and native routing modes.               |
   413  |                  | - An attacker with knowledge of pod network       |
   414  |                  |   configuration (including pod IP addresses and   |
   415  |                  |   ports) could inject traffic into a cluster by   |
   416  |                  |   forging packets.                                |
   417  |                  | - Denial of service could occur depending on the  |
   418  |                  |   behavior of the attacker.                       |
   419  +------------------+---------------------------------------------------+
   420  | Observability    | - TLS is required for all connectivity between    |
   421  | data             |   Cilium components, as well as for exporting     |
   422  |                  |   data to other destinations, removing the        |
   423  |                  |   scope for spoofing or tampering.                |
   424  |                  | - Without transparent encryption, the attacker    |
   425  |                  |   could re-create the observability data as       |
   426  |                  |   available on the network level.                 |
   427  |                  | - Information leakage could occur via an attacker |
   428  |                  |   scraping Hubble Prometheus metrics. These       |
   429  |                  |   metrics are disabled by default, and            |
   430  |                  |   can contain sensitive information on network    |
   431  |                  |   flows.                                          |
   432  |                  | - Denial of service could occur depending on the  |
   433  |                  |   behavior of the attacker.                       |
   434  +------------------+---------------------------------------------------+
   435  
   436  Recommended Controls
   437  ^^^^^^^^^^^^^^^^^^^^
   438  
   439  - :ref:`gsg_encryption` should be configured to ensure the confidentiality of
   440    communication between workloads.
   441  - TLS should be configured for communication between the Prometheus
   442    metrics endpoints and the Prometheus server.
   443  - Network policies should be configured such that only the Prometheus
   444    server is allowed to scrape :ref:`Hubble metrics <metrics>` in particular.
   445  
   446  .. _network-attacker:
   447  
   448  Network Attacker
   449  ~~~~~~~~~~~~~~~~
   450  
   451  In our threat model, a generic network attacker has access to the same
   452  underlying IP network as Kubernetes worker nodes, but is not inline
   453  between the nodes. The assumption is that this attacker is still able to
   454  send IP layer traffic that reaches a Kubernetes worker node. This is a
   455  weaker variant of the man-in-the-middle attack described above, as the
   456  attacker can only inject traffic to worker nodes, but not see the
   457  replies.
   458  
   459  .. image:: images/cilium_threat_model_network_attacker.png
   460  
   461  For such an attacker, the threat matrix is as follows:
   462  
   463  .. rst-class:: wrapped-table
   464  
   465  +------------------+---------------------------------------------------+
   466  | **Threat         | **Identified STRIDE threats**                     |
   467  | surface**        |                                                   |
   468  +==================+===================================================+
   469  | Cilium agent     | None                                              |
   470  +------------------+---------------------------------------------------+
   471  | Cilium           | None                                              |
   472  | configuration    |                                                   |
   473  +------------------+---------------------------------------------------+
   474  | Cilium eBPF      | None                                              |
   475  | programs         |                                                   |
   476  +------------------+---------------------------------------------------+
   477  | Network data     | - An attacker with knowledge of pod network       |
   478  |                  |   configuration (including pod IP addresses and   |
   479  |                  |   ports) could inject traffic into a cluster by   |
   480  |                  |   forging packets.                                |
   481  |                  | - Denial of service could occur depending on the  |
   482  |                  |   behavior of the attacker.                       |
   483  +------------------+---------------------------------------------------+
   484  | Observability    | - Denial of service could occur depending on the  |
   485  | data             |   behavior of the attacker.                       |
   486  |                  | - Information leakage could occur via an attacker |
   487  |                  |   scraping Cilium or Hubble Prometheus metrics,   |
   488  |                  |   depending on the specific metrics enabled.      |
   489  +------------------+---------------------------------------------------+
   490  
   491  Recommended Controls
   492  ^^^^^^^^^^^^^^^^^^^^
   493  
   494  - :ref:`gsg_encryption` should be configured to ensure the confidentiality of
   495    communication between workloads.
   496  
   497  .. _kubernetes-api-server-attacker:
   498  
   499  Kubernetes API Server Attacker
   500  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   501  
   502  This type of attack could be carried out by any user or code with
   503  network access to the Kubernetes API server and credentials that allow
   504  Kubernetes API requests. Such permissions would allow the user to read
   505  or manipulate the API server state (for example by changing CRDs).
   506  
   507  This section is intended to cover any attack that might be exposed via
   508  Kubernetes API server access, regardless of whether the access is full or
   509  limited. 
   510  
   511  .. image:: images/cilium_threat_model_api_server_attacker.png
   512  
   513  For such an attacker, our threat matrix is as follows:
   514  
   515  .. rst-class:: wrapped-table
   516  
   517  +------------------+---------------------------------------------------+
   518  | **Threat         | **Identified STRIDE threats**                     |
   519  | surface**        |                                                   |
   520  +==================+===================================================+
   521  | Cilium agent     | - A Kubernetes API user with ``kubectl exec``     |
   522  |                  |   access to the pod running Cilium effectively    |
   523  |                  |   becomes a :ref:`root-equivalent host            |
   524  |                  |   attacker <root-equivalent-host-attacker>`,      |
   525  |                  |   since Cilium runs as a privileged pod.          |
   526  |                  | - An attacker with permissions to configure       |
   527  |                  |   workload settings effectively becomes a         |
   528  |                  |   :ref:`kubernetes-workload-attacker`.            |
   529  +------------------+---------------------------------------------------+
   530  | Cilium           | The ability to modify the ``Cilium*``             |
   531  | configuration    | CustomResourceDefinitions, as well as any         |
   532  |                  | CustomResource from Cilium, in the cluster could  |
   533  |                  | have the following effects:                       |
   534  |                  |                                                   |
   535  |                  | -  The ability to create or modify CiliumIdentity |
   536  |                  |    and CiliumEndpoint or CiliumEndpointSlice      |
   537  |                  |    resources would allow an attacker to tamper    |
   538  |                  |    with the identities of pods.                   |
   539  |                  | -  The ability to delete Kubernetes or Cilium     |
   540  |                  |    NetworkPolicies would remove policy            |
   541  |                  |    enforcement.                                   |
   542  |                  | -  Creating a large number of CiliumIdentity      |
   543  |                  |    resources could result in denial of service.   |
   544  |                  | -  Workloads external to the cluster could be     |
   545  |                  |    added to the network.                          |
   546  |                  | -  Traffic routing settings between workloads     |
   547  |                  |    could be modified                              |
   548  |                  |                                                   |
   549  |                  | The cumulative effect of such actions could       |
   550  |                  | result in the escalation of a single-node         |
   551  |                  | compromise into a multi-node compromise.          |
   552  +------------------+---------------------------------------------------+
   553  | Cilium eBPF      | An attacker with ``kubectl exec`` access to the   |
   554  | programs         | Cilium agent pod will be able to modify eBPF      |
   555  |                  | programs.                                         |
   556  +------------------+---------------------------------------------------+
   557  | Network data     | Privileged Kubernetes API server access (``exec`` |
   558  |                  | access to Cilium pods or access to view           |
   559  |                  | Kubernetes secrets) could allow an attacker to    |
   560  |                  | access the pre-shared key used for IPsec. When    |
   561  |                  | used by a :ref:`man-in-the-middle                 |
   562  |                  | attacker <mitm-attacker>`, this                   |
   563  |                  | could undermine the confidentiality and integrity |
   564  |                  | of workload communication.                        |
   565  |                  | |br| |br|                                         |
   566  |                  | Depending on the attacker's level of access, the  |
   567  |                  | ability to spoof identities or tamper with policy |
   568  |                  | enforcement could also allow them to view network |
   569  |                  | data.                                             |
   570  +------------------+---------------------------------------------------+
   571  | Observability    | Users with permissions to configure workload      |
   572  | data             | settings could cause denial of service.           |
   573  +------------------+---------------------------------------------------+
   574  
   575  Recommended Controls
   576  ^^^^^^^^^^^^^^^^^^^^
   577  
   578  - `Kubernetes RBAC`_ should be configured to only grant necessary permissions
   579    to users and service accounts. Access to resources in the ``kube-system``
   580    and ``cilium`` namespaces in particular should be highly limited.
   581  - Kubernetes audit logs should be used to automatically review requests
   582    made to the API server, and detections should be configured to
   583    alert on suspicious activity.
   584  
   585  .. _Kubernetes RBAC: https://kubernetes.io/docs/reference/access-authn-authz/rbac/
   586  
   587  .. _kv-store-attacker:
   588  
   589  Cilium Key-value Store Attacker
   590  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   591  
   592  Cilium can use :ref:`an external key-value store <k8s_install_etcd>`
   593  such as etcd to store state. In this scenario, we consider a user with
   594  network access to the Cilium etcd endpoints and credentials to access
   595  those etcd endpoints. The credentials to the etcd endpoints are stored
   596  as Kubernetes secrets; any attacker would first have to compromise these
   597  secrets before gaining access to the key-value store.
   598  
   599  .. image:: images/cilium_threat_model_etcd_attacker.png
   600  
   601  .. rst-class:: wrapped-table
   602  
   603  +------------------+---------------------------------------------------+
   604  | **Threat         | **Identified STRIDE threats**                     |
   605  | surface**        |                                                   |
   606  +==================+===================================================+
   607  | Cilium agent     | None                                              |
   608  +------------------+---------------------------------------------------+
   609  | Cilium           | The ability to create or modify Identities or     |
   610  | configuration    | Endpoints in etcd would allow an attacker to      |
   611  |                  | "give" any pod any identity. The ability to spoof |
   612  |                  | identities in this manner might be used to        |
   613  |                  | escalate a single node compromise to a multi-node |
   614  |                  | compromise, for example by spoofing identities to |
   615  |                  | undermine ingress segmentation rules that would   |
   616  |                  | be applied on remote nodes.                       |
   617  +------------------+---------------------------------------------------+
   618  | Cilium eBPF      | None                                              |
   619  | programs         |                                                   |
   620  +------------------+---------------------------------------------------+
   621  | Network data     | An attacker would be able to modify the routing   |
   622  |                  | of traffic within a cluster, and as a consequence |
   623  |                  | gain the privileges of a :ref:`mitm-attacker`.    |
   624  |                  |                                                   |
   625  +------------------+---------------------------------------------------+
   626  | Observability    | None                                              |
   627  | data             |                                                   |
   628  +------------------+---------------------------------------------------+
   629  
   630  Recommended Controls
   631  ^^^^^^^^^^^^^^^^^^^^
   632  
   633  -  The ``etcd`` instance deployed to store Cilium configuration should be independent
   634     of the instance that is typically deployed as part of configuring a Kubernetes
   635     cluster. This separation reduces the risk of a Cilium ``etcd`` compromise
   636     leading to further cluster-wide impact.
   637  -  Kubernetes RBAC controls should be applied to restrict access to Kubernetes
   638     secrets.
   639  -  Kubernetes audit logs should be used to detect access to secret data and
   640     alert if such access is suspicious.
   641  
   642  Hubble Data Attacker
   643  ~~~~~~~~~~~~~~~~~~~~
   644  
   645  This is an attacker with network reachability to Kubernetes worker
   646  nodes, or other systems that store or expose Hubble data, with the goal
   647  of gaining access to potentially sensitive Hubble flow or process data.
   648  
   649  .. image:: images/cilium_threat_model_hubble_attacker.png
   650  
   651  .. rst-class:: wrapped-table
   652  
   653  +------------------+---------------------------------------------------+
   654  | **Threat         | **Identified STRIDE threats**                     |
   655  | surface**        |                                                   |
   656  +==================+===================================================+
   657  | Cilium pods      | None                                              |
   658  +------------------+---------------------------------------------------+
   659  | Cilium           | None                                              |
   660  | configuration    |                                                   |
   661  +------------------+---------------------------------------------------+
   662  | Cilium eBPF      | None                                              |
   663  | programs         |                                                   |
   664  +------------------+---------------------------------------------------+
   665  | Network data     | None                                              |
   666  +------------------+---------------------------------------------------+
   667  | Observability    | None, assuming correct configuration of the       |
   668  | data             | following:                                        |
   669  |                  |                                                   |
   670  |                  | -  Network policy to limit access to              |
   671  |                  |    ``hubble-relay`` or ``hubble-ui`` services     |
   672  |                  | -  Limited access to ``cilium``,                  |
   673  |                  |    ``hubble-relay``, or ``hubble-ui`` pods        |
   674  |                  | -  TLS for external data export                   |
   675  |                  | -  Security controls at the destination of any    |
   676  |                  |    exported data                                  |
   677  +------------------+---------------------------------------------------+
   678  
   679  Recommended Controls
   680  ^^^^^^^^^^^^^^^^^^^^
   681  
   682  -  Network policies should limit access to the ``hubble-relay`` and
   683     ``hubble-ui`` services
   684  -  Kubernetes RBAC should be used to limit access to any ``cilium-*``
   685     or ``hubble-`*`` pods
   686  -  TLS should be configured for access to the Hubble Relay API and Hubble UI
   687  -  TLS should be correctly configured for any data export
   688  -  The destination data stores for exported data should be secured (such
   689     as by applying encryption at rest and cloud provider specific RBAC
   690     controls, for example)
   691  
   692  Overall Recommendations
   693  -----------------------
   694  
   695  To summarize the recommended controls to be used when configuring a
   696  production Kubernetes cluster with Cilium:
   697  
   698  #. Ensure that Kubernetes roles are scoped correctly to the requirements of your
   699     users, and that service account permissions for pods are tightly scoped to
   700     the needs of the workloads. In particular, access to sensitive namespaces,
   701     ``exec`` actions, and Kubernetes secrets should all be highly controlled.
   702  #. Use resource limits for workloads where possible to reduce the chance of
   703     denial of service attacks.
   704  #. Ensure that workload privileges and capabilities are only granted when
   705     essential to the functionality of the workload, and ensure that specific
   706     controls to limit and monitor the behavior of the workload are in place.
   707  #. Use :ref:`network policies <network_policy>` to ensure that network traffic in Kubernetes is segregated.
   708  #. Use :ref:`gsg_encryption` in Cilium to ensure that communication between
   709     workloads is secured.
   710  #. Enable Kubernetes audit logging, forward the audit logs to a centralized
   711     monitoring platform, and define alerting for suspicious activity.
   712  #. Enable TLS for access to any externally-facing services, such as Hubble Relay
   713     and Hubble UI.
   714  #. Use `Tetragon`_ as a runtime security solution to rapidly detect unexpected
   715     behavior within your Kubernetes cluster.
   716  
   717  If you have questions, suggestions, or would like to help improve Cilium's security
   718  posture, reach out to security@cilium.io.
   719  
   720  .. |br| raw:: html
   721  
   722        <br>