github.com/aporeto-inc/trireme-lib@v10.358.0+incompatible/docs/policy_design.md

github.com/aporeto-inc/trireme-lib@v10.358.0+incompatible/docs/policy_design.md (about)

     1  # Policy design in Trireme
     2  
     3  Trireme includes a powerful policy language that defines authorization policies between containers or processes
     4  (often referred as Processing Units). This document aims to explain the basic concepts behind the Trireme
     5  policies and how to get started to define your own policies.
     6  
     7  As a user of the Trireme library, you need to implement a `Policy Resolver` interface that will fully
     8  define the policies that will apply to your traffic.
     9  
    10  The example part of Trireme can be used as a starting point for implementing your own `Policy Resolver`
    11  
    12  # Trireme Cluster
    13  
    14  When using Trireme, two different perimeters are defined:
    15  * Trireme endpoints is the set of all Processing Units (typically represents a container) that implement authorization
    16  and the related policies are captured through `Trireme Internal Policies`
    17  * Outside world: Everything else that is not being explicitly authorized. Traffic to those endpoints is
    18  managed through the `External policies`
    19  
    20  The complex and granular Trireme policies can only be applied if both the receiver and destination are being part of the Trireme endpoints.
    21  In any other cases, a standard set of ACLs will be applied in egress and ingress.
    22  
    23  ## Trireme CIDR
    24  
    25  Trireme is typically installed inside a private cluster. This cluster is a large set of servers under the same
    26  administrative control. Each node part of the cluster will get one Trireme agent installed.
    27  We recommend that the endpoints used inside the private Trireme cluster use a well-defined Network CIDR,
    28  although this is not mandatory.  The endpoints addresses are the Processing Unit (typically docker) IPs that
    29  will be used on your cluster and that will be policed through the Trireme agent.
    30  The typical server on which the Trireme agent runs is typically not an endpoint,
    31  but the containers or processes that will run on that servers are endpoints.
    32  
    33  This can be for example `10.0.0.0/8` and `172.17.0.0/16` It is referred to as the `Trireme CIDRs`
    34  and can be composed of a large set of independent CIDRs.
    35  
    36  Those `Trireme CIDRs` is given as parameter to the Trireme agent at startup. The agent uses those CIDRs to
    37  decide if a socket Endpoint is going to be inside your Trireme cluster, and therefore if there
    38  is a need to add the Trireme metadata to the socket.
    39  
    40  Trireme also supports an auto-discovery mechanism that automatically detects these end-points. The
    41  auto-discovery assumes that all endpoints are Trireme enabled and initiates an authorization process
    42  to all endpoints. If the authorization fails, the Trireme falls back to a list of ACL rules based
    43  on IP addresses.
    44  
    45  ## Excluding IPs from `Trireme CIDRs` cluster.
    46  
    47  In some specific use-case you want to be able to define a set of CIDRs for Trireme with the
    48  exception of a couple of well defined subnets or/and IPs. In order to achieve this,
    49  Trireme supports an Exclusion API that can exclude specific endpoints out of the
    50  general `Trireme CIDRs` dynamically during runtime.
    51  
    52  Any set of IPs in the `Trireme CIDRs` that are not going to get policed through the agent need
    53  to be removed through this exclusion API. This API is defined in supervisor/interfaces.go:
    54  
    55  
    56  ```go
    57  // An Excluder can add/remove specific IPs that are not part of Trireme.
    58  type Excluder interface {
    59  
    60  	// AddExcludedIP adds an exception for the destination parameter IP, allowing all the traffic.
    61  	AddExcludedIP(ip string) error
    62  
    63  	// RemoveExcludedIP removes the exception for the destination IP given in parameter.
    64  	RemoveExcludedIP(ip string) error
    65  }
    66  ```
    67  
    68  
    69  # Whitelist model for Trireme
    70  
    71  Trireme uses a whitelist model. That is, everything that is not explicitely allowed will be denied.
    72  
    73  # General logic for policy application.
    74  
    75  For Traffic reaching the Processing Unit, the following logic is applied:
    76  ```
    77  - If traffic source is part of Trireme CIDRs or it has authorization information:
    78      - If traffic is matched through one of the Trireme rules:
    79          - If action is ALLOW: Allow traffic.
    80          - If action is DROP: Drop traffic.
    81      - Drop unmatched traffic
    82  - If traffic source matches one of the Network ACLs:
    83      - If action is ALLOW: Allow traffic.
    84      - If action is DROP: Drop traffic.
    85  - Drop unmatched traffic
    86  ```
    87  
    88  For traffic exiting the Processing Unit, the following logic is applied:
    89  
    90  ```
    91  - If traffic destination is part of Trireme CIDRs:
    92      - Allow traffic (Add Trireme information to the TCP session)
    93  - If traffic destination matches one of the App. ACLs:
    94      - If action is ALLOW: Allow traffic.
    95      - If action is DROP: Drop traffic.
    96  - Drop unmatched traffic
    97  ```
    98  
    99  # Policies for Trireme traffic.
   100  
   101  Traffic flowing inside a cluster between two endpoints that are both policed by Trireme is subject to the Trireme policies.
   102  
   103  Those policies rely heavily on a set of metadata identity that is sent as part of the Trireme traffic
   104  and decapsulated/encapsulated by the endpoint agents. Those metadata are labels in the form of `Key:values`
   105  and are defined by the  Policy Resolver. Each Processing Unit will have a set of those labels associated.
   106  Each processing Unit also got a set of Trireme Policies that define which remote Trireme processing
   107  units are allowed to connect to the local processing unit.
   108  
   109  The Trireme policy is defined as a logical set of `OR` Rules that are each defined as `AND` Clauses:
   110  The action of a Trireme policy is applied IF at least one of the Rules is matched successfully. (Logical `OR`)
   111  In order for a rule to be matched successfully, each clause inside the rule needs to be successfully matched (Logical `AND`)
   112  
   113  Each clause is built as a `Key`, Set of `Values` and `Operator`.
   114  Each clause translated to a binary TRUE or FALSE.
   115  The following operations are supported:
   116  
   117  * `Equal` returns true if the PU got a label associated to the `Key` with a `value` equal to one of the `values` defined in the policy.
   118  Example:
   119  The clause
   120  ```
   121  KEY: App
   122  VALUE: {'nginx', 'centos', 'mysql'}
   123  OPERATOR: `Equal`
   124  ```
   125  will return TRUE for the following PU metadata:
   126  ```
   127  Image:centos
   128  App:centos
   129  owner:admin
   130  ```
   131  
   132  will return FALSE for the following PU metadata:
   133  ```
   134  Image:server
   135  owner:root
   136  ```
   137  
   138  * `NotEqual` returns true if the PU got a label associated to the `Key` with a `value` NOT equal to one of the `values` defined in the policy
   139  Example:
   140  The clause
   141  ```
   142  KEY: App
   143  VALUE: {'nginx', 'centos', 'mysql'}
   144  OPERATOR: `NotEqual`
   145  ```
   146  will return FALSE for the following PU metadata:
   147  ```
   148  Image:centos
   149  App:centos
   150  owner:admin
   151  ```
   152  
   153  will return TRUE for the following PU metadata:
   154  ```
   155  Image:server
   156  owner:root
   157  ```
   158  
   159  will return TRUE for the following PU metadata:
   160  ```
   161  Image:server
   162  owner:root
   163  App:redis
   164  ```
   165  
   166  * `KeyExists` returns true if the PU got a label with  that key in it.
   167  
   168  Example:
   169  The clause
   170  ```
   171  KEY: App
   172  VALUE: *
   173  OPERATOR: `KeyExists`
   174  ```
   175  will return TRUE for the following PU metadata:
   176  ```
   177  Image:centos
   178  App:abcd
   179  owner:admin
   180  ```
   181  
   182  will return FALSE for the following PU metadata:
   183  ```
   184  Image:server
   185  owner:root
   186  ```
   187  
   188  * `KeyNotExists` returns true if the PU doesn't have a label with the specified key in it.
   189  
   190  Example:
   191  The clause
   192  ```
   193  KEY: App
   194  VALUE: *
   195  OPERATOR: `KeyNotExists`
   196  ```
   197  will return FALSE for the following PU metadata:
   198  ```
   199  Image:centos
   200  App:centos
   201  owner:admin
   202  ```
   203  
   204  will return TRUE for the following PU metadata:
   205  ```
   206  Image:server
   207  owner:root
   208  ```
   209  
   210  # Special tags for Port matching.
   211  
   212  Trireme introduces dynamically an extra label per TCP connection that represents the TCP destination port.
   213  That extra label got the following format:
   214  ```
   215  @port:xx
   216  ```
   217  This label can then be used for matching in any of the previously defined rules, like any other usual label.
   218  
   219  # Policies for External traffic.
   220  
   221  If the source or receiver endpoint is not part of the Trireme CIDRs, then the Policies for external traffic are used.
   222  Those policies are defined as usual Network ACLs with Network and port matches.
   223  
   224  For each Processing Unit, the following two policies are defined:
   225  * Application policy: The allowed traffic that originates from that processing unit.
   226  
   227  * Net policy: The traffic that is allowed to reach the Processing unit from the network.
   228  
   229  Both these policies take the format of a set of (Network/Port-range/Protocol type).
   230  * Network is the CIDR of the network traffic we want to allow (Example: `192.169.0.0/16`)
   231  * Port-range can be a single port or any range of port (Example: `100-200`)
   232  * Protocol type is the L4 protocol type (Must be one of `TCP`/`UDP`/`ICMP`)