sigs.k8s.io/cluster-api-provider-aws@v1.5.5/docs/proposal/20180827-feature-model.md (about)

     1  # Cluster API Provider AWS Feature Set
     2  
     3  ## Introduction
     4  
     5  We wish to make a [feature model](https://en.wikipedia.org/wiki/Feature_model) which allows us to see the common functionality that will need to be shared between AWS implementations.
     6  
     7  Give each feature a unique number. Reference a feature by its number. If possible, provide a justification or requirement for the feature, which will help with prioritisation for a minimum viable product.
     8  
     9  You can also write constraints:
    10  Feature B is optional sub-feature of A
    11  Feature C is a mandatory sub-feature of B
    12  Feature E is an alternative feature to C, D
    13  Feature F is mutually exclusive of Feature E
    14  Feature G requires Feature B
    15  
    16  A minimum viable product will be a configuration of features, which when solved for all constraints provides the minimum list of features that will need to be developed.
    17  
    18  Different MVPs may be possible - e.g. EKS vs. not EKS, but they may rely on shared components which will become the critical path.
    19  
    20  ## Feature Set
    21  
    22  ### 0: AWS Cluster Provider
    23  
    24  ### 1: VPC Selection
    25  
    26  #### 2: The provider will need the ability to create a new VPC
    27  
    28  * Constraint: Mandatory sub-feature of [1](#1-vpc-selection)
    29  
    30  #### 3: The provider provisions in an existing VPC, selecting the default VPC if none is specified
    31  
    32  * Constraint: Optional alternative to [2](#2-the-provider-will-need-the-ability-to-create-a-new-vpc)
    33  * Requirement: Some customers may wish to reuse VPCs in which they have existing infrastructure.
    34  
    35  ### 45: Etcd location
    36  
    37  #### 46: The provider deploys etcd as part of the control plane
    38  
    39  * Constraint: Mandatory sub-feature of [45](#45-etcd-location)
    40  * Requirement: For simple clusters a colocated etcd is the easiest way to operate a cluster
    41  
    42  #### 47: The provider deploys etcd externally to the control plane
    43  
    44  * Constraint: Alternative to [46](#46-the-provider-deploys-etcd-as-part-of-the-control-plane)
    45  * Requirement: For larger clusters etcd placed external to the control plane allows for independent control plane and datastore scaling
    46    
    47  #### 48: The provider can connect to a pre-existing etcd cluster
    48  
    49  * Constraint: Optional subfeature of [45](#45-etcd-location)
    50  * Requirement: An existing etcd store could be used to replace a cluster during upgrade or a complete cluster restore.
    51  
    52  ### 5: Control plane placement
    53  
    54  #### 6: The provider deploys the control plane in public subnets
    55  
    56  * Constraint: Mandatory sub-feature of [5](#5-control-plane-placement)
    57  * Requirement: For simple clusters without bastion hosts, allowing users to break-glass SSH to control plane nodes
    58    
    59  #### 7: The provider deploys the control plane in private subnets
    60  
    61  * Constraint: Alternative to [6](#6-the-provider-deploys-the-control-plane-in-public-subnets)
    62  * Requirement: [AWS Well-Architected SEC 5](https://d1.awsstatic.com/whitepapers/architecture/AWS_Well-Architected_Framework.pdf), security requirements may require access via bastion hosts
    63  
    64  #### 8: The provider deploys control plane components to a single AZ
    65  
    66  * Constraint: Mandatory sub-feature of [5](#5-control-plane-placement)
    67  * Requirement: Architectural requirement for a particular customer workload
    68  
    69  #### 9: The provider deploys control plane components across multiple AZs
    70  
    71  * Constraint: Alternative to [8](#8-the-provider-deploys-control-plane-components-to-a-single-az)
    72  * Requirement: Robustness of control plane components
    73  
    74  ### 10: Worker node placement
    75  
    76  #### 11: Provider deploys worker nodes deployed to public subnets
    77  
    78  * Constraint: Mandatory sub-feature of [10](#10-worker-node-placement)
    79  * Requirement: For simple clusters without bastion hosts, allowing users to break-glass SSH to worker nodes
    80    
    81  #### 12: Provider deploys worker nodes to private subnets
    82  
    83  * Constraint: Alternative to [11](#11-provider-deploys-worker-nodes-deployed-to-public-subnets)
    84  * Requirement: AWS Well-Architected SEC 5, security requirements may require access via bastion hosts / VPN / Direct Connect
    85    
    86  #### 13: Provider deploys worker nodes to single AZ
    87  
    88  * Constraint: Mandatory sub-feature of [10](#10-worker-node-placement)
    89  * Requirement: Architectural requirement for a particular customer workload
    90  
    91  #### 14: Provider deploys worker nodes across multiple AZs
    92  
    93  * Constraint: Alternative to [13](#13-provider-deploys-worker-nodes-to-single-az)
    94  * Requirement: Robustness of cluster
    95  
    96  #### 15: Deploy worker nodes to a placement group
    97  
    98  * Constraint: Optional sub-feature of [10](#10-worker-node-placement)
    99  * Requirement: HPC type workload that requires fast interconnect between nodes
   100  
   101  #### 16: The provider deploys worker nodes to shared instances
   102  
   103  * Constraint: Mandatory sub-feature of [10](#10-worker-node-placement)
   104  * Requirement: Default behaviour / cost / availability of instances
   105  
   106  #### 17: The provider deploys worker nodes to dedicated EC2 instances
   107  
   108  * Constraint: Optional alternative to [16](#16-the-provider-deploys-worker-nodes-to-shared-instances)
   109  * Requirement: License requirements for a particular workload (e.g. Oracle) may require a dedicated instance
   110  
   111  ### 18: Worker node scaling methodology
   112  
   113  #### 19: Worker nodes are deployed individually or in batches not using auto-scaling groups
   114  
   115  * Constraint: Mandatory sub-feature of [18](#18-worker-node-scaling-methodology)
   116  
   117  #### 20: Worker nodes are deployed via Auto-Scaling Groups using MachineSets
   118  
   119  * Constraint: Alternative to [19](#19-worker-nodes-are-deployed-individually-or-in-batches-not-using-auto-scaling-groups)
   120  * Note: The implementation here would be significantly different to [19](#19-worker-nodes-are-deployed-individually-or-in-batches-not-using-auto-scaling-groups).
   121  
   122  ### 21: API Server Access
   123  
   124  #### 22: The API server is publicly accessible
   125  
   126  * Constraint: Mandatory sub-feature of [21](#21-api-server-access)
   127  * Requirement: Standard way of accessing k8s
   128  
   129  #### 23: The API server is not publicly accessible
   130  
   131  * Constraint: Alternative to [22](#22-the-api-server-is-publicly-accessible)
   132  * Requirement: Security requirement (e.g. someone’s interpretation of UK OFFICIAL) prohibits making API server endpoint publicly accessible
   133  
   134  #### 31: The API server is connected to a VPC via PrivateLink
   135  
   136  * Constraint: Sub-feature of [23](#23-the-api-server-is-not-publicly-accessible) & [25](#25-the-control-plane-is-eks)
   137  * Requirement: Compliance requirements for API traffic to not transit public internet, e.g. UK OFFICIAL-SENSITIVE workloads. AWS recommend for FedRamp(?) and UK-OFFICIAL to use VPC or PrivateLink endpoints to connect publicly accessible regional services to VPCs to prevent traffic exiting the internal AWS network. The actual EKS endpoint, for example may still present itself with a public load balancer endpoint even if it’s connected by PrivateLink to the VPC.
   138  
   139  #### 43: The API server is accessible via a load balancer
   140  
   141  * Constraint: Sub-feature of [22](#22-the-api-server-is-publicly-accessible) & [23](#23-the-api-server-is-not-publicly-accessible)
   142  * Requirement: For potential HA access OR public/private subnet distinctions, the API server is accessed via an AWS load balancer.
   143  
   144  #### 44: The API server is accessed directly via the IP address of cluster nodes hosting the API server
   145  
   146  * Constraint: Alternative to [43](#43-the-api-server-is-accessible-via-a-load-balancer)
   147  * Requirement: The IP addresses of each node hosting an API server is registered in DNS
   148  
   149  ### 24: Type of control plane
   150  
   151  #### 25: The control plane is EKS
   152  
   153  * Constraint: Mandatory sub-feature of [24](#25-type-of-control-plane)
   154  * Requirement: Leverage AWS to provide heavy lifting of control plane deployment & operations. Also, meets compliance requirements: [UK OFFICIAL IL2 & OFFICIAL-SENSITIVE/HIGH IL3](https://www.digitalmarketplace.service.gov.uk/g-cloud/services/760447139328659)
   155  
   156  #### 26: The control plane is managed within the provider
   157  
   158  * Constraint: Alternative to [25](#25-the-control-plane-is-eks)
   159  * Requirement: Customer requires functionality not provided by EKS (e.g. admission controller, non-public API endpoint)
   160  
   161  ### 33: CRI
   162  
   163  #### 34: The provider deploys a [credential helper](https://github.com/awslabs/amazon-ecr-credential-helper) for ECR
   164  
   165  * Constraint: Mandatory sub-feature of [33](#33-cri)
   166  * Requirement: [AWS Well-Architected SEC 3](https://d1.awsstatic.com/whitepapers/architecture/AWS_Well-Architected_Framework.pdf)
   167  
   168  ### 35: Container Hosts
   169  
   170  #### 36: The provider deploys to Amazon Linux 2
   171  
   172  * Constraint: Mandatory sub-feature of [35](#35-container-hosts)
   173  * Requirement: Parity with AWS recommendations
   174  
   175  #### 37: The provider deploys to CentOS / Ubuntu
   176  
   177  * Constraint: Alternative to [36](#36-the-provider-deploys-to-amazon-linux-2)
   178  * Requirement: Greater familiarity in the community (particularly Ubuntu), organisational requirements?
   179  
   180  #### 38: The provider deploys from arbitrary AMIs
   181  
   182  * Constraint: Alternative to [36](#36-the-provider-deploys-to-amazon-linux-2). Sub-feature of [37](#37-the-provider-deploys-to-centos--ubuntu)
   183  * Requirement: Compliance requirement may require AMI that passes [CIS Distribution Independent Linux Benchmark](https://www.cisecurity.org/benchmark/distribution_independent_linux/), or EC2 instances should have encrypted EBS root volumes for data loss prevention, requiring AMIs in the customer’s account.
   184  
   185  #### 39: The provider allows kubelet configuration to be customised, e.g. “--allow-privileged”
   186  
   187  * Constraint: Sub-feature of [35](#35-container-hosts)
   188  * Requirement: [NIST 800-190, control 4.4.3](https://nvlpubs.nist.gov/nistpubs/specialpublications/nist.sp.800-190.pdf) recommends disabling “allow-privileged”
   189  
   190  #### 40: Arbitrary customisation of bootstrap script
   191  
   192  * Constraint: Sub-feature of [35](#35-container-hosts)
   193  * Requirement: Organisational or security requirements, e.g. NIST 800-190 & AWS Well-Architected controls recommend the installation of additional file integrity tools such as OSSEC, Tripwire etc…, some organisations may even mandate antivirus, etc… Cannot encode all of this as additional options, so some mix of 38 plus some ability to customise bootstrap would satisfy this without bringing too much variability into scope.
   194  
   195  ### 41: API Server configuration
   196  
   197  #### 42: The provider allows customisation of API Server
   198  
   199  * Constraint: Sub-feature of [41](#41-api-server-configuration)
   200  * Requirement: Example - Would need to enable AdmissionController for Istio automatic sidecar injection (note, EKS doesn’t allow customisation of webhooks at present, but may in the future).
   201  
   202  ## TODO
   203  
   204  * HA / non-HA installs?
   205  
   206  ## Out of Scope
   207  
   208  Anything that can be applied after the cluster has come up with kubectl is not a cluster-api responsibility, including:
   209  
   210  * Monitoring / Logging
   211  * Many of the CNI options (at least Calico & AWS VPC CNI)
   212  * IAM identity for pods (e.g. kube2iam, kiam etc…)
   213  * ALB ingress
   214  
   215  These should be addressed with documentation - we do not want the cluster-api provider to be a package manager for Kubernetes manifests. In addition, @roberthbailey has stated on 2018/08/14 that Google is working on a declarative cluster add on manager to be presented to sig-cluster-lifecycle for discussion later.