github.com/Ilhicas/nomad@v1.0.4-0.20210304152020-e86851182bc3/e2e/terraform/README.md (about) 1 # Terraform infrastructure 2 3 This folder contains Terraform resources for provisioning a Nomad cluster on 4 EC2 instances on AWS to use as the target of end-to-end tests. 5 6 Terraform provisions the AWS infrastructure assuming that EC2 AMIs have 7 already been built via Packer. It deploys a specific build of Nomad to the 8 cluster along with configuration files for Nomad, Consul, and Vault. 9 10 ## Setup 11 12 You'll need Terraform 0.13+, as well as AWS credentials to create the Nomad 13 cluster. This Terraform stack assumes that an appropriate instance role has 14 been configured elsewhere and that you have the ability to `AssumeRole` into 15 the AWS account. 16 17 Optionally, edit the `terraform.tfvars` file to change the number of Linux 18 clients or Windows clients. The Terraform variables file 19 `terraform.full.tfvars` is for the nightly E2E test run and deploys a larger, 20 more diverse set of test targets. 21 22 ```hcl 23 region = "us-east-1" 24 instance_type = "t2.medium" 25 server_count = "3" 26 client_count_ubuntu_bionic_amd64 = "4" 27 client_count_windows_2016_amd64 = "1" 28 profile = "dev-cluster" 29 ``` 30 31 Run Terraform apply to deploy the infrastructure: 32 33 ```sh 34 cd e2e/terraform/ 35 terraform apply 36 ``` 37 38 > Note: You will likely see "Connection refused" or "Permission denied" errors 39 > in the logs as the provisioning script run by Terraform hits an instance 40 > where the ssh service isn't yet ready. That's ok and expected; they'll get 41 > retried. In particular, Windows instances can take a few minutes before ssh 42 > is ready. 43 44 ## Nomad Version 45 46 You'll need to pass one of the following variables in either your 47 `terraform.tfvars` file or as a command line argument (ex. `terraform apply 48 -var 'nomad_version=0.10.2+ent'`) 49 50 * `nomad_local_binary`: provision this specific local binary of Nomad. This is 51 a path to a Nomad binary on your own host. Ex. `nomad_local_binary = 52 "/home/me/nomad"`. This setting overrides `nomad_sha` or `nomad_version`. 53 * `nomad_sha`: provision this specific sha from S3. This is a Nomad binary 54 identified by its full commit SHA that's stored in a shared s3 bucket that 55 Nomad team developers can access. That commit SHA can be from any branch 56 that's pushed to remote. Ex. `nomad_sha = 57 "0b6b475e7da77fed25727ea9f01f155a58481b6c"`. This setting overrides 58 `nomad_version`. 59 * `nomad_version`: provision this version from 60 [releases.hashicorp.com](https://releases.hashicorp.com/nomad). Ex. `nomad_version 61 = "0.10.2+ent"` 62 63 If you want to deploy the Enterprise build of a specific SHA, include 64 `-var 'nomad_enterprise=true'`. 65 66 If you want to bootstrap Nomad ACLs, include `-var 'nomad_acls=true'`. 67 68 > Note: If you bootstrap ACLs you will see "No cluster leader" in the output 69 > several times while the ACL bootstrap script polls the cluster to start and 70 > and elect a leader. 71 72 ## Profiles 73 74 The `profile` field selects from a set of configuration files for Nomad, 75 Consul, and Vault by uploading the files found in `./config/<profile>`. The 76 standard profiles are as follows: 77 78 * `full-cluster`: This profile is used for nightly E2E testing. It assumes at 79 least 3 servers and includes a unique config for each Nomad client. 80 * `dev-cluster`: This profile is used for developer testing of a more limited 81 set of clients. It assumes at least 3 servers but uses the one config for 82 all the Linux Nomad clients and one config for all the Windows Nomad 83 clients. 84 85 You may create additional profiles for testing more complex interactions between features. 86 You can build your own custom profile by writing config files to the 87 `./config/<custom name>` directory. 88 89 For each profile, application (Nomad, Consul, Vault), and agent type 90 (`server`, `client_linux`, or `client_windows`), the agent gets the following 91 configuration files, ignoring any that are missing. 92 93 * `./config/<profile>/<application>/*`: base configurations shared between all 94 servers and clients. 95 * `./config/<profile>/<application>/<type>/*`: base configurations shared 96 between all agents of this type. 97 * `./config/<profile>/<application>/<type>/indexed/*<index>.<ext>`: a 98 configuration for that particular agent, where the index value is the index 99 of that agent within the total count. 100 101 For example, with the `full-cluster` profile, 2nd Nomad server would get the 102 following configuration files: 103 * `./config/full-cluster/nomad/base.hcl` 104 * `./config/full-cluster/nomad/server/indexed/server-1.hcl` 105 106 The directory `./config/full-cluster/nomad/server` has no configuration files, 107 so that's safely skipped. 108 109 ## Outputs 110 111 After deploying the infrastructure, you can get connection information 112 about the cluster: 113 114 - `$(terraform output environment)` will set your current shell's 115 `NOMAD_ADDR` and `CONSUL_HTTP_ADDR` to point to one of the cluster's 116 server nodes, and set the `NOMAD_E2E` variable. 117 - `terraform output servers` will output the list of server node IPs. 118 - `terraform output linux_clients` will output the list of Linux 119 client node IPs. 120 - `terraform output windows_clients` will output the list of Windows 121 client node IPs. 122 123 ## SSH 124 125 You can use Terraform outputs above to access nodes via ssh: 126 127 ```sh 128 ssh -i keys/nomad-e2e-*.pem ubuntu@${EC2_IP_ADDR} 129 ``` 130 131 The Windows client runs OpenSSH for convenience, but has a different 132 user and will drop you into a Powershell shell instead of bash: 133 134 ```sh 135 ssh -i keys/nomad-e2e-*.pem Administrator@${EC2_IP_ADDR} 136 ``` 137 138 ## Teardown 139 140 The terraform state file stores all the info. 141 142 ```sh 143 cd e2e/terraform/ 144 terraform destroy 145 ``` 146 147 ## FAQ 148 149 #### E2E Provisioning Goals 150 151 1. The provisioning process should be able to run a nightly build against a 152 variety of OS targets. 153 2. The provisioning process should be able to support update-in-place 154 tests. (See [#7063](https://github.com/hashicorp/nomad/issues/7063)) 155 3. A developer should be able to quickly stand up a small E2E cluster and 156 provision it with a version of Nomad they've built on their laptop. The 157 developer should be able to send updated builds to that cluster with a short 158 iteration time, rather than having to rebuild the cluster. 159 160 #### Why not just drop all the provisioning into the AMI? 161 162 While that's the "correct" production approach for cloud infrastructure, it 163 creates a few pain points for testing: 164 165 * Creating a Linux AMI takes >10min, and creating a Windows AMI can take 166 15-20min. This interferes with goal (3) above. 167 * We won't be able to do in-place upgrade testing without having an in-place 168 provisioning process anyways. This interferes with goals (2) above. 169 170 #### Why not just drop all the provisioning into the user data? 171 172 * Userdata is executed on boot, which prevents using them for in-place upgrade 173 testing. 174 * Userdata scripts are not very observable and it's painful to determine 175 whether they've failed or simply haven't finished yet before trying to run 176 tests.