github.com/Ilhicas/nomad@v1.0.4-0.20210304152020-e86851182bc3/e2e/terraform/README.md

github.com/Ilhicas/nomad@v1.0.4-0.20210304152020-e86851182bc3/e2e/terraform/README.md (about)

1 # Terraform infrastructure
2
3 This folder contains Terraform resources for provisioning a Nomad cluster on
4 EC2 instances on AWS to use as the target of end-to-end tests.
5
6 Terraform provisions the AWS infrastructure assuming that EC2 AMIs have
7 already been built via Packer. It deploys a specific build of Nomad to the
8 cluster along with configuration files for Nomad, Consul, and Vault.
9
10 ## Setup
11
12 You'll need Terraform 0.13+, as well as AWS credentials to create the Nomad
13 cluster. This Terraform stack assumes that an appropriate instance role has
14 been configured elsewhere and that you have the ability to `AssumeRole` into
15 the AWS account.
16
17 Optionally, edit the `terraform.tfvars` file to change the number of Linux
18 clients or Windows clients. The Terraform variables file
19 `terraform.full.tfvars` is for the nightly E2E test run and deploys a larger,
20 more diverse set of test targets.
21
22 ```hcl
23 region = "us-east-1"
24 instance_type = "t2.medium"
25 server_count = "3"
26 client_count_ubuntu_bionic_amd64 = "4"
27 client_count_windows_2016_amd64 = "1"
28 profile = "dev-cluster"
29 ```
30
31 Run Terraform apply to deploy the infrastructure:
32
33 ```sh
34 cd e2e/terraform/
35 terraform apply
36 ```
37
38 > Note: You will likely see "Connection refused" or "Permission denied" errors
39 > in the logs as the provisioning script run by Terraform hits an instance
40 > where the ssh service isn't yet ready. That's ok and expected; they'll get
41 > retried. In particular, Windows instances can take a few minutes before ssh
42 > is ready.
43
44 ## Nomad Version
45
46 You'll need to pass one of the following variables in either your
47 `terraform.tfvars` file or as a command line argument (ex. `terraform apply
48 -var 'nomad_version=0.10.2+ent'`)
49
50 * `nomad_local_binary`: provision this specific local binary of Nomad. This is
51 a path to a Nomad binary on your own host. Ex. `nomad_local_binary =
52 "/home/me/nomad"`. This setting overrides `nomad_sha` or `nomad_version`.
53 * `nomad_sha`: provision this specific sha from S3. This is a Nomad binary
54 identified by its full commit SHA that's stored in a shared s3 bucket that
55 Nomad team developers can access. That commit SHA can be from any branch
56 that's pushed to remote. Ex. `nomad_sha =
57 "0b6b475e7da77fed25727ea9f01f155a58481b6c"`. This setting overrides
58 `nomad_version`.
59 * `nomad_version`: provision this version from
60 [releases.hashicorp.com](https://releases.hashicorp.com/nomad). Ex. `nomad_version
61 = "0.10.2+ent"`
62
63 If you want to deploy the Enterprise build of a specific SHA, include
64 `-var 'nomad_enterprise=true'`.
65
66 If you want to bootstrap Nomad ACLs, include `-var 'nomad_acls=true'`.
67
68 > Note: If you bootstrap ACLs you will see "No cluster leader" in the output
69 > several times while the ACL bootstrap script polls the cluster to start and
70 > and elect a leader.
71
72 ## Profiles
73
74 The `profile` field selects from a set of configuration files for Nomad,
75 Consul, and Vault by uploading the files found in `./config/<profile>`. The
76 standard profiles are as follows:
77
78 * `full-cluster`: This profile is used for nightly E2E testing. It assumes at
79 least 3 servers and includes a unique config for each Nomad client.
80 * `dev-cluster`: This profile is used for developer testing of a more limited
81 set of clients. It assumes at least 3 servers but uses the one config for
82 all the Linux Nomad clients and one config for all the Windows Nomad
83 clients.
84
85 You may create additional profiles for testing more complex interactions between features.
86 You can build your own custom profile by writing config files to the
87 `./config/<custom name>` directory.
88
89 For each profile, application (Nomad, Consul, Vault), and agent type
90 (`server`, `client_linux`, or `client_windows`), the agent gets the following
91 configuration files, ignoring any that are missing.
92
93 * `./config/<profile>/<application>/*`: base configurations shared between all
94 servers and clients.
95 * `./config/<profile>/<application>/<type>/*`: base configurations shared
96 between all agents of this type.
97 * `./config/<profile>/<application>/<type>/indexed/*<index>.<ext>`: a
98 configuration for that particular agent, where the index value is the index
99 of that agent within the total count.
100
101 For example, with the `full-cluster` profile, 2nd Nomad server would get the
102 following configuration files:
103 * `./config/full-cluster/nomad/base.hcl`
104 * `./config/full-cluster/nomad/server/indexed/server-1.hcl`
105
106 The directory `./config/full-cluster/nomad/server` has no configuration files,
107 so that's safely skipped.
108
109 ## Outputs
110
111 After deploying the infrastructure, you can get connection information
112 about the cluster:
113
114 - `$(terraform output environment)` will set your current shell's
115 `NOMAD_ADDR` and `CONSUL_HTTP_ADDR` to point to one of the cluster's
116 server nodes, and set the `NOMAD_E2E` variable.
117 - `terraform output servers` will output the list of server node IPs.
118 - `terraform output linux_clients` will output the list of Linux
119 client node IPs.
120 - `terraform output windows_clients` will output the list of Windows
121 client node IPs.
122
123 ## SSH
124
125 You can use Terraform outputs above to access nodes via ssh:
126
127 ```sh
128 ssh -i keys/nomad-e2e-*.pem ubuntu@${EC2_IP_ADDR}
129 ```
130
131 The Windows client runs OpenSSH for convenience, but has a different
132 user and will drop you into a Powershell shell instead of bash:
133
134 ```sh
135 ssh -i keys/nomad-e2e-*.pem Administrator@${EC2_IP_ADDR}
136 ```
137
138 ## Teardown
139
140 The terraform state file stores all the info.
141
142 ```sh
143 cd e2e/terraform/
144 terraform destroy
145 ```
146
147 ## FAQ
148
149 #### E2E Provisioning Goals
150
151 1. The provisioning process should be able to run a nightly build against a
152 variety of OS targets.
153 2. The provisioning process should be able to support update-in-place
154 tests. (See [#7063](https://github.com/hashicorp/nomad/issues/7063))
155 3. A developer should be able to quickly stand up a small E2E cluster and
156 provision it with a version of Nomad they've built on their laptop. The
157 developer should be able to send updated builds to that cluster with a short
158 iteration time, rather than having to rebuild the cluster.
159
160 #### Why not just drop all the provisioning into the AMI?
161
162 While that's the "correct" production approach for cloud infrastructure, it
163 creates a few pain points for testing:
164
165 * Creating a Linux AMI takes >10min, and creating a Windows AMI can take
166 15-20min. This interferes with goal (3) above.
167 * We won't be able to do in-place upgrade testing without having an in-place
168 provisioning process anyways. This interferes with goals (2) above.
169
170 #### Why not just drop all the provisioning into the user data?
171
172 * Userdata is executed on boot, which prevents using them for in-place upgrade
173 testing.
174 * Userdata scripts are not very observable and it's painful to determine
175 whether they've failed or simply haven't finished yet before trying to run
176 tests.