github.com/ferranbt/nomad@v0.9.3-0.20190607002617-85c449b7667c/website/source/guides/security/vault-pki-integration.html.md (about) 1 --- 2 layout: "guides" 3 page_title: "Vault PKI Secrets Engine Integration" 4 sidebar_current: "guides-security-vault-pki" 5 description: |- 6 Securing Nomad's cluster communication with TLS is important for both 7 security and easing operations. Nomad can use mutual TLS (mTLS) for 8 authenticating for all HTTP and RPC communication. This guide will leverage 9 Vault's PKI secrets engine to accomplish this task. 10 --- 11 12 # Vault PKI Secrets Engine Integration 13 14 You can use [Consul Template][consul-template] in your Nomad cluster to 15 integrate with Vault's [PKI Secrets Engine][pki-engine] to generate and renew 16 dynamic X.509 certificates. By using this method, you enable each node to have a 17 unique certificate with a relatively short ttl. This feature, along with 18 automatic certificate rotation, allows you to safely and securely scale your 19 cluster while using mutual TLS (mTLS). 20 21 ## Reference Material 22 23 - [Vault PKI Secrets Engine][pki-engine] 24 - [Consul Template][consul-template-github] 25 - [Build Your Own Certificate Authority (CA)][vault-ca-learn] 26 27 ## Estimated Time to Complete 28 29 25 minutes 30 31 ## Challenge 32 33 Secure your existing Nomad cluster with mTLS. Configure a root and intermediate 34 CA in Vault and ensure (with the help of Consul Template) that you are 35 periodically renewing your X.509 certificates on all nodes to maintain a healthy 36 cluster state. 37 38 ## Solution 39 40 Enable TLS in your Nomad cluster configuration. Additionally, configure Consul 41 Template on all nodes along with the appropriate templates to communicate with 42 Vault and ensure all nodes are dynamically generating/renewing their X.509 43 certificates. 44 45 ## Prerequisites 46 47 To perform the tasks described in this guide, you need to have a Nomad 48 environment with Consul and Vault installed. You can use this [repo][repo] to 49 easily provision a sandbox environment. This guide will assume a cluster with 50 one server node and three client nodes. 51 52 ~> **Please Note:** This guide is for demo purposes and is only using a single 53 Nomad server with a Vault server configured alongside it. In a production 54 cluster, 3 or 5 Nomad server nodes are recommended along with a separate Vault 55 cluster. Please see [Vault Reference Architecture][vault-ra] to learn how to 56 securely deploy a Vault cluster. 57 58 ## Steps 59 60 ### Step 1: Initialize Vault Server 61 62 Run the following command to initialize Vault server and receive an 63 [unseal][seal] key and initial root [token][token] (if you are running the 64 environment provided in this guide, the Vault server is co-located with the 65 Nomad server). Be sure to note the unseal key and initial root token as you will 66 need these two pieces of information. 67 68 ```shell 69 $ vault operator init -key-shares=1 -key-threshold=1 70 ``` 71 72 The `vault operator init` command above creates a single Vault unseal key for 73 convenience. For a production environment, it is recommended that you create at 74 least five unseal key shares and securely distribute them to independent 75 operators. The `vault operator init` command defaults to five key shares and a 76 key threshold of three. If you provisioned more than one server, the others will 77 become standby nodes but should still be unsealed. 78 79 ### Step 2: Unseal Vault 80 81 Run the following command and then provide your unseal key to Vault. 82 83 ```shell 84 $ vault operator unseal 85 ``` 86 The output of unsealing Vault will look similar to the following: 87 88 ```shell 89 Key Value 90 --- ----- 91 Seal Type shamir 92 Initialized true 93 Sealed false 94 Total Shares 1 95 Threshold 1 96 Version 1.0.3 97 Cluster Name vault-cluster-d1b6513f 98 Cluster ID 87d6d13f-4b92-60ce-1f70-41a66412b0f1 99 HA Enabled true 100 HA Cluster n/a 101 HA Mode standby 102 Active Node Address <none> 103 ``` 104 105 ### Step 3: Log in to Vault 106 107 Use the [login][login] command to authenticate yourself against Vault using the 108 initial root token you received earlier. You will need to authenticate to run 109 the necessary commands to write policies, create roles, and configure your root 110 and intermediate CAs. 111 112 ```shell 113 $ vault login <your initial root token> 114 ``` 115 If your login is successful, you will see output similar to what is shown below: 116 117 ```shell 118 Success! You are now authenticated. The token information displayed below 119 is already stored in the token helper. You do NOT need to run "vault login" 120 again. Future Vault requests will automatically use this token. 121 ... 122 ``` 123 124 ### Step 4: Generate the Root CA 125 126 Enable the [PKI secrets engine][pki-engine] at the `pki` path: 127 128 ```shell 129 $ vault secrets enable pki 130 ``` 131 132 Tune the PKI secrets engine to issue certificates with a maximum time-to-live 133 (TTL) of 87600 hours: 134 135 ```shell 136 $ vault secrets tune -max-lease-ttl=87600h pki 137 ``` 138 * Please note: we are using a common and recommended pattern which is to have 139 one mount act as the root CA and to use this CA only to sign intermediate CA 140 CSRs from other PKI secrets engines (which we will create in the next few 141 steps). For tighter security, you can store your CA outside of Vault and use 142 the PKI engine only as an intermediate CA. 143 144 Generate the root certificate and save the certificate as `CA_cert.crt`: 145 146 ```shell 147 $ vault write -field=certificate pki/root/generate/internal \ 148 common_name="global.nomad" ttl=87600h > CA_cert.crt 149 ``` 150 151 ### Step 5: Generate the Intermediate CA and CSR 152 153 Enable the PKI secrets engine at the `pki_int` path: 154 155 ```shell 156 $ vault secrets enable -path=pki_int pki 157 ``` 158 159 Tune the PKI secrets engine at the `pki_int` path to issue certificates with a 160 maximum time-to-live (TTL) of 43800 hours: 161 162 ```shell 163 $ vault secrets tune -max-lease-ttl=43800h pki_int 164 ``` 165 Generate a CSR from your intermediate CA and save it as `pki_intermediate.csr`: 166 167 ```shell 168 $ vault write -format=json pki_int/intermediate/generate/internal \ 169 common_name="global.nomad Intermediate Authority" \ 170 ttl="43800h" | jq -r '.data.csr' > pki_intermediate.csr 171 ``` 172 173 ### Step 6: Sign the CSR and Configure Intermediate CA Certificate 174 175 Sign the intermediate CA CSR with the root certificate and save the generated 176 certificate as `intermediate.cert.pem`: 177 178 ```shell 179 $ vault write -format=json pki/root/sign-intermediate \ 180 csr=@pki_intermediate.csr format=pem_bundle \ 181 ttl="43800h" | jq -r '.data.certificate' > intermediate.cert.pem 182 ``` 183 184 Once the CSR is signed and the root CA returns a certificate, it can be imported 185 back into Vault: 186 187 ```shell 188 vault write pki_int/intermediate/set-signed certificate=@intermediate.cert.pem 189 ``` 190 191 ### Step 7: Create a Role 192 193 A role is a logical name that maps to a policy used to generate credentials. In 194 our example, it will allow you to use [configuration 195 parameters][config-parameters] that specify certificate common names, designate 196 alternate names, and enable subdomains along with a few other key settings. 197 198 Create a role named `nomad-cluster` that specifies the allowed domains, enables 199 you to create certificates for subdomains, and generates certificates with a TTL 200 of 86400 seconds (24 hours). 201 202 ``` 203 $ vault write pki_int/roles/nomad-cluster allowed_domains=global.nomad \ 204 allow_subdomains=true max_ttl=86400s require_cn=false generate_lease=true 205 ``` 206 You should see the following output if the command you issues was successful: 207 208 ``` 209 Success! Data written to: pki_int/roles/nomad-cluster 210 ``` 211 212 ### Step 8: Create a Policy to Access the Role Endpoint 213 214 Recall from [Step 1](#step-1-initialize-vault-server) that we generated a root 215 token that we used to log in to Vault. Although we could use that token in our 216 next steps to generate our TLS certs, the recommended security approach is to 217 create a new token based on a specific policy with limited privileges. 218 219 Create a policy file named `tls-policy.hcl` and provide it the following 220 contents: 221 222 ``` 223 path "pki_int/issue/nomad-cluster" { 224 capabilities = ["update"] 225 } 226 ``` 227 Note that we have are specifying the `update` [capability][capability] on the 228 path `pki_int/issue/nomad-cluster`. All other privileges will be denied. You can 229 read more about Vault policies [here][policies]. 230 231 Write the policy we just created into Vault: 232 233 ``` 234 $ vault policy write tls-policy tls-policy.hcl 235 Success! Uploaded policy: tls-policy 236 ``` 237 238 ### Step 9: Generate a Token based on `tls-policy` 239 240 Create a token based on `tls-policy` with the following command: 241 242 ``` 243 $ vault token create -policy="tls-policy" -ttl=24h 244 ``` 245 246 If the command is successful, you will see output similar to the following: 247 248 ```shell 249 Key Value 250 --- ----- 251 token s.xafiYzh7MCMotHLu2d35hepR 252 token_accessor 9vj7q5nnF53JAcTyxvccpAZK 253 token_duration 24h 254 token_renewable true 255 token_policies ["default" "tls-policy"] 256 identity_policies [] 257 policies ["default" "tls-policy"] 258 ``` 259 260 Make a note of this token as you will need it in the upcoming steps. 261 262 ### Step 10: Create and Populate the Templates Directory 263 264 We need to create templates that Consul Template can use to render the actual 265 certificates and keys on the nodes in our cluster. In this guide, we will place 266 these templates in `/opt/nomad/templates`. 267 268 Create a directory called `templates` in `/opt/nomad`: 269 270 ```shell 271 $ sudo mkdir /opt/nomad/templates 272 ``` 273 274 Below are the templates that the Consul Template configuration will use. We will 275 provide different templates to the nodes depending on whether they are server 276 nodes or client nodes. All of the nodes will get the CLI templates (since we 277 want to use the CLI on any of the nodes). 278 279 **For Nomad Servers**: 280 281 *agent.crt.tpl*: 282 283 ``` 284 {{ with secret "pki_int/issue/nomad-cluster" "common_name=server.global.nomad" "ttl=24h" "alt_names=localhost" "ip_sans=127.0.0.1"}} 285 {{ .Data.certificate }} 286 {{ end }} 287 ``` 288 289 *agent.key.tpl*: 290 291 ``` 292 {{ with secret "pki_int/issue/nomad-cluster" "common_name=server.global.nomad" "ttl=24h" "alt_names=localhost" "ip_sans=127.0.0.1"}} 293 {{ .Data.private_key }} 294 {{ end }} 295 ``` 296 297 *ca.crt.tpl*: 298 299 ``` 300 {{ with secret "pki_int/issue/nomad-cluster" "common_name=server.global.nomad" "ttl=24h"}} 301 {{ .Data.issuing_ca }} 302 {{ end }} 303 ``` 304 305 **For Nomad Clients**: 306 307 Replace the word `server` in the `common_name` option in each template with the 308 word `client`. 309 310 *agent.crt.tpl*: 311 312 ``` 313 {{ with secret "pki_int/issue/nomad-cluster" "common_name=client.global.nomad" "ttl=24h" "alt_names=localhost" "ip_sans=127.0.0.1"}} 314 {{ .Data.certificate }} 315 {{ end }} 316 ``` 317 318 *agent.key.tpl*: 319 320 ``` 321 {{ with secret "pki_int/issue/nomad-cluster" "common_name=client.global.nomad" "ttl=24h" "alt_names=localhost" "ip_sans=127.0.0.1"}} 322 {{ .Data.private_key }} 323 {{ end }} 324 ``` 325 326 *ca.crt.tpl*: 327 328 ``` 329 {{ with secret "pki_int/issue/nomad-cluster" "common_name=client.global.nomad" "ttl=24h"}} 330 {{ .Data.issuing_ca }} 331 {{ end }} 332 ``` 333 334 **For Nomad CLI (provide this on all nodes)**: 335 336 *cli.crt.tpl*: 337 338 ``` 339 {{ with secret "pki_int/issue/nomad-cluster" "ttl=24h" }} 340 {{ .Data.certificate }} 341 {{ end }} 342 ``` 343 344 *cli.key.tpl*: 345 346 ``` 347 {{ with secret "pki_int/issue/nomad-cluster" "ttl=24h" }} 348 {{ .Data.private_key }} 349 {{ end }} 350 ``` 351 352 ### Step 11: Configure Consul Template on All Nodes 353 354 If you are using the AWS environment provided in this guide, you already have 355 [Consul Template][consul-template-github] installed on all nodes. If you are 356 using your own environment, please make sure Consul Template is installed. You 357 can download it [here][ct-download]. 358 359 Provide the token you created in [Step 360 9](#step-9-generate-a-token-based-on-tls-policy) to the Consul Template 361 configuration file located at `/etc/consul-template.d/consul-template.hcl`. You 362 will also need to specify the [template stanza][ct-template-stanza] so you can 363 render each of the following on your nodes at the specified location from the 364 templates you created in the previous step: 365 366 * Node certificate 367 * Node private key 368 * CA public certificate 369 370 We will also specify the template stanza to create certs and keys from the 371 templates we previously created for the Nomad CLI (which defaults to HTTP but 372 will need to use HTTPS once once TLS is enabled in our cluster). 373 374 Your `consul-template.hcl` configuration file should look similar to the 375 following (you will need to provide this to each node in the cluster): 376 377 ``` 378 # This denotes the start of the configuration section for Vault. All values 379 # contained in this section pertain to Vault. 380 vault { 381 # This is the address of the Vault leader. The protocol (http(s)) portion 382 # of the address is required. 383 address = "http://active.vault.service.consul:8200" 384 385 # This value can also be specified via the environment variable VAULT_TOKEN. 386 token = "s.xafiYzh7MCMotHLu2d35hepR" 387 388 # This should also be less than or around 1/3 of your TTL for a predictable 389 # behaviour. See https://github.com/hashicorp/vault/issues/3414 390 grace = "1s" 391 392 # This tells Consul Template that the provided token is actually a wrapped 393 # token that should be unwrapped using Vault's cubbyhole response wrapping 394 # before being used. Please see Vault's cubbyhole response wrapping 395 # documentation for more information. 396 unwrap_token = false 397 398 # This option tells Consul Template to automatically renew the Vault token 399 # given. If you are unfamiliar with Vault's architecture, Vault requires 400 # tokens be renewed at some regular interval or they will be revoked. Consul 401 # Template will automatically renew the token at half the lease duration of 402 # the token. The default value is true, but this option can be disabled if 403 # you want to renew the Vault token using an out-of-band process. 404 renew_token = true 405 } 406 407 # This block defines the configuration for connecting to a syslog server for 408 # logging. 409 syslog { 410 enabled = true 411 412 # This is the name of the syslog facility to log to. 413 facility = "LOCAL5" 414 } 415 416 # This block defines the configuration for a template. Unlike other blocks, 417 # this block may be specified multiple times to configure multiple templates. 418 template { 419 # This is the source file on disk to use as the input template. This is often 420 # called the "Consul Template template". 421 source = "/opt/nomad/templates/agent.crt.tpl" 422 423 # This is the destination path on disk where the source template will render. 424 # If the parent directories do not exist, Consul Template will attempt to 425 # create them, unless create_dest_dirs is false. 426 destination = "/opt/nomad/agent-certs/agent.crt" 427 428 # This is the permission to render the file. If this option is left 429 # unspecified, Consul Template will attempt to match the permissions of the 430 # file that already exists at the destination path. If no file exists at that 431 # path, the permissions are 0644. 432 perms = 0700 433 434 # This is the optional command to run when the template is rendered. The 435 # command will only run if the resulting template changes. 436 command = "systemctl reload nomad" 437 } 438 439 template { 440 source = "/opt/nomad/templates/agent.key.tpl" 441 destination = "/opt/nomad/agent-certs/agent.key" 442 perms = 0700 443 command = "systemctl reload nomad" 444 } 445 446 template { 447 source = "/opt/nomad/templates/ca.crt.tpl" 448 destination = "/opt/nomad/agent-certs/ca.crt" 449 command = "systemctl reload nomad" 450 } 451 452 # The following template stanzas are for the CLI certs 453 454 template { 455 source = "/opt/nomad/templates/cli.crt.tpl" 456 destination = "/opt/nomad/cli-certs/cli.crt" 457 } 458 459 template { 460 source = "/opt/nomad/templates/cli.key.tpl" 461 destination = "/opt/nomad/cli-certs/cli.key" 462 } 463 ``` 464 465 !> Note: we have hard-coded the token we created into the Consul Template 466 configuration file. Although we can avoid this by assigning it to the 467 environment variable `VAULT_TOKEN`, this method can still pose a security 468 concern. The recommended approach is to securely introduce this token to Consul 469 Template. To learn how to accomplish this, see [Secure 470 Introduction][secure-introduction]. 471 472 * Please also note we have applied file permissions `0700` to the `agent.crt` 473 and `agent.key` since only the root user should be able to read those files. 474 Any other user using the Nomad CLI will be able to read the CLI certs and key 475 that we have created for them along with intermediate CA cert. 476 477 478 ### Step 12: Start the Consul Template Service 479 480 Start the Consul Template service on each node: 481 482 ```shell 483 $ sudo systemctl start consul-template 484 ``` 485 You can quickly confirm the appropriate certs and private keys were generated in 486 the `destination` directory you specified in your Consul Template configuration 487 by listing them out: 488 489 ``` 490 $ ls /opt/nomad/agent-certs/ /opt/nomad/cli-certs/ 491 /opt/nomad/agent-certs/: 492 agent.crt agent.key ca.crt 493 494 /opt/nomad/cli-certs/: 495 cli.crt cli.key 496 ``` 497 498 ### Step 13: Configure Nomad to Use TLS 499 500 Add the following [tls stanza][nomad-tls-stanza] to the configuration of all 501 Nomad agents (servers and clients) in the cluster (configuration file located at 502 `/etc/nomad.d/nomad.hcl` in this example): 503 504 ```hcl 505 tls { 506 http = true 507 rpc = true 508 509 ca_file = "/opt/nomad/agent-certs/ca.crt" 510 cert_file = "/opt/nomad/agent-certs/agent.crt" 511 key_file = "/opt/nomad/agent-certs/agent.key" 512 513 verify_server_hostname = true 514 verify_https_client = true 515 } 516 ``` 517 518 Additionally, ensure the [`rpc_upgrade_mode`][rpc-upgrade-mode] option is set to 519 `true` on your server nodes (this is to ensure the Nomad servers will accept 520 both TLS and non-TLS connections during the upgrade): 521 522 ```hcl 523 rpc_upgrade_mode = true 524 ``` 525 Reload Nomad's configuration on all nodes: 526 527 ```shell 528 $ systemctl reload nomad 529 ``` 530 Once Nomad has been reloaded on all nodes, go back to your server nodes and 531 change the `rpc_upgrade_mode` option to false (or remove the line since the 532 option defaults to false) so that your Nomad servers will only accept TLS 533 connections: 534 535 ```hcl 536 rpc_upgrade_mode = false 537 ``` 538 You will need to reload Nomad on your servers after changing this setting. You 539 can read more about RPC Upgrade Mode [here][rpc-upgrade]. 540 541 If you run `nomad status`, you will now receive the following error: 542 543 ``` 544 Error querying jobs: Get http://172.31.52.215:4646/v1/jobs: net/http: HTTP/1.x transport connection broken: malformed HTTP response "\x15\x03\x01\x00\x02\x02" 545 ``` 546 547 This is because the Nomad CLI defaults to communicating via HTTP instead of 548 HTTPS. We can configure the local Nomad client to connect using TLS and specify 549 our custom key and certificates by setting the following environments variables: 550 551 ```shell 552 export NOMAD_ADDR=https://localhost:4646 553 export NOMAD_CACERT="/opt/nomad/agent-certs/ca.crt" 554 export NOMAD_CLIENT_CERT="/opt/nomad/cli-certs/cli.crt" 555 export NOMAD_CLIENT_KEY="/opt/nomad/cli-certs/cli.key" 556 ``` 557 558 After these environment variables are correctly configured, the CLI will respond 559 as expected: 560 561 ```shell 562 $ nomad status 563 No running jobs 564 ``` 565 566 ## Encrypt Server Gossip 567 568 At this point all of Nomad's RPC and HTTP communication is secured with mTLS. 569 However, Nomad servers also communicate with a gossip protocol, Serf, that does 570 not use TLS: 571 572 * HTTP - Used to communicate between CLI and Nomad agents. Secured by mTLS. 573 * RPC - Used to communicate between Nomad agents. Secured by mTLS. 574 * Serf - Used to communicate between Nomad servers. Secured by a shared key. 575 576 Nomad server's gossip protocol use a shared key instead of TLS for encryption. 577 This encryption key must be added to every server's configuration using the 578 [`encrypt`](/docs/configuration/server.html#encrypt) parameter or with the 579 [`-encrypt` command line option](/docs/commands/agent.html). 580 581 The Nomad CLI includes a `operator keygen` command for generating a new secure 582 gossip encryption key: 583 584 ```shell 585 $ nomad operator keygen 586 cg8StVXbQJ0gPvMd9o7yrg== 587 ``` 588 589 Alternatively, you can use any method that base64 encodes 16 random bytes: 590 591 ```shell 592 $ openssl rand -base64 16 593 raZjciP8vikXng2S5X0m9w== 594 $ dd if=/dev/urandom bs=16 count=1 status=none | base64 595 LsuYyj93KVfT3pAJPMMCgA== 596 ``` 597 598 Put the same generated key into every server's configuration file or command 599 line arguments: 600 601 ```hcl 602 server { 603 enabled = true 604 605 # Self-elect, should be 3 or 5 for production 606 bootstrap_expect = 1 607 608 # Encrypt gossip communication 609 encrypt = "cg8StVXbQJ0gPvMd9o7yrg==" 610 } 611 ``` 612 613 Unlike with TLS, reloading Nomad will not be enough to initiate encryption of 614 gossip traffic. At this point, you may restart each Nomad server with `systemctl 615 restart nomad`. 616 617 [capability]: https://www.vaultproject.io/docs/concepts/policies.html#capabilities 618 [config-parameters]: https://www.vaultproject.io/api/secret/pki/index.html#parameters-8 619 [consul-template]: https://www.consul.io/docs/guides/consul-template.html 620 [consul-template-github]: https://github.com/hashicorp/consul-template 621 [ct-download]: https://releases.hashicorp.com/consul-template/ 622 [ct-template-stanza]: https://github.com/hashicorp/consul-template#configuration-file-format 623 [login]: https://www.vaultproject.io/docs/commands/login.html 624 [nomad-tls-stanza]: https://www.nomadproject.io/docs/configuration/tls.html 625 [policies]: https://www.vaultproject.io/docs/concepts/policies.html#policies 626 [pki-engine]: https://www.vaultproject.io/docs/secrets/pki/index.html 627 [repo]: https://github.com/hashicorp/nomad/tree/master/terraform 628 [rpc-upgrade-mode]: /docs/configuration/tls.html#rpc_upgrade_mode 629 [rpc-upgrade]: /guides/security/securing-nomad.html#rpc-upgrade-mode-for-nomad-servers 630 [seal]: https://www.vaultproject.io/docs/concepts/seal.html 631 [secure-introduction]: https://learn.hashicorp.com/vault/identity-access-management/iam-secure-intro 632 [token]: https://www.vaultproject.io/docs/concepts/tokens.html 633 [vault-ca-learn]: https://learn.hashicorp.com/vault/secrets-management/sm-pki-engine 634 [vault-ra]: https://learn.hashicorp.com/vault/operations/ops-reference-architecture