github.com/m-lab/locate@v0.17.6/management/README.md (about)

     1  # Management of Locate Service JWKs
     2  
     3  The Locate Service (LS) loads multiple JSON Web Keys (JWK) for signing and
     4  verifying request tokens. To safely configure and deploy private signer and
     5  public verifier keys in AppEngine, we rely on Google Secret Manager (SM).
     6  
     7  To bootstrap keys and secrets in a project, where none currently exist, an
     8  operator should use the `management/create_jwt_keys_and_secrets.sh` script. The
     9  script takes two arguments, GCP project ID and "keyID". It is recommended that
    10  the keyID be in the following date format _YYMMDD_. For example:
    11  
    12  ```sh
    13  cd management
    14  ./create_jwt_keys_and_secrets.sh mlab-sandbox $(date +%Y%m%d)
    15  ```
    16  
    17  This script will generate two JWT key pairs, one for the LS and the other for
    18  platform monitoring. The keys will be located in the directory where you ran the
    19  script and will look like:
    20  
    21  ```sh
    22  ls -1 jwk*
    23  jwk_sig_EdDSA_locate_<keyID>
    24  jwk_sig_EdDSA_locate_<keyID>.pub
    25  jwk_sig_EdDSA_monitoring_<keyID>
    26  jwk_sig_EdDSA_monitoring_<keyID>.pub
    27  ```
    28  
    29  The script will automatically load the LS private signer key and monitoring
    30  public verify key into SM. These keys will be read from SM by the LS at
    31  runtime. However, it is the operator's responsibility to deploy the monitoring
    32  private signer key and LS public verifier key. Instructions on how to deploy
    33  these keys can be found in sections later in this document.
    34  
    35  ## Key rotation
    36  
    37  Periodically, the private signer keys should be rotated. To ensure continuity of
    38  verification, LS and clients (in particular the ndt-server) MUST possess both
    39  the current _and_ next verifier keys.
    40  
    41  To create new key pairs for both the LS and monitoring, run the same script
    42  (`./management/create_jwt_keys_and_secrets.sh`) as before. If a SM secret
    43  for the LS private signer key already exists, you will be prompted to add a new
    44  version of the secret. The same goes for the monitoring verify key. The new
    45  LS private signer key version will _not_ be active or used by the LS without
    46  further action, described below. However, the new monitoring public verify key
    47  will be active right away. There is no harm in activating new verify keys right
    48  away, as long as the older ones remains enabled too.
    49  
    50  ### Locate Service
    51  
    52  The LS reads the private signer key directly from the SM at runtime. When the
    53  script creates a new version of a secret, it will disable the new private signer
    54  key version. This will reduce the chance that the LS could unintentionally load
    55  the new private signer key when it starts up. Before enabling the key, you
    56  *must* ensure that the associated LS public verifier key has been distributed to
    57  all services that need to verify LS keys. Currently, this list includes:
    58  
    59  * ndt
    60  * ndt-canary
    61  * access-envelope
    62  
    63  To deploy the LS public verifier key, upload the public key to Google Cloud
    64  Storage (GCS). The public verifier key will be located in the directory where
    65  you ran the script, and will have a name like
    66  "jwk_sig_EdDSA_locate_\<keyid\>.pub". Upload it to the following bucket,
    67  replacing "\<project\>" with the GCP project in which you are working. For
    68  example:
    69  
    70  ```sh
    71  gsutil cp jwk_sig_EdDSA_locate_20211020.pub gs://k8s-support-mlab-sandbox/locate/
    72  ```
    73  
    74  The Google Cloud Build (GCB) deployment scripts in the k8s-support repo will
    75  read the LS public verifier key from that bucket. Once the key has been
    76  uploaded, you will need to add an *additional* `-token.verify-key` flag to all
    77  of the services that use it:
    78  
    79  * [ndt.jsonnet](https://github.com/m-lab/k8s-support/blob/master/k8s/daemonsets/experiments/ndt.jsonnet)
    80  * [ndt-canary.jsonnet](https://github.com/m-lab/k8s-support/blob/master/k8s/daemonsets/experiments/ndt-canary.jsonnet)
    81  * [wehe.jsonnet](https://github.com/m-lab/k8s-support/blob/master/k8s/daemonsets/experiments/wehe.jsonnet)
    82  
    83  **NOTE**: do not remove or overwrite the existing `-token-verify-key` flags. The
    84  flag can be specified multiple times, and we are adding a new one.  The value of
    85  the new flag should be the filename for the LS public verifier key you uploaded
    86  to GCS.
    87  
    88  Once the k8s-support GCB build has completed after a push to the repo (or a
    89  tag, for production), verify that the Kubernetes Secret named
    90  `locate-verify-keys` in the platform cluster contains the file you uploaded to
    91  GCS:
    92  
    93  ```sh
    94  kubectl --context <project> describe secret locate-verify-keys
    95  ```
    96  
    97  Modifying the DaemonSet pod specs (with the additional -token-verify-key flag)
    98  will cause rolling updates of the DaemonSets, deploying the new LS public
    99  verifier key. Once you are 100% sure that all services using the key have rolled
   100  out the change, and you have verified that all services have loaded the
   101  additional public key, then, and only then, can you enable the new signer key in
   102  the Secret Manager. You can enable it with a command like:
   103  
   104  ```sh
   105  gcloud secrets versions list locate-service-signer-key --project <project>
   106  # Note the version number of the new, disabled version
   107  gcloud secrets versions enable <version> --secret locate-service-verify-key --project <project>
   108  ```
   109  
   110  You are still not done. The LS is configured to use the _oldest_ enabled version
   111  of a secret. To promote your new private signer key, you will also need to
   112  disable all previous versions of the secret, such that your new private signer
   113  key version is the _oldest_ enabled version. In practice, this will generally
   114  mean disabling only the previous enabled version. Using commands similar to the
   115  ones you used above to enable your new private signer key version, disable all
   116  older versions of the secret older than yours.
   117  
   118  LS instances will start using the new private signer key when they are
   119  restarted. You can use the `cbctl` command to trigger a Cloud Build of the LS,
   120  which will redeploy the service, causing all instances to be restarted. For
   121  example:
   122  
   123  ```sh
   124  go get github.com/m-lab/gcp-config/cmd/cbctl
   125  cbctl -repo locate -trigger_name push-m-lab-locate-trigger -project mlab-staging
   126  ```
   127  
   128  You can verify which private signer key the LS loaded on initialization by
   129  looking through LS logs. You can find relevant entries with something like:
   130  
   131  ```sh
   132  gcloud app logs read --service locate --project mlab-sandbox | grep 'Loading JWT'
   133  ```
   134  
   135  To test that things are working as intended with the new signer key and deployed
   136  verifier key, you can use the ndt7-client-go client. Install the client:
   137  
   138  ```sh
   139  go get github.com/m-lab/ndt7-client-go/cmd/ndt7-client
   140  ```
   141  
   142  Run an ndt7 test with something like:
   143  
   144  ```sh
   145  ndt7-client -locate.url https://mlab-sandbox.appspot.com/v2/nearest/ndt/ndt7
   146  ```
   147  
   148  Assuming you have verified that all instances of the LS loaded the new, proper
   149  signer key, then a successful test verifies a successful key rotation.
   150  
   151  ### Monitoring
   152  
   153  The script-exporter pod in the prometheus-federation cluster uses the monitoring
   154  private signer key for monitoring purposes. It signs a token using the private
   155  key, and the LS verifies it using the corresponding public key. To deploy the
   156  new private key, update the MONITORING_SIGNER_KEY evironment variable in the
   157  Travis-CI settings for the prometheus-support repo:
   158  
   159  [https://travis-ci.com/github/m-lab/prometheus-support/settings](https://travis-ci.com/github/m-lab/prometheus-support/settings)
   160  
   161  You will need to delete the existing MONITORING_SIGNER_KEY environment variable
   162  and recreate it using the value found in the file "jwk_sig_EdDSA_monitoring_\<keyid\>"
   163  after the script is run.
   164  
   165  The next time the prometheus-support Travis-CI build runs for a project, the new
   166  key from that environment variable will start to be used.
   167  
   168  **WARNING**: The monitoring public verifier key should have uploaded
   169  automatically to the SM when you ran the
   170  `./management/create_jwt_keys_and_secrets.sh` script. However, LS instances will
   171  not load the new verifier key until they are restarted. DO NOT deploy the new
   172  monitoring private signer key, until you are sure that all LS instances in
   173  AppEngine have restarted and picked up the new verify public key which
   174  corresponds to the private key, else script-exporter monitoring requests will
   175  fail. You can trigger a redeployment of the LS using `cbctl` as described
   176  earlier in this document.