github.com/sentienttechnologies/studio-go-runner@v0.0.0-20201118202441-6d21f2ced8ee/docs/message_privacy.md (about) 1 # Message Encryption 2 3 This section describes the message encryption, and signing features of the runner. Message payloads are described in the docs/interface.md file. Encryption, and signing is only supported within Kubernetes deployments. The reason for this is that standalone runners cannot be secured and have shared secrets without the isolation provided by Kubernetes. 4 5 Encrypted payloads use a hybrid cryptosystem, [please click for a detailed description](https://en.wikipedia.org/wiki/Hybrid_cryptosystem). 6 7 Message signing uses Ed25519 signing as defined by RFC8032, more information can be found at[https://ed25519.cr.yp.to/](https://ed25519.cr.yp.to/). 8 9 Ed25519 certificate SHA1 fingerprints, not intended to be cryptographicaly secure, will be used by clients to assert identity, confirmed by successful verification.a Verification still relies on a full public key. 10 11 <!--ts--> 12 13 Table of Contents 14 ================= 15 16 * [Message Encryption](#message-encryption) 17 * [Table of Contents](#table-of-contents) 18 * [Introduction](#introduction) 19 * [Encryption](#encryption) 20 * [Key creation by the cluster owner](#key-creation-by-the-cluster-owner) 21 * [Mount secrets into runner deployment](#mount-secrets-into-runner-deployment) 22 * [Message format](#message-format) 23 * [Signing](#signing) 24 * [Signing deployment](#signing-deployment) 25 * [First time creation](#first-time-creation) 26 * [Manual insertion](#manual-insertion) 27 * [Automatted insertion](#automatted-insertion) 28 * [Python StudioML configuration](#python-studioml-configuration) 29 <!--te--> 30 31 # Introduction 32 33 This document describes encryption of Request messages sent by StudioML clients to the runner. 34 35 Encryption of messages has two tiers, the first tier is a Public-key scheme that has the runner employ a private key and a public key that is given to experimenters using the python or other client software. 36 37 The concerns to users of the system is to obtain from the computer cluster owner the public key, and only the public key. The public key can then be made accessible to the client for securing the messages exchanged with the runner compute instances. 38 39 The compute cluster owner will be resposible for generating the public-private key pair and manging the integrity of the private key. They will also be responsible for distribution of the public key to any experiments, or users of the system. 40 41 The client encrypts a per message secret that is encrypted using the public key, and prepended to a payload that contains the request message encrypted using the secret. 42 43 # Encryption 44 45 ## Key creation by the cluster owner 46 47 The owner of the compute cluster is responsible for the generation of key pair for use with the message encryption. The following commands show the creation of the key pairs. 48 49 ``` 50 echo -n "PassPhrase" > secret_phrase 51 ssh-keygen -t rsa -b 4096 -f studioml_message -C "Message Encryption Key" -N "PassPhrase" 52 ssh-keygen -f studioml_message.pub -e -m PEM > studioml_message.pub.pem 53 cp studioml_message studioml_message.pem 54 ssh-keygen -f studioml_message.pem -e -m PEM -p -P "PassPhrase" -N "PassPhrase" 55 ``` 56 57 The private key file and the passphrase should be considered as valuable secrets for your organization that MUST be protected and cared for appropriately. 58 59 Once the keypair has been created they can be loaded into the Kubernetes runner cluster using the following commands: 60 61 ``` 62 kubectl create secret generic studioml-runner-key-secret --from-file=ssh-privatekey=studioml_message.pem --from-file=ssh-publickey=studioml_message.pub.pem 63 kubectl create secret generic studioml-runner-passphrase-secret --from-file=ssh-passphrase=secret_phrase 64 ``` 65 66 The passphrase is kept in a seperate secret to enable RBAC access to be used to isolate the two pieces of knowledge should your secrets management procedures call for this. 67 68 The public PEM key MUST be the only file delivered to client side users of StudioML in PEM Key file format, for example: 69 70 ``` 71 -----BEGIN RSA PUBLIC KEY----- 72 MIICCgKCAgEAtZurOEVuT9bhjiUWX7U8EFxL8oMGWSLXf4M6QBsJ5TljtSqyIxvI 73 kXiQDLIpJXY8KRmiR9RghGopvB5NfAMLZtfwozuju2NtnSn0UPI+6O4ED6TfDP5F 74 eta/6tUKAuvxVwF5Yvr7en1qnbv4L86vqeukrn/gIPTb7LlsFjt6uHlxA6xTAun/ 75 HfRKlBiWR5rIi/fwuUMmTGpAcCa8s5Gqfla28FfsknGOipy4Vw4Mt7f93ke1dHN+ 76 dY/J2TpCm/GNJuFaHc4EgHE8uw+jU6uBgpZAJSIzK5dxYniEjZS93CWxs2HN8dmV 77 wEqleT02agWW4cfa13X3Lz1YoQkCjYtSqB8Y2KjT1q7sSll0HExWV58kFPk9FmIy 78 JniMLcLFzAxGDM5UgtmsdSYmqN49vlqOejxfYxy6GrKXrkRGCDuQKyb2m/WQLXGU 79 8cGqwuVpN/JNWjiG4+NaxWRzfE2Yk4gbhcYqXRocNMlidG0Sx/xrFTFln86lmGJ1 80 RCse6jv3beENf5lfrz4ddAzAssjTivmlZgJCTK2oROT3WPI/G6CaBQadt13XkQLW 81 hAZDbnsZMhOVH3/UiQJ6DwgV0yK5FND4jkbHM3GWGNLRIrnL9F0I8c1p9X2oCx6T 82 plgCug3iz5cE9+G2455Y1vaVMBEKSm1REhsdTYzPBV/yXPpPR4lUCmkCAwEAAQ== 83 -----END RSA PUBLIC KEY----- 84 ``` 85 86 A single key pair is used to encrypt all requests on the cluster at this time. A future feature is envisioned to allow multiple key pairs. 87 88 When the runner is run the secrets are mounted into the container that Kubernetes is managing. This is done using the deployment yaml. When performing deployments the yaml should be reviewed for runner pod, and their runner container to ensure that the secrets are available and that they are mounted. If these secrets are not loaded into the cluster the runner pod should remain in a pending state. 89 90 # Mount secrets into runner deployment 91 92 Secrets used by the runner will be mounted into the runner pod using the Kubernetes deployment pod resource definition. An example of this is provided within the sample AWS CPU runner that can be found in the [../examples/aws/cpu/deployment.yaml](../examples/aws/cpu/deployment.yaml) file. 93 94 Two mounts will be created firstly for the keyfiles, secondly for the passphrase. These two are split to allow for RBAC to be employed in the cluster should you want it. The motivation is that you might want to divide ownership between two parties for the private key and the and avoid revealing one of these to the other. 95 96 If you wish to use encrypted traffic exclusively be sure to remove the ```CLEAR_TEXT_MESSAGES: "true"``` entry from your ConfigMap entries in the yaml. 97 98 In any event the yaml need to mount these secrets appears as follows: 99 100 ``` 101 apiVersion: apps/v1 102 kind: Deployment 103 metadata: 104 name: studioml-go-runner-deployment 105 labels: 106 app: studioml-go-runner 107 spec: 108 ... 109 template: 110 ... 111 spec: 112 ... 113 containers: 114 - name: studioml-go-runner 115 ... 116 volumeMounts: 117 - name: message-encryption 118 mountPath: "/runner/certs/message/encryption" 119 readOnly: true 120 - name: encryption-passphrase 121 mountPath: "/runner/certs/message/passphrase" 122 readOnly: true 123 - name: queue-signing 124 mountPath: "/runner/certs/queues/signing" 125 readOnly: true 126 ... 127 volumes: 128 ... 129 - name: message-encryption 130 secret: 131 optional: false 132 secretName: studioml-runner-key-secret 133 items: 134 - key: ssh-privatekey 135 path: ssh-privatekey 136 - key: ssh-publickey 137 path: ssh-publickey 138 - name: encryption-passphrase 139 secret: 140 optional: false 141 secretName: studioml-runner-passphrase-secret 142 items: 143 - key: ssh-passphrase 144 path: ssh-passphrase 145 - name: queue-signing 146 secret: 147 optional: false 148 secretName: studioml-signing 149 ``` 150 151 ## Message format 152 153 The encrypted\_data block contains two comma seperated Base64 strings. The first string contains a symmetric key that is encrypted using RSA-OAEP with a key length of 4096 bits, and the sha256 hashing algorithm. The second field contains the JSON string for the Request message that is first encrypted using a NaCL SecretBox encryption and then encoded as Base64. 154 155 The encryption works in two steps, first the secretbox based symmetric shared key is generated for every message by the source generating the message. The data within the messages is encrypted with the symmetric key. The symmetric key is then encrypted and placed at the front of the message using an asymmetric key. This has the following effects: 156 157 The sender can decrypt the payload if they retain their original symmetric key. 158 The sender can not decrypt the symmetric key, once it is placed encrypted into the payload 159 The legitimate runner if able to access the RSA PEM private key can decrypt the asymmetric key, and only then can subsequently decrypt the Request in the payload. 160 Evesdropping software cannot decrypt the asymmetricly encrypted secretbox key and so cannot decrypt the rest of the payload. 161 162 # Signing 163 164 Message signing is a way of protecting the runner receiving messages from processing spoofed requests. To prevent this the runner can be configured to read public key information from Kubernetes secrets and then to use this to validate messages that are being received. The configuration information for the runner signing keys is detailed in the next section. 165 166 Signing is only supported in Kubernetes deployments. 167 168 The portion of the message that is signed is the Base64 representation of the entire payload field. The payload field including the base64 string of the key, a comma, and the base64 string of encoded payload proper. 169 170 The format of the signature that is transmitted using the StudioML message signature field consists of the Base64 encoded signature blob, encoded from the binary 64 byte signature. 171 172 Message signing uses Ed25519 signing as defined by RFC8032, more information can be found at[https://ed25519.cr.yp.to/](https://ed25519.cr.yp.to/). 173 174 Ed25519 certificate SHA256 fingerprints, not intended to be cryptographicaly secure, will be used by clients to assert identity, confirmed by successful verification. Verification of messages sent to the runner relies on a public key supplied by the experimenter. The follow example shows how an experimenter would go about creating a private public key pair suitable for signing: 175 176 ``` 177 ssh-keygen -t ed25519 -f studioml_signing -P "" 178 ssh-keygen -l -E sha256 -f studioml_signing.pub 179 256 SHA256:BB+StMfwvv/8Dutb0i1QpdBL171Fg/Fd3ODebi+NX74 kmutch@awsdev (ED25519) 180 ``` 181 182 The finger print can be extracted and sent to the cluster administrator, from the last line of the above output. 183 184 Having generated a key pair the PUBLIC key file should be transmitted to the administrators of any runner compute clusters that will be used. Along with sending the key the experimenter should decide in conjunction with their community the queue name prefixes they will be assigned to use exclusively. The queue name prefixes should be passed to the administrators with the public key pem file. 185 186 Queue name prefixes should be a minimum of four characters to include the queue technology being used with the underscore, for example 'rmq_', or 'sqs_' to use the public key on all four queues. 187 188 If you send the request via email you might compose something like the following to send: 189 190 ``` 191 Hi, 192 193 I would like to add/replace a signing verification key for any queues on the 54.123.10.5 Rabbit MQ Server for our cluster with the prefix of 'rmq_cpu_andrei_'. 194 195 They public key I wish to use is: 196 197 ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFITo06Pk8sqCMoMHPaQiQ7BY3pjf7OE8BDcsnYozmIG kmutch@awsdev 198 199 Our fingerprint is: 200 201 SHA256:BB+StMfwvv/8Dutb0i1QpdBL171Fg/Fd3ODebi+NX74 202 203 Thanks, 204 Andrei 205 ``` 206 207 The above should provide enough information to the administrator to apply your key to the system and reply using email confirming the key has been added. 208 209 Once a message signing public key has been assigned any messages on related queue MUST have a valid signature attached to messages otherwise they will be rejected. 210 211 ## Signing deployment 212 213 Before starting any addition of message signing keys the cluster administrator must check that the request being sent originated from a pre-nominated sender. 214 215 Signing keys can be injected into the compute cluster using Kubernetes secrets. The runners in a cluster will use a secret in the same namespace called 'studioml-signing' for extracting signing keys. The addition of new keys is via the addition of data items within the secrets resource via the kubectl apply command. Changes or additions to signing keys are propogated via the mounted resource within the runner pods, see [Mounted Secrets are updated automatically](https://kubernetes.io/docs/concepts/configuration/secret/#mounted-secrets-are-updated-automatically). 216 217 Using the example, above, then a secret data item can be added to the studio signing secrets using a command such as the following example workflow shows: 218 219 ``` 220 $ export KUBECTL_CONFIG=~/.kube/my_cluster.config 221 $ export KUBECTLCONFIG=~/.kube/my_cluster.config 222 $ kubectl get secrets 223 NAME TYPE DATA AGE 224 default-token-qps8p kubernetes.io/service-account-token 3 11s 225 docker-registry-config Opaque 1 11s 226 release-github-token Opaque 1 11s 227 studioml-runner-key-secret Opaque 2 11s 228 studioml-runner-passphrase-secret Opaque 1 11s 229 studioml-signing Opaque 1 11s 230 ``` 231 ``` 232 $ kubectl get secrets studioml-signing -o=yaml 233 apiVersion: v1 234 data: 235 info: RHVtbXkgU2VjcmV0IHNvIHJlc291cmNlIHJlbWFpbnMgcHJlc2VudA== 236 kind: Secret 237 metadata: 238 annotations: 239 kubectl.kubernetes.io/last-applied-configuration: | 240 {"apiVersion":"v1","data":{"info":"RHVtbXkgU2VjcmV0IHNvIHJlc291cmNlIHJlbWFpbnMgcHJlc2VudA=="},"kind":"Secret","metadata":{"annotations":{},"name":"studioml-signing","namespace":"default"},"type":"Opaque"} 241 creationTimestamp: "2020-05-15T22:05:26Z" 242 managedFields: 243 - apiVersion: v1 244 fieldsType: FieldsV1 245 fieldsV1: 246 f:data: 247 .: {} 248 f:info: {} 249 f:metadata: 250 f:annotations: 251 .: {} 252 f:kubectl.kubernetes.io/last-applied-configuration: {} 253 f:type: {} 254 manager: kubectl 255 operation: Update 256 time: "2020-05-15T22:05:26Z" 257 name: studioml-signing 258 resourceVersion: "790034" 259 selfLink: /api/v1/namespaces/ci-go-runner-kmutch/secrets/studioml-signing 260 uid: bc13f78d-199b-4afb-8b3a-31b6ea486c8e 261 type: Opaque 262 ``` 263 264 This next line will take the public key that was emailed to you and convert it into Base 64 format ready to be inserted into the Kubernetes secret input encoding. 265 266 ``` 267 $ item=`cat << EOF | base64 -w 0 268 ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFITo06Pk8sqCMoMHPaQiQ7BY3pjf7OE8BDcsnYozmIG kmutch@awsdev 269 EOF 270 ` 271 ``` 272 273 ### First time creation 274 275 276 The first time the queue secrets are used you must create the Kubernetes resource as the following examples shows. Also note that when a secret is directly loaded from a file that the data is not Base64 encoded in the input file prior to being read by kubectl. 277 278 ``` 279 tmp_name=`mktemp` 280 echo -n "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFITo06Pk8sqCMoMHPaQiQ7BY3pjf7OE8BDcsnYozmIG kmutch@awsdev" > $tmp_name 281 kubectl create secret generic studioml-signing --from-file=rmq_cpu_andrei_=$tmp_name 282 rm $tmp_name 283 ``` 284 285 ### Manual insertion 286 287 If you do not have the jq tool installed you will now have to manually edit the secret using the following command: 288 289 ``` 290 $ kubectl edit secrets studioml-signing 291 ``` 292 293 Now manually insert a yaml line after the info: item so that things appear as follows: 294 295 ``` 296 1 # Please edit the object below. Lines beginning with a '#' will be ignored, 297 2 # and an empty file will abort the edit. If an error occurs while saving this file will be 298 3 # reopened with the relevant failures. 299 4 # 300 5 apiVersion: v1 301 6 data: 302 7 info: RHVtbXkgU2VjcmV0IHNvIHJlc291cmNlIHJlbWFpbnMgcHJlc2VudA== 303 8 rmq_cpu_andrei_: c3NoLWVkMjU1MTkgQUFBQUMzTnphQzFsWkRJMU5URTVBQUFBSUZJVG8wNlBrOHNxQ01vTUhQYVFpUTdCWTNwamY3T0U4QkRjc25Zb3ptSUcga211dGNoQGF3c2Rldgo= 304 9 kind: Secret 305 10 metadata: 306 11 annotations: 307 ... [redacted] ... 308 ``` 309 310 Now use the ':wq' command to exit the editor and have the secret updated inside the cluster. 311 312 ### Automatted insertion 313 314 Using the jq command the new secret can be inserted into the secret using the following: 315 316 ``` 317 kubectl get secret studioml-signing -o json | jq --arg item= "${item}" '.data["rmq_cpu_andrei_"]=$item' | kubectl apply -f - 318 ``` 319 320 # Python StudioML configuration 321 322 In order to use experiment payload encryption with the Python-based StudioML client, 323 the StudioML section of experiment configuration must specify 324 a path to the public key file in PEM format. If a path is not specified, 325 the experiment payload will be submitted unencrypted, in plain text form. 326 327 If a StudioML configuration is provided as part of the enclosing completion service configuration, in .hocon format, it would include the following (example): 328 329 ``` 330 { 331 ... 332 "studio_ml_config": { 333 ... 334 "public_key_path": "/home/user/keys/my-key.pub.pem", 335 ... 336 } 337 ... 338 } 339 ``` 340 341 another possibility is: 342 343 ``` 344 { 345 ... 346 "studio_ml_config": { 347 ... 348 "public_key_path": ${PUBLIC_KEY_PATH}, 349 ... 350 } 351 ... 352 } 353 ``` 354 355 For the base StudioML configuration, in .yaml format, specifying the public key for encryption would look like: 356 357 ``` 358 public_key_path: /home/user/keys/my-key.pub.pem 359 ``` 360 361 If you wish to use message signing to prove that queue messages you send to the cluster are from a genuine sender then an additional option can be specified, for example: 362 363 ``` 364 { 365 ... 366 "studio_ml_config": { 367 ... 368 "public_key_path": "/home/user/keys/my-key.pub.pem", 369 "signing_key_path": "/home/user/keys/studioml_signing", 370 ... 371 } 372 ... 373 } 374 ``` 375 376 Copyright © 2019-2020 Cognizant Digital Business, Evolutionary AI. All rights reserved. Issued under the Apache 2.0 license.