github.com/pachyderm/pachyderm@v1.13.4/doc/docs/master/deploy-manage/deploy/ibmcloud-openshift.md (about) 1 # IBM Cloud OpenShift 3.1 2 3 [OpenShift](https://www.openshift.com/products/openshift-ibm-cloud/) is a popular enterprise Kubernetes distribution. 4 Pachyderm can run on IBM Cloud OpenShift with a few small tweaks in the [OpenShift deployment process](openshift.md), the most important being the StatefulSets deployment. 5 6 Please see [known issues](#known-issues) below for currently issues with OpenShift deployments. 7 8 ## Prerequisites 9 10 Pachyderm needs a few things to install and run successfully in IBM Cloud OpenShift environment. 11 12 ##### Binaries for CLI 13 14 - [oc](https://cloud.ibm.com/docs/openshift?topic=openshift-openshift-cli#cli_oc) 15 - [pachctl](#install-pachctl) 16 17 1. Since this is a stateful set based deployment, it uses either Persistent Volume Provisioning or pre-provisioned PV's using a defined storage class. 18 1. An object store, used by Pachyderm's `pachd` for storing all your data. 19 The object store you use will probably be dependent on where you're going to run OpenShift: S3 for [AWS](https://pachyderm.readthedocs.io/en/latest/deployment/amazon_web_services.html), GCS for [Google Cloud Platform](https://pachyderm.readthedocs.io/en/latest/deployment/google_cloud_platform.html), Azure Blob Storage for [Azure](https://pachyderm.readthedocs.io/en/latest/deployment/azure.html), or a storage provider like Minio, EMC's ECS or Swift providing S3-compatible access to enterprise storage for on-premises deployment. 20 1. Access to particular TCP/IP ports for communication. 21 22 ### Stateful Set 23 24 You'll need a storage class as defined in IBM Cloud OpenShift cluster. You would need to specify the number of replica pods needed in the configuration. 25 A custom deploy can also create storage. 26 27 We're currently developing good rules of thumb for scaling this storage as your Pachyderm deployment grows, 28 but it looks like 10G of disk space is sufficient for most purposes. 29 30 ### Object store 31 32 Size your object store generously, since this is where Pachyderm, versions and stores all your data. 33 You'll need four items to configure object storage, in this case we have used Minio. 34 35 1. The access endpoint. 36 For example, Minio's endpoints are usually something like `minio-server:9000`. 37 Don't begin it with the protocol; it's an endpoint, not an url. 38 1. The bucket name you're dedicating to Pachyderm. Pachyderm will need exclusive access to this bucket. 39 1. The access key id for the object store. This is like a user name for logging into the object store. 40 1. The secret key for the object store. This is like the above user's password. 41 42 ### TCP/IP ports 43 44 For more details on how Kubernetes networking and service definitions work, see the [Kubernetes services 45 documentation](https://kubernetes.io/docs/concepts/services-networking/service/). 46 47 #### Incoming ports (port) 48 49 These are the ports internal to the containers. 50 You'll find these on both the pachd and dash containers. 51 OpenShift runs containers and pods as unprivileged users which don't have access to port numbers below 1024. 52 Pachyderm's default manifests use ports below 1024, so you'll have to modify the manifests to use other port numbers. 53 It's usually as easy as adding a "1" in front of the port numbers we use. 54 55 #### Pod ports (targetPort) 56 57 This is the port exposed by the pod to Kubernetes, which is forwarded to the `port`. 58 You should leave the `targetPort` set at `0` so it will match the `port` definition. 59 60 #### External ports (nodePorts) 61 62 This is the port accessible from outside of Kubernetes. 63 You probably don't need to change `nodePort` values unless your network security requirements or architecture requires you to change to another method of access. 64 Please see the [Kubernetes services documentation](https://kubernetes.io/docs/concepts/services-networking/service/) for details. 65 66 ## The OCPify script 67 68 A bash script that automates many of the substitutions below is available [at this gist](https://gist.github.com/gabrielgrant/86c1a5b590ae3f4b3fd32d7e9d622dc8). 69 You can use it to modify a manifest created using the `--dry-run` flag to `pachctl deploy custom`, as detailed below, and then use this guide to ensure the modifications it makes are relevant to your OpenShift environment. 70 It requires certain prerequisites, just as [jq](https://github.com/stedolan/jq) and [sponge, found in moreutils](https://joeyh.name/code/moreutils/). 71 72 This script may be useful as a basis for automating redeploys of Pachyderm as needed. 73 74 ### Best practices: Infrastructure as code 75 76 We highly encourage you to apply the best practices used in developing software to managing the deployment process. 77 78 1. Create scripts that automate as much of your processes as possible and keep them under version control. 79 1. Keep copies of all artifacts, such as manifests, produced by those scripts and keep those under version control. 80 1. Document your practices in the code and outside it. 81 82 ## Preparing to deploy Pachyderm 83 84 Things you'll need: 85 86 1. Your storage class 87 88 1. Your object store information. 89 90 1. Your project in OpenShift. 91 92 1. A text editor for editing your deployment manifest. 93 ## Deploying Pachyderm 94 ### 1. Determine your role security policy 95 Pachyderm is deployed by default with cluster roles. 96 Many institutional Openshift security policies require namespace-local roles rather than cluster roles. 97 If your security policies require namespace-local roles, use the [`pachctl deploy` command below with the `--local-roles` flag](#namespace-local-roles). 98 ### 2. Run the deploy command with --dry-run 99 Once you have your PV, object store, and project, you can create a manifest for editing using the `--dry-run` argument to `pachctl deploy`. 100 That step is detailed in the deployment instructions for each type of deployment, above. 101 102 Below, find examples, 103 with cluster roles and with namespace-local roles, 104 using AWS elastic block storage as a persistent disk with a custom deploy. 105 We'll show how to remove this PV in case you want to use a PV you create separately. 106 107 #### Cluster roles 108 109 ``` 110 pachctl deploy custom --persistent-disk aws --etcd-storage-class <storage-class-name> \ 111 --object-store <object-storage-name> etcd-volume <volume-size-in-gb> \ 112 <s3-bucket-name> <s3-access-key-id> <s3-access-secret-key> <s3-access-endpoint-url> \ 113 --dynamic-etcd-nodes <no-of-replica-pods> --dry-run > manifest-statefulset.json 114 ``` 115 116 #### Namespace-local roles 117 118 ``` 119 pachctl deploy custom --persistent-disk aws --etcd-storage-class <storage-class-name> \ 120 --object-store <object-storage-name> etcd-volume <volume-size-in-gb> \ 121 <s3-bucket-name> <s3-access-key-id> <s3-access-secret-key> <s3-access-endpoint-url> \ 122 --dynamic-etcd-nodes <no-of-replica-pods> --local-roles --dry-run > manifest-statefulset.json 123 ``` 124 125 ### 4. Modify pachd Service ports 126 127 In the deployment manifest, which we called `manifest.json`, above, find the stanza for the `pachd` Service. An example is shown below. 128 129 ``` 130 { 131 "kind": "Service", 132 "apiVersion": "v1", 133 "metadata": { 134 "name": "pachd", 135 "namespace": "pachyderm", 136 "creationTimestamp": null, 137 "labels": { 138 "app": "pachd", 139 "suite": "pachyderm" 140 }, 141 "annotations": { 142 "prometheus.io/port": "9091", 143 "prometheus.io/scrape": "true" 144 } 145 }, 146 "spec": { 147 "ports": [ 148 { 149 "name": "api-grpc-port", 150 "port": 650, 151 "targetPort": 0, 152 "nodePort": 30650 153 }, 154 { 155 "name": "trace-port", 156 "port": 651, 157 "targetPort": 0, 158 "nodePort": 30651 159 }, 160 { 161 "name": "api-http-port", 162 "port": 652, 163 "targetPort": 0, 164 "nodePort": 30652 165 }, 166 { 167 "name": "saml-port", 168 "port": 654, 169 "targetPort": 0, 170 "nodePort": 30654 171 }, 172 { 173 "name": "api-git-port", 174 "port": 999, 175 "targetPort": 0, 176 "nodePort": 30999 177 }, 178 { 179 "name": "s3gateway-port", 180 "port": 600, 181 "targetPort": 0, 182 "nodePort": 30600 183 } 184 ], 185 "selector": { 186 "app": "pachd" 187 }, 188 "type": "NodePort" 189 }, 190 "status": { 191 "loadBalancer": {} 192 } 193 } 194 ``` 195 196 While the nodePort declarations are fine, the port declarations are too low for OpenShift. Good example values are shown below. 197 198 ``` 199 "spec": { 200 "ports": [ 201 { 202 "name": "api-grpc-port", 203 "port": 1650, 204 "targetPort": 0, 205 "nodePort": 30650 206 }, 207 { 208 "name": "trace-port", 209 "port": 1651, 210 "targetPort": 0, 211 "nodePort": 30651 212 }, 213 { 214 "name": "api-http-port", 215 "port": 1652, 216 "targetPort": 0, 217 "nodePort": 30652 218 }, 219 { 220 "name": "saml-port", 221 "port": 1654, 222 "targetPort": 0, 223 "nodePort": 30654 224 }, 225 { 226 "name": "api-git-port", 227 "port": 1999, 228 "targetPort": 0, 229 "nodePort": 30999 230 }, 231 { 232 "name": "s3gateway-port", 233 "port": 1600, 234 "targetPort": 0, 235 "nodePort": 30600 236 } 237 ], 238 ``` 239 240 ### 5. Modify pachd Deployment ports and add environment variables 241 In this case you're editing two parts of the `pachd` Deployment json. 242 Here, we'll omit the example of the unmodified version. 243 Instead, we'll show you the modified version. 244 #### 5.1 pachd Deployment ports 245 The `pachd` Deployment also has a set of port numbers in the spec for the `pachd` container. 246 Those must be modified to match the port numbers you set above for each port. 247 248 ``` 249 { 250 "kind": "Deployment", 251 "apiVersion": "apps/v1beta1", 252 "metadata": { 253 "name": "pachd", 254 "namespace": "pachyderm", 255 "creationTimestamp": null, 256 "labels": { 257 "app": "pachd", 258 "suite": "pachyderm" 259 } 260 }, 261 "spec": { 262 "replicas": 1, 263 "selector": { 264 "matchLabels": { 265 "app": "pachd", 266 "suite": "pachyderm" 267 } 268 }, 269 "template": { 270 "metadata": { 271 "name": "pachd", 272 "namespace": "pachyderm", 273 "creationTimestamp": null, 274 "labels": { 275 "app": "pachd", 276 "suite": "pachyderm" 277 }, 278 "annotations": { 279 "iam.amazonaws.com/role": "" 280 } 281 }, 282 "spec": { 283 "volumes": [ 284 { 285 "name": "pach-disk" 286 }, 287 { 288 "name": "pachyderm-storage-secret", 289 "secret": { 290 "secretName": "pachyderm-storage-secret" 291 } 292 } 293 ], 294 "containers": [ 295 { 296 "name": "pachd", 297 "image": "pachyderm/pachd:{{ config.pach_latest_version }}", 298 "ports": [ 299 { 300 "name": "api-grpc-port", 301 "containerPort": 650, 302 "protocol": "TCP" 303 }, 304 { 305 "name": "trace-port", 306 "containerPort": 651 307 }, 308 { 309 "name": "api-http-port", 310 "containerPort": 652, 311 "protocol": "TCP" 312 }, 313 { 314 "name": "peer-port", 315 "containerPort": 653, 316 "protocol": "TCP" 317 }, 318 { 319 "name": "api-git-port", 320 "containerPort": 999, 321 "protocol": "TCP" 322 }, 323 { 324 "name": "saml-port", 325 "containerPort": 654, 326 "protocol": "TCP" 327 } 328 ], 329 330 ``` 331 332 #### 5.2 Add environment variables 333 334 You need to configure the following environment variables for 335 OpenShift: 336 337 1. `WORKER_USES_ROOT`: This controls whether worker pipelines run as the root user or not. You'll need to set it to `false` 338 1. `PORT`: This is the grpc port used by pachd for communication with `pachctl` and the api. It should be set to the same value you set for `api-grpc-port` above. 339 1. `HTTP_PORT`: The port for the api proxy. It should be set to `api-http-port` above. 340 1. `PEER_PORT`: Used to coordinate `pachd`'s. Same as `peer-port` above. 341 1. `PPS_WORKER_GRPC_PORT`: Used to talk to pipelines. Should be set to a value above 1024. The example value of 1680 below is recommended. 342 1. `PACH_ROOT`: The Pachyderm root directory. In an OpenShift 343 deployment, you need to set this value to a directory to which non-root users 344 have write access, which might depend on your container image. By 345 default, `PACH_ROOT` is set to `/pach`, which requires root privileges. 346 Because in OpenShift, you do not have root access, you need to modify 347 the default setting. 348 349 The added values below are shown inserted above the `PACH_ROOT` value, which is typically the first value in this array. 350 The rest of the stanza is omitted for clarity. 351 352 ``` 353 "env": [ 354 { 355 "name": "WORKER_USES_ROOT", 356 "value": "false" 357 }, 358 { 359 "name": "PORT", 360 "value": "1650" 361 }, 362 { 363 "name": "HTTP_PORT", 364 "value": "1652" 365 }, 366 { 367 "name": "PEER_PORT", 368 "value": "1653" 369 }, 370 { 371 "name": "PPS_WORKER_GRPC_PORT", 372 "value": "1680" 373 }, 374 { 375 "name": "PACH_ROOT", 376 "value": "<path-to-non-root-dir>" 377 }, 378 379 ``` 380 381 ### 6. Statefulset configuration 382 It uses storage class to create a PVC based on the volume claim template in the IBM Cloud OpenShift cluster. For reference, this example is based on 'ibmc-file-silver' storage class 383 384 ``` 385 { 386 "apiVersion": "apps/v1beta1", 387 "kind": "StatefulSet", 388 "metadata": { 389 "labels": { 390 "app": "etcd", 391 "suite": "pachyderm" 392 }, 393 "name": "etcd", 394 "namespace": "pachyderm" 395 }, 396 "spec": { 397 "replicas": 2, 398 "selector": { 399 "matchLabels": { 400 "app": "etcd", 401 "suite": "pachyderm" 402 } 403 }, 404 "serviceName": "etcd-headless", 405 "template": { 406 "metadata": { 407 "labels": { 408 "app": "etcd", 409 "suite": "pachyderm" 410 }, 411 "name": "etcd", 412 "namespace": "pachyderm" 413 }, 414 "spec": { 415 "containers": [ 416 { 417 "args": [ 418 "\"/usr/local/bin/etcd\" \"--listen-client-urls=http://0.0.0.0:2379\" \"--advertise-client-urls=http://0.0.0.0:2379\" \"--data-dir=/var/data/etcd\" \"--auto-compaction-retention=1\" \"--max-txn-ops=10000\" \"--max-request-bytes=52428800\" \"--quota-backend-bytes=8589934592\" \"--listen-peer-urls=http://0.0.0.0:2380\" \"--initial-cluster-token=pach-cluster\" \"--initial-advertise-peer-urls=http://${ETCD_NAME}.etcd-headless.${NAMESPACE}.svc.cluster.local:2380\" \"--initial-cluster=etcd-0=http://etcd-0.etcd-headless.${NAMESPACE}.svc.cluster.local:2380,etcd-1=http://etcd-1.etcd-headless.${NAMESPACE}.svc.cluster.local:2380\"" 419 ], 420 "command": [ 421 "/bin/sh", 422 "-c" 423 ], 424 "env": [ 425 { 426 "name": "ETCD_NAME", 427 "valueFrom": { 428 "fieldRef": { 429 "apiVersion": "v1", 430 "fieldPath": "metadata.name" 431 } 432 } 433 }, 434 { 435 "name": "NAMESPACE", 436 "valueFrom": { 437 "fieldRef": { 438 "apiVersion": "v1", 439 "fieldPath": "metadata.namespace" 440 } 441 } 442 } 443 ], 444 "image": "quay.io/coreos/etcd:v3.3.5", 445 "imagePullPolicy": "IfNotPresent", 446 "name": "etcd", 447 "ports": [ 448 { 449 "containerPort": 2379, 450 "name": "client-port" 451 }, 452 { 453 "containerPort": 2380, 454 "name": "peer-port" 455 } 456 ], 457 "resources": { 458 "requests": { 459 "cpu": "1", 460 "memory": "2G" 461 } 462 }, 463 "volumeMounts": [ 464 { 465 "mountPath": "/var/data/etcd", 466 "name": "etcd-storage" 467 } 468 ] 469 } 470 ], 471 "imagePullSecrets": null 472 } 473 }, 474 "volumeClaimTemplates": [ 475 { 476 "metadata": { 477 "annotations": { 478 "volume.beta.kubernetes.io/storage-class": "ibmc-file-silver" 479 }, 480 "labels": { 481 "app": "etcd", 482 "suite": "pachyderm" 483 }, 484 "name": "etcd-storage", 485 "namespace": "pachyderm" 486 }, 487 "spec": { 488 "accessModes": [ 489 "ReadWriteOnce" 490 ], 491 "resources": { 492 "requests": { 493 "storage": "10Gi" 494 } 495 } 496 } 497 } 498 ] 499 } 500 } 501 ``` 502 503 ## 7. Deploy the Pachyderm manifest you modified. 504 505 ```shell 506 oc create -f manifest-statefulset.json 507 ``` 508 509 You can see the cluster status by using `oc get pods` as in upstream OpenShift: 510 511 ```shell 512 oc get pods 513 NAME READY STATUS RESTARTS AGE 514 dash-78c4b487dc-sm56p 2/2 Running 0 1m 515 etcd-0 1/1 Running 0 3m 516 etcd-1 1/1 Running 0 3m 517 pachd-5655cffbf7-57w4p 1/1 Running 0 3m 518 ``` 519 520 ### Known issues 521 522 Problems related to OpenShift deployment are tracked in [issues with the "openshift" label](https://github.com/pachyderm/pachyderm/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+label%3Aopenshift).