github.com/pachyderm/pachyderm@v1.13.4/doc/docs/1.11.x/deploy-manage/deploy/openshift.md

github.com/pachyderm/pachyderm@v1.13.4/doc/docs/1.11.x/deploy-manage/deploy/openshift.md (about)

     1  # OpenShift
     2  
     3  [OpenShift](https://www.openshift.com/) is a popular enterprise Kubernetes distribution.
     4  Pachyderm can run on OpenShift with a few small tweaks in the deployment process, which will be outlined below.
     5  Please see [known issues](#known-issues) below for currently issues with OpenShift deployments.
     6  
     7  ## Prerequisites
     8  
     9  Pachyderm needs a few things to install and run successfully in any Kubernetes environment
    10  
    11  1. A persistent volume, used by Pachyderm's `etcd` for storage of system metatada. 
    12     The kind of PV you provision will be dependent on your infrastructure. 
    13     For example, many on-premises deployments use Network File System (NFS) access to some kind of enterprise storage.
    14  1. An object store, used by Pachyderm's `pachd` for storing all your data. 
    15     The object store you use will probably be dependent on where you're going to run OpenShift: S3 for [AWS](https://pachyderm.readthedocs.io/en/latest/deployment/amazon_web_services.html), GCS for [Google Cloud Platform](https://pachyderm.readthedocs.io/en/latest/deployment/google_cloud_platform.html), Azure Blob Storage for  [Azure](https://pachyderm.readthedocs.io/en/latest/deployment/azure.html), or a storage provider like Minio, EMC's ECS or Swift providing S3-compatible access to enterprise storage for on-premises deployment.
    16  1. Access to particular TCP/IP ports for communication.
    17  
    18  ### Persistent volume
    19  
    20  You'll need to create a persistent volume with enough space for the metadata associated with the data you plan to store Pachyderm. 
    21  The `pachctl deploy` command for AWS, GCP and Azure creates persistent storage for you, when you follow the instructions below.
    22  A custom deploy can also create storage.  
    23  We'll show you below how to take out the PV that's automatically created, in case you want to create it outside of the Pachyderm deployment and just consume it.
    24  
    25  We're currently developing good rules of thumb for scaling this storage as your Pachyderm deployment grows,
    26  but it looks like 10G of disk space is sufficient for most purposes.
    27  
    28  ### Object store
    29  
    30  Size your object store generously, once you start using Pachyderm, you'll start versioning all your data.
    31  You'll need four items to configure object storage
    32  
    33  1. The access endpoint.
    34     For example, Minio's endpoints are usually something like `minio-server:9000`. 
    35     Don't begin it with the protocol; it's an endpoint, not an url.
    36  1. The bucket name you're dedicating to Pachyderm. Pachyderm will need exclusive access to this bucket.
    37  1. The access key id for the object store.  This is like a user name for logging into the object store.
    38  1. The secret key for the object store.  This is like the above user's password.
    39  
    40  ### TCP/IP ports
    41  
    42  For more details on how Kubernetes networking and service definitions work, see the [Kubernetes services 
    43  documentation](https://kubernetes.io/docs/concepts/services-networking/service/).
    44  
    45  #### Incoming ports (port)
    46  
    47  These are the ports internal to the containers, 
    48  You'll find these on both the pachd and dash containers.
    49  OpenShift runs containers and pods as unprivileged users which don't have access to port numbers below 1024.
    50  Pachyderm's default manifests use ports below 1024, so you'll have to modify the manifests to use other port numbers.
    51  It's usually as easy as adding a "1" in front of the port numbers we use.
    52  
    53  #### Pod ports (targetPort)
    54  
    55  This is the port exposed by the pod to Kubernetes, which is forwarded to the `port`.
    56  You should leave the `targetPort` set at `0` so it will match the `port` definition. 
    57  
    58  #### External ports (nodePorts)
    59  
    60  This is the port accessible from outside of Kubernetes.
    61  You probably don't need to change `nodePort` values unless your network security requirements or architecture requires you to change to another method of access. 
    62  Please see the [Kubernetes services documentation](https://kubernetes.io/docs/concepts/services-networking/service/) for details.
    63  
    64  ## The OCPify script
    65  
    66  A bash script that automates many of the substitutions below is available [at this gist](https://gist.github.com/gabrielgrant/86c1a5b590ae3f4b3fd32d7e9d622dc8). 
    67  You can use it to modify a manifest created using the `--dry-run` flag to `pachctl deploy custom`, as detailed below, and then use this guide to ensure the modifications it makes are relevant to your OpenShift environment.
    68  It requires certain prerequisites, just as [jq](https://github.com/stedolan/jq) and [sponge, found in moreutils](https://joeyh.name/code/moreutils/).
    69  
    70  This script may be useful as a basis for automating redeploys of Pachyderm as needed. 
    71  
    72  ### Best practices: Infrastructure as code
    73  
    74  We highly encourage you to apply the best practices used in developing software to managing the deployment process.
    75  
    76  1. Create scripts that automate as much of your processes as possible and keep them under version control.
    77  1. Keep copies of all artifacts, such as manifests, produced by those scripts and keep those under version control.
    78  1. Document your practices in the code and outside it.
    79  
    80  ## Preparing to deploy Pachyderm
    81  
    82  Things you'll need
    83  1. Your PV.  It can be created separately.
    84  
    85  1. Your object store information.
    86  
    87  1. Your project in OpenShift.
    88  
    89  1. A text editor for editing your deployment manifest.
    90  ## Deploying Pachyderm
    91  ### 1. Setting up PV and object stores
    92  How you deploy Pachyderm on OpenShift is largely going to depend on where OpenShift is deployed. 
    93  Below you'll find links to the documentation for each kind of deployment you can do.
    94  Follow the instructions there for setting up persistent volumes and object storage resources.
    95  Don't yet deploy your manifest, come back here after you've set up your PV and object store.
    96  * OpenShift Deployed on [AWS](https://pachyderm.readthedocs.io/en/latest/deployment/amazon_web_services.html) 
    97  * OpenShift Deployed on [GCP](https://pachyderm.readthedocs.io/en/latest/deployment/google_cloud_platform.html)
    98  * OpenShift Deployed on [Azure](https://pachyderm.readthedocs.io/en/latest/deployment/azure.html)
    99  * OpenShift Deployed [on-premise](https://pachyderm.readthedocs.io/en/latest/deployment/on_premises.html)
   100  ### 2. Determine your role security policy
   101  Pachyderm is deployed by default with cluster roles.
   102  Many institutional Openshift security policies require namespace-local roles rather than cluster roles.
   103  If your security policies require namespace-local roles, use the [`pachctl deploy` command below with the `--local-roles` flag](#namespace-local-roles).
   104  ### 3. Run the deploy command with --dry-run
   105  Once you have your PV, object store, and project, you can create a manifest for editing using the `--dry-run` argument to `pachctl deploy`.
   106  That step is detailed in the deployment instructions for each type of deployment, above.
   107  
   108  Below, find examples, 
   109  with cluster roles and with namespace-local roles,
   110  using AWS elastic block storage as a persistent disk with a custom deploy.
   111  We'll show how to remove this PV in case you want to use a PV you create separately.
   112  
   113  #### Cluster roles
   114  ```
   115  pachctl deploy custom --persistent-disk aws --object-store s3 \
   116       <pv-storage-name> <pv-storage-size> \
   117       <s3-bucket-name> <s3-access-key-id> <s3-access-secret-key> <s3-access-endpoint-url> \
   118       --static-etcd-volume=<pv-storage-name> --dry-run > manifest.json
   119  ```
   120  
   121  #### Namespace-local roles
   122  ```
   123  pachctl deploy custom --persistent-disk aws --object-store s3 \
   124       <pv-storage-name> <pv-storage-size> \
   125       <s3-bucket-name> <s3-access-key-id> <s3-access-secret-key> <s3-access-endpoint-url> \
   126       --static-etcd-volume=<pv-storage-name> --local-roles --dry-run > manifest.json
   127  ```
   128  
   129  ### 4. Modify pachd Service ports
   130  
   131  In the deployment manifest, which we called `manifest.json`, above, find the stanza for the `pachd` Service.  An example is shown below.
   132  
   133  ```
   134  {
   135  	"kind": "Service",
   136  	"apiVersion": "v1",
   137  	"metadata": {
   138  		"name": "pachd",
   139  		"namespace": "default",
   140  		"creationTimestamp": null,
   141  		"labels": {
   142  			"app": "pachd",
   143  			"suite": "pachyderm"
   144  		},
   145  		"annotations": {
   146  			"prometheus.io/port": "9091",
   147  			"prometheus.io/scrape": "true"
   148  		}
   149  	},
   150  	"spec": {
   151  		"ports": [
   152  			{
   153  				"name": "api-grpc-port",
   154  				"port": 650,
   155  				"targetPort": 0,
   156  				"nodePort": 30650
   157  			},
   158  			{
   159  				"name": "trace-port",
   160  				"port": 651,
   161  				"targetPort": 0,
   162  				"nodePort": 30651
   163  			},
   164  			{
   165  				"name": "api-http-port",
   166  				"port": 652,
   167  				"targetPort": 0,
   168  				"nodePort": 30652
   169  			},
   170  			{
   171  				"name": "saml-port",
   172  				"port": 654,
   173  				"targetPort": 0,
   174  				"nodePort": 30654
   175  			},
   176  			{
   177  				"name": "api-git-port",
   178  				"port": 999,
   179  				"targetPort": 0,
   180  				"nodePort": 30999
   181  			},
   182  			{
   183  				"name": "s3gateway-port",
   184  				"port": 600,
   185  				"targetPort": 0,
   186  				"nodePort": 30600
   187  			}
   188  		],
   189  		"selector": {
   190  			"app": "pachd"
   191  		},
   192  		"type": "NodePort"
   193  	},
   194  	"status": {
   195  		"loadBalancer": {}
   196  	}
   197  }
   198  ```
   199  
   200  While the nodePort declarations are fine, the port declarations are too low for OpenShift. Good example values are shown below.
   201  
   202  ```
   203  	"spec": {
   204  		"ports": [
   205  			{
   206  				"name": "api-grpc-port",
   207  				"port": 1650,
   208  				"targetPort": 0,
   209  				"nodePort": 30650
   210  			},
   211  			{
   212  				"name": "trace-port",
   213  				"port": 1651,
   214  				"targetPort": 0,
   215  				"nodePort": 30651
   216  			},
   217  			{
   218  				"name": "api-http-port",
   219  				"port": 1652,
   220  				"targetPort": 0,
   221  				"nodePort": 30652
   222  			},
   223  			{
   224  				"name": "saml-port",
   225  				"port": 1654,
   226  				"targetPort": 0,
   227  				"nodePort": 30654
   228  			},
   229  			{
   230  				"name": "api-git-port",
   231  				"port": 1999,
   232  				"targetPort": 0,
   233  				"nodePort": 30999
   234  			},
   235  			{
   236  				"name": "s3gateway-port",
   237  				"port": 1600,
   238  				"targetPort": 0,
   239  				"nodePort": 30600
   240  			}
   241  		],
   242  ```
   243  
   244  ### 5. Modify pachd Deployment ports and add environment variables
   245  In this case you're editing two parts of the `pachd` Deployment json.  
   246  Here, we'll omit the example of the unmodified version.
   247  Instead, we'll show you the modified version.
   248  #### 5.1 pachd Deployment ports
   249  The `pachd` Deployment also has a set of port numbers in the spec for the `pachd` container. 
   250  Those must be modified to match the port numbers you set above for each port.
   251  
   252  ```
   253  {
   254  	"kind": "Deployment",
   255  	"apiVersion": "apps/v1",
   256  	"metadata": {
   257  		"name": "pachd",
   258  		"namespace": "default",
   259  		"creationTimestamp": null,
   260  		"labels": {
   261  			"app": "pachd",
   262  			"suite": "pachyderm"
   263  		}
   264  	},
   265  	"spec": {
   266  		"replicas": 1,
   267  		"selector": {
   268  			"matchLabels": {
   269  				"app": "pachd",
   270  				"suite": "pachyderm"
   271  			}
   272  		},
   273  		"template": {
   274  			"metadata": {
   275  				"name": "pachd",
   276  				"namespace": "default",
   277  				"creationTimestamp": null,
   278  				"labels": {
   279  					"app": "pachd",
   280  					"suite": "pachyderm"
   281  				},
   282  				"annotations": {
   283  					"iam.amazonaws.com/role": ""
   284  				}
   285  			},
   286  			"spec": {
   287  				"volumes": [
   288  					{
   289  						"name": "pach-disk"
   290  					},
   291  					{
   292  						"name": "pachyderm-storage-secret",
   293  						"secret": {
   294  							"secretName": "pachyderm-storage-secret"
   295  						}
   296  					}
   297  				],
   298  				"containers": [
   299  					{
   300  						"name": "pachd",
   301  						"image": "pachyderm/pachd:{{ config.pach_latest_version }}",
   302  						"ports": [
   303  							{
   304  								"name": "api-grpc-port",
   305  								"containerPort": 1650,
   306  								"protocol": "TCP"
   307  							},
   308  							{
   309  								"name": "trace-port",
   310  								"containerPort": 1651
   311  							},
   312  							{
   313  								"name": "api-http-port",
   314  								"containerPort": 1652,
   315  								"protocol": "TCP"
   316  							},
   317  							{
   318  								"name": "peer-port",
   319  								"containerPort": 1653,
   320  								"protocol": "TCP"
   321  							},
   322  							{
   323  								"name": "api-git-port",
   324  								"containerPort": 1999,
   325  								"protocol": "TCP"
   326  							},
   327  							{
   328  								"name": "saml-port",
   329  								"containerPort": 1654,
   330  								"protocol": "TCP"
   331  							}
   332  						],
   333  
   334  ```
   335  
   336  #### 5.2 Add environment variables
   337  
   338  You need to configure the following environment variables for
   339  OpenShift:
   340  
   341  1. `WORKER_USES_ROOT`: This controls whether worker pipelines run as the root user or not. You'll need to set it to `false`
   342  1. `PORT`: This is the grpc port used by pachd for communication with `pachctl` and the api.  It should be set to the same value you set for `api-grpc-port` above.
   343  1. `HTTP_PORT`: The port for the api proxy.  It should be set to `api-http-port` above.
   344  1. `PEER_PORT`: Used to coordinate `pachd`'s. Same as `peer-port` above.
   345  1. `PPS_WORKER_GRPC_PORT`: Used to talk to pipelines. Should be set to a value above 1024.  The example value of 1680 below is recommended.
   346  1. `PACH_ROOT`: The Pachyderm root directory. In an OpenShift
   347  deployment, you need to set this value to a directory to which non-root users
   348  have write access, which might depend on your container image. By
   349  default, `PACH_ROOT` is set to `/pach`, which requires root privileges.
   350  Because in OpenShift, you do not have root access, you need to modify
   351  the default setting.
   352  
   353  The added values below are shown inserted above the `PACH_ROOT` value, which is typically the first value in this array.
   354  The rest of the stanza is omitted for clarity.
   355  
   356  ```
   357  						"env": [
   358                              {
   359                              "name": "WORKER_USES_ROOT",
   360                              "value": "false"
   361                              },
   362                              {
   363                              "name": "PORT",
   364                              "value": "1650"
   365                              },
   366                              {
   367                              "name": "HTTP_PORT",
   368                              "value": "1652"
   369                              },
   370                              {
   371                              "name": "PEER_PORT",
   372                              "value": "1653"
   373                              },
   374                              {
   375                              "name": "PPS_WORKER_GRPC_PORT",
   376                              "value": "1680"
   377                              },
   378  							{
   379  								"name": "PACH_ROOT",
   380  								"value": "<path-to-non-root-dir>"
   381  							},
   382  
   383  ```
   384  
   385  ### 6. (Optional) Remove the PV created during the deploy command
   386  If you're using a PV you've created separately, remove the PV that was added to your manifest by `pachctl deploy --dry-run`.  Here's the example PV we created with the deploy command we used above, so you can recognize it.
   387  
   388  ```
   389  {
   390  	"kind": "PersistentVolume",
   391  	"apiVersion": "v1",
   392  	"metadata": {
   393  		"name": "etcd-volume",
   394  		"namespace": "default",
   395  		"creationTimestamp": null,
   396  		"labels": {
   397  			"app": "etcd",
   398  			"suite": "pachyderm"
   399  		}
   400  	},
   401  	"spec": {
   402  		"capacity": {
   403  			"storage": "10Gi"
   404  		},
   405  		"awsElasticBlockStore": {
   406  			"volumeID": "pach-disk",
   407  			"fsType": "ext4"
   408  		},
   409  		"accessModes": [
   410  			"ReadWriteOnce"
   411  		],
   412  		"persistentVolumeReclaimPolicy": "Retain"
   413  	},
   414  	"status": {}
   415  }
   416  ```
   417  
   418  ## 7. Deploy the Pachyderm manifest you modified.
   419  
   420  ```shell
   421  oc create -f pachyderm.json
   422  ```
   423  
   424  You can see the cluster status by using `oc get pods` as in upstream Kubernetes:
   425  
   426  ```shell
   427      oc get pods
   428      NAME                     READY     STATUS    RESTARTS   AGE
   429      dash-6c9dc97d9c-89dv9    2/2       Running   0          1m
   430      etcd-0                   1/1       Running   0          4m
   431      pachd-65fd68d6d4-8vjq7   1/1       Running   0          4m
   432  ```
   433  
   434  ### Known issues
   435  
   436  Problems related to OpenShift deployment are tracked in [issues with the "openshift" label](https://github.com/pachyderm/pachyderm/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+label%3Aopenshift).