github.com/pachyderm/pachyderm@v1.13.4/doc/docs/1.11.x/deploy-manage/manage/migrations.md (about) 1 # Migrate to a Minor or Major Version 2 3 !!! info 4 If you need to upgrade Pachyderm from one patch 5 to another, such as from x.xx.0 to x.xx.1, see 6 [Upgrade Pachyderm](upgrades.md). 7 8 As new versions of Pachyderm are released, you might need to update your 9 cluster to get access to bug fixes and new features. 10 11 Migrations involve moving between major releases, such as 1.x.x to 12 2.x.x or minor releases, such as 1.10.x to {{ config.pach_latest_version }}. 13 14 !!! tip 15 Pachyderm follows the [Semantic Versioning](https://semver.org/) 16 specification to manage the release process. 17 18 Pachyderm stores all of its states in the following places: 19 20 * In `etcd` which in turn stores its state in one or more persistent volumes, 21 which were created when the Pachyderm cluster was deployed. `etcd` stores 22 metadata about your pipelines, repositories, and other Pachyderm primitives. 23 24 * In an object store bucket, such as AWS S3, MinIO, or Azure Blob Storage. 25 Actual data is stored here. 26 27 In a migration, the data structures stored in those locations need to be 28 read, transformed, and rewritten. Therefore, this process involves the 29 following steps: 30 31 1. Back up your cluster by exporting the existing Pachyderm cluster's repos, 32 pipelines, and input commits to a backup file and optionally to an S3 bucket. 33 1. Bring up a new Pachyderm cluster adjacent to the old pachyderm cluster either 34 in a separate namespace or in a separate Kubernetes cluster. 35 1. Restore the old cluster's repos, commits, and pipelines into the new 36 cluster. 37 38 !!! warning 39 Whether you are upgrading or migrating your cluster, you must back it up 40 to guarantee that you can restore it after migration. 41 42 ## Step 1 - Back up Your Cluster 43 44 Before migrating your cluster, create a backup that you can use to restore your 45 cluster from. For large amounts of data that are stored in an S3 object store, 46 we recommend that you use the cloud provider capabilities to copy your data 47 into a new bucket while backing up information about Pachyderm object to a 48 local file. For smaller deployments, you can copy everything into a local 49 file and then restore from that file. 50 51 To back up your cluster, complete the following steps: 52 53 1. Back up your cluster by running the `pachctl export` command with the 54 `--no-object` flag as described in [Back up Your Cluster](../backup_restore/). 55 56 1. In your cloud provider, create a new S3 bucket with the same Permissions 57 policy that you assigned to the original cluster bucket. For example, 58 if your cluster is on EKS, create the same Permissions policy as described 59 in [Deploy Pachyderm with an IAM Role](../../deploy/amazon_web_services/aws-deploy-pachyderm/#deploy-pachyderm-with-an-iam-role). 60 61 1. Clone your S3 bucket that you used for the olf cluster to this new bucket. 62 Follow the instructions for your cloud provider: 63 64 * If you use Google cloud, see the [gsutil instructions](https://cloud.google.com/storage/docs/gsutil/commands/cp). 65 * If you use Microsoft Azure, see the [azcopy instructions](https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-linux?toc=%2fazure%2fstorage%2ffiles%2ftoc.json). 66 * If you use Amazon EKS, see [AWS CLI instructions](https://docs.aws.amazon.com/cli/latest/reference/s3/sync.html). 67 68 **Example:** 69 70 ```shell 71 aws s3 sync s3://mybucket s3://mybucket2 72 ``` 73 74 1. Proceed to [Step 2](#step-2-restore-all-paused-pipelines). 75 76 ## Step 2 - Restore All Paused Pipelines 77 78 If you want to minimize downtime and run your pipeline while you are migrating 79 your cluster, you can restart all paused pipelines and data loading operations 80 after the backup and clone operations are complete. 81 82 To restore all paused pipelines, complete the following steps: 83 84 1. Run the `pachctl start pipeline` command on each paused pipeline or 85 use the multi-line shell script to restart all pipelines at once: 86 87 ```pachctl tab="Command" 88 pachctl start pipeline <pipeline-name> 89 ``` 90 91 ```shell tab="Script" 92 pachctl list pipeline --raw \ 93 | jq -r '.pipeline.name' \ 94 | xargs -P3 -n1 -I{} pachctl start pipeline {} 95 ``` 96 97 You might need to install `jq` and other utilities to run the script. 98 99 1. Confirm that each pipeline is started using the `list pipeline` command: 100 101 ```shell 102 pachctl list pipeline 103 ``` 104 105 * If you have switched the ports to stop data loading from outside sources, 106 change the ports back: 107 108 1. Back up the current configuration: 109 110 ```shell 111 kubectl get svc/pachd -o json >pachd_service_backup_30649.json 112 kubectl get svc/etcd -o json >etcd_svc_backup_32379.json 113 kubectl get svc/dash -o json >dash_svc_backup_30080.json 114 ``` 115 116 1. Modify the services to accept traffic on the corresponding ports to 117 avoid collisions with the migration cluster: 118 119 ```shell 120 # Modify the pachd API endpoint to run on 30650: 121 kubectl get svc/pachd -o json | sed 's/30649/30650/g' | kubectl apply -f - 122 # Modify the pachd trace port to run on 30651: 123 kubectl get svc/pachd -o json | sed 's/30648/30651/g' | kubectl apply -f - 124 # Modify the pachd api-over-http port to run on 30652: 125 kubectl get svc/pachd -o json | sed 's/30647/30652/g' | kubectl apply -f - 126 # Modify the pachd SAML authentication port to run on 30654: 127 kubectl get svc/pachd -o json | sed 's/30646/30654/g' | kubectl apply -f - 128 # Modify the pachd git API callback port to run on 30655: 129 kubectl get svc/pachd -o json | sed 's/30644/30655/g' | kubectl apply -f - 130 # Modify the pachd s3 port to run on 30600: 131 kubectl get svc/pachd -o json | sed 's/30611/30600/g' | kubectl apply -f - 132 # Modify the etcd client port to run on 32378: 133 kubectl get svc/etcd -o json | sed 's/32378/32379/g' | kubectl apply -f - 134 # Modify the dashboard ports to run on 30081 and 30080: 135 kubectl get svc/dash -o json | sed 's/30079/30080/g' | kubectl apply -f - 136 kubectl get svc/dash -o json | sed 's/30078/30081/g' | kubectl apply -f - 137 ``` 138 139 1. Modify your environment so that you can access `pachd` on the old port: 140 141 ```shell 142 pachctl config update context `pachctl config get active-context` --pachd-address=<cluster ip>:30650 143 ``` 144 145 1. Verify that you can access `pachd`: 146 147 ```shell 148 pachctl version 149 ``` 150 151 **System Response:** 152 153 ``` 154 COMPONENT VERSION 155 pachctl {{ config.pach_latest_version }} 156 pachd {{ config.pach_latest_version }} 157 ``` 158 159 If the command above hangs, you might need to adjust your firewall rules. 160 Your old Pachyderm cluster can operate while you are creating a migrated 161 one. 162 163 1. Proceed to [Step 3](#step-3-deploy-a-pachyderm-cluster-with-the-cloned-bucket). 164 165 ## Step 3 - Deploy a Pachyderm Cluster with the Cloned Bucket 166 167 After you create a backup of your existing cluster, you need to create a new 168 Pachyderm cluster by using the bucket you cloned in [Step 1](#step-1-back-up-your-cluster). 169 170 This new cluster can be deployed: 171 172 * On the same Kubernetes cluster in a separate namespace. 173 * On a different Kubernetes cluster within the same cloud provider. 174 175 If you are deploying in a namespace on the same Kubernetes cluster, 176 you might need to modify Kubernetes ingress to Pachyderm deployment in the 177 new namespace to avoid port conflicts in the same cluster. 178 Consult with your Kubernetes administrator for information on avoiding 179 ingress conflicts. 180 181 If you have issues with the extracted data, rerun instructions in 182 [Step 1](#step-1-back-up-your-cluster). 183 184 To deploy a Pachyderm cluster with a cloned bucket, complete the following 185 steps: 186 187 1. Upgrade your Pachyderm version to the latest version: 188 189 ```shell 190 brew upgrade pachyderm/tap/pachctl@1.11 191 ``` 192 193 * If you are deploying your cluster in a separate Kubernetes namespace, 194 create a new namespace: 195 196 ```shell 197 kubectl create namespace <new-cluster-namespace> 198 ``` 199 200 1. Deploy your cluster in a separate namespace or on a separate Kubernetes 201 cluster by using a `pachctl deploy` command for your cloud provider with the 202 `--namespace` flag. 203 204 **Examples:** 205 206 ```pachctl tab="AWS EKS" 207 pachctl deploy amazon <bucket-name> <region> <storage-size> --dynamic-etcd-nodes=<number> --iam-role <iam-role> --namespace=<namespace-name> 208 ``` 209 210 ```shell tab="GKE" 211 pachctl deploy google <bucket-name> <storage-size> --dynamic-etcd-nodes=1 --namespace=<namespace-name> 212 ``` 213 214 ```shell tab="Azure" 215 pachctl deploy microsoft <account-name> <storage-account> <storage-key> <storage-size> --dynamic-etcd-nodes=<number> --namespace=<namespace-name> 216 ``` 217 218 **Note:** Parameters for your Pachyderm cluster deployment might be different. 219 For more information, see [Deploy Pachyderm](../../deploy/). 220 221 1. Verify that your cluster has been deployed: 222 223 ```pachctl tab="In a Namespace" 224 kubectl get pod --namespace=<new-cluster> 225 ``` 226 227 ```shell tab="On a Separate Cluster" 228 kubectl get pod 229 ``` 230 231 * If you have deployed your new cluster in a namespace, Pachyderm should 232 have created a new context for this deployement. Verify that you are 233 using this. 234 235 1. Proceed to [Step 4](#step-4-restore-your-cluster). 236 237 ## Step 4 - Restore your Cluster 238 239 After you have created a new cluster, you can restore your backup to this 240 new cluster. If you have deployed your new cluster in a namespace, Pachyderm 241 should have created a new context for this deployement. You need to switch to 242 this new context to access the correct cluster. Before you run the 243 `pachctl restore` command, your new cluster should be empty. 244 245 To restore your cluster, complete the following steps: 246 247 * If you deployed your new cluster into a different namespace on the same 248 Kubernetes cluster as your old cluster, verify that you on the correct namespace: 249 250 ```shell 251 $ pachctl config get context `pachctl config get active-context` 252 ``` 253 254 **Example System Response:** 255 256 ``` hl_lines="5" 257 { 258 "source": "IMPORTED", 259 "cluster_name": "test-migration.us-east-1.eksctl.io", 260 "auth_info": "user@test-migration.us-east-1.eksctl.io", 261 "namespace": "new-cluster" 262 } 263 ``` 264 265 Your active context must have the namespace you have deployed your new 266 cluster into. 267 268 1. Check that the cluster does not have any exisiting Pachyderm objects: 269 270 ```shell 271 pachctl list repo & pachctl list pipeline 272 ``` 273 274 You should get empty output. 275 276 1. Restore your cluster from the backup you have created in 277 [Step 1](#step-1-back-up-your-cluster): 278 279 ```pachctl tab="From a Local File" 280 pachctl restore < path/to/your/backup/file 281 ``` 282 283 ```pachctl tab="From an S3 Bucker" 284 pachctl restore --url s3://path/to/backup 285 ``` 286 287 This S3 bucket is different from the s3 bucket to which you cloned 288 your Pachyderm data. This is merely a bucket you allocated to hold 289 the Pachyderm backup without objects. 290 291 1. Configure any external data loading systems to point at the new, 292 upgraded Pachyderm cluster and play back transactions from the checkpoint 293 established at [Pause External Data Operations](./backup-migrations/#pause-external-data-loading-operations). 294 Perform any reconfiguration to data loading or unloading operations. 295 Confirm that the data output is as expected and the new cluster is operating as expected. 296 297 1. Disable the old cluster: 298 299 * If you have deployed the new cluster on the same Kuberenetes cluster 300 switch to the old cluster's Pachyderm context: 301 302 ```shell 303 pachctl config set active-context <old-context> 304 ``` 305 306 * If you have deployed the new cluster to a different Kubernetes cluster, 307 switch to the old cluster's Kubernetes context: 308 309 ```shell 310 kubectl config use-context <old cluster> 311 ``` 312 313 1. Undeploy your old cluster: 314 315 ```pachctl 316 pachctl undeploy 317 ``` 318 319 1. Reconfigure new cluster as necessary 320 You may need to reconfigure the following: 321 322 - Data loading operations from Pachyderm to processes outside 323 of it to work as expected. 324 - Kubernetes ingress and port changes taken to avoid conflicts 325 with the old cluster.