github.com/pachyderm/pachyderm@v1.13.4/doc/docs/master/deploy-manage/manage/migrations.md (about) 1 # Migrate to a Minor or Major Version 2 3 !!! info 4 If you need to upgrade Pachyderm from one patch 5 to another, such as from x.xx.0 to x.xx.1, see 6 [Upgrade Pachyderm](upgrades.md). 7 8 As new versions of Pachyderm are released, you might need to update your 9 cluster to get access to bug fixes and new features. 10 11 Migrations involve moving between major releases, such as 1.x.x to 12 2.x.x or minor releases, such as 1.11.x to 1.12.0. 13 14 !!! tip 15 Pachyderm follows the [Semantic Versioning](https://semver.org/) 16 specification to manage the release process. 17 18 Pachyderm stores all of its states in the following places: 19 20 * In `etcd` which in turn stores its state in one or more persistent volumes, 21 which were created when the Pachyderm cluster was deployed. `etcd` stores 22 metadata about your pipelines, repositories, and other Pachyderm primitives. 23 24 * In an object store bucket, such as AWS S3, MinIO, or Azure Blob Storage. 25 Actual data is stored here. 26 27 In a migration, the data structures stored in those locations need to be 28 read, transformed, and rewritten. Therefore, this process involves the 29 following steps: 30 31 1. Back up your cluster by exporting the existing Pachyderm cluster's repos, 32 pipelines, and input commits to a backup file and optionally to an S3 bucket. 33 1. Bring up a new Pachyderm cluster adjacent to the old pachyderm cluster either 34 in a separate namespace or in a separate Kubernetes cluster. 35 1. Restore the old cluster's repos, commits, and pipelines into the new 36 cluster. 37 38 !!! warning 39 Whether you are upgrading or migrating your cluster, you must back it up 40 to guarantee that you can restore it after migration. 41 42 ## Step 1 - Back up Your Cluster 43 44 Before migrating your cluster, create a backup that you can use to restore your 45 cluster from. For large amounts of data that are stored in an S3 object store, 46 we recommend that you use the cloud provider capabilities to copy your data 47 into a new bucket while backing up information about Pachyderm object to a 48 local file. For smaller deployments, you can copy everything into a local 49 file and then restore from that file. 50 51 To back up your cluster, complete the following steps: 52 53 1. Back up your cluster by running the `pachctl export` command with the 54 `--no-object` flag as described in [Back up Your Cluster](../backup_restore/). 55 56 1. In your cloud provider, create a new S3 bucket with the same Permissions 57 policy that you assigned to the original cluster bucket. For example, 58 if your cluster is on EKS, create the same Permissions policy as described 59 in [Deploy Pachyderm with an IAM Role](../../deploy/amazon_web_services/aws-deploy-pachyderm/#deploy-pachyderm-with-an-iam-role). 60 61 1. Clone your S3 bucket that you used for the olf cluster to this new bucket. 62 Follow the instructions for your cloud provider: 63 64 * If you use Google cloud, see the [gsutil instructions](https://cloud.google.com/storage/docs/gsutil/commands/cp). 65 * If you use Microsoft Azure, see the [azcopy instructions](https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-linux?toc=%2fazure%2fstorage%2ffiles%2ftoc.json). 66 * If you use Amazon EKS, see [AWS CLI instructions](https://docs.aws.amazon.com/cli/latest/reference/s3/sync.html). 67 68 **Example:** 69 70 ```shell 71 aws s3 sync s3://mybucket s3://mybucket2 72 ``` 73 74 1. Proceed to [Step 2](#step-2-restore-all-paused-pipelines). 75 76 ## Step 2 - Restore All Paused Pipelines 77 78 If you want to minimize downtime and run your pipeline while you are migrating 79 your cluster, you can restart all paused pipelines and data loading operations 80 after the backup and clone operations are complete. 81 82 To restore all paused pipelines, complete the following steps: 83 84 1. Run the `pachctl start pipeline` command on each paused pipeline one-by-one, or 85 use the multi-line shell script to restart pipelines all-at-once: 86 87 === "one-by-one" 88 ```shell 89 pachctl start pipeline <pipeline-name> 90 ``` 91 92 === "all-at-once" 93 ```shell 94 pachctl list pipeline --raw \ 95 | jq -r '.pipeline.name' \ 96 | xargs -P3 -n1 -I{} pachctl start pipeline {} 97 ``` 98 99 You might need to install `jq` and other utilities to run the script. 100 101 1. Confirm that each pipeline is started using the `list pipeline` command: 102 103 ```shell 104 pachctl list pipeline 105 ``` 106 107 * If you have switched the ports to stop data loading from outside sources, 108 change the ports back: 109 110 1. Back up the current configuration: 111 112 ```shell 113 kubectl get svc/pachd -o json >pachd_service_backup_30649.json 114 kubectl get svc/etcd -o json >etcd_svc_backup_32379.json 115 kubectl get svc/dash -o json >dash_svc_backup_30080.json 116 ``` 117 118 1. Modify the services to accept traffic on the corresponding ports to 119 avoid collisions with the migration cluster: 120 121 ```shell 122 # Modify the pachd API endpoint to run on 30650: 123 kubectl get svc/pachd -o json | sed 's/30649/30650/g' | kubectl apply -f - 124 # Modify the pachd trace port to run on 30651: 125 kubectl get svc/pachd -o json | sed 's/30648/30651/g' | kubectl apply -f - 126 # Modify the pachd api-over-http port to run on 30652: 127 kubectl get svc/pachd -o json | sed 's/30647/30652/g' | kubectl apply -f - 128 # Modify the pachd SAML authentication port to run on 30654: 129 kubectl get svc/pachd -o json | sed 's/30646/30654/g' | kubectl apply -f - 130 # Modify the pachd git API callback port to run on 30655: 131 kubectl get svc/pachd -o json | sed 's/30644/30655/g' | kubectl apply -f - 132 # Modify the pachd s3 port to run on 30600: 133 kubectl get svc/pachd -o json | sed 's/30611/30600/g' | kubectl apply -f - 134 # Modify the etcd client port to run on 32378: 135 kubectl get svc/etcd -o json | sed 's/32378/32379/g' | kubectl apply -f - 136 # Modify the dashboard ports to run on 30081 and 30080: 137 kubectl get svc/dash -o json | sed 's/30079/30080/g' | kubectl apply -f - 138 kubectl get svc/dash -o json | sed 's/30078/30081/g' | kubectl apply -f - 139 ``` 140 141 1. Modify your environment so that you can access `pachd` on the old port: 142 143 ```shell 144 pachctl config update context `pachctl config get active-context` --pachd-address=<cluster ip>:30650 145 ``` 146 147 1. Verify that you can access `pachd`: 148 149 ```shell 150 pachctl version 151 ``` 152 153 ```shell 154 pachctl config update context `pachctl config get active-context` --pachd-address=<cluster ip>:30650 155 ``` 156 157 1. Verify that you can access `pachd`: 158 159 ```shell 160 pachctl version 161 ``` 162 163 **System Response:** 164 165 ``` 166 COMPONENT VERSION 167 pachctl {{ config.pach_latest_version }} 168 pachd {{ config.pach_latest_version }} 169 ``` 170 171 If the command above hangs, you might need to adjust your firewall rules. 172 Your old Pachyderm cluster can operate while you are creating a migrated 173 one. 174 175 1. Proceed to [Step 3](#step-3-deploy-a-pachyderm-cluster-with-the-cloned-bucket). 176 177 ## Step 3 - Deploy a Pachyderm Cluster with the Cloned Bucket 178 179 After you create a backup of your existing cluster, you need to create a new 180 Pachyderm cluster by using the bucket you cloned in [Step 1](#step-1-back-up-your-cluster). 181 182 This new cluster can be deployed: 183 184 * On the same Kubernetes cluster in a separate namespace. 185 * On a different Kubernetes cluster within the same cloud provider. 186 187 If you are deploying in a namespace on the same Kubernetes cluster, 188 you might need to modify Kubernetes ingress to Pachyderm deployment in the 189 new namespace to avoid port conflicts in the same cluster. 190 Consult with your Kubernetes administrator for information on avoiding 191 ingress conflicts. 192 193 If you have issues with the extracted data, rerun instructions in 194 [Step 1](#step-1-back-up-your-cluster). 195 196 To deploy a Pachyderm cluster with a cloned bucket, complete the following 197 steps: 198 199 1. Upgrade your Pachyderm version to the latest version: 200 201 ```shell 202 brew upgrade pachyderm/tap/pachctl@1.11 203 ``` 204 205 ```shell 206 brew upgrade pachyderm/tap/pachctl@1.11 207 ``` 208 209 * If you are deploying your cluster in a separate Kubernetes namespace, 210 create a new namespace: 211 212 ```shell 213 kubectl create namespace <new-cluster-namespace> 214 ``` 215 216 ```shell 217 kubectl create namespace <new-cluster-namespace> 218 ``` 219 220 1. Deploy your cluster in a separate namespace or on a separate Kubernetes 221 cluster by using a `pachctl deploy` command for your cloud provider with the 222 `--namespace` flag. 223 224 **Examples:** 225 226 === "AWS EKS" 227 ```shell 228 pachctl deploy amazon <bucket-name> <region> <storage-size> --dynamic-etcd-nodes=<number> --iam-role <iam-role> --namespace=<namespace-name> 229 ``` 230 231 === "GKE" 232 ```shell 233 pachctl deploy google <bucket-name> <storage-size> --dynamic-etcd-nodes=1 --namespace=<namespace-name> 234 ``` 235 236 === "Azure" 237 ```shell 238 pachctl deploy microsoft <account-name> <storage-account> <storage-key> <storage-size> --dynamic-etcd-nodes=<number> --namespace=<namespace-name> 239 ``` 240 241 **Note:** Parameters for your Pachyderm cluster deployment might be different. 242 For more information, see [Deploy Pachyderm](../../deploy/). 243 244 1. Verify that your cluster has been deployed: 245 246 === "In a namespace" 247 ```shell 248 kubectl get pod --namespace=<new-cluster> 249 ``` 250 251 === "On a cluster" 252 ```shell 253 kubectl get pod 254 ``` 255 256 * If you have deployed your new cluster in a namespace, Pachyderm should 257 have created a new context for this deployement. Verify that you are 258 using this. 259 260 1. Proceed to [Step 4](#step-4-restore-your-cluster). 261 262 ## Step 4 - Restore your Cluster 263 264 After you have created a new cluster, you can restore your backup to this 265 new cluster. If you have deployed your new cluster in a namespace, Pachyderm 266 should have created a new context for this deployment. You need to switch to 267 this new context to access the correct cluster. Before you run the 268 `pachctl restore` command, your new cluster should be empty. 269 270 To restore your cluster, complete the following steps: 271 272 * If you deployed your new cluster into a different namespace on the same 273 Kubernetes cluster as your old cluster, verify that you on the correct namespace: 274 275 ```shell 276 $ pachctl config get context `pachctl config get active-context` 277 ``` 278 279 **Example System Response:** 280 281 ``` hl_lines="5" 282 { 283 "source": "IMPORTED", 284 "cluster_name": "test-migration.us-east-1.eksctl.io", 285 "auth_info": "user@test-migration.us-east-1.eksctl.io", 286 "namespace": "new-cluster" 287 } 288 ``` 289 290 Your active context must have the namespace you have deployed your new 291 cluster into. 292 293 1. Check that the cluster does not have any existing Pachyderm objects: 294 295 ```shell 296 pachctl list repo & pachctl list pipeline 297 ``` 298 299 You should get empty output. 300 301 1. Restore your cluster from the backup you have created in 302 [Step 1](#step-1-back-up-your-cluster): 303 304 === "Local File" 305 ```shell 306 pachctl restore < path/to/your/backup/file 307 ``` 308 309 === "S3 Bucket" 310 ```shell 311 pachctl restore --url s3://path/to/backup 312 ``` 313 314 This S3 bucket is different from the s3 bucket to which you cloned 315 your Pachyderm data. This is merely a bucket you allocated to hold 316 the Pachyderm backup without objects. 317 318 1. Configure any external data loading systems to point at the new, 319 upgraded Pachyderm cluster and play back transactions from the checkpoint 320 established at [Pause External Data Operations](./backup-migrations/#pause-external-data-loading-operations). 321 Perform any reconfiguration to data loading or unloading operations. 322 Confirm that the data output is as expected and the new cluster is operating as expected. 323 324 1. Disable the old cluster: 325 326 * If you have deployed the new cluster on the same Kuberenetes cluster 327 switch to the old cluster's Pachyderm context: 328 329 ```shell 330 pachctl config set active-context <old-context> 331 ``` 332 333 * If you have deployed the new cluster to a different Kubernetes cluster, 334 switch to the old cluster's Kubernetes context: 335 336 ```shell 337 kubectl config use-context <old cluster> 338 ``` 339 340 1. Undeploy your old cluster: 341 342 ```pachctl 343 pachctl undeploy 344 ``` 345 346 1. Reconfigure new cluster as necessary 347 You may need to reconfigure the following: 348 349 - Data loading operations from Pachyderm to processes outside 350 of it to work as expected. 351 - Kubernetes ingress and port changes taken to avoid conflicts 352 with the old cluster.