github.com/1aal/kubeblocks@v0.0.0-20231107070852-e1c03e598921/docs/user_docs/kubeblocks-for-mysql/migration/migration-kbcli.md (about)

     1  ---
     2  title: Migrate data in MySQL to KubeBlocks
     3  description: How to migrate data in MySQL v8.* to KubeBlocks by kbcli migration
     4  keywords: [mysql, migration, kbcli migration, migrate data in MySQL to KubeBlocks]
     5  sidebar_position: 2
     6  sidebar_label: Migration by kbcli
     7  ---
     8  
     9  # Migrate data in MySQL to KubeBlocks
    10  
    11  ## Before you start
    12  
    13  ### Enable kbcli migration
    14  
    15  1. Install KubeBlocks: You can install KubeBlocks by [kbcli](./../../installation/install-with-kbcli/install-kubeblocks-with-kbcli.md) or by [Helm](./../../installation/install-with-helm/install-kubeblocks-with-helm.md).
    16  2. [Enable the migration add-on](./../../overview/supported-addons.md).
    17  
    18     ```bash
    19     kbcli addon list
    20  
    21     kbcli addon enable migration
    22     ```
    23  
    24  ### Configure the source
    25  
    26  Modify the configuration of the source to support CDC.
    27  
    28  1. Open 'log_bin' configuration.
    29  2. Set 'binlog_format' configuration to 'row'.
    30  3. Set 'binlog_row_image' configuration to 'full'.
    31  
    32  :::note
    33  
    34  1. Modifying 'log_bin' configuration restarts the database. Make sure the modification is performed during off-peak hours.
    35  2. Modifying 'binlog_format' and 'binlog_row_image' configurations does not affect the existing binlog format. Make sure the log timestamp of pulling CDC (usually the time CDC starts) occurs after the modifications are completed.
    36  
    37  :::
    38  
    39  ### Check the account permission
    40  
    41  Make sure both the source and sink account meet the following permissions.
    42  
    43  * The source account
    44    * REPLICATION SLAVE
    45    * REPLICATION CLIENT
    46    * SELECT
    47  * The sink account
    48    * SELECT
    49    * INSERT
    50    * UPDATE
    51    * DELETE
    52    * CREATE
    53    * ALTER
    54    * DROP
    55  
    56  ### Initialize the sink
    57  
    58  Create a database named `db_test`.
    59  
    60  ```bash
    61  create database if not exists db_test
    62  ```
    63  
    64  ### Prepare data sampling
    65  
    66  It is recommended to prepare data sampling for verification after the migration to ensure correctness.
    67  
    68  ## Migrate data
    69  
    70  ### Steps
    71  
    72  1. Create a migration task.
    73  
    74     ```bash
    75     kbcli migration create mytask --template apecloud-mysql2mysql \
    76     --source user:123456@127.0.0.1:5432/db_test \
    77     --sink user:123456@127.0.0.2:5432/db_test \
    78     --migration-object '"public.table_test_1","public.table_test_2"'
    79     ```
    80  
    81     :paperclip: Table 1. Options explanation
    82  
    83     | Option     | Descriprion |
    84     | :--------- | :---------- |
    85     | mystask    | The name of the migration task. You can customize it. |
    86     | `--template` | It specifies the migration template. `--template apecloud-mysql2mysql` means that this migration task uses the template of migrating from MySQL to MySQL created by KubeBlocks. Run `kbcli migration templates` to view all available templates and the supported database information.   |
    87     | `--source`  | It specifies the source. `user:123456@127.0.0.1:5432/db_test` in the above example follows the format `${user_name}:${password}@${database connection url}/${database}`. For this guide, the connection URL uses the public network address. |
    88     | `--sink`     | It specifies the destination. `user:123456@127.0.0.2:5432/db_test` in the above example follows the format `${user_name}:${password}@${database connection url}/${database}`. For this guide, the connection URL uses the service address inside the Kubernetes cluster. |
    89     | `--migration-object`  | It specifies the migration object. The above example describes data in "public.table_test_1" and "public.table_test_2", including structure data, stock data, and incremental data generated during running migration task, will be migrated to the sink.    |
    90  
    91  2. (Optional) Specify migration steps by the flag `--steps`.
    92  
    93     The default steps follow the order precheck -> structure initialization -> data initialization -> incremental migration. You can use `--steps` to specify migration steps. For example, perform tasks in the order of precheck -> structure initialization -> data initialization -> incremental migration.
    94  
    95     ```bash
    96     kbcli migration create mytask --template apecloud-mysql2mysql \
    97     --source user:123456@127.0.0.1:5432/db_test \
    98     --sink user:123456@127.0.0.2:5432/db_test \
    99     --migration-object '"public.table_test_1","public.table_test_2"'
   100     --steps precheck=true,init-struct=false,init-data=true,cdc=true
   101     ```
   102  
   103  3. View the task status.
   104  
   105     ```bash
   106     # View the migration task list
   107     kbcli migration list
   108  
   109     # View the details of a specified task
   110     kbcli migration describe ${migration-task-name}
   111     ```
   112  
   113     Pay attention to Initialization, CDC, and CDC Metrics.
   114  
   115     * Initialization
   116       * Precheck: If the status shows `Failed`, it means the initialization precheck does not pass. Troubleshoot the initialization by [the following examples in troubleshooting](#troubleshooting).
   117       * Init-struct: Structure initialization. Idempotent processing logic is adopted. A failure occurs only when a severe problem occurs, such as failing to connect a database.
   118       * Init-data: Data initialization. If there is a large amount of stock data, it takes a long time to perform this step and you should pay attention to Status.
   119     * CDC: Incremental migration. Based on the timestamp recorded by the system before the init-data step, the system starts data migration following eventual consistency and performs capturing the source library WAL (Write Ahead Log) changes -> writing to the sink. Under normal circumstances, the CDC phase continues if the migration link is not actively terminated.
   120     * CDC Metrics: Incremental migration indicators. Currently, the indicators mainly provide the WAL LSN (Log Sequencer Number) of the source library and the corresponding timestamp (note that the timestamp shows the local time zone of the Pod Container runtime) when the CDC process has completed "capturing -> writing" process.
   121  
   122       :::note
   123  
   124       The CDC Metrics are updated every 10 minutes by the system, i.e. if there exists continuous data writing into the source, metrics.timestamp here delays 10 minutes compared with the current time.
   125  
   126       :::
   127  
   128  4. Validate the migration with the prepared data sampling.
   129  
   130  ### Troubleshooting
   131  
   132  If any step above fails, run the command below to troubleshoot the failure.
   133  
   134  ```bash
   135  # --step: Specify the step. Allowed values: precheck,init-struct,init-data,cdc
   136  kbcli migration logs ${migration-task-name} --step ${step-name}
   137  ```
   138  
   139  ## Switch applications
   140  
   141  ### Before you start
   142  
   143  * Make sure the KubeBlocks migration task runs normally.
   144  * To differentiate the dialogue information and to improve data security, it is recommended to create and authorize another account dedicated to data migration.
   145  * For safety concerns, it is necessary to stop the business write and switch the application during off-peak hours.
   146  * Before switching the application, it is recommended to prepare data sampling for verification after switching to ensure correctness.
   147  
   148  ### Steps
   149  
   150  1. Check the migration task status and ensure the task is performed normally.
   151     1. Describe the migration task details and all steps in Initialization are `Complete` and CDC is `Running`.
   152  
   153        ```bash
   154        kbcli migration describe ${migration-task-name}
   155        ```
   156  
   157     2. Under the prerequisite that there exists continuous write into the source, observe whether the timestamp is still in progress and whether there is almost no delay. For example,
   158  
   159        ```bash
   160        kbcli migration logs ${migration-task-name} --step cdc | grep current_position
   161        ```
   162  
   163        The results update every 10 seconds.
   164  
   165        ![Timestamp](../../../img/pgsql-migration-timestamp.png)
   166  2. Pause the business and stop new business data from being written into the source.
   167  3. View the migration status again and ensure the migration task runs normally, lasting at least one minute.
   168  
   169     Refer to the operations in step 1 and observe whether the link is normal and the timestamp meets the expectation.
   170  4. Use the sink to restore the business.
   171  5. Validate the switch with the prepared data sampling.
   172  
   173  ## Clean up the environment
   174  
   175  After the migration task is completed, you can terminate the migration task and function.
   176  
   177  ### Terminate the migration task
   178  
   179  Deleting the migration task does not affect the data in the source and sink.
   180  
   181  ```bash
   182  kbcli migration terminate ${migration-task-name}
   183  ```
   184  
   185  ### Terminate kbcli migration
   186  
   187  1. Check whether there are running migration tasks.
   188  
   189     ```bash
   190     kbcli migration list
   191     ```
   192  
   193  2. Disable the migration add-on.
   194  
   195     ```bash
   196     kbcli addon disable migration
   197     ```
   198  
   199  3. Delete the Kubernetes CRD (Custom Resource Definition) manually.
   200  
   201     ```bash
   202     kubectl delete crd migrationtasks.datamigration.apecloud.io migrationtemplates.datamigration.apecloud.io serialjobs.common.apecloud.io
   203     ```