github.com/pingcap/tiflow@v0.0.0-20240520035814-5bf52d54e205/dm/docs/RFCS/20220124_enhance_task_manageability.md (about)

     1  # Proposal: Enhance task manageability
     2  
     3  - Author(s): [ehco](https://github.com/Ehco1996)
     4  - Last updated: 2022-01-22
     5  
     6  ## Background
     7  
     8  The main purpose of this change is to address the root cause of problems like [#3771](https://github.com/pingcap/tiflow/issues/3771), which are caused by the fact that commands about DM task do not distinguish between dynamic configuration and static resources, making it impossible for users to intuitively manage their tasks. For this reasons, we will attempt to redesign the state machine of DM task and optimize the dmctl interaction interface to provide a better user experience.
     9  
    10  ### Current State machine of task
    11  
    12  ![dm old task lifecycle](../media/dm-old-task-lifecycle.png)
    13  
    14  ### New State machine of task
    15  
    16  ![dm new task lifecycle](../media/dm-new-task-lifecycle.png)
    17  
    18  This is mainly achieved by adding a new command `dmctl task create`, which creates a stopped task instead of creating and starting a new task with single command `dmctl start-task`.
    19  
    20  ## Goals
    21  
    22  - Organize and optimize the state machine of tasks for better management
    23  - Unify style of commands in dmctl and optimize dmctl interaction experience
    24  - Add/modify OpenAPI, enhance eco-tools
    25  
    26  ## Design and Examples
    27  
    28  New syntax of dmctl is `dmctl [resource type] [command] [flags] [arguments]`
    29  
    30  where `resource type`, `command`, `flags` and `arguments` are:
    31  
    32  - `resource type` specifies the resource you want to control. `resource type`s are case-insensitive and limited. Currently there are only these: `task`, `source`, `relay`, `ddl-lock` and `member`.
    33  
    34  - `command` specifies the operation that you want to perform on one or more resources, for example `create`, `get`, `update`, `delete`, etc.
    35  
    36  - `flags` specifies optional flags. For example, you can use the `--master-addr` flags to specify the address and port of the DM-Master server.
    37  
    38  - `arguments` specifies the required arguments for this command, such as the name of the `task` and task config file.
    39  
    40  ### dmctl commands for Task
    41  
    42  | Command | Full Syntax Example                                                                                                                 | Flags                                                                            | Arguments              | Description                                                       |
    43  |---------|-------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------|------------------------|-------------------------------------------------------------------|
    44  | check   | `dmctl task check --error-count=1 --warn-count=1 task1.yaml`                                                                        | --error-count(default: 10), --warn-count(default: 10)                            | config-file            | check the task config yaml file.                                  |
    45  | create  | `dmctl task create task1.yaml`                                                                                                      |                                                                                  | config-file            | create a stopped task with config file.                           |
    46  | update  | `dmctl task update task1 task1.yaml`                                                                                                |                                                                                  | task-name, config-file | update a stopped task with config file.                           |
    47  | delete  | `dmctl task delete --yes --force task1`                                                                                             | --yes(default: false), --force(default: false)                                   | task-name              | delete a task and remove all meta data for this task.             |
    48  | get     | `dmctl task get --output="new_task.yaml" task1`                                                                                     | --output                                                                         | task-name              | show the task config in yaml format, also support output to file. |
    49  | list    | `dmctl task list --stage="Running" --source="source1,source2"`                                                                      | --source, --stage(Running/Stopped/Finished)                                      |                        | list all tasks in current cluster.                                |
    50  | status  | `dmctl task status --source="source1,source2" task1`                                                                                | --source                                                                         | task-name              | show task detail status.                                          |
    51  | start   | `dmctl task start --source="source1,source2" --remove-meta --start-time="2021-01-01 00:00:00" --safe-mode-time-duration="1s" task1` | --source, --remove-meta(default: false), --start-time, --safe-mode-time-duration | task-name              | start a stopped task with many flags.                             |
    52  | stop    | `dmctl task stop --source="source1,source2" --timeout="60s" task1`                                                                  | --source, --timeout(default: "10s")                                              | task-name              | stop a running task with many flags.                              |
    53  
    54  ### dmctl commands for Source
    55  
    56  | Command  | Full Syntax Example                                | Flags                   | Arguments                | Description                                                         |
    57  |----------|----------------------------------------------------|-------------------------|--------------------------|---------------------------------------------------------------------|
    58  | create   | `dmctl source create source1.yaml`                 |                         | config-file              | create source with config file.                                     |
    59  | update   | `dmctl source update source1 source1.yaml`         |                         | source-name, config-file | update a source with config file.                                   |
    60  | delete   | `dmctl source delete --force source1`              | --force(default: false) | source-name              | delete a source.                                                    |
    61  | get      | `dmctl source get --output="source1.yaml" source1` | --output                | source-name              | show the source config in yaml format, also support output to file. |
    62  | list     | `dmctl source list`                                |                         |                          | list all sources in current cluster.                                |
    63  | status   | `dmctl source status source1`                      |                         | source-name              | show source detail status.                                          |
    64  | enable   | `dmctl source enable source1`                      |                         | source-name              | enable a disabled source.                                           |
    65  | disable  | `dmctl source disable source1`                     |                         | source-name              | disable a source and also stop the running subtasks of this source. |
    66  | transfer | `dmctl source transfer source1 worker1`            |                         | source-name, worker-name | transfers a source to a free worker.                                |
    67  
    68  ### dmctl commands for Relay
    69  
    70  | Command | Full Syntax Example                                                                                 | Flags         | Arguments              | Description                                                                  |
    71  |---------|-----------------------------------------------------------------------------------------------------|---------------|------------------------|------------------------------------------------------------------------------|
    72  | start   | `dmctl relay start --worker-name="worker1" source1`                                                 | --worker-name | source-name            | start relay for a source on a worker.                                        |
    73  | stop    | `dmctl relay stop --worker-name="worker1" source1`                                                  | --worker-name | source-name            | stop relay for a source on a worker.                                         |
    74  | purge   | `dmctl relay purge --sub-dir="2ae76434-f79f-11e8-bde2-024ac130008.000001" source1 mysql-bin.000006` | --sub-dir     | source-name, file-name | purges relay log files of the DM-worker according to the specified filename. |
    75  
    76  ### dmctl commands for DDL-LOCK
    77  
    78  | Command | Full Syntax Example           | Flags | Arguments | Description                                  |
    79  |---------|-------------------------------|-------|-----------|----------------------------------------------|
    80  | list    | `dmctl ddl-lock list task1`   |       | task-name | show shard-ddl locks information for a task. |
    81  | unlock  | `dmctl ddl-lock unlock lock1` |       | lock-id   | force unlock un-resolved DDL locks.          |
    82  
    83  ### dmctl commands for member
    84  
    85  | Command             | Full Syntax Example                                   | Flags                                | Arguments          | Description                                       |
    86  |---------------------|-------------------------------------------------------|--------------------------------------|--------------------|---------------------------------------------------|
    87  | list                | `dmctl member list --name="master-1" --role="master"` | --name, --role(master/worker/leader) |                    | show members of current cluster by name and role. |
    88  | offline             | `dmctl member offline master-1`                       |                                      | master/worker-name | offline members of current cluster by name.       |
    89  | evict-leader        | `dmctl member evict-leader master-1`                  |                                      | master-name        | evict leader for master node.                     |
    90  | cancel-evict-leader | `dmctl member cancel-evict-leader master-1`           |                                      | master-name        | cancel evict leader for master node.              |
    91  
    92  ### Optimized dmctl for interaction mode (optional)
    93  
    94  The current interaction of dmctl still has room for optimization, and I hope to take this opportunity to do some optimization of the interaction experience, focusing on the user can quickly select the command they want to enter through the keyboard, instead of entering commands by their memory.
    95  
    96  Here is a simple prototype demo:
    97  
    98  ![dmctl](../media/new-dmctl.svg)
    99  
   100  ## Breaking Changes for OpenAPI
   101  
   102  - `POST /api/v1/tasks` will be changed from creating and starting tasks to only creating tasks.
   103  - `DELETE /api/v1/tasks/{task-name}` will be changed from stopping task to stopping task and delete the meta data.
   104  - `POST /api/v1/tasks/{task-name}/pause` will be updated to `POST /api/v1/tasks/{task-name}/stop`
   105  - `POST /api/v1/tasks/{task-name}/resume` will be updated to `POST /api/v1/tasks/{task-name}/start`
   106  - `POST /api/v1/sources/{source-name}/pause-relay` will be deleted
   107  - `POST /api/v1/sources/{source-name}/resume-relay` will be deleted
   108  
   109  ## Milestones
   110  
   111  ### Milestone 1 - Implementation of the modified task state machine according to the design documentation
   112  
   113  This phase is mainly about implementing the DM-Master/DM-Worker internal logic.
   114  
   115  For tasks, the DM-Master's internal scheduling module needs to support the creation of a subtask in a stopped state, in addition to adapting additional parameters like `-start-time`,`--time-out` and so on. the DM-Worker need to watch the task stage from etcd and operate the subtask according the stage.
   116  
   117  And for Sources, when the DM-Master receives a `disable source` request from a user, it will **synchronously** notify the DM-Worker and tell the DM-Worker to stop processing the subtask.
   118  
   119  Note that all changes to the internal logic at this stage do not have any effect on existing dmctl.
   120  
   121  ### Milestone 2 - Defining the new OpenAPI Spec and implementing specific features
   122  
   123  This phase is used to identify and implement the new OpenAPI and to use unit and integration tests to determine whether the OpenAPI meets expectations in certain scenarios, It is important to note that changes to the code in this phase will result in the above incompatible changes to the existing OpenAPI.
   124  
   125  ### Milestone 3 - Implementing commands in dmctl with OpenAPI and optimize the interaction experience
   126  
   127  This phase will focus most of the effort on implementing the commands in the dmctl and optimize the interaction experience, and completing the corresponding unit and integration tests. It is to be expected that a lot of time will be spent in this phase on modifying and testing the CI.
   128  
   129  ### Milestone 4 - Perform corresponding testing and documentation completions
   130  
   131  This is the final stage of testing, including system and compatibility testing, as well as completing the documentation.