github.com/pingcap/tiflow@v0.0.0-20240520035814-5bf52d54e205/engine/executor/README.md (about)

     1  # Structure
     2  
     3  ## server
     4  
     5  executor-server is in charge of:
     6  
     7  - maintain heartbeat with master.
     8  - transfer the tasks from master to runtime engine.
     9  
    10  ## runtime
    11  
    12  The task runtime maintains some important structures and implements the interface:
    13  
    14  - **runtime**, the *singleton* instance organizes and executes tasks.
    15    - **Run** function that actually drives the execution of the tasks and is thread-safe.
    16    - *Queue* field that contains the runnable tasks.
    17  - **taskContainer** wraps the logic of operator and maintains the status and input/output channels.
    18    - **Poll** triggers a executing action of the task, which will return quickly. During the poll action, the task will
    19      - read inputs channels, if the channels are empty, return `blocked`. The read action fetches a batch of data.
    20    - *status* records the current status of task, including
    21      - `runnable` means the task is in the *Queue* and can run immediately.
    22      - `blocked` means the task is waiting someone to trigger, say output data consuming, input data arriving or i/o task completed.
    23      - `awaiting` means this task has been awoken by someone. If the **Poll** function ends and checks this status, we should put this task back to queue and reset the status to runnable.
    24  - **Operator** actually implement the user logic. Every operator is contained by a task. Ideally, operators can construct an operating tree like a typical sql engine.
    25    - **prepare** is called when constructing the operator and prepare some resources.
    26    - **close** releases all resources gracefully.
    27    - **next(ctx \*taskContext, r \*record, index int)** accepts the incoming data and returns the result data or blocked status. `index` indicates which input the record comes from.
    28    - **NextWantedInputIdx** returns a integer indicating which input the task should read in this poll. In normal cases it returns non-negative value. There are two special value:
    29      - **DontNeedData** suggests not to read data from input. It happens when the operator reads data from grpc streams or the operator was blocked last time and need to digest current blocking record at this time.
    30      - **DontRequireData**. Given a union operator, it can read from any inputs without caring about the read order. In this case, we return this status.