volcano.sh/volcano@v1.9.0/docs/user-guide/how_to_use_mpi_plugin.md (about)

     1  # MPI Plugin User Guide
     2  
     3  ## Introduction
     4  
     5  **MPI plugin** is designed to optimize the user experience when running MPI jobs, it not only allows users to write less yaml, but also ensures the normal operation of MPI jobs.
     6  
     7  ## How the MPI Plugin Works
     8  
     9  The MPI plugin will do three things:
    10  
    11  * Open ports used by MPI for all containers of the job
    12  * Force open `ssh` and `svc` plugins
    13  * add `MPI_HOST` environment variable for master pod, this environment variable includes the worker's domain name, It is used by the `--host` parameter of `mpiexec`
    14  
    15  ## Parameters of the MPI Plugin
    16  
    17  ### Key Points
    18  
    19  * If `master` or `worker` is configured, please ensure that the tasks corresponding to their values exist, and the roles of these tasks correspond to the meaning of the parameters
    20  * If `port` is configured, make the port value of `sshd` the same as the value of the parameter.
    21  * If the `gang` plugin is enabled, then make sure that the value of `minAvailable` is **equal** to the number of `replicas of the worker`.
    22  
    23  ### Arguments
    24  
    25  | ID   | Name   | Type   | Default Value | Required | Description                        | Example            |
    26  | ---- | ------ | ------ | ------------- | -------- | ---------------------------------- | ------------------ |
    27  | 1    | master | string | master        | No       | Name of MPI master                 | --master=mpimaster |
    28  | 2    | worker | string | worker        | No       | Name of MPI worker                 | --worker=mpiworker |
    29  | 3    | port   | string | 22            | No       | The port to open for the container | --port=5000        |
    30  
    31  ## Examples
    32  
    33  ```yaml
    34  apiVersion: batch.volcano.sh/v1alpha1
    35  kind: Job
    36  metadata:
    37    name: lm-mpi-job
    38  spec:
    39    minAvailable: 1
    40    schedulerName: volcano
    41    plugins:
    42      mpi: ["--master=mpimaster","--worker=mpiworker","--port=22"]  ## MPI plugin register
    43    tasks:
    44      - replicas: 1
    45        name: mpimaster
    46        policies:
    47          - event: TaskCompleted
    48            action: CompleteJob
    49        template:
    50          spec:
    51            containers:
    52              - command:
    53                  - /bin/sh
    54                  - -c
    55                  - |
    56                    mkdir -p /var/run/sshd; /usr/sbin/sshd;
    57                    mpiexec --allow-run-as-root --host ${MPI_HOST} -np 2 mpi_hello_world;
    58                image: volcanosh/example-mpi:0.0.3
    59                name: mpimaster
    60                workingDir: /home
    61            restartPolicy: OnFailure
    62      - replicas: 2
    63        name: mpiworker
    64        template:
    65          spec:
    66            containers:
    67              - command:
    68                  - /bin/sh
    69                  - -c
    70                  - |
    71                    mkdir -p /var/run/sshd; /usr/sbin/sshd -D;
    72                image: volcanosh/example-mpi:0.0.3
    73                name: mpiworker
    74                workingDir: /home
    75            restartPolicy: OnFailure
    76  ```
    77