volcano.sh/volcano@v1.9.0/docs/user-guide/how_to_use_env_plugin.md (about)

     1  # Volcano Job Plugin -- Env User Guidance
     2  
     3  ## Background
     4  **Env Plugin** is designed for business that a pod should be aware of its index in the task such as [MPI](https://www.open-mpi.org/)
     5  and [TensorFlow](https://tensorflow.google.cn/). The indices will be registered as **environment variables** automatically
     6  when the Volcano job is created. For example, a tensorflow job consists of *1* ps and *2* workers. And each worker maps 
     7  to a slice of raw data. In order to make the workers be aware of its target slice, they get their index in the environment
     8  variables.
     9  
    10  ## Key Points
    11  * The index keys of the environment variables are `VK_TASK_INDEX` and `VC_TASK_INDEX`, they have the same value.
    12  * The value of the indices is a number which ranges from `0` to `length - 1`. The `length` equals to the number of replicas 
    13  of the task. It is also the index of the pod in the task. 
    14  
    15  ## Examples
    16  ```yaml
    17  apiVersion: batch.volcano.sh/v1alpha1
    18  kind: Job
    19  metadata:
    20    name: tensorflow-dist-mnist
    21  spec:
    22    minAvailable: 3
    23    schedulerName: volcano
    24    plugins:
    25      env: []   ## Env plugin register, note that no values are needed in the array.
    26      svc: []
    27    policies:
    28      - event: PodEvicted
    29        action: RestartJob
    30    queue: default
    31    tasks:
    32      - replicas: 1
    33        name: ps
    34        template:
    35          spec:
    36            containers:
    37              - command:
    38                  - sh
    39                  - -c
    40                  - |
    41                    PS_HOST=`cat /etc/volcano/ps.host | sed 's/$/&:2222/g' | sed 's/^/"/;s/$/"/' | tr "\n" ","`;
    42                    WORKER_HOST=`cat /etc/volcano/worker.host | sed 's/$/&:2222/g' | sed 's/^/"/;s/$/"/' | tr "\n" ","`;
    43                    export TF_CONFIG={\"cluster\":{\"ps\":[${PS_HOST}],\"worker\":[${WORKER_HOST}]},\"task\":{\"type\":\"ps\",\"index\":${VK_TASK_INDEX}},\"environment\":\"cloud\"};   ## Get the index from the environment variable and configure it in the TF job.
    44                    python /var/tf_dist_mnist/dist_mnist.py
    45                image: volcanosh/dist-mnist-tf-example:0.0.1
    46                name: tensorflow
    47                ports:
    48                  - containerPort: 2222
    49                    name: tfjob-port
    50                resources: {}
    51            restartPolicy: Never
    52      - replicas: 2
    53        name: worker
    54        policies:
    55          - event: TaskCompleted
    56            action: CompleteJob
    57        template:
    58          spec:
    59            containers:
    60              - command:
    61                  - sh
    62                  - -c
    63                  - |
    64                    PS_HOST=`cat /etc/volcano/ps.host | sed 's/$/&:2222/g' | sed 's/^/"/;s/$/"/' | tr "\n" ","`;
    65                    WORKER_HOST=`cat /etc/volcano/worker.host | sed 's/$/&:2222/g' | sed 's/^/"/;s/$/"/' | tr "\n" ","`;
    66                    export TF_CONFIG={\"cluster\":{\"ps\":[${PS_HOST}],\"worker\":[${WORKER_HOST}]},\"task\":{\"type\":\"worker\",\"index\":${VK_TASK_INDEX}},\"environment\":\"cloud\"};
    67                    python /var/tf_dist_mnist/dist_mnist.py
    68                image: volcanosh/dist-mnist-tf-example:0.0.1
    69                name: tensorflow
    70                ports:
    71                  - containerPort: 2222
    72                    name: tfjob-port
    73                resources: {}
    74            restartPolicy: Never
    75  ```
    76  Note:
    77  * When config env plugin in the tensorflow job above, all the pods will have 2 environment variables `VK_TASK_INDEX` and 
    78  `VC_TASK_INDEX`. The environment variables registered in the `ps` pod are as follows.
    79  ```
    80  [root@tensorflow-dist-mnist-ps-0 /] env | grep TASK_INDEX
    81  VK_TASK_INDEX=0
    82  VC_TASK_INDEX=0
    83  ```
    84  * Considering the 2 workers, you will get that their names are `tensorflow-dist-mnist-worker-0` and
    85  `tensorflow-dist-mnist-worker-1`. And the corresponding values of the index environment variables are `0` and `1`.
    86  ```
    87  [root@tensorflow-dist-mnist-worker-0 /] env | grep TASK_INDEX
    88  VK_TASK_INDEX=0
    89  VC_TASK_INDEX=0
    90  ```
    91  ```
    92  [root@tensorflow-dist-mnist-worker-1 /] env | grep TASK_INDEX
    93  VK_TASK_INDEX=1
    94  VC_TASK_INDEX=1
    95  ```
    96  ## Note
    97  * Because of historical reasons, environment variables `VK_TASK_INDEX` and `VC_TASK_INDEX` both exist, `VK_TASK_INDEX` will
    98  be **deprecated** in the future releases.
    99  * No value are needed when register env plugin in the volcano job.