volcano.sh/volcano@v1.9.0/docs/user-guide/how_to_use_numa_aware.md (about) 1 # NUMA Aware User Guide 2 3 ## Environment setup 4 5 ### Pre-Condition 6 7 - Enable cpu manager and set policy to "static" 8 - Enable topology manager and set the policy option you want 9 <br><br> 10 1. Set the above conditions by editing the kubelet configuration file 11 12 ``` 13 cat /var/lib/kubelet/config.yaml 14 ``` 15 16 ``` 17 {...} 18 cpuManagerPolicy: static 19 topologyManagerPolicy: best-effort 20 kubeReserved: 21 cpu: 1000m 22 ``` 23 24 2. Restart kubelet to take effect <br> 25 Run the following: 26 27 ``` 28 1. systemctl stop kubelet 29 2. rm -rf /var/lib/kubelet/cpu_manager_state 30 3. systemctl daemon-reload 31 4. systemctl start kubelet 32 ``` 33 34 ### Install volcano 35 36 #### 1. Install from source 37 38 Refer to [Install Guide](../../installer/README.md) to install volcano. 39 40 After installed, update the scheduler configuration: 41 42 ```shell script 43 kubectl edit cm -n volcano-system volcano-scheduler-configmap 44 ``` 45 46 ```yaml 47 kind: ConfigMap 48 apiVersion: v1 49 metadata: 50 name: volcano-scheduler-configmap 51 namespace: volcano-system 52 data: 53 volcano-scheduler.conf: | 54 actions: "enqueue, allocate, backfill" 55 tiers: 56 - plugins: 57 - name: priority 58 - name: gang 59 - name: conformance 60 - plugins: 61 - name: drf 62 - name: predicates 63 - name: proportion 64 - name: nodeorder 65 - name: binpack 66 - name: numa-aware # add it to enable numa-aware plugin 67 arguments: 68 weight: 10 69 ``` 70 71 #### 2. Install from release package 72 73 Same as above, after installed, update the scheduler configuration in `volcano-scheduler-configmap` configmap. 74 75 ### Install volcano resource exporter 76 77 Please refer to [volcano resource exporter](https://github.com/volcano-sh/resource-exporter/blob/main/README.md) 78 79 ### Verify environment is ready 80 81 Check the CRD **numatopo** whether the data of all nodes exists. 82 83 ``` 84 kubectl get numatopo 85 NAME AGE 86 node-1 4h8m 87 node-2 4h8m 88 node-3 4h8m 89 ``` 90 91 ## Usage 92 93 ### Running volcano Job with topology policy 94 95 Support the task-level topology policy and edit **spec.tasks.topologyPolicy** to specify whether to perform topology scheduling.<br> The supported options are the same as [topology manager](https://v1-19.docs.kubernetes.io/docs/tasks/administer-cluster/topology-manager/) on kubelet: 96 97 ```` 98 1. single-numa-node 99 2. best-effort 100 3. restricted 101 4. none 102 103 ```` 104 105 For example 106 107 ``` 108 apiVersion: batch.volcano.sh/v1alpha1 109 kind: Job 110 metadata: 111 name: vj-test 112 spec: 113 schedulerName: volcano 114 minAvailable: 1 115 tasks: 116 - replicas: 1 117 name: "test" 118 topologyPolicy: best-effort # set the topology policy for task 119 template: 120 spec: 121 containers: 122 - image: alpine 123 command: ["/bin/sh", "-c", "sleep 1000"] 124 imagePullPolicy: IfNotPresent 125 name: running 126 resources: 127 limits: 128 cpu: 20 129 memory: "100Mi" 130 restartPolicy: OnFailure 131 ``` 132 133 ### Running TFJob with topology policy 134 135 Add the annotation **volcano.sh/numa-topology-policy** to specify the topology policy you want. 136 137 ``` 138 apiVersion: kubeflow.org/v1 139 kind: TFJob 140 metadata: 141 generateName: tfjob 142 name: tfjob-test 143 spec: 144 tfReplicaSpecs: 145 PS: 146 replicas: 1 147 restartPolicy: OnFailure 148 template: 149 metadata: 150 annotations: 151 sidecar.istio.io/inject: "false" 152 volcano.sh/numa-topology-policy: "best-effort" # set the topology policy for pod 153 spec: 154 containers: 155 - name: tensorflow 156 image: alpine:latest 157 imagePullPolicy: IfNotPresent 158 command: ["/bin/sh", "-c", "sleep 1000"] 159 resources: 160 limits: 161 cpu: 15 162 memory: 2Gi 163 requests: 164 cpu: 15 165 memory: 2Gi 166 Worker: 167 replicas: 1 168 restartPolicy: OnFailure 169 template: 170 metadata: 171 annotations: 172 sidecar.istio.io/inject: "false" 173 volcano.sh/numa-topology-policy: "best-effort" 174 spec: 175 containers: 176 - name: tensorflow 177 image: alpine:latest 178 imagePullPolicy: IfNotPresent 179 command: ["/bin/sh", "-c", "sleep 1000"] 180 resources: 181 limits: 182 cpu: 15 183 memory: 2Gi 184 requests: 185 cpu: 15 186 memory: 2Gi 187 ``` 188 189 ### Practice 190 191 |worker node|allocatable cpu on NUMA node 0|allocatable cpu on NUMA node 2| 192 |-----|----|-----| 193 | node-1| 12 | 12| 194 | node-2| 20 | 20| 195 196 Submit a volcano job as the following: 197 198 ``` 199 apiVersion: batch.volcano.sh/v1alpha1 200 kind: Job 201 metadata: 202 name: vj-test 203 spec: 204 schedulerName: volcano 205 minAvailable: 1 206 tasks: 207 - replicas: 1 208 name: "test" 209 topologyPolicy: best-effort # set the topology policy for task 210 template: 211 spec: 212 containers: 213 - image: alpine 214 command: ["/bin/sh", "-c", "sleep 1000"] 215 imagePullPolicy: IfNotPresent 216 name: running 217 resources: 218 limits: 219 cpu: 16 220 memory: "100Mi" 221 restartPolicy: OnFailure 222 ``` 223 224 The pod will be scheduled to node-2, because it can allocate the cpu request of the pod on a single NUMA node and the node-1 needs to do this on two NUMA nodes.