github.com/containerd/nerdctl@v1.7.7/docs/gpu.md (about) 1 # Using GPUs inside containers 2 3 | :zap: Requirement | nerdctl >= 0.9 | 4 |-------------------|----------------| 5 6 nerdctl provides docker-compatible NVIDIA GPU support. 7 8 ## Prerequisites 9 10 - NVIDIA Drivers 11 - Same requirement as when you use GPUs on Docker. For details, please refer to [the doc by NVIDIA](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#pre-requisites). 12 - `nvidia-container-cli` 13 - containerd relies on this CLI for setting up GPUs inside container. You can install this via [`libnvidia-container` package](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/arch-overview.html#libnvidia-container). 14 15 ## Options for `nerdctl run --gpus` 16 17 `nerdctl run --gpus` is compatible to [`docker run --gpus`](https://docs.docker.com/engine/reference/commandline/run/#access-an-nvidia-gpu). 18 19 You can specify number of GPUs to use via `--gpus` option. 20 The following example exposes all available GPUs. 21 22 ``` 23 nerdctl run -it --rm --gpus all nvidia/cuda:9.0-base nvidia-smi 24 ``` 25 26 You can also pass detailed configuration to `--gpus` option as a list of key-value pairs. The following options are provided. 27 28 - `count`: number of GPUs to use. `all` exposes all available GPUs. 29 - `device`: IDs of GPUs to use. UUID or numbers of GPUs can be specified. 30 - `capabilities`: [Driver capabilities](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/user-guide.html#driver-capabilities). If unset, use default driver `utility`, `compute`. 31 32 The following example exposes a specific GPU to the container. 33 34 ``` 35 nerdctl run -it --rm --gpus '"capabilities=utility,compute",device=GPU-3a23c669-1f69-c64e-cf85-44e9b07e7a2a' nvidia/cuda:9.0-base nvidia-smi 36 ``` 37 38 ## Fields for `nerdctl compose` 39 40 `nerdctl compose` also supports GPUs following [compose-spec](https://github.com/compose-spec/compose-spec/blob/master/deploy.md#devices). 41 42 You can use GPUs on compose when you specify some of the following `capabilities` in `services.demo.deploy.resources.reservations.devices`. 43 44 - `gpu` 45 - `nvidia` 46 - all allowed capabilities for `nerdctl run --gpus` 47 48 Available fields are the same as `nerdctl run --gpus`. 49 50 The following exposes all available GPUs to the container. 51 52 ``` 53 version: "3.8" 54 services: 55 demo: 56 image: nvidia/cuda:9.0-base 57 command: nvidia-smi 58 deploy: 59 resources: 60 reservations: 61 devices: 62 - capabilities: ["utility"] 63 count: all 64 ``` 65 66 ## Trouble Shooting 67 68 ### `nerdctl run --gpus` fails when using the Nvidia gpu-operator 69 70 If the Nvidia driver is installed by the [gpu-operator](https://github.com/NVIDIA/gpu-operator).The `nerdctl run` will fail with the error message `(FATA[0000] exec: "nvidia-container-cli": executable file not found in $PATH)`. 71 72 So, the `nvidia-container-cli` needs to be added to the PATH environment variable. 73 74 You can do this by adding the following line to your $HOME/.profile or /etc/profile (for a system-wide installation): 75 ``` 76 export PATH=$PATH:/usr/local/nvidia/toolkit 77 ``` 78 79 The shared libraries also need to be added to the system. 80 ``` 81 echo "/run/nvidia/driver/usr/lib/x86_64-linux-gnu" > /etc/ld.so.conf.d/nvidia.conf 82 ldconfig 83 ``` 84 85 And then, the `nerdctl run --gpus` can run successfully.