github.com/anth0d/nomad@v0.0.0-20221214183521-ae3a0a2cad06/website/content/docs/concepts/plugins/csi.mdx (about) 1 --- 2 layout: docs 3 page_title: Storage Plugins 4 description: Learn how Nomad manages dynamic storage plugins. 5 --- 6 7 # Storage Plugins 8 9 Nomad has built-in support for scheduling compute resources such as 10 CPU, memory, and networking. Nomad's storage plugin support extends 11 this to allow scheduling tasks with externally created storage 12 volumes. Storage plugins are third-party plugins that conform to the 13 [Container Storage Interface (CSI)][csi-spec] specification. 14 15 Storage plugins are created dynamically as Nomad jobs, unlike device 16 and task driver plugins that need to be installed and configured on 17 each client. Each dynamic plugin type has its own type-specific job 18 spec block; currently there is only the `csi_plugin` type. Nomad 19 tracks which clients have instances of a given plugin, and 20 communicates with plugins over a Unix domain socket that it creates 21 inside the plugin's tasks. 22 23 ## CSI Plugins 24 25 Every storage vendor has its own APIs and workflows, and the 26 industry-standard Container Storage Interface specification unifies 27 these APIs in a way that's agnostic to both the storage vendor and the 28 container orchestrator. Each storage provider can build its own CSI 29 plugin. Jobs can claim storage volumes from AWS Elastic Block Storage 30 (EBS) volumes, GCP persistent disks, Ceph, Portworx, vSphere, etc. The 31 Nomad scheduler will be aware of volumes created by CSI plugins and 32 schedule workloads based on the availability of volumes on a given 33 Nomad client node. 34 35 A list of available CSI plugins can be found in the [Kubernetes CSI 36 documentation][csi-drivers-list]. Spec-compliant plugins should work with Nomad. 37 However, it is possible a plugin vendor has implemented their plugin to make 38 Kubernetes API calls, or is otherwise non-compliant with the CSI 39 specification. In those situations the plugin may not function correctly in a 40 Nomad environment. You should verify plugin compatibility with Nomad before 41 deploying in production. 42 43 A CSI plugin task requires the [`csi_plugin`][csi_plugin] block: 44 45 ```hcl 46 csi_plugin { 47 id = "csi-hostpath" 48 type = "monolith" 49 mount_dir = "/csi" 50 stage_publish_base_dir = "/local/csi" 51 } 52 ``` 53 54 There are three **types** of CSI plugins. **Controller Plugins** 55 communicate with the storage provider's APIs. For example, for a job 56 that needs an AWS EBS volume, Nomad will tell the controller plugin 57 that it needs a volume to be "published" to the client node, and the 58 controller will make the API calls to AWS to attach the EBS volume to 59 the right EC2 instance. **Node Plugins** do the work on each client 60 node, like creating mount points. **Monolith Plugins** are plugins 61 that perform both the controller and node roles in the same 62 instance. Not every plugin provider has or needs a controller; that's 63 specific to the provider implementation. 64 65 Plugins mount and unmount volumes but are not in the data path once 66 the volume is mounted for a task. Plugin tasks are needed when tasks 67 using their volumes stop, so plugins should be left running on a Nomad 68 client until all tasks using their volumes are stopped. The `nomad 69 node drain` command handles this automatically by stopping plugin 70 tasks last. 71 72 Typically, you should run node plugins as Nomad `system` jobs so they 73 can mount volumes on any client where they are running. Controller 74 plugins can create and attach volumes anywhere they can communicate 75 with the storage provider's API, so they can usually be run as 76 `service` jobs. You should always run more than one controller plugin 77 allocation for high availability. 78 79 Nomad exposes a Unix domain socket named `csi.sock` inside each CSI 80 plugin task, and communicates over the gRPC protocol expected by the 81 CSI specification. The `mount_dir` field tells Nomad where the plugin 82 expects to find the socket file. The path to this socket is exposed in 83 the container as the `CSI_ENDPOINT` environment variable. 84 85 Some plugins also require the `stage_publish_base_dir` field, which 86 tells Nomad where to instruct the plugin to mount volumes for staging 87 and/or publishing. 88 89 ### Plugin Lifecycle and State 90 91 CSI plugins report their health like other Nomad jobs. If the plugin 92 crashes or otherwise terminates, Nomad will launch it again using the 93 same `restart` and `reschedule` logic used for other jobs. If plugins 94 are unhealthy, Nomad will mark the volumes they manage as 95 "unscheduable". 96 97 Storage plugins don't have any responsibility (or ability) to monitor 98 the state of tasks that claim their volumes. Nomad sends mount and 99 publish requests to storage plugins when a task claims a volume, and 100 unmount and unpublish requests when a task stops. 101 102 The dynamic plugin registry persists state to the Nomad client so that 103 it can restore volume managers for plugin jobs after client restarts 104 without disrupting storage. 105 106 ### Volume Lifecycle 107 108 The Nomad scheduler decides whether a given client can run an 109 allocation based on whether it has a node plugin present for the 110 volume. But before a task can use a volume the client needs to "claim" 111 the volume for the allocation. The client makes an RPC call to the 112 server and waits for a response; the allocation's tasks won't start 113 until the volume has been claimed and is ready. 114 115 If the volume's plugin requires a controller, the server will send an 116 RPC to any Nomad client where that controller is running. The Nomad 117 client will forward this request over the controller plugin's gRPC 118 socket. The controller plugin will make the request volume available 119 to the node that needs it. 120 121 Once the controller is done (or if there's no controller required), 122 the server will increment the count of claims on the volume and return 123 to the client. This count passes through Nomad's state store so that 124 Nomad has a consistent view of which volumes are available for 125 scheduling. 126 127 The client then makes RPC calls to the node plugin running on that 128 client, and the node plugin mounts the volume to a staging area in 129 the Nomad data directory. Nomad will bind-mount this staged directory 130 into each task that mounts the volume. 131 132 This cycle is reversed when a task that claims a volume becomes 133 terminal. The client frees the volume locally by making "unpublish" 134 RPCs to the node plugin. The node plugin unmounts the bind-mount from 135 the allocation and unmounts the volume from the plugin (if it's not in 136 use by another task). The client will then send an "unpublish" RPC to 137 the server, which will forward it to the controller plugin (if 138 any), and decrement the claim count for the volume. At this point the 139 volume’s claim capacity has been freed up for scheduling. 140 141 [csi-spec]: https://github.com/container-storage-interface/spec 142 [csi-drivers-list]: https://kubernetes-csi.github.io/docs/drivers.html 143 [csi_plugin]: /docs/job-specification/csi_plugin