github.com/iqoqo/nomad@v0.11.3-0.20200911112621-d7021c74d101/website/pages/docs/internals/plugins/csi.mdx

github.com/iqoqo/nomad@v0.11.3-0.20200911112621-d7021c74d101/website/pages/docs/internals/plugins/csi.mdx (about)

1 ---
2 layout: docs
3 page_title: Storage Plugins
4 sidebar_title: Storage
5 description: Learn how Nomad manages dynamic storage plugins.
6 ---
7
8 # Storage Plugins
9
10 Nomad has built-in support for scheduling compute resources such as
11 CPU, memory, and networking. Nomad's storage plugin support extends
12 this to allow scheduling tasks with externally created storage
13 volumes. Storage plugins are third-party plugins that conform to the
14 [Container Storage Interface (CSI)][csi-spec] specification.
15
16 Storage plugins are created dynamically as Nomad jobs, unlike device
17 and task driver plugins that need to be installed and configured on
18 each client. Each dynamic plugin type has its own type-specific job
19 spec block; currently there is only the `csi_plugin` type. Nomad
20 tracks which clients have instances of a given plugin, and
21 communicates with plugins over a Unix domain socket that it creates
22 inside the plugin's tasks.
23
24 ## CSI Plugins
25
26 Every storage vendor has its own APIs and workflows, and the
27 industry-standard Container Storage Interface specification unifies
28 these APIs in a way that's agnostic to both the storage vendor and the
29 container orchestrator. Each storage provider can build its own CSI
30 plugin. Jobs can claim storage volumes from AWS Elastic Block Storage
31 (EBS) volumes, GCP persistent disks, Ceph, Portworx, vSphere, etc. The
32 Nomad scheduler will be aware of volumes created by CSI plugins and
33 schedule workloads based on the availability of volumes on a given
34 Nomad client node. A list of available CSI plugins can be found in the
35 [Kubernetes CSI documentation][csi-drivers-list]. Any of these plugins
36 should work with Nomad out of the box.
37
38 A CSI plugin task requires the [`csi_plugin`][csi_plugin] block:
39
40 ```hcl
41 csi_plugin {
42 id = "csi-hostpath"
43 type = "monolith"
44 mount_dir = "/csi"
45 }
46 ```
47
48 There are three **types** of CSI plugins. **Controller Plugins**
49 communicate with the storage provider's APIs. For example, for a job
50 that needs an AWS EBS volume, Nomad will tell the controller plugin
51 that it needs a volume to be "published" to the client node, and the
52 controller will make the API calls to AWS to attach the EBS volume to
53 the right EC2 instance. **Node Plugins** do the work on each client
54 node, like creating mount points. **Monolith Plugins** are plugins
55 that perform both the controller and node roles in the same
56 instance. Not every plugin provider has or needs a controller; that's
57 specific to the provider implementation.
58
59 You should almost always run node plugins as Nomad `system` jobs to
60 ensure volume claims are released when a Nomad client is drained. Use
61 constraints for the node plugin jobs based on the availability of
62 volumes. For example, AWS EBS volumes are specific to particular
63 availability zones with a region. Controller plugins can be run as
64 `service` jobs.
65
66 Nomad exposes a Unix domain socket named `csi.sock` inside each CSI
67 plugin task, and communicates over the gRPC protocol expected by the
68 CSI specification. The `mount_dir` field tells Nomad where the plugin
69 expects to find the socket file.
70
71 ### Plugin Lifecycle and State
72
73 CSI plugins report their health like other Nomad jobs. If the plugin
74 crashes or otherwise terminates, Nomad will launch it again using the
75 same `restart` and `reschedule` logic used for other jobs. If plugins
76 are unhealthy, Nomad will mark the volumes they manage as
77 "unscheduable".
78
79 Storage plugins don't have any responsibility (or ability) to monitor
80 the state of tasks that claim their volumes. Nomad sends mount and
81 publish requests to storage plugins when a task claims a volume, and
82 unmount and unpublish requests when a task stops.
83
84 The dynamic plugin registry persists state to the Nomad client so that
85 it can restore volume managers for plugin jobs after client restarts
86 without disrupting storage.
87
88 ### Volume Lifecycle
89
90 The Nomad scheduler decides whether a given client can run an
91 allocation based on whether it has a node plugin present for the
92 volume. But before a task can use a volume the client needs to "claim"
93 the volume for the allocation. The client makes an RPC call to the
94 server and waits for a response; the allocation's tasks won't start
95 until the volume has been claimed and is ready.
96
97 If the volume's plugin requires a controller, the server will send an
98 RPC to the Nomad client where that controller is running. The Nomad
99 client will forward this request over the controller plugin's gRPC
100 socket. The controller plugin will make the request volume available
101 to the node that needs it.
102
103 Once the controller is done (or if there's no controller required),
104 the server will increment the count of claims on the volume and return
105 to the client. This count passes through Nomad's state store so that
106 Nomad has a consistent view of which volumes are available for
107 scheduling.
108
109 The client then makes RPC calls to the node plugin running on that
110 client, and the node plugin mounts the volume to a staging area in
111 the Nomad data directory. Nomad will bind-mount this staged directory
112 into each task that mounts the volume.
113
114 This cycle is reversed when a task that claims a volume becomes
115 terminal. The client updates the server frequently about changes to
116 allocations, including terminal state. When the server receives a
117 terminal state for a job with volume claims, it creates a volume claim
118 garbage collection (GC) evaluation to to handled by the core job
119 scheduler. The GC job will send "detach" RPCs to the node plugin. The
120 node plugin unmounts the bind-mount from the allocation and unmounts
121 the volume from the plugin (if it's not in use by another task). The
122 GC job will then send "unpublish" RPCs to the controller plugin (if
123 any), and decrement the claim count for the volume. At this point the
124 volume’s claim capacity has been freed up for scheduling.
125
126 [csi-spec]: https://github.com/container-storage-interface/spec
127 [csi-drivers-list]: https://kubernetes-csi.github.io/docs/drivers.html
128 [csi_plugin]: /docs/job-specification/csi_plugin