github.com/anth0d/nomad@v0.0.0-20221214183521-ae3a0a2cad06/website/content/docs/concepts/plugins/csi.mdx (about)

     1  ---
     2  layout: docs
     3  page_title: Storage Plugins
     4  description: Learn how Nomad manages dynamic storage plugins.
     5  ---
     6  
     7  # Storage Plugins
     8  
     9  Nomad has built-in support for scheduling compute resources such as
    10  CPU, memory, and networking. Nomad's storage plugin support extends
    11  this to allow scheduling tasks with externally created storage
    12  volumes. Storage plugins are third-party plugins that conform to the
    13  [Container Storage Interface (CSI)][csi-spec] specification.
    14  
    15  Storage plugins are created dynamically as Nomad jobs, unlike device
    16  and task driver plugins that need to be installed and configured on
    17  each client. Each dynamic plugin type has its own type-specific job
    18  spec block; currently there is only the `csi_plugin` type. Nomad
    19  tracks which clients have instances of a given plugin, and
    20  communicates with plugins over a Unix domain socket that it creates
    21  inside the plugin's tasks.
    22  
    23  ## CSI Plugins
    24  
    25  Every storage vendor has its own APIs and workflows, and the
    26  industry-standard Container Storage Interface specification unifies
    27  these APIs in a way that's agnostic to both the storage vendor and the
    28  container orchestrator. Each storage provider can build its own CSI
    29  plugin. Jobs can claim storage volumes from AWS Elastic Block Storage
    30  (EBS) volumes, GCP persistent disks, Ceph, Portworx, vSphere, etc. The
    31  Nomad scheduler will be aware of volumes created by CSI plugins and
    32  schedule workloads based on the availability of volumes on a given
    33  Nomad client node.
    34  
    35  A list of available CSI plugins can be found in the [Kubernetes CSI
    36  documentation][csi-drivers-list]. Spec-compliant plugins should work with Nomad.
    37  However, it is possible a plugin vendor has implemented their plugin to make
    38  Kubernetes API calls, or is otherwise non-compliant with the CSI
    39  specification. In those situations the plugin may not function correctly in a
    40  Nomad environment. You should verify plugin compatibility with Nomad before
    41  deploying in production.
    42  
    43  A CSI plugin task requires the [`csi_plugin`][csi_plugin] block:
    44  
    45  ```hcl
    46  csi_plugin {
    47    id                     = "csi-hostpath"
    48    type                   = "monolith"
    49    mount_dir              = "/csi"
    50    stage_publish_base_dir = "/local/csi"
    51  }
    52  ```
    53  
    54  There are three **types** of CSI plugins. **Controller Plugins**
    55  communicate with the storage provider's APIs. For example, for a job
    56  that needs an AWS EBS volume, Nomad will tell the controller plugin
    57  that it needs a volume to be "published" to the client node, and the
    58  controller will make the API calls to AWS to attach the EBS volume to
    59  the right EC2 instance. **Node Plugins** do the work on each client
    60  node, like creating mount points. **Monolith Plugins** are plugins
    61  that perform both the controller and node roles in the same
    62  instance. Not every plugin provider has or needs a controller; that's
    63  specific to the provider implementation.
    64  
    65  Plugins mount and unmount volumes but are not in the data path once
    66  the volume is mounted for a task. Plugin tasks are needed when tasks
    67  using their volumes stop, so plugins should be left running on a Nomad
    68  client until all tasks using their volumes are stopped. The `nomad
    69  node drain` command handles this automatically by stopping plugin
    70  tasks last.
    71  
    72  Typically, you should run node plugins as Nomad `system` jobs so they
    73  can mount volumes on any client where they are running. Controller
    74  plugins can create and attach volumes anywhere they can communicate
    75  with the storage provider's API, so they can usually be run as
    76  `service` jobs. You should always run more than one controller plugin
    77  allocation for high availability.
    78  
    79  Nomad exposes a Unix domain socket named `csi.sock` inside each CSI
    80  plugin task, and communicates over the gRPC protocol expected by the
    81  CSI specification. The `mount_dir` field tells Nomad where the plugin
    82  expects to find the socket file. The path to this socket is exposed in
    83  the container as the `CSI_ENDPOINT` environment variable.
    84  
    85  Some plugins also require the `stage_publish_base_dir` field, which
    86  tells Nomad where to instruct the plugin to mount volumes for staging
    87  and/or publishing.
    88  
    89  ### Plugin Lifecycle and State
    90  
    91  CSI plugins report their health like other Nomad jobs. If the plugin
    92  crashes or otherwise terminates, Nomad will launch it again using the
    93  same `restart` and `reschedule` logic used for other jobs. If plugins
    94  are unhealthy, Nomad will mark the volumes they manage as
    95  "unscheduable".
    96  
    97  Storage plugins don't have any responsibility (or ability) to monitor
    98  the state of tasks that claim their volumes. Nomad sends mount and
    99  publish requests to storage plugins when a task claims a volume, and
   100  unmount and unpublish requests when a task stops.
   101  
   102  The dynamic plugin registry persists state to the Nomad client so that
   103  it can restore volume managers for plugin jobs after client restarts
   104  without disrupting storage.
   105  
   106  ### Volume Lifecycle
   107  
   108  The Nomad scheduler decides whether a given client can run an
   109  allocation based on whether it has a node plugin present for the
   110  volume. But before a task can use a volume the client needs to "claim"
   111  the volume for the allocation. The client makes an RPC call to the
   112  server and waits for a response; the allocation's tasks won't start
   113  until the volume has been claimed and is ready.
   114  
   115  If the volume's plugin requires a controller, the server will send an
   116  RPC to any Nomad client where that controller is running. The Nomad
   117  client will forward this request over the controller plugin's gRPC
   118  socket. The controller plugin will make the request volume available
   119  to the node that needs it.
   120  
   121  Once the controller is done (or if there's no controller required),
   122  the server will increment the count of claims on the volume and return
   123  to the client. This count passes through Nomad's state store so that
   124  Nomad has a consistent view of which volumes are available for
   125  scheduling.
   126  
   127  The client then makes RPC calls to the node plugin running on that
   128  client, and the node plugin mounts the volume to a staging area in
   129  the Nomad data directory. Nomad will bind-mount this staged directory
   130  into each task that mounts the volume.
   131  
   132  This cycle is reversed when a task that claims a volume becomes
   133  terminal. The client frees the volume locally by making "unpublish"
   134  RPCs to the node plugin. The node plugin unmounts the bind-mount from
   135  the allocation and unmounts the volume from the plugin (if it's not in
   136  use by another task). The client will then send an "unpublish" RPC to
   137  the server, which will forward it to the controller plugin (if
   138  any), and decrement the claim count for the volume. At this point the
   139  volume’s claim capacity has been freed up for scheduling.
   140  
   141  [csi-spec]: https://github.com/container-storage-interface/spec
   142  [csi-drivers-list]: https://kubernetes-csi.github.io/docs/drivers.html
   143  [csi_plugin]: /docs/job-specification/csi_plugin