github.com/Ilhicas/nomad@v1.0.4-0.20210304152020-e86851182bc3/website/content/docs/internals/plugins/task-drivers.mdx (about) 1 --- 2 layout: docs 3 page_title: Task Driver Plugins 4 sidebar_title: Task Drivers 5 description: Learn how to author a Nomad task driver plugin. 6 --- 7 8 # Task Drivers 9 10 Task drivers in Nomad are the runtime components that execute workloads. For 11 a real world example of a Nomad task driver plugin implementation, see the [LXC 12 driver source][lxcdriver]. 13 14 ## Authoring Task Driver Plugins 15 16 Authoring a task driver (shortened to driver in this documentation) in Nomad 17 consists of implementing the [DriverPlugin][driverplugin] interface and adding 18 a main package to launch the plugin. A driver plugin is long-lived and its 19 lifetime is not bound to the Nomad client. This means that the Nomad client can 20 be restarted without restarting the driver. Nomad will ensure that one 21 instance of the driver is running, meaning if the driver crashes or otherwise 22 terminates, Nomad will launch another instance of it. 23 24 Drivers should maintain as little state as possible. State for a task is stored 25 by the Nomad client on task creation. This enables a pattern where the driver 26 can maintain an in-memory state of the running tasks, and if necessary the 27 Nomad client can recover tasks into the driver state. 28 29 The [driver plugin skeleton project][skeletonproject] exists to help bootstrap 30 the development of new driver plugins. It provides most of the boilerplate 31 necessary for a driver plugin, along with detailed comments. 32 33 ## Task Driver Plugin API 34 35 The [base plugin][baseplugin] must be implemented in addition to the following 36 functions. 37 38 ### `TaskConfigSchema() (*hclspec.Spec, error)` 39 40 This function returns the schema for the driver configuration of the task. For 41 more information on `hclspec.Spec` see the HCL section in the [base 42 plugin][baseplugin] documentation. 43 44 ### `Capabilities() (*Capabilities, error)` 45 46 Capabilities define what features the driver implements. Example: 47 48 ```go 49 Capabilities { 50 // Does the driver support sending OS signals to the task? This capability 51 // is used by 'nomad alloc signal'. 52 SendSignals: true, 53 54 // Does the driver support executing a command within the task execution 55 // environment? This capability is used by 'nomad alloc exec'. 56 Exec: true, 57 58 // What filesystem isolation is supported by the driver. Options include 59 // FSIsolationImage, FSIsolationChroot, and FSIsolationNone. See below for 60 // more details. 61 FSIsolation: FSIsolationImage, 62 63 // NetIsolationModes lists the set of isolation modes supported by the 64 // driver. Options include NetIsolationModeHost, NetIsolationModeGroup, 65 // NetIsolationModeTask, and NetIsolationModeNone. See below for more 66 // details. 67 NetIsolationModes []NetIsolationMode 68 69 // MustInitiateNetwork tells Nomad that the driver must create the network 70 // namespace and that the CreateNetwork and DestroyNetwork RPCs are 71 // implemented. 72 MustInitiateNetwork bool 73 74 // MountConfigs tells Nomad which mounting config options the driver 75 // supports. This is used to check whether mounting host volumes or CSI 76 // volumes is allowed. Options include MountConfigSupportAll (default), or 77 // MountConfigSupportNone. 78 MountConfigs MountConfigSupport 79 } 80 ``` 81 82 The file system isolation options are: 83 84 - `FSIsolationImage`: The task driver isolates tasks as machine images. 85 - `FSIsolationChroot`: The task driver isolates tasks with `chroot` or 86 `pivot_root`. 87 - `FSIsolationNone`: The task driver has no filesystem isolation. 88 89 The network isolation modes are: 90 91 - `NetIsolationModeHost`: The task driver supports disabling network isolation 92 and using the host network. 93 - `NetIsolationModeGroup`: The task driver supports using the task group 94 network namespace. 95 - `NetIsolationModeTask`: The task driver supports isolating the network to 96 just the task. 97 - `NetIsolationModeNone`: There is no network to isolate. This is used for 98 task that the client manages remotely. 99 100 ### `Fingerprint(context.Context) (<-chan *Fingerprint, error)` 101 102 This function is called by the client when the plugin is started. It allows the 103 driver to indicate its health to the client. The channel returned should 104 immediately send an initial Fingerprint, then send periodic updates at an 105 interval that is appropriate for the driver until the context is canceled. 106 107 The fingerprint consists of a `HealthState` and `HealthDescription` to inform 108 the client about its health. Additionally an `Attributes` field is available 109 for the driver to add additional attributes to the client node. The fingerprint 110 `HealthState` can be one of three states. 111 112 - `HealthStateUndetected`: Indicates that the necessary dependencies for the 113 driver are not detected on the system. Ex. java runtime for the java driver 114 - `HealthStateUnhealthy`: Indicates that something is wrong with the driver 115 runtime. Ex. docker daemon stopped for the Docker driver 116 - `HealthStateHealthy`: All systems go 117 118 ### `StartTask(*TaskConfig) (*TaskHandle, *DriverNetwork, error)` 119 120 This function takes a [`TaskConfig`][taskconfig] which includes all of the configuration 121 needed to launch the task. Additionally the driver configuration can be decoded 122 from the `TaskConfig` by calling `*TaskConfig.DecodeDriverConfig(t interface{})` 123 passing in a pointer to the driver specific configuration struct. The 124 `TaskConfig` includes an `ID` field which future operations on the task will be 125 referenced by. 126 127 Drivers return a [`*TaskHandle`][taskhandle] which contains 128 the required information for the driver to reattach to the running task in the 129 case of plugin crashes or restarts. Some of this required state 130 will be specific to the driver implementation, thus a `DriverState` field 131 exists to allow the driver to encode custom state into the struct. Helper 132 fields exist on the `TaskHandle` to `GetDriverState` and `SetDriverState` 133 removing the need for the driver to handle serialization. 134 135 A `*DriverNetwork` can optionally be returned to describe the network of the 136 task if it is modified by the driver. An example of this is in the Docker 137 driver where tasks can be attached to a specific Docker network. 138 139 If an error occurs, it is expected that the driver will cleanup any created 140 resources prior to returning the error. 141 142 #### Logging 143 144 Nomad handles all rotation and plumbing of task logs. In order for task stdout 145 and stderr to be received by Nomad, they must be written to the correct 146 location. Prior to starting the task through the driver, the Nomad client 147 creates FIFOs for stdout and stderr. These paths are given to the driver in the 148 `TaskConfig`. The [`fifo` package][fifopackage] can be used to support 149 cross platform writing to these paths. 150 151 #### TaskHandle Schema Versioning 152 153 A `Version` field is available on the TaskHandle struct to facilitate backwards 154 compatible recovery of tasks. This field is opaque to Nomad, but allows the 155 driver to handle recover tasks that were created by an older version of the 156 plugin. 157 158 ### `RecoverTask(*TaskHandle) error` 159 160 When a driver is restarted it is not expected to persist any internal state to 161 disk. To support this, Nomad will attempt to recover a task that was 162 previously started if the driver does not recognize the task ID. During task 163 recovery, Nomad calls `RecoverTask` passing the `TaskHandle` that was 164 returned by the `StartTask` function. If no error was returned, it is 165 expected that the driver can now operate on the task by referencing the task 166 ID. If an error occurs, the Nomad client will mark the task as `lost`. 167 168 ### `WaitTask(context.Context, id string) (<-chan *ExitResult, error)` 169 170 The `WaitTask` function is expected to return a channel that will send an 171 `*ExitResult` when the task exits or close the channel when the context is 172 canceled. It is also expected that calling `WaitTask` on an exited task will 173 immediately send an `*ExitResult` on the returned channel. 174 175 ### `StopTask(taskID string, timeout time.Duration, signal string) error` 176 177 The `StopTask` function is expected to stop a running task by sending the given 178 signal to it. If the task does not stop during the given timeout, the driver 179 must forcefully kill the task. 180 181 `StopTask` does not clean up resources of the task or remove it from the 182 driver's internal state. A call to `WaitTask` after `StopTask` is valid and 183 should be handled. 184 185 ### `DestroyTask(taskID string, force bool) error` 186 187 The `DestroyTask` function cleans up and removes a task that has terminated. If 188 force is set to true, the driver must destroy the task even if it is still 189 running. If `WaitTask` is called after `DestroyTask`, it should return 190 `drivers.ErrTaskNotFound` as no task state should exist after `DestroyTask` is 191 called. 192 193 ### `InspectTask(taskID string) (*TaskStatus, error)` 194 195 The `InspectTask` function returns detailed status information for the 196 referenced `taskID`. 197 198 ### `TaskStats(context.Context, id string, time.Duration) (<-chan *cstructs.TaskResourceUsage, error)` 199 200 The `TaskStats` function returns a channel which the driver should send stats 201 to at the given interval. The driver must send stats at the given interval 202 until the given context is canceled or the task terminates. 203 204 ### `TaskEvents(context.Context) (<-chan *TaskEvent, error)` 205 206 The Nomad client publishes events associated with an allocation. The 207 `TaskEvents` function allows the driver to publish driver specific events about 208 tasks and the Nomad client will associate them with the correct allocation. 209 210 An `Eventer` utility is available in the 211 `github.com/hashicorp/nomad/drivers/shared/eventer` package implements an 212 event loop and publishing mechanism for use in the `TaskEvents` function. 213 214 ### `SignalTask(taskID string, signal string) error` 215 216 > Optional - can be skipped by embedding `drivers.DriverSignalTaskNotSupported` 217 218 The `SignalTask` function is used by drivers which support sending OS signals 219 (`SIGHUP`, `SIGKILL`, `SIGUSR1` etc.) to the task. It is an optional function 220 and is listed as a capability in the driver `Capabilities` struct. 221 222 ### `ExecTask(taskID string, cmd []string, timeout time.Duration) (*ExecTaskResult, error)` 223 224 > Optional - can be skipped by embedding `drivers.DriverExecTaskNotSupported` 225 226 The `ExecTask` function is used by the Nomad client to execute commands inside 227 the task execution context. For example, the Docker driver executes commands 228 inside the running container. `ExecTask` is called for Consul script checks. 229 230 [lxcdriver]: https://github.com/hashicorp/nomad-driver-lxc 231 [driverplugin]: https://github.com/hashicorp/nomad/blob/v0.9.0/plugins/drivers/driver.go#L39-L57 232 [skeletonproject]: https://github.com/hashicorp/nomad-skeleton-driver-plugin 233 [baseplugin]: /docs/internals/plugins/base 234 [taskconfig]: https://godoc.org/github.com/hashicorp/nomad/plugins/drivers#TaskConfig 235 [taskhandle]: https://godoc.org/github.com/hashicorp/nomad/plugins/drivers#TaskHandle 236 [fifopackage]: https://godoc.org/github.com/hashicorp/nomad/client/lib/fifo