github.com/anth0d/nomad@v0.0.0-20221214183521-ae3a0a2cad06/website/content/docs/concepts/plugins/task-drivers.mdx (about) 1 --- 2 layout: docs 3 page_title: Task Driver Plugins 4 description: Learn how to author a Nomad task driver plugin. 5 --- 6 7 # Task Drivers 8 9 Task drivers in Nomad are the runtime components that execute workloads. For 10 a real world example of a Nomad task driver plugin implementation, see the [LXC 11 driver source][lxcdriver]. 12 13 ## Authoring Task Driver Plugins 14 15 Authoring a task driver (shortened to driver in this documentation) in Nomad 16 consists of implementing the [DriverPlugin][driverplugin] interface and adding 17 a main package to launch the plugin. A driver plugin is long-lived and its 18 lifetime is not bound to the Nomad client. This means that the Nomad client can 19 be restarted without restarting the driver. Nomad will ensure that one 20 instance of the driver is running, meaning if the driver crashes or otherwise 21 terminates, Nomad will launch another instance of it. 22 23 Drivers should maintain as little state as possible. State for a task is stored 24 by the Nomad client on task creation. This enables a pattern where the driver 25 can maintain an in-memory state of the running tasks, and if necessary the 26 Nomad client can recover tasks into the driver state. 27 28 The [driver plugin skeleton project][skeletonproject] exists to help bootstrap 29 the development of new driver plugins. It provides most of the boilerplate 30 necessary for a driver plugin, along with detailed comments. 31 32 ## Task Driver Plugin API 33 34 The [base plugin][baseplugin] must be implemented in addition to the following 35 functions. 36 37 ### `TaskConfigSchema() (*hclspec.Spec, error)` 38 39 This function returns the schema for the driver configuration of the task. For 40 more information on `hclspec.Spec` see the HCL section in the [base 41 plugin][baseplugin] documentation. 42 43 ### `Capabilities() (*Capabilities, error)` 44 45 Capabilities define what features the driver implements. Example: 46 47 ```go 48 type Capabilities struct { 49 // SendSignals marks the driver as being able to send signals 50 SendSignals bool 51 52 // Exec marks the driver as being able to execute arbitrary commands 53 // such as health checks. Used by the ScriptExecutor interface. 54 Exec bool 55 56 //FSIsolation indicates what kind of filesystem isolation the driver supports. 57 FSIsolation FSIsolation 58 59 //NetIsolationModes lists the set of isolation modes supported by the driver 60 NetIsolationModes []NetIsolationMode 61 62 // MustInitiateNetwork tells Nomad that the driver must create the network 63 // namespace and that the CreateNetwork and DestroyNetwork RPCs are implemented. 64 MustInitiateNetwork bool 65 66 // MountConfigs tells Nomad which mounting config options the driver supports. 67 MountConfigs MountConfigSupport 68 69 // RemoteTasks indicates this driver runs tasks on remote systems 70 // instead of locally. The Nomad client can use this information to 71 // adjust behavior such as propogating task handles between allocations 72 // to avoid downtime when a client is lost. 73 RemoteTasks bool 74 } 75 ``` 76 77 The file system isolation options are: 78 79 - `FSIsolationImage`: The task driver isolates tasks as machine images. 80 - `FSIsolationChroot`: The task driver isolates tasks with `chroot` or 81 `pivot_root`. 82 - `FSIsolationNone`: The task driver has no filesystem isolation. 83 84 The network isolation modes are: 85 86 - `NetIsolationModeHost`: The task driver supports disabling network isolation 87 and using the host network. 88 - `NetIsolationModeGroup`: The task driver supports using the task group 89 network namespace. 90 - `NetIsolationModeTask`: The task driver supports isolating the network to 91 just the task. 92 - `NetIsolationModeNone`: There is no network to isolate. This is used for 93 task that the client manages remotely. 94 95 #### Remote Task Drivers 96 97 [Remote Task Drivers][rtd] should set `RemoteTasks` to `true`. Remote Task 98 Drivers are task driver plugins that execute tasks on a different system than 99 the Nomad client. This means the tasks lifecycle is distinct from the Nomad 100 client's. 101 102 For task driver plugin authors there are 2 important new behaviors when 103 `RemoteTasks` is `true`: 104 105 1. The `TaskHandle` returned by `StartTask` will be propagated to replacement 106 allocations if the Nomad client is drained or down. Nomad will call 107 `RecoverTask` instead of `StartTask` for remote tasks in replacement 108 allocations when a `TaskHandle` has been propagated from the previous 109 allocation. 110 2. If the Nomad client managing a remote task is drained or if the allocation 111 was `lost`, the remote task is sent a special `DETACH` kill signal. This 112 indicates the plugin should stop managing the remote task, but _not_ stop 113 it. 114 115 These behaviors are meant to keep remote tasks running even when the Nomad 116 client managing them is shutdown. Remote tasks are stopped when the job is 117 explicitly stopped like traditional tasks. 118 119 ### `Fingerprint(context.Context) (<-chan *Fingerprint, error)` 120 121 This function is called by the client when the plugin is started. It allows the 122 driver to indicate its health to the client. The channel returned should 123 immediately send an initial Fingerprint, then send periodic updates at an 124 interval that is appropriate for the driver until the context is canceled. 125 126 The fingerprint consists of a `HealthState` and `HealthDescription` to inform 127 the client about its health. Additionally an `Attributes` field is available 128 for the driver to add additional attributes to the client node. The fingerprint 129 `HealthState` can be one of three states. 130 131 - `HealthStateUndetected`: Indicates that the necessary dependencies for the 132 driver are not detected on the system. Ex. java runtime for the java driver 133 - `HealthStateUnhealthy`: Indicates that something is wrong with the driver 134 runtime. Ex. docker daemon stopped for the Docker driver 135 - `HealthStateHealthy`: All systems go 136 137 ### `StartTask(*TaskConfig) (*TaskHandle, *DriverNetwork, error)` 138 139 This function takes a [`TaskConfig`][taskconfig] which includes all of the configuration 140 needed to launch the task. Additionally the driver configuration can be decoded 141 from the `TaskConfig` by calling `*TaskConfig.DecodeDriverConfig(t interface{})` 142 passing in a pointer to the driver specific configuration struct. The 143 `TaskConfig` includes an `ID` field which future operations on the task will be 144 referenced by. 145 146 Drivers return a [`*TaskHandle`][taskhandle] which contains 147 the required information for the driver to reattach to the running task in the 148 case of plugin crashes or restarts. Some of this required state 149 will be specific to the driver implementation, thus a `DriverState` field 150 exists to allow the driver to encode custom state into the struct. Helper 151 fields exist on the `TaskHandle` to `GetDriverState` and `SetDriverState` 152 removing the need for the driver to handle serialization. 153 154 A `*DriverNetwork` can optionally be returned to describe the network of the 155 task if it is modified by the driver. An example of this is in the Docker 156 driver where tasks can be attached to a specific Docker network. 157 158 If an error occurs, it is expected that the driver will cleanup any created 159 resources prior to returning the error. 160 161 #### Logging 162 163 Nomad handles all rotation and plumbing of task logs. In order for task stdout 164 and stderr to be received by Nomad, they must be written to the correct 165 location. Prior to starting the task through the driver, the Nomad client 166 creates FIFOs for stdout and stderr. These paths are given to the driver in the 167 `TaskConfig`. The [`fifo` package][fifopackage] can be used to support 168 cross platform writing to these paths. 169 170 #### TaskHandle Schema Versioning 171 172 A `Version` field is available on the TaskHandle struct to facilitate backwards 173 compatible recovery of tasks. This field is opaque to Nomad, but allows the 174 driver to handle recover tasks that were created by an older version of the 175 plugin. 176 177 ### `RecoverTask(*TaskHandle) error` 178 179 When a driver is restarted it is not expected to persist any internal state to 180 disk. To support this, Nomad will attempt to recover a task that was 181 previously started if the driver does not recognize the task ID. During task 182 recovery, Nomad calls `RecoverTask` passing the `TaskHandle` that was 183 returned by the `StartTask` function. If no error was returned, it is 184 expected that the driver can now operate on the task by referencing the task 185 ID. If an error occurs, the Nomad client will mark the task as `lost`. 186 187 ### `WaitTask(context.Context, id string) (<-chan *ExitResult, error)` 188 189 The `WaitTask` function is expected to return a channel that will send an 190 `*ExitResult` when the task exits or close the channel when the context is 191 canceled. It is also expected that calling `WaitTask` on an exited task will 192 immediately send an `*ExitResult` on the returned channel. 193 194 ### `StopTask(taskID string, timeout time.Duration, signal string) error` 195 196 The `StopTask` function is expected to stop a running task by sending the given 197 signal to it. If the task does not stop during the given timeout, the driver 198 must forcefully kill the task. 199 200 `StopTask` does not clean up resources of the task or remove it from the 201 driver's internal state. A call to `WaitTask` after `StopTask` is valid and 202 should be handled. 203 204 ### `DestroyTask(taskID string, force bool) error` 205 206 The `DestroyTask` function cleans up and removes a task that has terminated. If 207 force is set to true, the driver must destroy the task even if it is still 208 running. If `WaitTask` is called after `DestroyTask`, it should return 209 `drivers.ErrTaskNotFound` as no task state should exist after `DestroyTask` is 210 called. 211 212 ### `InspectTask(taskID string) (*TaskStatus, error)` 213 214 The `InspectTask` function returns detailed status information for the 215 referenced `taskID`. 216 217 ### `TaskStats(context.Context, id string, time.Duration) (<-chan *cstructs.TaskResourceUsage, error)` 218 219 The `TaskStats` function returns a channel which the driver should send stats 220 to at the given interval. The driver must send stats at the given interval 221 until the given context is canceled or the task terminates. 222 223 ### `TaskEvents(context.Context) (<-chan *TaskEvent, error)` 224 225 The Nomad client publishes events associated with an allocation. The 226 `TaskEvents` function allows the driver to publish driver specific events about 227 tasks and the Nomad client will associate them with the correct allocation. 228 229 An `Eventer` utility is available in the 230 `github.com/hashicorp/nomad/drivers/shared/eventer` package implements an 231 event loop and publishing mechanism for use in the `TaskEvents` function. 232 233 ### `SignalTask(taskID string, signal string) error` 234 235 > Optional - can be skipped by embedding `drivers.DriverSignalTaskNotSupported` 236 237 The `SignalTask` function is used by drivers which support sending OS signals 238 (`SIGHUP`, `SIGKILL`, `SIGUSR1` etc.) to the task. It is an optional function 239 and is listed as a capability in the driver `Capabilities` struct. 240 241 ### `ExecTask(taskID string, cmd []string, timeout time.Duration) (*ExecTaskResult, error)` 242 243 > Optional - can be skipped by embedding `drivers.DriverExecTaskNotSupported` 244 245 The `ExecTask` function is used by the Nomad client to execute commands inside 246 the task execution context. For example, the Docker driver executes commands 247 inside the running container. `ExecTask` is called for Consul script checks. 248 249 [lxcdriver]: https://github.com/hashicorp/nomad-driver-lxc 250 [driverplugin]: https://github.com/hashicorp/nomad/blob/v0.9.0/plugins/drivers/driver.go#L39-L57 251 [skeletonproject]: https://github.com/hashicorp/nomad-skeleton-driver-plugin 252 [baseplugin]: /docs/concepts/plugins/base 253 [taskconfig]: https://godoc.org/github.com/hashicorp/nomad/plugins/drivers#TaskConfig 254 [taskhandle]: https://godoc.org/github.com/hashicorp/nomad/plugins/drivers#TaskHandle 255 [fifopackage]: https://godoc.org/github.com/hashicorp/nomad/client/lib/fifo 256 [rtd]: /plugins/drivers/remote