github.com/kaisenlinux/docker.io@v0.0.0-20230510090727-ea55db55fac7/swarmkit/design/task_model.md (about) 1 # SwarmKit task model 2 3 This document explains some important properties of tasks in SwarmKit. It 4 covers the types of state that exist for a task, a task's lifecycle, and the 5 slot model that associates a task with a particular replica or node. 6 7 ## Task message 8 9 Tasks are defined by the `Task` protobuf message. A simplified version of this 10 message, showing only the fields described in this document, is presented below: 11 12 ``` 13 // Task specifies the parameters for implementing a Spec. A task is effectively 14 // immutable and idempotent. Once it is dispatched to a node, it will not be 15 // dispatched to another node. 16 message Task { 17 string id = 1 [(gogoproto.customname) = "ID"]; 18 19 // Spec defines the desired state of the task as specified by the user. 20 // The system will honor this and will *never* modify it. 21 TaskSpec spec = 3 [(gogoproto.nullable) = false]; 22 23 // ServiceID indicates the service under which this task is 24 // orchestrated. This should almost always be set. 25 string service_id = 4 [(gogoproto.customname) = "ServiceID"]; 26 27 // Slot is the service slot number for a task. 28 // For example, if a replicated service has replicas = 2, there will be 29 // a task with slot = 1, and another with slot = 2. 30 uint64 slot = 5; 31 32 // NodeID indicates the node to which the task is assigned. If this 33 // field is empty or not set, the task is unassigned. 34 string node_id = 6 [(gogoproto.customname) = "NodeID"]; 35 36 TaskStatus status = 9 [(gogoproto.nullable) = false]; 37 38 // DesiredState is the target state for the task. It is set to 39 // TaskStateRunning when a task is first created, and changed to 40 // TaskStateShutdown if the manager wants to terminate the task. This 41 // field is only written by the manager. 42 TaskState desired_state = 10; 43 } 44 ``` 45 46 ### ID 47 48 The `id` field contains a unique ID string for the task. 49 50 ### Spec 51 52 The `spec` field contains the specification for the task. This is a part of the 53 service spec, which is copied to the task object when the task is created. The 54 spec is entirely specified by the user through the service spec. It will never 55 be modified by the system. 56 57 ### Service ID 58 59 `service_id` links a task to the associated service. Tasks link back to the 60 service that created them, rather than services maintaining a list of all 61 associated tasks. Generally, a service's tasks are listed by querying for tasks 62 where service_id has a specific value. In some cases, there are tasks that exist 63 independent of any service. These do not have a value set in `service_id`. 64 65 ### Slot 66 67 `slot` is used for replicated tasks to identify which slot the task satisfies. 68 The slot model is discussed in more detail below. 69 70 ### Node ID 71 72 `node_id` assigns the task to a specific node. This is used by both replicated 73 tasks and global tasks. For global tasks, the node ID is assigned when the task 74 is first created. For replicated tasks, it is assigned by the scheduler when 75 the task gets scheduled. 76 77 ### Status 78 79 `status` contains *observed* state of the task as reported by the agent. The 80 most important field inside `status` is `state`, which indicates where the task 81 is in its lifecycle (assigned, running, complete, and so on). The status 82 information in this field may become out of date if the node that the task is 83 assigned to is unresponsive. In this case, it's up to the orchestrator to 84 replace the task with a new one. 85 86 ### Desired state 87 88 Desired state is the state that the orchestrator would like the task to progress 89 to. This field provides a way for the orchestrator to control when the task can 90 advance in state. For example, the orchestrator may create a task with desired 91 state set to `READY` during a rolling update, and then advance the desired state 92 to `RUNNING` once the old task it is replacing has stopped. This gives it a way 93 to get the new task ready to start (for example, pulling the new image), without 94 actually starting it. 95 96 ## Properties of tasks 97 98 A task is a "one-shot" execution unit. Once a task stops running, it is never 99 executed again. A new task may be created to replace it. 100 101 Tasks states are changed in a monotonic progression. Tasks may move to states 102 beyond the current state, but their states may never move backwards. 103 104 ## Task history 105 106 Once a task stops running, the task object is not necessarily removed from the 107 distributed data store. Generally, a few historic tasks for each slot of each 108 service are retained to provide task history. The task reaper will garbage 109 collect old tasks if the limit of historic tasks for a given slot is reached. 110 Currently, retention of containers on the workers is tied to the presence of the 111 old task objects in the distributed data store, but this may change in the 112 future. 113 114 ## Task lifecycle 115 116 Tasks are created by the orchestrator. They may be created for a new service, or 117 to scale up an existing service, or to replace tasks for an existing service 118 that are no longer running for whatever reason. The orchestrator creates tasks 119 in the `NEW` state. 120 121 Tasks next run through the allocator, which allocate resources such as network 122 attachments which are necessary for the tasks to run. When the allocator has 123 processed a task, it moves the task to the `PENDING` state. 124 125 The scheduler takes `PENDING` tasks and assigns them to nodes (or verifies 126 that the requested node has the necessary resources, in the case of global 127 services' tasks). It changes their state to `ASSIGNED`. 128 129 From this point, control over the state passes to the agent. A task will 130 progress through the `ACCEPTED`, `PREPARING`, `READY', and `STARTING` states on 131 the way to `RUNNING`. If a task exits without an error code, it moves to the 132 `COMPLETE` state. If it fails, it moves to the `FAILED` state instead. 133 134 A task may alternatively end up in the `SHUTDOWN` state if its shutdown was 135 requested by the orchestrator (by setting desired state to `SHUTDOWN`), 136 the `REJECTED` state if the agent rejected the 137 task, or the `ORPHANED` state if the node on which the task is scheduled is 138 down for too long. The orchestrator will also set desired state for a task not 139 already in a terminal state to 140 `REMOVE` when the service associated with the task was removed or scaled down 141 by the user. When this happens, the agent proceeds to shut the task down. 142 The task is removed from the store by the task reaper only after the shutdown succeeds. 143 This ensures that resources associated with the task are not released before 144 the task has shut down. 145 Tasks that were removed because of service removal or scale down 146 are not kept around in task history. 147 148 The task state can never move backwards - it only increases monotonically. 149 150 ## Slot model 151 152 Replicated tasks have a slot number assigned to them. This allows the system to 153 track the history of a particular replica over time. 154 155 For example, a replicated service with three replicas would lead to three tasks, 156 with slot numbers 1, 2, and 3. If the task in slot 2 fails, a new task would be 157 started with `Slot = 2`. Through the slot numbers, the administrator would be 158 able to see that the new task was a replacement for the previous one in slot 2 159 that failed. 160 161 The orchestrator for replicated services tries to make sure the correct number 162 of slots have a running task in them. For example, if this 3-replica service 163 only has running tasks with two distinct slot numbers, it will create a third 164 task with a different slot number. Also, if there are 4 slot numbers represented 165 among the tasks in the running state, it will kill one or more tasks so that 166 there are only 3 slot numbers between the running tasks. 167 168 Slot numbers may be noncontiguous. For example, when a service is scaled down, 169 the task that's removed may not be the one with the highest slot number. 170 171 It's normal for a slot to have multiple tasks. Generally, there will be a single 172 task with the desired state of `RUNNING`, and also some historic tasks with a 173 desired state of `SHUTDOWN` that are no longer active in the system. However, 174 there are also cases where a slot may have multiple tasks with the desired state 175 of `RUNNING`. This can happen during rolling updates when the updates are 176 configured to start the new task before stopping the old one. The orchestrator 177 isn't confused by this situation, because it only cares about which slots are 178 satisfied by at least one running task, not the detailed makeup of those slots. 179 The updater takes care of making sure that each slot converges to having a 180 single running task. 181 182 Also, for application availability, multiple tasks can share the single slot 183 number when a network partition occurs between nodes. If a node is split from 184 manager nodes, the tasks that were running on the node will be recreated on 185 another node. However, the tasks on the split node can still continue 186 running. So the old tasks and the new ones can share identical slot 187 numbers. These tasks may be considered "orphaned" by the manager, after some 188 time. Upon recovering the split, these tasks will be killed. 189 190 Global tasks do not have slot numbers, but the concept is similar. Each node in 191 the system should have a single running task associated with it. If this is not 192 the case, the orchestrator and updater work together to create or destroy tasks 193 as necessary.