github.com/kaisenlinux/docker.io@v0.0.0-20230510090727-ea55db55fac7/swarmkit/design/task_model.md

github.com/kaisenlinux/docker.io@v0.0.0-20230510090727-ea55db55fac7/swarmkit/design/task_model.md (about)

1 # SwarmKit task model
2
3 This document explains some important properties of tasks in SwarmKit. It
4 covers the types of state that exist for a task, a task's lifecycle, and the
5 slot model that associates a task with a particular replica or node.
6
7 ## Task message
8
9 Tasks are defined by the `Task` protobuf message. A simplified version of this
10 message, showing only the fields described in this document, is presented below:
11
12 ```
13 // Task specifies the parameters for implementing a Spec. A task is effectively
14 // immutable and idempotent. Once it is dispatched to a node, it will not be
15 // dispatched to another node.
16 message Task {
17 string id = 1 [(gogoproto.customname) = "ID"];
18
19 // Spec defines the desired state of the task as specified by the user.
20 // The system will honor this and will *never* modify it.
21 TaskSpec spec = 3 [(gogoproto.nullable) = false];
22
23 // ServiceID indicates the service under which this task is
24 // orchestrated. This should almost always be set.
25 string service_id = 4 [(gogoproto.customname) = "ServiceID"];
26
27 // Slot is the service slot number for a task.
28 // For example, if a replicated service has replicas = 2, there will be
29 // a task with slot = 1, and another with slot = 2.
30 uint64 slot = 5;
31
32 // NodeID indicates the node to which the task is assigned. If this
33 // field is empty or not set, the task is unassigned.
34 string node_id = 6 [(gogoproto.customname) = "NodeID"];
35
36 TaskStatus status = 9 [(gogoproto.nullable) = false];
37
38 // DesiredState is the target state for the task. It is set to
39 // TaskStateRunning when a task is first created, and changed to
40 // TaskStateShutdown if the manager wants to terminate the task. This
41 // field is only written by the manager.
42 TaskState desired_state = 10;
43 }
44 ```
45
46 ### ID
47
48 The `id` field contains a unique ID string for the task.
49
50 ### Spec
51
52 The `spec` field contains the specification for the task. This is a part of the
53 service spec, which is copied to the task object when the task is created. The
54 spec is entirely specified by the user through the service spec. It will never
55 be modified by the system.
56
57 ### Service ID
58
59 `service_id` links a task to the associated service. Tasks link back to the
60 service that created them, rather than services maintaining a list of all
61 associated tasks. Generally, a service's tasks are listed by querying for tasks
62 where service_id has a specific value. In some cases, there are tasks that exist
63 independent of any service. These do not have a value set in `service_id`.
64
65 ### Slot
66
67 `slot` is used for replicated tasks to identify which slot the task satisfies.
68 The slot model is discussed in more detail below.
69
70 ### Node ID
71
72 `node_id` assigns the task to a specific node. This is used by both replicated
73 tasks and global tasks. For global tasks, the node ID is assigned when the task
74 is first created. For replicated tasks, it is assigned by the scheduler when
75 the task gets scheduled.
76
77 ### Status
78
79 `status` contains *observed* state of the task as reported by the agent. The
80 most important field inside `status` is `state`, which indicates where the task
81 is in its lifecycle (assigned, running, complete, and so on). The status
82 information in this field may become out of date if the node that the task is
83 assigned to is unresponsive. In this case, it's up to the orchestrator to
84 replace the task with a new one.
85
86 ### Desired state
87
88 Desired state is the state that the orchestrator would like the task to progress
89 to. This field provides a way for the orchestrator to control when the task can
90 advance in state. For example, the orchestrator may create a task with desired
91 state set to `READY` during a rolling update, and then advance the desired state
92 to `RUNNING` once the old task it is replacing has stopped. This gives it a way
93 to get the new task ready to start (for example, pulling the new image), without
94 actually starting it.
95
96 ## Properties of tasks
97
98 A task is a "one-shot" execution unit. Once a task stops running, it is never
99 executed again. A new task may be created to replace it.
100
101 Tasks states are changed in a monotonic progression. Tasks may move to states
102 beyond the current state, but their states may never move backwards.
103
104 ## Task history
105
106 Once a task stops running, the task object is not necessarily removed from the
107 distributed data store. Generally, a few historic tasks for each slot of each
108 service are retained to provide task history. The task reaper will garbage
109 collect old tasks if the limit of historic tasks for a given slot is reached.
110 Currently, retention of containers on the workers is tied to the presence of the
111 old task objects in the distributed data store, but this may change in the
112 future.
113
114 ## Task lifecycle
115
116 Tasks are created by the orchestrator. They may be created for a new service, or
117 to scale up an existing service, or to replace tasks for an existing service
118 that are no longer running for whatever reason. The orchestrator creates tasks
119 in the `NEW` state.
120
121 Tasks next run through the allocator, which allocate resources such as network
122 attachments which are necessary for the tasks to run. When the allocator has
123 processed a task, it moves the task to the `PENDING` state.
124
125 The scheduler takes `PENDING` tasks and assigns them to nodes (or verifies
126 that the requested node has the necessary resources, in the case of global
127 services' tasks). It changes their state to `ASSIGNED`.
128
129 From this point, control over the state passes to the agent. A task will
130 progress through the `ACCEPTED`, `PREPARING`, `READY', and `STARTING` states on
131 the way to `RUNNING`. If a task exits without an error code, it moves to the
132 `COMPLETE` state. If it fails, it moves to the `FAILED` state instead.
133
134 A task may alternatively end up in the `SHUTDOWN` state if its shutdown was
135 requested by the orchestrator (by setting desired state to `SHUTDOWN`),
136 the `REJECTED` state if the agent rejected the
137 task, or the `ORPHANED` state if the node on which the task is scheduled is
138 down for too long. The orchestrator will also set desired state for a task not
139 already in a terminal state to
140 `REMOVE` when the service associated with the task was removed or scaled down
141 by the user. When this happens, the agent proceeds to shut the task down.
142 The task is removed from the store by the task reaper only after the shutdown succeeds.
143 This ensures that resources associated with the task are not released before
144 the task has shut down.
145 Tasks that were removed because of service removal or scale down
146 are not kept around in task history.
147
148 The task state can never move backwards - it only increases monotonically.
149
150 ## Slot model
151
152 Replicated tasks have a slot number assigned to them. This allows the system to
153 track the history of a particular replica over time.
154
155 For example, a replicated service with three replicas would lead to three tasks,
156 with slot numbers 1, 2, and 3. If the task in slot 2 fails, a new task would be
157 started with `Slot = 2`. Through the slot numbers, the administrator would be
158 able to see that the new task was a replacement for the previous one in slot 2
159 that failed.
160
161 The orchestrator for replicated services tries to make sure the correct number
162 of slots have a running task in them. For example, if this 3-replica service
163 only has running tasks with two distinct slot numbers, it will create a third
164 task with a different slot number. Also, if there are 4 slot numbers represented
165 among the tasks in the running state, it will kill one or more tasks so that
166 there are only 3 slot numbers between the running tasks.
167
168 Slot numbers may be noncontiguous. For example, when a service is scaled down,
169 the task that's removed may not be the one with the highest slot number.
170
171 It's normal for a slot to have multiple tasks. Generally, there will be a single
172 task with the desired state of `RUNNING`, and also some historic tasks with a
173 desired state of `SHUTDOWN` that are no longer active in the system. However,
174 there are also cases where a slot may have multiple tasks with the desired state
175 of `RUNNING`. This can happen during rolling updates when the updates are
176 configured to start the new task before stopping the old one. The orchestrator
177 isn't confused by this situation, because it only cares about which slots are
178 satisfied by at least one running task, not the detailed makeup of those slots.
179 The updater takes care of making sure that each slot converges to having a
180 single running task.
181
182 Also, for application availability, multiple tasks can share the single slot
183 number when a network partition occurs between nodes. If a node is split from
184 manager nodes, the tasks that were running on the node will be recreated on
185 another node. However, the tasks on the split node can still continue
186 running. So the old tasks and the new ones can share identical slot
187 numbers. These tasks may be considered "orphaned" by the manager, after some
188 time. Upon recovering the split, these tasks will be killed.
189
190 Global tasks do not have slot numbers, but the concept is similar. Each node in
191 the system should have a single running task associated with it. If this is not
192 the case, the orchestrator and updater work together to create or destroy tasks
193 as necessary.