volcano.sh/volcano@v1.9.0/docs/design/job-api.md (about) 1 # Job API 2 3 [@k82cn](http://github.com/k82cn); Dec 27, 2018 4 5 ## Motivation 6 7 `Job` is the fundamental object of high performance workload; this document provides the definition of `Job` in Volcano. 8 9 ## Scope 10 11 ### In Scope 12 13 * Define the API of Job 14 * Define the behaviour of Job 15 * Clarify the interaction with other features 16 17 ### Out of Scope 18 19 * Volumes: volume management is out of scope for job management related features 20 * Network: the addressing between tasks will be described in other project 21 22 ## Function Detail 23 24 The definition of `Job` follow Kuberentes's style, e.g. Status, Spec; the follow sections will only describe 25 the major functions of `Job`, refer to [Appendix](#appendix) section for the whole definition of `Job`. 26 27 ### Multiple Pod Template 28 29 As most jobs of high performance workload include different type of tasks, e.g. TensorFlow (ps/worker), Spark (driver/executor); 30 `Job` introduces `taskSpecs` to support multiple pod template, defined as follow. The `Policies` will describe in 31 [Error Handling](#error-handling) section. 32 33 ```go 34 // JobSpec describes how the job execution will look like and when it will actually run 35 type JobSpec struct { 36 ... 37 38 // Tasks specifies the task specification of Job 39 // +optional 40 Tasks []TaskSpec `json:"tasks,omitempty" protobuf:"bytes,5,opt,name=tasks"` 41 } 42 43 // TaskSpec specifies the task specification of Job 44 type TaskSpec struct { 45 // Name specifies the name of task 46 Name string `json:"name,omitempty" protobuf:"bytes,1,opt,name=name"` 47 48 // Replicas specifies the replicas of this TaskSpec in Job 49 Replicas int32 `json:"replicas,omitempty" protobuf:"bytes,2,opt,name=replicas"` 50 51 // Specifies the pod that will be created for this TaskSpec 52 // when executing a Job 53 Template v1.PodTemplateSpec `json:"template,omitempty" protobuf:"bytes,3,opt,name=template"` 54 55 // Specifies the lifecycle of tasks 56 // +optional 57 Policies []LifecyclePolicy `json:"policies,omitempty" protobuf:"bytes,4,opt,name=policies"` 58 } 59 ``` 60 61 `JobController` will create Pods based on the templates and replicas in `spec.tasks`; 62 the controlled `OwnerReference` of Pod will be set to the `Job`. The following is 63 an example YAML with multiple pod template. 64 65 ```yaml 66 apiVersion: batch.volcano.sh/v1alpha1 67 kind: Job 68 metadata: 69 name: tf-job 70 spec: 71 tasks: 72 - name: "ps" 73 replicas: 2 74 template: 75 spec: 76 containers: 77 - name: ps 78 image: ps-img 79 - name: "worker" 80 replicas: 5 81 template: 82 spec: 83 containers: 84 - name: worker 85 image: worker-img 86 ``` 87 88 ### Job Input/Output 89 90 Most of high performance workload will handle data which is considering as input/output of a Job. 91 The following types are introduced for Job's input/output. 92 93 ```go 94 type VolumeSpec struct { 95 MountPath string `json:"mountPath" protobuf:"bytes,1,opt,name=mountPath"` 96 97 // defined the PVC name 98 // + optional 99 VolumeClaimName string `json:"volumeClaimName,omitempty" protobuf:"bytes,2,opt,name=volumeClaimName"` 100 101 // VolumeClaim defines the PVC used by the VolumeSpec. 102 // + optional 103 VolumeClaim *PersistentVolumeClaim `json:"claim,omitempty" protobuf:"bytes,3,opt,name=claim"` 104 } 105 106 type JobSpec struct{ 107 ... 108 109 // The volumes mount on Job 110 // +optional 111 Volumes []VolumeSpec `json:"volumes,omitempty" protobuf:"bytes,1,opt,name=volumes"` 112 } 113 ``` 114 115 The `Volumes` of Job can be `nil` which means user will manage data themselves. If `*VolumeSpec.volumeClaim` is `nil` and `*VolumeSpec.volumeClaimName` is `nil` or not exist in PersistentVolumeClaim,`emptyDir` volume will be used for each Task/Pod. 116 117 ### Conditions and Phases 118 119 The following phases are introduced to give a simple, high-level summary of where the Job is in its lifecycle; and the conditions array, 120 the reason and message field contain more detail about the job's status. 121 122 ```go 123 type JobPhase string 124 125 const ( 126 // Pending is the phase that job is pending in the queue, waiting for scheduling decision 127 Pending JobPhase = "Pending" 128 // Aborting is the phase that job is aborted, waiting for releasing pods 129 Aborting JobPhase = "Aborting" 130 // Aborted is the phase that job is aborted by user or error handling 131 Aborted JobPhase = "Aborted" 132 // Running is the phase that minimal available tasks of Job are running 133 Running JobPhase = "Running" 134 // Restarting is the phase that the Job is restarted, waiting for pod releasing and recreating 135 Restarting JobPhase = "Restarting" 136 // Completed is the phase that all tasks of Job are completed successfully 137 Completed JobPhase = "Completed" 138 // Terminating is the phase that the Job is terminated, waiting for releasing pods 139 Terminating JobPhase = "Terminating" 140 // Teriminated is the phase that the job is finished unexpected, e.g. events 141 Teriminated JobPhase = "Terminated" 142 ) 143 144 // JobState contains details for the current state of the job. 145 type JobState struct { 146 // The phase of Job. 147 // +optional 148 Phase JobPhase `json:"phase,omitempty" protobuf:"bytes,1,opt,name=phase"` 149 150 // Unique, one-word, CamelCase reason for the phase's last transition. 151 // +optional 152 Reason string `json:"reason,omitempty" protobuf:"bytes,2,opt,name=reason"` 153 154 // Human-readable message indicating details about last transition. 155 // +optional 156 Message string `json:"message,omitempty" protobuf:"bytes,3,opt,name=message"` 157 } 158 159 // JobStatus represents the current state of a Job 160 type JobStatus struct { 161 // Current state of Job. 162 State JobState `json:"state,omitempty" protobuf:"bytes,1,opt,name=state"` 163 164 ...... 165 } 166 ``` 167 168 The following table shows available transactions between different phases. The phase can not transfer to the target 169 phase if the cell is empty. 170 171 | From \ To | Pending | Aborted | Running | Completed | Terminated | 172 | ------------- | ------- | ------- | ------- | --------- | ---------- | 173 | Pending | * | * | * | | | 174 | Aborted | * | * | | | | 175 | Running | | * | * | * | * | 176 | Completed | | | | * | | 177 | Terminated | | | | | * | 178 179 `Restarting`, `Aborting` and `Terminating` are temporary states to avoid race condition, e.g. there'll be several 180 `PodeEvictedEvent`s because of `TerminateJobAction` which should not be handled again. 181 182 ### Error Handling 183 184 After Job was created in system, there'll be several events related to the Job, e.g. Pod succeeded, Pod failed; 185 and some events are critical to the Job, e.g. Pod of MPIJob failed. So `LifecyclePolicy` is introduced to handle different 186 events based on user's configuration. 187 188 ```go 189 // Event is the type of Event related to the Job 190 type Event string 191 192 const ( 193 // AllEvents means all event 194 AllEvents Event = "*" 195 // PodFailedEvent is triggered if Pod was failed 196 PodFailedEvent Event = "PodFailed" 197 // PodEvictedEvent is triggered if Pod was deleted 198 PodEvictedEvent Event = "PodEvicted" 199 // These below are several events can lead to job 'Unknown' 200 // 1. Task Unschedulable, this is triggered when part of 201 // pods can't be scheduled while some are already running in gang-scheduling case. 202 JobUnknownEvent Event = "Unknown" 203 204 // OutOfSyncEvent is triggered if Pod/Job were updated 205 OutOfSyncEvent Event = "OutOfSync" 206 // CommandIssuedEvent is triggered if a command is raised by user 207 CommandIssuedEvent Event = "CommandIssued" 208 // TaskCompletedEvent is triggered if the 'Replicas' amount of pods in one task are succeed 209 TaskCompletedEvent Event = "TaskCompleted" 210 ) 211 212 // Action is the type of event handling 213 type Action string 214 215 const ( 216 // AbortJobAction if this action is set, the whole job will be aborted: 217 // all Pod of Job will be evicted, and no Pod will be recreated 218 AbortJobAction Action = "AbortJob" 219 // RestartJobAction if this action is set, the whole job will be restarted 220 RestartJobAction Action = "RestartJob" 221 // TerminateJobAction if this action is set, the whole job wil be terminated 222 // and can not be resumed: all Pod of Job will be evicted, and no Pod will be recreated. 223 TerminateJobAction Action = "TerminateJob" 224 // CompleteJobAction if this action is set, the unfinished pods will be killed, job completed. 225 CompleteJobAction Action = "CompleteJob" 226 227 // ResumeJobAction is the action to resume an aborted job. 228 ResumeJobAction Action = "ResumeJob" 229 // SyncJobAction is the action to sync Job/Pod status. 230 SyncJobAction Action = "SyncJob" 231 ) 232 233 // LifecyclePolicy specifies the lifecycle and error handling of task and job. 234 type LifecyclePolicy struct { 235 Event Event `json:"event,omitempty" protobuf:"bytes,1,opt,name=event"` 236 Action Action `json:"action,omitempty" protobuf:"bytes,2,opt,name=action"` 237 Timeout *metav1.Duration `json:"timeout,omitempty" protobuf:"bytes,3,opt,name=timeout"` 238 } 239 ``` 240 241 Both `JobSpec` and `TaskSpec` include lifecycle policy: the policies in `JobSpec` are the default policy if no policies 242 in `TaskSpec`; the policies in `TaskSpec` will overwrite defaults. 243 244 ```go 245 // JobSpec describes how the job execution will look like and when it will actually run 246 type JobSpec struct { 247 ... 248 249 // Specifies the default lifecycle of tasks 250 // +optional 251 Policies []LifecyclePolicy `json:"policies,omitempty" protobuf:"bytes,5,opt,name=policies"` 252 253 // Tasks specifies the task specification of Job 254 // +optional 255 Tasks []TaskSpec `json:"tasks,omitempty" protobuf:"bytes,6,opt,name=tasks"` 256 } 257 258 // TaskSpec specifies the task specification of Job 259 type TaskSpec struct { 260 ... 261 262 // Specifies the lifecycle of tasks 263 // +optional 264 Policies []LifecyclePolicy `json:"policies,omitempty" protobuf:"bytes,4,opt,name=policies"` 265 } 266 ``` 267 268 The following examples demonstrate the usage of `LifecyclePolicy` for job and task. 269 270 For the training job of machine learning framework, the whole job should be restarted if any task was failed or evicted. 271 To simplify the configuration, a job level `LifecyclePolicy` is set as follows. As no `LifecyclePolicy` is set for any 272 task, all tasks will use the policies in `spec.policies`. 273 274 ```yaml 275 apiVersion: batch.volcano.sh/v1alpha1 276 kind: Job 277 metadata: 278 name: tf-job 279 spec: 280 # If any event here, restart the whole job. 281 policies: 282 - event: * 283 action: RestartJob 284 tasks: 285 - name: "ps" 286 replicas: 1 287 template: 288 spec: 289 containers: 290 - name: ps 291 image: ps-img 292 - name: "worker" 293 replicas: 5 294 template: 295 spec: 296 containers: 297 - name: worker 298 image: worker-img 299 ... 300 ``` 301 302 Some BigData framework (e.g. Spark) may have different requirements. Take Spark as example, the whole job will be restarted 303 if 'driver' tasks failed and only restart the task if 'executor' tasks failed. `OnFailure` restartPolicy is set for executor 304 and `RestartJob` is set for driver `spec.tasks.policies` as follow. 305 306 ```yaml 307 apiVersion: batch.volcano.sh/v1alpha1 308 kind: Job 309 metadata: 310 name: spark-job 311 spec: 312 tasks: 313 - name: "driver" 314 replicas: 1 315 policies: 316 - event: * 317 action: RestartJob 318 template: 319 spec: 320 containers: 321 - name: driver 322 image: driver-img 323 - name: "executor" 324 replicas: 5 325 template: 326 spec: 327 containers: 328 - name: executor 329 image: executor-img 330 restartPolicy: OnFailure 331 ``` 332 333 ## Features Interaction 334 335 ### Admission Controller 336 337 The following validations must be included to make sure expected behaviours: 338 339 * `spec.minAvailable` <= sum(`spec.taskSpecs.replicas`) 340 * no duplicated name in `spec.taskSpecs` array 341 * no duplicated event handler in `LifecyclePolicy` array, both job policies and task policies 342 343 ### CoScheduling 344 345 CoScheduling (or Gang-scheduling) is required by most of high performance workload, e.g. TF training job, MPI job. 346 The `spec.minAvailable` is used to identify how many pods will be scheduled together. The default value of `spec.minAvailable` 347 is summary of `spec.tasks.replicas`. The admission controller web hook will check `spec.minAvailable` against 348 the summary of `spec.tasks.replicas`; the job creation will be rejected if `spec.minAvailable` > sum(`spec.tasks.replicas`). 349 If `spec.minAvailable` < sum(`spec.tasks.replicas`), the pod of `spec.tasks` will be created randomly; 350 refer to [Task Priority with Job](#task-priority-within-job) section on how to create tasks in order. 351 352 ```yaml 353 apiVersion: batch.volcano.sh/v1alpha1 354 kind: Job 355 metadata: 356 name: tf-job 357 spec: 358 # minAvailable to run job 359 minAvailable: 6 360 tasks: 361 - name: "ps" 362 replicas: 1 363 template: 364 spec: 365 containers: 366 - name: "ps" 367 image: "ps-img" 368 - name: "worker" 369 replicas: 5 370 template: 371 spec: 372 containers: 373 - name: "worker" 374 image: "worker-img" 375 ``` 376 377 ### Task Priority within Job 378 379 In addition to multiple pod template, the priority of each task maybe different. `PriorityClass` of `PodTemplate` is reused 380 to define the priority of task within a job. This's an example to run spark job: 1 driver with 5 executors, the driver's 381 priority is `master-pri` which is higher than normal pods; as `spec.minAvailable` is 3, the scheduler will make sure one driver 382 with 2 executors will be scheduled if not enough resources. 383 384 ```yaml 385 apiVersion: batch.volcano.sh/v1alpha1 386 kind: Job 387 metadata: 388 name: spark-job 389 spec: 390 minAvailable: 3 391 tasks: 392 - name: "driver" 393 replicas: 1 394 template: 395 spec: 396 priorityClass: "master-pri" 397 containers: 398 - name: driver 399 image: driver-img 400 - name: "executor" 401 replicas: 5 402 template: 403 spec: 404 containers: 405 - name: executor 406 image: executor-img 407 ``` 408 409 **NOTE**: although scheduler will make sure high priority pods with job will be scheduled firstly, there's still a race 410 condition between different kubelets that low priority pod maybe launched early; the job/task dependency will be introduced 411 later to handle such kind of race condition. 412 413 ### Resource sharing between Job 414 415 By default, the `spec.minAvailable` is set to the summary of `spec.tasks.replicas`; if it's set to a smaller value, 416 the pod beyond `spec.minAvailable` will share resource between jobs. 417 418 ```yaml 419 apiVersion: batch.volcano.sh/v1alpha1 420 kind: Job 421 metadata: 422 name: spark-job 423 spec: 424 minAvailable: 3 425 tasks: 426 - name: "driver" 427 replicas: 1 428 template: 429 spec: 430 priorityClass: "master-pri" 431 containers: 432 - name: driver 433 image: driver-img 434 - name: "executor" 435 replicas: 5 436 template: 437 spec: 438 containers: 439 - name: executor 440 image: executor-img 441 ``` 442 443 ### Plugins for Job 444 445 As many jobs of AI frame, e.g. TensorFlow, MPI, Mxnet, need set env, pods communicate, ssh sign in without password. 446 We provide Job api plugins to give users a better focus on core business. 447 Now we have three plugins, every plugin has parameters, if not provided, we use default. 448 449 * env: set VK_TASK_INDEX to each container, is a index for giving the identity to container. 450 * svc: create Service and *.host to enable pods communicate. 451 * ssh: sign in ssh without password, e.g. use command mpirun or mpiexec. 452 453 ```yaml 454 apiVersion: batch.volcano.sh/v1alpha1 455 kind: Job 456 metadata: 457 name: mpi-job 458 spec: 459 minAvailable: 2 460 schedulerName: volcano 461 policies: 462 - event: PodEvicted 463 action: RestartJob 464 plugins: 465 ssh: [] 466 env: [] 467 svc: [] 468 tasks: 469 - replicas: 1 470 name: mpimaster 471 template: 472 spec: 473 containers: 474 image: mpi-image 475 name: mpimaster 476 - replicas: 2 477 name: mpiworker 478 template: 479 spec: 480 containers: 481 image: mpi-image 482 name: mpiworker 483 ``` 484 485 ## Appendix 486 487 ```go 488 type Job struct { 489 metav1.TypeMeta `json:",inline"` 490 491 metav1.ObjectMeta `json:"metadata,omitempty" protobuf:"bytes,1,opt,name=metadata"` 492 493 // Specification of the desired behavior of a cron job, including the minAvailable 494 // +optional 495 Spec JobSpec `json:"spec,omitempty" protobuf:"bytes,2,opt,name=spec"` 496 497 // Current status of Job 498 // +optional 499 Status JobStatus `json:"status,omitempty" protobuf:"bytes,3,opt,name=status"` 500 } 501 502 // JobSpec describes how the job execution will look like and when it will actually run 503 type JobSpec struct { 504 // SchedulerName is the default value of `taskSpecs.template.spec.schedulerName`. 505 // +optional 506 SchedulerName string `json:"schedulerName,omitempty" protobuf:"bytes,1,opt,name=schedulerName"` 507 508 // The minimal available pods to run for this Job 509 // +optional 510 MinAvailable int32 `json:"minAvailable,omitempty" protobuf:"bytes,2,opt,name=minAvailable"` 511 512 // The volumes mount on Job 513 Volumes []VolumeSpec `json:"volumes,omitempty" protobuf:"bytes,3,opt,name=volumes"` 514 515 // Tasks specifies the task specification of Job 516 // +optional 517 Tasks []TaskSpec `json:"taskSpecs,omitempty" protobuf:"bytes,4,opt,name=taskSpecs"` 518 519 // Specifies the default lifecycle of tasks 520 // +optional 521 Policies []LifecyclePolicy `json:"policies,omitempty" protobuf:"bytes,5,opt,name=policies"` 522 523 // Specifies the plugin of job 524 // Key is plugin name, value is the arguments of the plugin 525 // +optional 526 Plugins map[string][]string `json:"plugins,omitempty" protobuf:"bytes,6,opt,name=plugins"` 527 528 //Specifies the queue that will be used in the scheduler, "default" queue is used this leaves empty. 529 Queue string `json:"queue,omitempty" protobuf:"bytes,7,opt,name=queue"` 530 531 // Specifies the maximum number of retries before marking this Job failed. 532 // Defaults to 3. 533 // +optional 534 MaxRetry int32 `json:"maxRetry,omitempty" protobuf:"bytes,8,opt,name=maxRetry"` 535 } 536 537 // VolumeSpec defines the specification of Volume, e.g. PVC 538 type VolumeSpec struct { 539 MountPath string `json:"mountPath" protobuf:"bytes,1,opt,name=mountPath"` 540 541 // defined the PVC name 542 VolumeClaimName string `json:"volumeClaimName,omitempty" protobuf:"bytes,2,opt,name=volumeClaimName"` 543 544 // VolumeClaim defines the PVC used by the VolumeMount. 545 VolumeClaim *v1.PersistentVolumeClaimSpec `json:"volumeClaim,omitempty" protobuf:"bytes,3,opt,name=volumeClaim"` 546 } 547 548 // Event represent the phase of Job, e.g. pod-failed. 549 type Event string 550 551 const ( 552 // AllEvent means all event 553 AllEvents Event = "*" 554 // PodFailedEvent is triggered if Pod was failed 555 PodFailedEvent Event = "PodFailed" 556 // PodEvictedEvent is triggered if Pod was deleted 557 PodEvictedEvent Event = "PodEvicted" 558 // These below are several events can lead to job 'Unknown' 559 // 1. Task Unschedulable, this is triggered when part of 560 // pods can't be scheduled while some are already running in gang-scheduling case. 561 JobUnknownEvent Event = "Unknown" 562 563 // OutOfSyncEvent is triggered if Pod/Job were updated 564 OutOfSyncEvent Event = "OutOfSync" 565 // CommandIssuedEvent is triggered if a command is raised by user 566 CommandIssuedEvent Event = "CommandIssued" 567 // TaskCompletedEvent is triggered if the 'Replicas' amount of pods in one task are succeed 568 TaskCompletedEvent Event = "TaskCompleted" 569 ) 570 571 // Action is the action that Job controller will take according to the event. 572 type Action string 573 574 const ( 575 // AbortJobAction if this action is set, the whole job will be aborted: 576 // all Pod of Job will be evicted, and no Pod will be recreated 577 AbortJobAction Action = "AbortJob" 578 // RestartJobAction if this action is set, the whole job will be restarted 579 RestartJobAction Action = "RestartJob" 580 // TerminateJobAction if this action is set, the whole job wil be terminated 581 // and can not be resumed: all Pod of Job will be evicted, and no Pod will be recreated. 582 TerminateJobAction Action = "TerminateJob" 583 // CompleteJobAction if this action is set, the unfinished pods will be killed, job completed. 584 CompleteJobAction Action = "CompleteJob" 585 586 // ResumeJobAction is the action to resume an aborted job. 587 ResumeJobAction Action = "ResumeJob" 588 // SyncJobAction is the action to sync Job/Pod status. 589 SyncJobAction Action = "SyncJob" 590 ) 591 592 // LifecyclePolicy specifies the lifecycle and error handling of task and job. 593 type LifecyclePolicy struct { 594 // The action that will be taken to the PodGroup according to Event. 595 // One of "Restart", "None". 596 // Default to None. 597 // +optional 598 Action Action `json:"action,omitempty" protobuf:"bytes,1,opt,name=action"` 599 600 // The Event recorded by scheduler; the controller takes actions 601 // according to this Event. 602 // +optional 603 Event Event `json:"event,omitempty" protobuf:"bytes,2,opt,name=event"` 604 605 // Timeout is the grace period for controller to take actions. 606 // Default to nil (take action immediately). 607 // +optional 608 Timeout *metav1.Duration `json:"timeout,omitempty" protobuf:"bytes,3,opt,name=timeout"` 609 } 610 611 // TaskSpec specifies the task specification of Job 612 type TaskSpec struct { 613 // Name specifies the name of tasks 614 Name string `json:"name,omitempty" protobuf:"bytes,1,opt,name=name"` 615 616 // Replicas specifies the replicas of this TaskSpec in Job 617 Replicas int32 `json:"replicas,omitempty" protobuf:"bytes,2,opt,name=replicas"` 618 619 // Specifies the pod that will be created for this TaskSpec 620 // when executing a Job 621 Template v1.PodTemplateSpec `json:"template,omitempty" protobuf:"bytes,3,opt,name=template"` 622 623 // Specifies the lifecycle of task 624 // +optional 625 Policies []LifecyclePolicy `json:"policies,omitempty" protobuf:"bytes,4,opt,name=policies"` 626 } 627 628 type JobPhase string 629 630 const ( 631 // Pending is the phase that job is pending in the queue, waiting for scheduling decision 632 Pending JobPhase = "Pending" 633 // Aborting is the phase that job is aborted, waiting for releasing pods 634 Aborting JobPhase = "Aborting" 635 // Aborted is the phase that job is aborted by user or error handling 636 Aborted JobPhase = "Aborted" 637 // Running is the phase that minimal available tasks of Job are running 638 Running JobPhase = "Running" 639 // Restarting is the phase that the Job is restarted, waiting for pod releasing and recreating 640 Restarting JobPhase = "Restarting" 641 // Completing is the phase that required tasks of job are completed, job starts to clean up 642 Completing JobPhase = "Completing" 643 // Completed is the phase that all tasks of Job are completed successfully 644 Completed JobPhase = "Completed" 645 // Terminating is the phase that the Job is terminated, waiting for releasing pods 646 Terminating JobPhase = "Terminating" 647 // Terminated is the phase that the job is finished unexpected, e.g. events 648 Terminated JobPhase = "Terminated" 649 // Failed is the phase that the job is restarted failed reached the maximum number of retries. 650 Failed JobPhase = "Failed" 651 ) 652 653 // JobState contains details for the current state of the job. 654 type JobState struct { 655 // The phase of Job. 656 // +optional 657 Phase JobPhase `json:"phase,omitempty" protobuf:"bytes,1,opt,name=phase"` 658 659 // Unique, one-word, CamelCase reason for the phase's last transition. 660 // +optional 661 Reason string `json:"reason,omitempty" protobuf:"bytes,2,opt,name=reason"` 662 663 // Human-readable message indicating details about last transition. 664 // +optional 665 Message string `json:"message,omitempty" protobuf:"bytes,3,opt,name=message"` 666 } 667 668 // JobStatus represents the current status of a Job 669 type JobStatus struct { 670 // Current state of Job. 671 State JobState `json:"state,omitempty" protobuf:"bytes,1,opt,name=state"` 672 673 // The number of pending pods. 674 // +optional 675 Pending int32 `json:"pending,omitempty" protobuf:"bytes,2,opt,name=pending"` 676 677 // The number of running pods. 678 // +optional 679 Running int32 `json:"running,omitempty" protobuf:"bytes,3,opt,name=running"` 680 681 // The number of pods which reached phase Succeeded. 682 // +optional 683 Succeeded int32 `json:"Succeeded,omitempty" protobuf:"bytes,4,opt,name=succeeded"` 684 685 // The number of pods which reached phase Failed. 686 // +optional 687 Failed int32 `json:"failed,omitempty" protobuf:"bytes,5,opt,name=failed"` 688 689 // The minimal available pods to run for this Job 690 // +optional 691 MinAvailable int32 `json:"minAvailable,omitempty" protobuf:"bytes,6,opt,name=minAvailable"` 692 693 // The number of pods which reached phase Terminating. 694 // +optional 695 Terminating int32 `json:"terminating,omitempty" protobuf:"bytes,7,opt,name=terminating"` 696 } 697 698 // +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object 699 type JobList struct { 700 metav1.TypeMeta `json:",inline"` 701 metav1.ListMeta `json:"metadata,omitempty" protobuf:"bytes,1,opt,name=metadata"` 702 703 Items []Job `json:"items" protobuf:"bytes,2,rep,name=items"` 704 } 705 706 ```