github.com/pachyderm/pachyderm@v1.13.4/doc/docs/1.11.x/how-tos/monitor-job-progress.md (about) 1 # Monitor Job Progress 2 3 After a pipeline starts a job, you can run one of the following commands 4 to monitor its status: 5 6 * `pachctl list pipeline` 7 8 This command shows all the pipelines that run in your cluster 9 and the status of the last job. The `STATE` column shows the current 10 state of the pipeline. If you see that a pipeline is in `running` 11 state, it means that pods were spun up in the underlying Kubernetes 12 cluster for this pipeline. The running state does not necessarily mean 13 that the pipeline is actively processing a job. If you see `failed` in 14 the `STATE` 15 column, this means that the Kubernetes cluster failed to schedule a pod for 16 this pipeline. 17 18 The `LAST JOB` column shows the status of the most recent job that ran 19 for this pipeline, which can be either `success`, `failed`, or 20 `crashing`. If a pipeline is in a `failed` state, you need to find the 21 reason of the failure and fix it. The crashing state indicates that 22 the pipeline worker is failing for potentially transient reasons. The 23 most common reasons for crashing are image pull failures, such as 24 incorrect image name or registry credentials, or scheduling failures, 25 such as not enough resources on your Kubernetes cluster. 26 27 **Example:** 28 29 ```shell 30 NAME VERSION INPUT CREATED STATE / LAST JOB DESCRIPTION 31 montage 1 (edges:/ тип images:/) 2 seconds ago starting / starting A montage pipeline 32 edges 1 images:/* 2 seconds ago running / starting An edge detection pipeline. 33 ``` 34 35 * `pachctl list job` 36 37 This command shows the jobs that were run for each pipeline. For each job, 38 Pachyderm shows the number of datums in the **PROGRESS** section, the amount 39 of downloaded and uploaded data, duration, and other important information. 40 The format of the progress bar is `DATUMS PROCESSED + DATUMS SKIPPED / TOTAL DATUMS`. 41 42 For more information, see 43 [Datum Processing States](../../concepts/pipeline-concepts/datum/datum-processing-states/). 44 45 **Example:** 46 47 ```shell 48 svetlanakarslioglu@Svetlanas-MBP examples % pachctl list job 49 ID PIPELINE STARTED DURATION RESTART PROGRESS DL UL STATE 50 7321952b9a214d3dbb64cc4369cc67da montage 6 minutes ago 1 second 0 1 + 0 / 1 371.9KiB 1.283MiB success 51 95adc138e82e48949909364e8b9dbb53 edges 6 minutes ago 1 second 0 2 + 1 / 3 181.1KiB 111.4KiB success 52 84fe22432f22492c9fd4f23036c3c8b5 montage 6 minutes ago Less than a second 0 1 + 0 / 1 79.49KiB 378.6KiB success 53 2fbbc54ab3514d8a94d1b7a75bab96a7 edges 6 minutes ago Less than a second 0 1 + 0 / 1 57.27KiB 22.22KiB success 54 ``` 55 56 * `pachctl list commit <repo>` 57 58 This command shows the status of the downstream jobs further in 59 the DAG that result from this commit. 60 In the [Hyperparameter Tuning example](https://github.com/pachyderm/pachyderm/tree/master/examples/ml/hyperparameter), we have four pipelines, 61 or a four-stage pipeline. Every subsequent pipeline takes the results 62 in the output repository of the previous pipeline and performs a 63 computation. Therefore, each step is executed one after another. 64 The **PROGRESS** bar in the output of the `pachctl list commit <first-repo-in-dag>` 65 command reflects these changes. 66 67 Running the command against the first repo in the DAG displays 68 a progress bar that shows job progress for all steps in your DAG. 69 70 The following animation shows how the progress bar is updated 71 when a job for each pipeline completes. 72 73 <p><small>(Click to enlarge)</small></p> 74 [ ](../assets/images/list_commit_progress_bar.gif) 75 76 The progress bar is equally divided to the number of steps, or pipelines, 77 you have in your DAG. In the example above, it is four steps. 78 If one of the jobs fails, you will see the progress bar turn red 79 for that pipeline step. To troubleshoot, look into that particular 80 pipeline job. 81 82 !!! note "See Also" 83 [Pipeline Troubleshooting](../../troubleshooting/pipeline_troubleshooting/)