github.com/pachyderm/pachyderm@v1.13.4/examples/err_cmd/README.md (about)

     1  >![pach_logo](../img/pach_logo.svg) INFO - Pachyderm 2.0 introduces profound architectural changes to the product. As a result, our examples pre and post 2.0 are kept in two separate branches:
     2  > - Branch Master: Examples using Pachyderm 2.0 and later versions - https://github.com/pachyderm/pachyderm/tree/master/examples
     3  > - Branch 1.13.x: Examples using Pachyderm 1.13 and older versions - https://github.com/pachyderm/pachyderm/tree/1.13.x/examples
     4  
     5  # Skip Failed Datums in Your Pipeline
     6  
     7  This example describes how you can use the `err_cmd` and `err_stdin` fields
     8  in your pipeline to fail a datum without failing the job and the whole
     9  pipeline. This feature is useful when you have large datasets and multiple
    10  datums, and you do not need to have them all processed successfully to
    11  move to the next step in your DAG.
    12  
    13  For more information about the `err_cmd` command, see [](../../docs/err_cmd.md)
    14  
    15  ## Prerequisites
    16  
    17  Before you begin, verify that you have the following configured in your
    18  environment:
    19  
    20  * Pachyderm version 1.9.x or later
    21  * A clone of the Pachyderm repository
    22  
    23  To clone the Pachyderm repository, run the following command:
    24  
    25  ```shell
    26  $ git clone git@github.com:pachyderm/pachyderm.git
    27  ```
    28  
    29  ## Create a Repository
    30  
    31  The first step is to create a repository called `input` by running the
    32  following command:
    33  
    34  ```shell
    35  $ pachctl create repo input
    36  ```
    37  
    38  ## Create a Pipeline
    39  
    40  Next, you need to create a pipeline that uses the `input` repository
    41  as input and has the `err_cmd` and `err_stdin` fields specified.
    42  
    43  In this example, we use the [error_test.json](error_test.json)
    44  pipeline:
    45  
    46  ```json
    47  {
    48      "pipeline": {
    49          "name": "error_test"
    50      "description": "A pipeline that checks if the `file1` is present in the datum.",
    51      },
    52      "input": {
    53          "pfs": {
    54              "glob": "/*",
    55              "repo": "input"
    56          }
    57      },
    58      "transform": {
    59          "cmd": [ "bash" ] ,
    60          "stdin": [ "if", "[ -a /pfs/input/file1 ]", "then cp /pfs/input/* /pfs/out/", "exit 0",  "fi", "exit 1" ] ,
    61          "err_cmd": [ "bash" ] ,
    62          "err_stdin": [ "if", "[ -a /pfs/input/file2 ]",  "then", "exit 0", "fi", " exit 1" ]
    63      }
    64  }
    65  ```
    66  
    67  In the pipeline above, the code checks if the datum contains `file1`. If it
    68  does, then the code copies everything in `/pfs/input/` to the `/pfs/out`
    69  directory. If the datum does not include `file1`, the datum is checked
    70  against the code in `err_stdin`. That code checks if the datum has
    71  `file2`. If it does, the code marks the datum as recovered, and the
    72  job succeeds. If it does not, the job fails.
    73  
    74  Create a pipeline by running the following command from the `examples/err_cmd/`
    75  directory:
    76  
    77  ```shell
    78  $ pachctl create pipeline -f error_test.json
    79  ```
    80  
    81  Verify that the pipeline was successfully created:
    82  
    83  ```shell
    84  $ pachctl list pipeline
    85  NAME       VERSION INPUT    CREATED       STATE / LAST JOB
    86  error_test 1       input:/* 5 seconds ago running / starting
    87  ```
    88  
    89  ## Add Files to the Input Repository
    90  
    91  Now, let's add some files to the input repository to watch how your pipeline
    92  code and error code work.
    93  
    94  You will add three files, `file1`, `file2`, and `file3`, that each contains one
    95  line in them.
    96  
    97  1. Add `file1`:
    98  
    99     ```shell
   100     $ echo "foo" | pachctl put file input@master:file1
   101     ```
   102  
   103     When you add `file1`, your pipeline should succeed:
   104  
   105     ```shell
   106     $ pachctl list job --no-pager
   107     ID                               PIPELINE   STARTED        DURATION           RESTART PROGRESS  DL UL STATE
   108     c8860dae5a054ec38a33068f75fe9690 error_test 13 seconds ago Less than a second 0       1 + 0 / 1 4B 4B success
   109     ```
   110  
   111     As you can see in the `PROGRESS` column – `1 + 0 / 1`, you have one
   112     successfully processed datum.
   113  
   114  1. Add `file2`:
   115  
   116     ```shell
   117     $ echo "bar" | pachctl put file input@master:file2
   118     ```
   119  
   120     Processing of this datum fails, but because the `err_cmd` code ran successfully,
   121     the datum is marked as *recovered*, and the job finishes without errors.
   122     Only `file1` is available in the output commit.
   123  
   124     ```shell
   125     $ pachctl list job --no-pager
   126     ID                               PIPELINE   STARTED       DURATION           RESTART PROGRESS      DL UL STATE
   127     bc3da288ff884d5a9bcb312dd6cf07cb error_test 3 seconds ago Less than a second 0       0 + 1 + 1 / 2 0B 0B success
   128     c8860dae5a054ec38a33068f75fe9690 error_test 3 minutes ago Less than a second 0       1 + 0 / 1     4B 4B success
   129     ```
   130  
   131     In the `PROGRESS` column, you can see that the last job did not have
   132     any successfully processed datums, but it had a skipped datum and a
   133     recovered datum – `0 + 1 + 1 / 2`.
   134  
   135  1. Add `file3`:
   136  
   137  
   138     ```shell
   139     $ echo "baz" | pachctl put file input@master:file3
   140     ```
   141  
   142     Because the processed datum does not have neither `file1`, nor
   143     `file2`, this job results in failure. Therefore, both `cmd`
   144     and `err_cmd` codes result in non-zero status:
   145  
   146     ```
   147     ID                               PIPELINE   STARTED        DURATION           RESTART PROGRESS      DL UL STATE
   148     272370ec03c24cc1be660bf97403712f error_test 26 seconds ago Less than a second 0       0 + 2 / 3     0B 0B failure: failed to process datum:...
   149     bc3da288ff884d5a9bcb312dd6cf07cb error_test 6 minutes ago  Less than a second 0       0 + 1 + 1 / 2 0B 0B success
   150     c8860dae5a054ec38a33068f75fe9690 error_test 10 minutes ago Less than a second 0       1 + 0 / 1     4B 4B success
   151     ```