github.com/pachyderm/pachyderm@v1.13.4/doc/docs/1.11.x/how-tos/developer-workflow/build-pipelines.md (about)

     1  # Build Pipelines
     2  
     3  A build pipeline is a useful feature when iterating on the code in your pipeline. They allow you to bypass the Docker build process and submit your code directly to the pipeline. In essence, build pipelines automate Steps 2-4 of the [pipeline workflow](working-with-pipelines.md). A diagram of the build pipeline process is shown below.
     4  
     5  ![Developer workflow](../../assets/images/d_steps_build_pipeline.svg)
     6  
     7  
     8  Functionally, a build pipeline relies on a base Docker image that remains unchanged during the development process. Code and build assets are stored in Pachyderm itself and copied into the pipeline pod when it executes.
     9  
    10  To enable this feature, add a `build` object to the pipeline spec's `transform` object, with the following fields:
    11  
    12  - `path`: An optional string specifying where the source code is relative to the pipeline spec path (or the current working directory if the pipeline is fed into `pachctl` via stdin.)
    13  - `language`: An optional string specifying what language builder to use (see below). Only works with official builders. If unspecified, `image` will be used instead.
    14  - `image`: An optional string specifying what builder image to use, if a non-official builder is desired. If unspecified, the `transform` object's `image` will be used instead.
    15  
    16  Below is a Python example of a build pipline.
    17  
    18  ```json
    19  {
    20    "pipeline": {
    21      "name": "map"
    22    },
    23    "description": "A pipeline that tokenizes scraped pages and appends counts of words to corresponding files.",
    24    "transform": {
    25      "build": {
    26        "language": "python",
    27        "path": "./source"
    28      }
    29    },
    30    "input": {
    31      "pfs": {
    32        "repo": "scraper",
    33        "glob": "/*"
    34      }
    35    }
    36  }
    37  ```
    38  
    39  A build pipeline can be submitted the same way as any other pipeline, for example:
    40  
    41  ```shell
    42  pachctl update pipeline -f <pipeline name>
    43  ```
    44  
    45  ## How it works
    46  
    47  When a build pipeline is submitted, the following actions occur:
    48  
    49  1. All files from the pipeline build `path` are copied to a PFS repository, `<pipeline name>_build`, which we can think of as the source code repository. In the case above, everything in `./source` would be copied to to the PFS `map_build` repository.
    50  
    51  2. A pipeline that uses the same repo but a different branch starts, reads the source code and creates build assets (i.e. pulling in dependencies and/or compiling) by running a `build.sh` script.
    52  
    53  3. The running pipeline, `<pipeline name>`, is updated with the the new source files and built assets then executes `sh /pfs/build/run.sh` when a job for that pipeline is started.
    54  
    55  !!! note
    56        You can optionally specify a `.pachignore` file in the source root directory, which uses [ohmyglob](https://github.com/pachyderm/ohmyglob) entries to prevent certain files from getting pushed to this repo.
    57  
    58  The updated pipeline contains the following PFS repos mapped in as inputs:
    59  
    60  1. `/pfs/source` - source code that is required for running the pipeline.
    61  
    62  1. `/pfs/build` - any artifacts resulting from the build process.
    63  
    64  1. `/pfs/<input(s)>` - any inputs specified in the pipeline spec.
    65  
    66  ## Builders
    67  
    68  The builder interprets the pipeline spec to determine:
    69  
    70  * A Docker image to use as the base image.
    71  * Steps to run for the build.
    72  * Step to run upon deployment.
    73    
    74  The `transform.build.language` field is solely required to use an official builder (currently `python` or `go`), which already have impelmentations for `build.sh` and `run.sh`.
    75  
    76  ### Python Builder
    77  
    78  The Python builder relies on a file structure similar to the following:
    79  
    80  ```tree
    81  ./map
    82  ├── source
    83  │   ├── requirements.txt
    84  │   ├── ...
    85  │   └── main.py
    86  ```
    87  There must exist a `main.py` which acts as the entrypoint for the pipeline. Optionally, a `requirements.txt` can be used to specify pip packages that will be installed during the build process. Other supporting files in the directory will also be copied and available in the pipeline if they are not excluded by the `.pachignore`.
    88  
    89  
    90  ### Go Builder
    91  
    92  The Go Builder follows the same format as the [Python Builder](#python-builder). There must be a main source file in the source root that imports and invokes the intended code.
    93  
    94  ### Creating a Builder
    95  
    96  Users can author their own builders for languages other than Python and Go (or customizations to the official builders). Builders are somewhat similar to buildpacks in design, and follow a convention over configuration approach. The current [official builders](https://github.com/pachyderm/pachyderm/tree/master/etc/pipeline-build) can be used for reference.
    97  
    98  A builder needs 3 things:
    99  
   100  - A Dockerfile to bake the image specified in the build pipeline spec.
   101  - A `build.sh` in the image workdir, which acts as the entry-point for the build pipeline.
   102  - A `run.sh`, injected into `/pfs/out` via `build.sh`. This will act as the entry-point for the executing pipeline. By convention, `run.sh` should take an arbitrary number of arguments and forward them to whatever executes the actual user code.
   103  
   104  And the build file structure would look similar to the following:
   105  
   106  ```tree
   107  <language>
   108  ├── Dockerfile
   109  ├── build.sh
   110  └── run.sh
   111  ```
   112  
   113  The `transform.build.image` in the pipeline spec is used to define the base image for unofficial builders. The order of preference for determining the Docker image is:
   114  
   115  1. `transform.build.language`
   116  2. `transform.build.image`
   117  3. `transform.image`
   118  
   119  The convention is to provide `build.sh` and `run.sh` scripts to fulfill the build pipeline requirements; however, if a `transform.cmd` is specified, it will take precedence over `run.sh`.