github.com/pachyderm/pachyderm@v1.13.4/doc/docs/master/how-tos/developer-workflow/build-pipelines.md (about)

     1  # Build Pipelines
     2  
     3  !!! Warning
     4      Build Pipelines are an [experimental feature](../../../contributing/supported-releases/#experimental).
     5  
     6  A build pipeline is a useful feature when iterating on the code in your pipeline. In essence, build pipelines automate or remove the need for Steps 2-4 of the [pipeline workflow](working-with-pipelines.md). They allow you to bypass the Docker build process and submit your code directly to the pipeline. A diagram of the build pipeline process is shown below.
     7  
     8  ![Developer workflow](../../assets/images/d_steps_build_pipeline.svg)
     9  
    10  
    11  Functionally, a build pipeline relies on a base Docker image that remains unchanged during the development process. Code and build assets are stored in Pachyderm itself and copied into the pipeline pod when it executes.
    12  
    13  To enable this feature, add a `build` object to the pipeline spec's `transform` object, with the following fields:
    14  
    15  - `path`: An optional string specifying where the source code is relative to the pipeline spec path (or the current working directory if the pipeline is fed into `pachctl` via stdin.)
    16  - `language`: An optional string specifying what language builder to use (see below). Only works with official builders. If unspecified, `image` will be used instead.
    17  - `image`: An optional string specifying what builder image to use, if a non-official builder is desired. If unspecified, the `transform` object's `image` will be used instead.
    18  
    19  Below is a Python example of a build pipline.
    20  
    21  ```json
    22  {
    23    "pipeline": {
    24      "name": "map"
    25    },
    26    "description": "A pipeline that tokenizes scraped pages and appends counts of words to corresponding files.",
    27    "transform": {
    28      "build": {
    29        "language": "python",
    30        "path": "./source"
    31      }
    32    },
    33    "input": {
    34      "pfs": {
    35        "repo": "scraper",
    36        "glob": "/*"
    37      }
    38    }
    39  }
    40  ```
    41  
    42  A build pipeline can be submitted the same way as any other pipeline, for example:
    43  
    44  ```shell
    45  pachctl update pipeline -f <pipeline name>
    46  ```
    47  
    48  ## How it works
    49  
    50  When a build pipeline is submitted, the following actions occur:
    51  
    52  1. All files from the pipeline build `path` are copied to a PFS repository, `<pipeline name>_build`, which we can think of as the source code repository. In the case above, everything in `./source` would be copied to to the PFS `map_build` repository.
    53  
    54  2. A pipeline that uses the same repo but a different branch starts, reads the source code and creates build assets (i.e. pulling in dependencies and/or compiling) by running a `build.sh` script.
    55  
    56  3. The running pipeline, `<pipeline name>`, is updated with the the new source files and built assets then executes `sh /pfs/build/run.sh` when a job for that pipeline is started.
    57  
    58  !!! note
    59        You can optionally specify a `.pachignore` file in the source root directory, which uses [ohmyglob](https://github.com/pachyderm/ohmyglob) entries to prevent certain files from getting pushed to this repo.
    60  
    61  The updated pipeline contains the following PFS repos mapped in as inputs:
    62  
    63  1. `/pfs/source` - source code that is required for running the pipeline.
    64  
    65  1. `/pfs/build` - any artifacts resulting from the build process.
    66  
    67  1. `/pfs/<input(s)>` - any inputs specified in the pipeline spec.
    68  
    69  ## Builders
    70  
    71  The builder interprets the pipeline spec to determine:
    72  
    73  * A Docker image to use as the base image.
    74  * Steps to run for the build.
    75  * Step to run upon deployment.
    76    
    77  The `transform.build.language` field is solely required to use an official builder (currently `python` or `go`), which already have impelmentations for `build.sh` and `run.sh`.
    78  
    79  ### Python Builder
    80  
    81  The Python builder relies on a file structure similar to the following:
    82  
    83  ```tree
    84  ./map
    85  ├── source
    86  │   ├── requirements.txt
    87  │   ├── ...
    88  │   └── main.py
    89  ```
    90  There must exist a `main.py` which acts as the entrypoint for the pipeline. Optionally, a `requirements.txt` can be used to specify pip packages that will be installed during the build process. Other supporting files in the directory will also be copied and available in the pipeline if they are not excluded by the `.pachignore`.
    91  
    92  
    93  ### Go Builder
    94  
    95  The Go Builder follows the same format as the [Python Builder](#python-builder). There must be a main source file in the source root that imports and invokes the intended code.
    96  
    97  ### Creating a Builder
    98  
    99  Users can author their own builders for languages other than Python and Go (or customizations to the official builders). Builders are somewhat similar to buildpacks in design, and follow a convention over configuration approach. The current [official builders](https://github.com/pachyderm/pachyderm/tree/master/etc/pipeline-build) can be used for reference.
   100  
   101  A builder needs 3 things:
   102  
   103  - A Dockerfile to bake the image specified in the build pipeline spec.
   104  - A `build.sh` in the image workdir, which acts as the entry-point for the build pipeline.
   105  - A `run.sh`, injected into `/pfs/out` via `build.sh`. This will act as the entry-point for the executing pipeline. By convention, `run.sh` should take an arbitrary number of arguments and forward them to whatever executes the actual user code.
   106  
   107  And the build file structure would look similar to the following:
   108  
   109  ```tree
   110  <language>
   111  ├── Dockerfile
   112  ├── build.sh
   113  └── run.sh
   114  ```
   115  
   116  The `transform.build.image` in the pipeline spec is used to define the base image for unofficial builders. The order of preference for determining the Docker image is:
   117  
   118  1. `transform.build.language`
   119  2. `transform.build.image`
   120  3. `transform.image`
   121  
   122  The convention is to provide `build.sh` and `run.sh` scripts to fulfill the build pipeline requirements; however, if a `transform.cmd` is specified, it will take precedence over `run.sh`.