github.com/pachyderm/pachyderm@v1.13.4/doc/docs/1.10.x/how-tos/mount-volume.md (about)

     1  # Mount a Volume in a Pipeline
     2  
     3  You may have a local or a network-attached storage that you want your
     4  pipeline to write files to.
     5  You can mount that folder as a volume in Kubernetes
     6  and make it available in your pipeline worker by using the
     7  `pod_patch` pipeline parameter.
     8  The `pod_patch` parameter takes a string that specifies the changes
     9  that you want to add to your existing manifest. To create
    10  a patch, you need to generate a diff of the original ReplicationController
    11  and the one with your changes. You can use one of the online JSON patch
    12  utilities, such as [JSON Patch Generator](https://extendsclass.com/json-patch.html)
    13  to create a diff. A diff for mounting a volume might look like this:
    14  
    15  ```json
    16  [
    17   {
    18    "op": "add",
    19    "path": "/volumes/-",
    20    "value": {
    21     "name": "task-pv-storage",
    22     "persistentVolumeClaim": {
    23      "claimName": "task-pv-claim"
    24     }
    25    }
    26   },
    27   {
    28    "op": "add",
    29    "path": "/containers/0/volumeMounts/-",
    30    "value": {
    31     "mountPath": "/data",
    32     "name": "task-pv-volume"
    33    }
    34   }
    35  ]
    36  ```
    37  
    38  This output needs to be converted into a one-liner and added to the
    39  pipeline spec.
    40  
    41  We will use the [OpenCV example](../getting_started/beginner_tutorial/).
    42  to demonstrate this functionality.
    43  
    44  To mount a volume, complete the following steps:
    45  
    46  1. Create a PersistentVolume and a PersistentVolumeClaim as
    47  described in [Configure a Pod to Use a PersistentVolume for Storage](https://kubernetes.io/docs/tasks/configure-pod-container/configure-persistent-volume-storage/). Modify `mountPath` and `path` as needed.
    48  
    49     For testing purposes, you might want to add an `index.html`
    50     file as described in [Create an index.html file](https://kubernetes.io/docs/tasks/configure-pod-container/configure-persistent-volume-storage/#create-an-index-html-file-on-your-node).
    51  
    52  1. Get the ReplicationController (RC) manifest from your pipeline:
    53  
    54     ```shell
    55     kubectl get rc <rc-pipeline> -o json > <filename>.yaml
    56     ```
    57  
    58     **Example:**
    59  
    60     ```shell
    61     kubectl get rc pipeline-edges-v7 -o json > test-rc.yaml
    62     ```
    63  
    64  1. Open the generated RC manifest for editing.
    65  1. Under `spec`, find the `volumeMounts` section.
    66  1. Add your volume in the list of mounts. 
    67  
    68     **Example:**
    69  
    70     ```json
    71     {
    72          "mountPath": "/data",
    73          "name": "task-pv-storage"
    74     }
    75     ```
    76  
    77     `mountPath` is where your volume will be mounted inside of the
    78     container.
    79  
    80  1. Find the `volumes` section.
    81  1. Add the information about the volume.
    82  
    83     **Example:**
    84  
    85     ```json
    86     {
    87          "name": "task-pv-storage",
    88          "persistentVolumeClaim": {
    89              "claimName": "task-pv-claim"
    90          }
    91     }
    92     ```
    93  
    94     In this section, you need to specify the PersistentVolumeClaim you have
    95     created in Step 1.
    96  
    97  1. Save these changes to a new file.
    98  1. Copy the contents of the original RC to the clipboard.
    99  1. Go to a JSON patch generator, such as [JSON Patch Generator](https://extendsclass.com/json-patch.html),
   100  and paste the contents of the original RC manifest to the **Source JSON**
   101  field.
   102  1. Copy the contents of the modified RC manifest to clipboard
   103  as described above.
   104  1. Paste the contents of the modified RC manifest to the **Target JSON**
   105  field.
   106  1. Copy the generated JSON Patch.
   107  1. Go to your terminal and open the pipeline manifest for editing.
   108  
   109     For example, if you are modifying the `edges` pipeline, open the
   110     `edges.json` file.
   111  
   112  1. Add the patch as a one-liner under the `pod_patch` parameter.
   113  
   114     **Example:**
   115  
   116     ```json
   117     "pod_patch": "[{\"op\": \"add\",\"path\": \"/volumes/-\",\"value\": {\"name\": \"task-pv-storage\",\"persistentVolumeClaim\": {\"claimName\": \"task-pv-claim\"}}}, {\"op\": \"add\",\"path\": \"/containers/0/volumeMounts/-\",\"value\": {\"mountPath\": \"/data\",\"name\": \"task-pv-storage\"}}]"
   118     ```
   119  
   120     You need to add a backslash (\) before every quote (") sign
   121     that is enclosed in square brackets ([]). Also, you might need
   122     to modify the path to `volumeMounts` and `volumes` by removing
   123     the `/spec/template/spec/` prefix and replacing the assigned
   124     volume number with a dash (-). For example, if a
   125     path in the JSON patch is `/spec/template/spec/volumes/5`, you
   126     might need to replace it with `/volumes/-`. See the example
   127     above for details.
   128  
   129  1. After modifying the pipeline spec, update the pipeline:
   130  
   131     ```shell
   132     pachctl update pipeline -f <pipeline-spec.yaml>
   133     ```
   134  
   135     A new pod and new replication controller should be created with
   136     your modified changes.
   137  
   138  1. Verify that your file was mounted by connecting to your pod and
   139  listing the directory that you have specified as a mountpoint. In this
   140  example, it is `/data`.
   141  
   142     **Example:**
   143  
   144     ```shell
   145     kubectl exec -it <pipeline-pod> -- /bin/bash
   146     ```
   147  
   148     ```shell
   149     ls /data
   150     ```
   151  
   152     If you have added the `index.html` file for testing as described
   153     in Step 1, you should see that file in the mounted directory.
   154  
   155     You might want to adjust your pipeline code to read from or write to
   156     the mounted directory. For example, in the aforementioned
   157     [OpenCV example](https://docs.pachyderm.com/latest/getting_started/beginner_tutorial/#create-a-pipeline),
   158     the code reads from the `/pfs/images` directory and writes to the
   159     `/pfs/out` directory. If you want to read or write to the `/data`
   160     directory, you need to change those to `/data`.
   161  
   162     !!! important
   163         Pachyderm has no notion of the files stored in the mounted directory
   164         before it is mounted to Pachyderm. Moreover, if you have mounted a
   165         network share to which you write files from other than Pachyderm
   166         sources, Pachyderm does not guarantee the provenance of those changes.
   167