github.com/pachyderm/pachyderm@v1.13.4/doc/docs/master/deploy-manage/manage/s3gateway/supported-operations.md (about)

     1  # Supported Operations
     2  
     3  The Pachyderm S3 gateway supports the following operations:
     4  
     5  * Create buckets: Creates a repo and branch.
     6  * Delete buckets: Deletes a branch or a repo with all branches.
     7  * List buckets: Lists all branches on all repos as S3 buckets.
     8  * Write objects: Atomically overwrites a file on a branch.
     9  * Remove objects: Atomically removes a file on a branch.
    10  * List objects: Lists the files in the HEAD of a branch.
    11  * Get objects: Gets file contents on a branch.
    12  
    13  ## List Filesystem Objects
    14  
    15  If you have configured your S3 client correctly, you should be
    16  able to see the list of filesystem objects in your Pachyderm
    17  repository by running an S3 client `ls` command.
    18  To list filesystem objects, complete the following steps:
    19  
    20  1. Verify that your S3 client can access all of your Pachyderm repositories:
    21  
    22     * If you are using MinIO, type:
    23  
    24       ```shell
    25       mc ls local
    26       ```
    27  
    28       **System Response:**
    29  
    30       ```
    31       [2019-07-12 15:09:50 PDT]      0B master.train/
    32       [2019-07-12 14:58:50 PDT]      0B master.pre_process/
    33       [2019-07-12 14:58:09 PDT]      0B master.split/
    34       [2019-07-12 14:58:09 PDT]      0B stats.split/
    35       [2019-07-12 14:36:27 PDT]      0B master.raw_data/
    36       ```
    37  
    38     * If you are using AWS, type:
    39  
    40       ```shell
    41       aws --endpoint-url http://localhost:30600 s3 ls
    42       ```
    43  
    44       **System Response:**
    45  
    46       ```
    47       2019-07-12 15:09:50 master.train
    48       2019-07-12 14:58:50 master.pre_process
    49       2019-07-12 14:58:09 master.split
    50       2019-07-12 14:58:09 stats.split
    51       2019-07-12 14:36:27 master.raw_data
    52       ```
    53  
    54  1. List the contents of a repository:
    55  
    56     * If you are using MinIO, type:
    57  
    58       ```shell
    59       mc ls local/master.raw_data
    60       ```
    61  
    62       **System Response:**
    63  
    64       ```
    65       [2019-07-19 12:11:37 PDT]  2.6MiB github_issues_medium.csv
    66       ```
    67  
    68     * If you are using AWS, type:
    69  
    70       ```shell
    71       aws --endpoint-url http://localhost:30600/ s3 ls s3://master.raw_data
    72       ```
    73  
    74       **System Response:**
    75  
    76       ```
    77       2019-07-26 11:22:23    2685061 github_issues_medium.csv
    78       ```
    79  
    80  ## Create an S3 Bucket
    81  
    82  You can create an S3 bucket in Pachyderm by using the AWS CLI or
    83  the MinIO client commands.
    84  The S3 bucket that you create is a branch in a repository
    85  in Pachyderm.
    86  
    87  To create an S3 bucket, complete the following steps:
    88  
    89  1. Use a corresponding command below to create a new
    90  S3 bucket, which is a repository with a branch in Pachyderm.
    91  
    92     * If you are using MinIO, type:
    93  
    94       ```shell
    95       mc mb local/master.test
    96       ```
    97  
    98       **System Response:**
    99  
   100       ```
   101       Bucket created successfully `local/master.test`.
   102       ```
   103  
   104     * If you are using AWS, type:
   105  
   106       ```shell
   107       aws --endpoint-url http://localhost:30600/ s3 mb s3://master.test
   108       ```
   109  
   110       **System Response:**
   111  
   112       ```
   113       make_bucket: master.test
   114       ```
   115  
   116  1. Verify that the S3 bucket has been successfully created:
   117  
   118     * If you are using MinIO, type:
   119  
   120       ```shell
   121       mc ls local
   122       ```
   123  
   124       **System Response:**
   125  
   126       ```
   127       [2019-07-18 13:32:44 PDT]      0B master.test/
   128       [2019-07-12 15:09:50 PDT]      0B master.train/
   129       [2019-07-12 14:58:50 PDT]      0B master.pre_process/
   130       [2019-07-12 14:58:09 PDT]      0B master.split/
   131       [2019-07-12 14:58:09 PDT]      0B stats.split/
   132       [2019-07-12 14:36:27 PDT]      0B master.raw_data/
   133       ```
   134  
   135     * If you are using AWS, type:
   136  
   137       ```shell
   138       aws --endpoint-url http://localhost:30600/ s3 ls
   139       ```
   140  
   141       **System Response:**
   142  
   143       ```
   144       2019-07-26 11:35:28 master.test
   145       2019-07-12 14:58:50 master.pre_process
   146       2019-07-12 14:58:09 master.split
   147       2019-07-12 14:58:09 stats.split
   148       2019-07-12 14:36:27 master.raw_data
   149       ```
   150  
   151       **System Response:**
   152  
   153       ```
   154       2019-07-26 11:35 master.test
   155       2019-07-12 14:58 master.pre_process
   156       2019-07-12 14:58 master.split
   157       2019-07-12 14:58 stats.split
   158       2019-07-12 14:36 master.raw_data
   159       ```
   160  
   161     * You can also use the `pachctl list repo` command to view the
   162     list of repositories:
   163  
   164       ```shell
   165       pachctl list repo
   166       ```
   167  
   168       **System Response:**
   169  
   170       ```
   171       NAME               CREATED                    SIZE (MASTER)
   172       test               About an hour ago          0B
   173       train              6 days ago                 68.57MiB
   174       pre_process        6 days ago                 1.18MiB
   175       split              6 days ago                 1.019MiB
   176       raw_data           6 days ago                 2.561MiB
   177       ```
   178  
   179       You should see the newly created repository in this list.
   180  
   181  ### Delete an S3 Bucket
   182  
   183  You can delete an empty S3 bucket in Pachyderm by running a corresponding
   184  command for your S3 client. The bucket must be completely empty.
   185  
   186  To remove an S3 bucket, run one of the following commands:
   187  
   188  * If you are using MinIO, type:
   189  
   190    ```shell
   191    mc rb local/master.test
   192    ```
   193  
   194    **System Response:**
   195  
   196    ```
   197    Removed `local/master.test` successfully.
   198    ```
   199  
   200  * If you are using AWS, type:
   201  
   202    ```shell
   203    aws --endpoint-url http://localhost:30600/ s3 rb s3://master.test
   204    ```
   205  
   206    **System Response:**
   207  
   208    ```
   209    remove_bucket: master.test
   210    ```
   211  
   212  ## Upload and Download File Objects
   213  
   214  For input repositories at the top of your DAG, you can both add files
   215  to and download files from the repository.
   216  
   217  Not all the repositories that you see in the results of the `ls` command are
   218  input repositories that can be written to. Some of them might be read-only
   219  output repos. Check your pipeline specification to verify which
   220  repositories are the input repos.
   221  
   222  To add a file to a repository, complete the following steps:
   223  
   224  1. Run the `cp` command for your S3 client:
   225  
   226     * If you are using MinIO, type:
   227  
   228       ```shell
   229       mc cp test.csv local/master.raw_data/test.csv
   230       ```
   231  
   232       **System Response:**
   233  
   234       ```
   235       test.csv:                  62 B / 62 B  ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓  100.00% 206 B/s 0s
   236       ```
   237  
   238     * If you are using AWS, type:
   239  
   240       ```shell
   241       aws --endpoint-url http://localhost:30600/ s3 cp test.csv s3://master.raw_data
   242       ```
   243  
   244       **System Response:**
   245  
   246       ```
   247       upload: ./test.csv to s3://master.raw_data/test.csv
   248       ```
   249  
   250     These commands add the `test.csv` file to the `master` branch in
   251     the `raw_data` repository. `raw_data` is an input repository.
   252  
   253  1. Check that the file was added:
   254  
   255     * If you are using MinIO, type:
   256  
   257       ```shell
   258       mc ls local/master.raw_data
   259       ```
   260  
   261       **System Response:**
   262  
   263       ```
   264       [2019-07-19 12:11:37 PDT]  2.6MiB github_issues_medium.csv
   265       [2019-07-19 12:11:37 PDT]     62B test.csv
   266       ```
   267  
   268     * If you are using AWS, type:
   269  
   270       ```shell
   271       aws --endpoint-url http://localhost:30600/ s3 ls s3://master.raw_data/
   272       ```
   273  
   274       **System Response:**
   275  
   276       ```
   277       2019-07-19 12:11:37  2685061 github_issues_medium.csv
   278       2019-07-19 12:11:37       62 test.csv
   279       ```
   280  
   281  1. Download a file from MinIO to the
   282  current directory by running the following commands:
   283  
   284     * If you are using MinIO, type:
   285  
   286       ```shell
   287       mc cp local/master.raw_data/github_issues_medium.csv .
   288       ```
   289  
   290       **System Response:**
   291  
   292       ```
   293       ...hub_issues_medium.csv:  2.56 MiB / 2.56 MiB  ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ 100.00% 1.26 MiB/s 2s
   294       ```
   295  
   296     * If you are using AWS, type:
   297  
   298       ```
   299       aws --endpoint-url http://localhost:30600/ s3 cp s3://master.raw_data/test.csv .
   300       ```
   301  
   302       **System Response:**
   303  
   304       ```
   305       download: s3://master.raw_data/test.csv to ./test.csv
   306       ```
   307  
   308  ## Remove a File Object
   309  
   310  You can delete a file in the `HEAD` of a Pachyderm branch by using the
   311  MinIO command-line interface:
   312  
   313  1. List the files in the input repository:
   314  
   315     * If you are using MinIO, type:
   316  
   317       ```shell
   318       mc ls local/master.raw_data/
   319       ```
   320  
   321       **System Response:**
   322  
   323       ```
   324       [2019-07-19 12:11:37 PDT]  2.6MiB github_issues_medium.csv
   325       [2019-07-19 12:11:37 PDT]     62B test.csv
   326       ```
   327  
   328     * If you are using AWS, type:
   329  
   330       ```shell
   331       aws --endpoint-url http://localhost:30600/ s3 ls s3://master.raw_data
   332       ```
   333  
   334       **System Response:**
   335  
   336       ```
   337       2019-07-19 12:11:37    2685061 github_issues_medium.csv
   338       2019-07-19 12:11:37         62 test.csv
   339       ```
   340  
   341  1. Delete a file from a repository. Example:
   342  
   343     * If you are using MinIO, type:
   344  
   345       ```shell
   346       mc rm local/master.raw_data/test.csv
   347       ```
   348  
   349       **System Response:**
   350  
   351       ```
   352       Removing `local/master.raw_data/test.csv`.
   353       ```
   354  
   355     * If you are using AWS, type:
   356  
   357       ```shell
   358       aws --endpoint-url http://localhost:30600/ s3 rm s3://master.raw_data/test.csv
   359       ```
   360  
   361       **System Response:**
   362  
   363       ```
   364       delete: s3://master.raw_data/test.csv
   365       ```