github.com/pachyderm/pachyderm@v1.13.4/doc/docs/master/deploy-manage/manage/s3gateway/supported-operations.md (about) 1 # Supported Operations 2 3 The Pachyderm S3 gateway supports the following operations: 4 5 * Create buckets: Creates a repo and branch. 6 * Delete buckets: Deletes a branch or a repo with all branches. 7 * List buckets: Lists all branches on all repos as S3 buckets. 8 * Write objects: Atomically overwrites a file on a branch. 9 * Remove objects: Atomically removes a file on a branch. 10 * List objects: Lists the files in the HEAD of a branch. 11 * Get objects: Gets file contents on a branch. 12 13 ## List Filesystem Objects 14 15 If you have configured your S3 client correctly, you should be 16 able to see the list of filesystem objects in your Pachyderm 17 repository by running an S3 client `ls` command. 18 To list filesystem objects, complete the following steps: 19 20 1. Verify that your S3 client can access all of your Pachyderm repositories: 21 22 * If you are using MinIO, type: 23 24 ```shell 25 mc ls local 26 ``` 27 28 **System Response:** 29 30 ``` 31 [2019-07-12 15:09:50 PDT] 0B master.train/ 32 [2019-07-12 14:58:50 PDT] 0B master.pre_process/ 33 [2019-07-12 14:58:09 PDT] 0B master.split/ 34 [2019-07-12 14:58:09 PDT] 0B stats.split/ 35 [2019-07-12 14:36:27 PDT] 0B master.raw_data/ 36 ``` 37 38 * If you are using AWS, type: 39 40 ```shell 41 aws --endpoint-url http://localhost:30600 s3 ls 42 ``` 43 44 **System Response:** 45 46 ``` 47 2019-07-12 15:09:50 master.train 48 2019-07-12 14:58:50 master.pre_process 49 2019-07-12 14:58:09 master.split 50 2019-07-12 14:58:09 stats.split 51 2019-07-12 14:36:27 master.raw_data 52 ``` 53 54 1. List the contents of a repository: 55 56 * If you are using MinIO, type: 57 58 ```shell 59 mc ls local/master.raw_data 60 ``` 61 62 **System Response:** 63 64 ``` 65 [2019-07-19 12:11:37 PDT] 2.6MiB github_issues_medium.csv 66 ``` 67 68 * If you are using AWS, type: 69 70 ```shell 71 aws --endpoint-url http://localhost:30600/ s3 ls s3://master.raw_data 72 ``` 73 74 **System Response:** 75 76 ``` 77 2019-07-26 11:22:23 2685061 github_issues_medium.csv 78 ``` 79 80 ## Create an S3 Bucket 81 82 You can create an S3 bucket in Pachyderm by using the AWS CLI or 83 the MinIO client commands. 84 The S3 bucket that you create is a branch in a repository 85 in Pachyderm. 86 87 To create an S3 bucket, complete the following steps: 88 89 1. Use a corresponding command below to create a new 90 S3 bucket, which is a repository with a branch in Pachyderm. 91 92 * If you are using MinIO, type: 93 94 ```shell 95 mc mb local/master.test 96 ``` 97 98 **System Response:** 99 100 ``` 101 Bucket created successfully `local/master.test`. 102 ``` 103 104 * If you are using AWS, type: 105 106 ```shell 107 aws --endpoint-url http://localhost:30600/ s3 mb s3://master.test 108 ``` 109 110 **System Response:** 111 112 ``` 113 make_bucket: master.test 114 ``` 115 116 1. Verify that the S3 bucket has been successfully created: 117 118 * If you are using MinIO, type: 119 120 ```shell 121 mc ls local 122 ``` 123 124 **System Response:** 125 126 ``` 127 [2019-07-18 13:32:44 PDT] 0B master.test/ 128 [2019-07-12 15:09:50 PDT] 0B master.train/ 129 [2019-07-12 14:58:50 PDT] 0B master.pre_process/ 130 [2019-07-12 14:58:09 PDT] 0B master.split/ 131 [2019-07-12 14:58:09 PDT] 0B stats.split/ 132 [2019-07-12 14:36:27 PDT] 0B master.raw_data/ 133 ``` 134 135 * If you are using AWS, type: 136 137 ```shell 138 aws --endpoint-url http://localhost:30600/ s3 ls 139 ``` 140 141 **System Response:** 142 143 ``` 144 2019-07-26 11:35:28 master.test 145 2019-07-12 14:58:50 master.pre_process 146 2019-07-12 14:58:09 master.split 147 2019-07-12 14:58:09 stats.split 148 2019-07-12 14:36:27 master.raw_data 149 ``` 150 151 **System Response:** 152 153 ``` 154 2019-07-26 11:35 master.test 155 2019-07-12 14:58 master.pre_process 156 2019-07-12 14:58 master.split 157 2019-07-12 14:58 stats.split 158 2019-07-12 14:36 master.raw_data 159 ``` 160 161 * You can also use the `pachctl list repo` command to view the 162 list of repositories: 163 164 ```shell 165 pachctl list repo 166 ``` 167 168 **System Response:** 169 170 ``` 171 NAME CREATED SIZE (MASTER) 172 test About an hour ago 0B 173 train 6 days ago 68.57MiB 174 pre_process 6 days ago 1.18MiB 175 split 6 days ago 1.019MiB 176 raw_data 6 days ago 2.561MiB 177 ``` 178 179 You should see the newly created repository in this list. 180 181 ### Delete an S3 Bucket 182 183 You can delete an empty S3 bucket in Pachyderm by running a corresponding 184 command for your S3 client. The bucket must be completely empty. 185 186 To remove an S3 bucket, run one of the following commands: 187 188 * If you are using MinIO, type: 189 190 ```shell 191 mc rb local/master.test 192 ``` 193 194 **System Response:** 195 196 ``` 197 Removed `local/master.test` successfully. 198 ``` 199 200 * If you are using AWS, type: 201 202 ```shell 203 aws --endpoint-url http://localhost:30600/ s3 rb s3://master.test 204 ``` 205 206 **System Response:** 207 208 ``` 209 remove_bucket: master.test 210 ``` 211 212 ## Upload and Download File Objects 213 214 For input repositories at the top of your DAG, you can both add files 215 to and download files from the repository. 216 217 Not all the repositories that you see in the results of the `ls` command are 218 input repositories that can be written to. Some of them might be read-only 219 output repos. Check your pipeline specification to verify which 220 repositories are the input repos. 221 222 To add a file to a repository, complete the following steps: 223 224 1. Run the `cp` command for your S3 client: 225 226 * If you are using MinIO, type: 227 228 ```shell 229 mc cp test.csv local/master.raw_data/test.csv 230 ``` 231 232 **System Response:** 233 234 ``` 235 test.csv: 62 B / 62 B ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ 100.00% 206 B/s 0s 236 ``` 237 238 * If you are using AWS, type: 239 240 ```shell 241 aws --endpoint-url http://localhost:30600/ s3 cp test.csv s3://master.raw_data 242 ``` 243 244 **System Response:** 245 246 ``` 247 upload: ./test.csv to s3://master.raw_data/test.csv 248 ``` 249 250 These commands add the `test.csv` file to the `master` branch in 251 the `raw_data` repository. `raw_data` is an input repository. 252 253 1. Check that the file was added: 254 255 * If you are using MinIO, type: 256 257 ```shell 258 mc ls local/master.raw_data 259 ``` 260 261 **System Response:** 262 263 ``` 264 [2019-07-19 12:11:37 PDT] 2.6MiB github_issues_medium.csv 265 [2019-07-19 12:11:37 PDT] 62B test.csv 266 ``` 267 268 * If you are using AWS, type: 269 270 ```shell 271 aws --endpoint-url http://localhost:30600/ s3 ls s3://master.raw_data/ 272 ``` 273 274 **System Response:** 275 276 ``` 277 2019-07-19 12:11:37 2685061 github_issues_medium.csv 278 2019-07-19 12:11:37 62 test.csv 279 ``` 280 281 1. Download a file from MinIO to the 282 current directory by running the following commands: 283 284 * If you are using MinIO, type: 285 286 ```shell 287 mc cp local/master.raw_data/github_issues_medium.csv . 288 ``` 289 290 **System Response:** 291 292 ``` 293 ...hub_issues_medium.csv: 2.56 MiB / 2.56 MiB ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ 100.00% 1.26 MiB/s 2s 294 ``` 295 296 * If you are using AWS, type: 297 298 ``` 299 aws --endpoint-url http://localhost:30600/ s3 cp s3://master.raw_data/test.csv . 300 ``` 301 302 **System Response:** 303 304 ``` 305 download: s3://master.raw_data/test.csv to ./test.csv 306 ``` 307 308 ## Remove a File Object 309 310 You can delete a file in the `HEAD` of a Pachyderm branch by using the 311 MinIO command-line interface: 312 313 1. List the files in the input repository: 314 315 * If you are using MinIO, type: 316 317 ```shell 318 mc ls local/master.raw_data/ 319 ``` 320 321 **System Response:** 322 323 ``` 324 [2019-07-19 12:11:37 PDT] 2.6MiB github_issues_medium.csv 325 [2019-07-19 12:11:37 PDT] 62B test.csv 326 ``` 327 328 * If you are using AWS, type: 329 330 ```shell 331 aws --endpoint-url http://localhost:30600/ s3 ls s3://master.raw_data 332 ``` 333 334 **System Response:** 335 336 ``` 337 2019-07-19 12:11:37 2685061 github_issues_medium.csv 338 2019-07-19 12:11:37 62 test.csv 339 ``` 340 341 1. Delete a file from a repository. Example: 342 343 * If you are using MinIO, type: 344 345 ```shell 346 mc rm local/master.raw_data/test.csv 347 ``` 348 349 **System Response:** 350 351 ``` 352 Removing `local/master.raw_data/test.csv`. 353 ``` 354 355 * If you are using AWS, type: 356 357 ```shell 358 aws --endpoint-url http://localhost:30600/ s3 rm s3://master.raw_data/test.csv 359 ``` 360 361 **System Response:** 362 363 ``` 364 delete: s3://master.raw_data/test.csv 365 ```