github.com/pachyderm/pachyderm@v1.13.4/doc/docs/1.10.x/deploy-manage/manage/s3gateway/supported-operations.md (about) 1 # Supported Operations 2 3 The Pachyderm S3 gateway supports the following operations: 4 5 * Create buckets: Creates a repo and branch. 6 * Delete buckets: Deletes a branch or a repo with all branches. 7 * List buckets: Lists all branches on all repos as S3 buckets. 8 * Write objects: Atomically overwrites a file on a branch. 9 * Remove objects: Atomically removes a file on a branch. 10 * List objects: Lists the files in the HEAD of a branch. 11 * Get objects: Gets file contents on a branch. 12 13 ## List Filesystem Objects 14 15 If you have configured your S3 client correctly, you should be 16 able to see the list of filesystem objects in your Pachyderm 17 repository by running an S3 client `ls` command. 18 To list filesystem objects, complete the following steps: 19 20 1. Verify that your S3 client can access all of your Pachyderm repositories: 21 22 * If you are using MinIO, type: 23 24 ```shell 25 mc ls local 26 ``` 27 28 **System Response:** 29 30 ``` 31 [2019-07-12 15:09:50 PDT] 0B master.train/ 32 [2019-07-12 14:58:50 PDT] 0B master.pre_process/ 33 [2019-07-12 14:58:09 PDT] 0B master.split/ 34 [2019-07-12 14:58:09 PDT] 0B stats.split/ 35 [2019-07-12 14:36:27 PDT] 0B master.raw_data/ 36 ``` 37 38 * If you are using AWS, type: 39 40 ```shell 41 aws --endpoint-url http://localhost:30600 s3 ls 42 ``` 43 44 **System Response:** 45 46 ``` 47 2019-07-12 15:09:50 master.train 48 2019-07-12 14:58:50 master.pre_process 49 2019-07-12 14:58:09 master.split 50 2019-07-12 14:58:09 stats.split 51 2019-07-12 14:36:27 master.raw_data 52 ``` 53 54 1. List the contents of a repository: 55 56 * If you are using MinIO, type: 57 58 ```shell 59 mc ls local/master.raw_data 60 ``` 61 62 **System Response:** 63 64 ``` 65 [2019-07-19 12:11:37 PDT] 2.6MiB github_issues_medium.csv 66 ``` 67 68 * If you are using AWS, type: 69 70 ```shell 71 aws --endpoint-url http://localhost:30600/ s3 ls s3://master.raw_data 72 ``` 73 74 **System Response:** 75 76 ``` 77 2019-07-26 11:22:23 2685061 github_issues_medium.csv 78 ``` 79 80 ## Create an S3 Bucket 81 82 You can create an S3 bucket in Pachyderm by using the AWS CLI or 83 the MinIO client commands. 84 The S3 bucket that you create is a branch in a repository 85 in Pachyderm. 86 87 To create an S3 bucket, complete the following steps: 88 89 1. Use a corresponding command below to create a new 90 S3 bucket, which is a repository with a branch in Pachyderm. 91 92 * If you are using MinIO, type: 93 94 ```shell 95 mc mb local/master.test 96 ``` 97 98 **System Response:** 99 100 ``` 101 Bucket created successfully `local/master.test`. 102 ``` 103 104 * If you are using AWS, type: 105 106 ```shell 107 aws --endpoint-url http://localhost:30600/ s3 mb s3://master.test 108 ``` 109 110 **System Response:** 111 112 ``` 113 make_bucket: master.test 114 ``` 115 116 1. Verify that the S3 bucket has been successfully created: 117 118 * If you are using MinIO, type: 119 120 ```shell 121 mc ls local 122 ``` 123 124 **System Response:** 125 126 ``` 127 [2019-07-18 13:32:44 PDT] 0B master.test/ 128 [2019-07-12 15:09:50 PDT] 0B master.train/ 129 [2019-07-12 14:58:50 PDT] 0B master.pre_process/ 130 [2019-07-12 14:58:09 PDT] 0B master.split/ 131 [2019-07-12 14:58:09 PDT] 0B stats.split/ 132 [2019-07-12 14:36:27 PDT] 0B master.raw_data/ 133 ``` 134 135 * If you are using AWS, type: 136 137 ```shell 138 aws --endpoint-url http://localhost:30600/ s3 ls 139 ``` 140 141 **System Response:** 142 143 ``` 144 2019-07-26 11:35:28 master.test 145 2019-07-12 14:58:50 master.pre_process 146 2019-07-12 14:58:09 master.split 147 2019-07-12 14:58:09 stats.split 148 2019-07-12 14:36:27 master.raw_data 149 ``` 150 151 * You can also use the `pachctl list repo` command to view the 152 list of repositories: 153 154 ```shell 155 pachctl list repo 156 ``` 157 158 **System Response:** 159 160 ``` 161 NAME CREATED SIZE (MASTER) 162 test About an hour ago 0B 163 train 6 days ago 68.57MiB 164 pre_process 6 days ago 1.18MiB 165 split 6 days ago 1.019MiB 166 raw_data 6 days ago 2.561MiB 167 ``` 168 169 You should see the newly created repository in this list. 170 171 ### Delete an S3 Bucket 172 173 You can delete an empty S3 bucket in Pachyderm by running a corresponding 174 command for your S3 client. The bucket must be completely empty. 175 176 To remove an S3 bucket, run one of the following commands: 177 178 * If you are using MinIO, type: 179 180 ```shell 181 mc rb local/master.test 182 ``` 183 184 **System Response:** 185 186 ``` 187 Removed `local/master.test` successfully. 188 ``` 189 190 * If you are using AWS, type: 191 192 ```shell 193 aws --endpoint-url http://localhost:30600/ s3 rb s3://master.test 194 ``` 195 196 **System Response:** 197 198 ``` 199 remove_bucket: master.test 200 ``` 201 202 ## Upload and Download File Objects 203 204 For input repositories at the top of your DAG, you can both add files 205 to and download files from the repository. 206 207 Not all the repositories that you see in the results of the `ls` command are 208 input repositories that can be written to. Some of them might be read-only 209 output repos. Check your pipeline specification to verify which 210 repositories are the input repos. 211 212 To add a file to a repository, complete the following steps: 213 214 1. Run the `cp` command for your S3 client: 215 216 * If you are using MinIO, type: 217 218 ```shell 219 mc cp test.csv local/master.raw_data/test.csv 220 ``` 221 222 **System Response:** 223 224 ``` 225 test.csv: 62 B / 62 B ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ 100.00% 206 B/s 0s 226 ``` 227 228 * If you are using AWS, type: 229 230 ```shell 231 aws --endpoint-url http://localhost:30600/ s3 cp test.csv s3://master.raw_data 232 ``` 233 234 **System Response:** 235 236 ``` 237 upload: ./test.csv to s3://master.raw_data/test.csv 238 ``` 239 240 These commands add the `test.csv` file to the `master` branch in 241 the `raw_data` repository. `raw_data` is an input repository. 242 243 1. Check that the file was added: 244 245 * If you are using MinIO, type: 246 247 ```shell 248 mc ls local/master.raw_data 249 ``` 250 251 **System Response:** 252 253 ``` 254 [2019-07-19 12:11:37 PDT] 2.6MiB github_issues_medium.csv 255 [2019-07-19 12:11:37 PDT] 62B test.csv 256 ``` 257 258 * If you are using AWS, type: 259 260 ```shell 261 aws --endpoint-url http://localhost:30600/ s3 ls s3://master.raw_data/ 262 ``` 263 264 **System Response:** 265 266 ``` 267 2019-07-19 12:11:37 2685061 github_issues_medium.csv 268 2019-07-19 12:11:37 62 test.csv 269 ``` 270 271 1. Download a file from MinIO to the 272 current directory by running the following commands: 273 274 * If you are using MinIO, type: 275 276 ```shell 277 mc cp local/master.raw_data/github_issues_medium.csv . 278 ``` 279 280 **System Response:** 281 282 ``` 283 ...hub_issues_medium.csv: 2.56 MiB / 2.56 MiB ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓ 100.00% 1.26 MiB/s 2s 284 ``` 285 286 * If you are using AWS, type: 287 288 ``` 289 aws --endpoint-url http://localhost:30600/ s3 cp s3://master.raw_data/test.csv . 290 ``` 291 292 **System Response:** 293 294 ``` 295 download: s3://master.raw_data/test.csv to ./test.csv 296 ``` 297 298 ## Remove a File Object 299 300 You can delete a file in the `HEAD` of a Pachyderm branch by using the 301 MinIO command-line interface: 302 303 1. List the files in the input repository: 304 305 * If you are using MinIO, type: 306 307 ```shell 308 mc ls local/master.raw_data/ 309 ``` 310 311 **System Response:** 312 313 ``` 314 [2019-07-19 12:11:37 PDT] 2.6MiB github_issues_medium.csv 315 [2019-07-19 12:11:37 PDT] 62B test.csv 316 ``` 317 318 * If you are using AWS, type: 319 320 ```shell 321 aws --endpoint-url http://localhost:30600/ s3 ls s3://master.raw_data 322 ``` 323 324 **System Response:** 325 326 ``` 327 2019-07-19 12:11:37 2685061 github_issues_medium.csv 328 2019-07-19 12:11:37 62 test.csv 329 ``` 330 331 1. Delete a file from a repository. Example: 332 333 * If you are using MinIO, type: 334 335 ```shell 336 mc rm local/master.raw_data/test.csv 337 ``` 338 339 **System Response:** 340 341 ``` 342 Removing `local/master.raw_data/test.csv`. 343 ``` 344 345 * If you are using AWS, type: 346 347 ```shell 348 aws --endpoint-url http://localhost:30600/ s3 rm s3://master.raw_data/test.csv 349 ``` 350 351 **System Response:** 352 353 ``` 354 delete: s3://master.raw_data/test.csv 355 ```