github.com/pachyderm/pachyderm@v1.13.4/doc/docs/1.9.x/reference/s3gateway_api.md (about) 1 # S3 Gateway API 2 3 This section outlines the HTTP API exposed by the s3 gateway and any peculiarities 4 relative to S3. The operations largely mirror those documented in S3's 5 [official docs](https://docs.aws.amazon.com/AmazonS3/latest/API/API_Operations_Amazon_Simple_Storage_Service.html). 6 7 Generally, you would not call these endpoints directly, but rather use a 8 tool or library designed to work with S3-like APIs. Because of that, some 9 working knowledge of S3 and HTTP is assumed. 10 11 ### Authentication 12 13 If authentication is not enabled on the Pachyderm cluster, S3 gateway 14 endpoints can be hit without passing auth credentials. 15 16 If authentication is enabled, credentials must be passed using AWS' 17 [signature v2](https://docs.aws.amazon.com/AmazonS3/latest/dev/RESTAuthentication.html) 18 or [signature v4](https://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-authenticating-requests.html) 19 methods. For both methods, set the AWS access key and secret key to the 20 same value. Both values are the Pachyderm auth token that is used to issue the 21 relevant PFS calls. One or both signature methods are built into most s3 tools 22 and libraries already, so you do not need to configure these methods manually. 23 24 ### Buckets 25 26 Buckets are represented via `branch.repo`. For example, the `master.images` 27 bucket corresponds to the `master` branch of the `images` repo. 28 29 ### Operations 30 31 #### `ListBuckets` 32 33 Route: `GET /`. 34 35 Lists all of the branches across all of the repos as S3 buckets. 36 37 #### `DeleteBucket` 38 39 Route: `DELETE /<branch>.<repo>/`. 40 41 Deletes the branch. If it is the last branch in the repo, the repo is also 42 deleted. Unlike S3, you can delete non-empty branches. 43 44 #### `ListObjects` 45 46 Route: `GET /<branch>.<repo>/` 47 48 Only S3's list objects v1 is supported. 49 50 PFS directories are represented via `CommonPrefixes`. This largely mirrors how 51 S3 is used in practice, but leads to a couple of differences: 52 53 * If you set the delimiter parameter, it must be `/`. 54 * Empty directories are included in listed results. 55 56 With regard to listed results: 57 58 * Due to PFS peculiarities, the `LastModified` field references when the most 59 recent commit to the branch happened, which may or may not have modified the 60 specific object listed. 61 * The HTTP `ETag` field does not use MD5, but is a cryptographically secure 62 hash of the file contents. 63 * The S3 `StorageClass` and `Owner` fields always have the same filler value. 64 65 #### `GetBucketLocation` 66 67 Route: `GET /<branch>.<repo>/?location` 68 69 This will always serve the same location for every bucket, but the endpoint 70 is implemented to provide better compatibility with S3 clients. 71 72 #### `GetBucketVersioning` 73 74 Route: `GET /<branch>.<repo>/?versioning` 75 76 This will get whether versioning is enabled, which is always true. 77 78 #### `ListMultipartUploads` 79 80 Route: `GET /<branch>.<repo>/?uploads` 81 82 Lists the in-progress multipart uploads in the given branch. The `delimiter` query parameter is not supported. 83 84 #### `CreateBucket` 85 86 Route: `PUT /<branch>.<repo>/`. 87 88 If the repo does not exist, it is created. If the branch does not exist, it 89 is likewise created. As per S3's behavior in some regions (but not all), 90 trying to create the same bucket twice will return a `BucketAlreadyOwnedByYou` 91 error. 92 93 #### `DeleteObjects` 94 95 Route: `POST /<branch>.<repo>/?delete`. 96 97 Deletes multiple files specified in the request payload. 98 99 #### `DeleteObject` 100 101 Route: `DELETE /<branch>.<repo>/<filepath>`. 102 103 Deletes the PFS file `filepath` in an atomic commit on the HEAD of `branch`. 104 105 #### `GetObject` 106 107 Route: `GET /<branch>.<repo>/<filepath>`. 108 109 By default, this request gets the `HEAD` version of the file. You can use s3's 110 versioning API to get the object at a non-HEAD commit by specifying either a 111 specific commit ID, or by using the caret syntax -- for example, `HEAD^`. 112 113 There is support for range queries and conditional requests, however error 114 response bodies for bad requests using these headers are not standard S3 XML. 115 116 With regard to HTTP response headers: 117 118 * Due to PFS peculiarities, the HTTP `Last-Modified` header references when 119 the most recent commit to the branch happened, which may or may not have 120 modified this specific object. 121 * The HTTP `ETag` does not use MD5, but is a cryptographically secure hash of 122 the file contents. 123 124 #### `PutObject` 125 126 Route: `PUT /<branch>.<repo>/<filepath>`. 127 128 Writes the PFS file at `filepath` in an atomic commit on the HEAD of `branch`. 129 130 Any existing file content is overwritten. Unlike S3, there is no limit to 131 upload size. 132 133 Unlike s3, a 64mb max size is not enforced on this endpoint. Especially, 134 as the file upload size gets larger, we recommend setting the `Content-MD5` 135 request header to ensure data integrity. 136 137 #### `AbortMultipartUpload` 138 139 Route: `DELETE /<branch>.<repo>?uploadId=<uploadId>` 140 141 Aborts an in-progress multipart upload. 142 143 #### `CompleteMultipartUpload` 144 145 Route: `POST /<branch>.<repo>?uploadId=<uploadId>` 146 147 Completes a multipart upload. If ETags are included in the request 148 payload, they must be of the same format as returned by the S3 149 gateway when the multipart chunks are included. If they are `md5` 150 hashes or any other hash algorithm, they are ignored. 151 152 #### `CreateMultipartUpload` 153 154 Route: `POST /<branch>.<repo>?uploads` 155 156 Initiates a multipart upload. 157 158 #### `ListParts` 159 160 Route: `GET /<branch>.<repo>?uploadId=<uploadId>` 161 162 Lists the parts of an in-progress multipart upload. 163 164 #### `UploadPart` 165 166 Route: `PUT /<branch>.<repo>?uploadId=<uploadId>&partNumber=<partNumber>` 167 168 Uploads a chunk of a multipart upload.