github.com/pachyderm/pachyderm@v1.13.4/doc/docs/1.11.x/reference/s3gateway_api.md (about)

     1  # S3 Gateway API
     2  
     3  This section outlines the HTTP API exposed by the s3 gateway and any peculiarities
     4  relative to S3. The operations largely mirror those documented in S3's
     5  [official docs](https://docs.aws.amazon.com/AmazonS3/latest/API/API_Operations_Amazon_Simple_Storage_Service.html).
     6  
     7  Generally, you would not call these endpoints directly, but rather use a
     8  tool or library designed to work with S3-like APIs. Because of that, some
     9  working knowledge of S3 and HTTP is assumed.
    10  
    11  ### Authentication
    12  
    13  If authentication is not enabled on the Pachyderm cluster, S3 gateway
    14  endpoints can be hit without passing auth credentials, with the requirement that the AWS access key and secret key are set to the same value.
    15  
    16  If authentication is enabled, credentials must be passed using AWS'
    17  [signature v2](https://docs.aws.amazon.com/AmazonS3/latest/dev/RESTAuthentication.html)
    18  or [signature v4](https://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-authenticating-requests.html)
    19  methods. For both methods, set the AWS access key and secret key to the
    20  same value. Both values are the Pachyderm auth token that is used to issue the
    21  relevant PFS calls. One or both signature methods are built into most s3 tools
    22  and libraries already, so you do not need to configure these methods manually.
    23  
    24  ### Buckets
    25  
    26  Buckets are represented via `branch.repo`. For example, the `master.images`
    27  bucket corresponds to the `master` branch of the `images` repo.
    28  
    29  ### Operations
    30  
    31  #### `ListBuckets`
    32  
    33  Route: `GET /`.
    34  
    35  Lists all of the branches across all of the repos as S3 buckets.
    36  
    37  #### `DeleteBucket`
    38  
    39  Route: `DELETE /<branch>.<repo>/`.
    40  
    41  Deletes the branch. If it is the last branch in the repo, the repo is also
    42  deleted. Unlike S3, you can delete non-empty branches.
    43  
    44  #### `ListObjects`
    45  
    46  Route: `GET /<branch>.<repo>/`
    47  
    48  Only S3's list objects v1 is supported.
    49  
    50  PFS directories are represented via `CommonPrefixes`. This largely mirrors how
    51  S3 is used in practice, but leads to a couple of differences:
    52  
    53  * If you set the delimiter parameter, it must be `/`.
    54  * Empty directories are included in listed results.
    55  
    56  With regard to listed results:
    57  
    58  * Due to PFS peculiarities, the `LastModified` field references when the most
    59  recent commit to the branch happened, which may or may not have modified the
    60  specific object listed.
    61  * The HTTP `ETag` field does not use MD5, but is a cryptographically secure
    62  hash of the file contents.
    63  * The S3 `StorageClass` and `Owner` fields always have the same filler value.
    64  
    65  #### `GetBucketLocation`
    66  
    67  Route: `GET /<branch>.<repo>/?location`
    68  
    69  This will always serve the same location for every bucket, but the endpoint
    70  is implemented to provide better compatibility with S3 clients.
    71  
    72  #### `GetBucketVersioning`
    73  
    74  Route: `GET /<branch>.<repo>/?versioning`
    75  
    76  This will get whether versioning is enabled, which is always true.
    77  
    78  #### `ListMultipartUploads`
    79  
    80  Route: `GET /<branch>.<repo>/?uploads`
    81  
    82  Lists the in-progress multipart uploads in the given branch. The `delimiter` query parameter is not supported.
    83  
    84  #### `CreateBucket`
    85  
    86  Route: `PUT /<branch>.<repo>/`.
    87  
    88  If the repo does not exist, it is created. If the branch does not exist, it
    89  is likewise created. As per S3's behavior in some regions (but not all),
    90  trying to create the same bucket twice will return a `BucketAlreadyOwnedByYou`
    91  error.
    92  
    93  #### `DeleteObjects`
    94  
    95  Route: `POST /<branch>.<repo>/?delete`.
    96  
    97  Deletes multiple files specified in the request payload.
    98  
    99  #### `DeleteObject`
   100  
   101  Route: `DELETE /<branch>.<repo>/<filepath>`.
   102  
   103  Deletes the PFS file `filepath` in an atomic commit on the HEAD of `branch`.
   104  
   105  #### `GetObject`
   106  
   107  Route: `GET /<branch>.<repo>/<filepath>`.
   108  
   109  By default, this request gets the `HEAD` version of the file. You can use s3's
   110  versioning API to get the object at a non-HEAD commit by specifying either a
   111  specific commit ID, or by using the caret syntax -- for example, `HEAD^`.
   112  
   113  There is support for range queries and conditional requests, however error
   114  response bodies for bad requests using these headers are not standard S3 XML.
   115  
   116  With regard to HTTP response headers:
   117  
   118  * Due to PFS peculiarities, the HTTP `Last-Modified` header references when
   119  the most recent commit to the branch happened, which may or may not have
   120  modified this specific object.
   121  * The HTTP `ETag` does not use MD5, but is a cryptographically secure hash of
   122  the file contents.
   123  
   124  #### `PutObject`
   125  
   126  Route: `PUT /<branch>.<repo>/<filepath>`.
   127  
   128  Writes the PFS file at `filepath` in an atomic commit on the HEAD of `branch`.
   129  
   130  Any existing file content is overwritten. Unlike S3, there is no limit to
   131  upload size.
   132  
   133  Unlike s3, a 64mb max size is not enforced on this endpoint. Especially,
   134  as the file upload size gets larger, we recommend setting the `Content-MD5`
   135  request header to ensure data integrity.
   136  
   137  #### `AbortMultipartUpload`
   138  
   139  Route: `DELETE /<branch>.<repo>?uploadId=<uploadId>`
   140  
   141  Aborts an in-progress multipart upload.
   142  
   143  #### `CompleteMultipartUpload`
   144  
   145  Route: `POST /<branch>.<repo>?uploadId=<uploadId>`
   146  
   147  Completes a multipart upload. If ETags are included in the request
   148  payload, they must be of the same format as returned by the S3
   149  gateway when the multipart chunks are included. If they are `md5`
   150  hashes or any other hash algorithm, they are ignored.
   151  
   152  #### `CreateMultipartUpload`
   153  
   154  Route: `POST /<branch>.<repo>?uploads`
   155  
   156  Initiates a multipart upload.
   157  
   158  #### `ListParts`
   159  
   160  Route: `GET /<branch>.<repo>?uploadId=<uploadId>`
   161  
   162  Lists the parts of an in-progress multipart upload.
   163  
   164  #### `UploadPart`
   165  
   166  Route: `PUT /<branch>.<repo>?uploadId=<uploadId>&partNumber=<partNumber>`
   167  
   168  Uploads a chunk of a multipart upload.