github.com/Jeffail/benthos/v3@v3.65.0/website/docs/components/outputs/aws_s3.md (about)

     1  ---
     2  title: aws_s3
     3  type: output
     4  status: stable
     5  categories: ["Services","AWS"]
     6  ---
     7  
     8  <!--
     9       THIS FILE IS AUTOGENERATED!
    10  
    11       To make changes please edit the contents of:
    12       lib/output/aws_s3.go
    13  -->
    14  
    15  import Tabs from '@theme/Tabs';
    16  import TabItem from '@theme/TabItem';
    17  
    18  
    19  Sends message parts as objects to an Amazon S3 bucket. Each object is uploaded
    20  with the path specified with the `path` field.
    21  
    22  Introduced in version 3.36.0.
    23  
    24  
    25  <Tabs defaultValue="common" values={[
    26    { label: 'Common', value: 'common', },
    27    { label: 'Advanced', value: 'advanced', },
    28  ]}>
    29  
    30  <TabItem value="common">
    31  
    32  ```yaml
    33  # Common config fields, showing default values
    34  output:
    35    label: ""
    36    aws_s3:
    37      bucket: ""
    38      path: ${!count("files")}-${!timestamp_unix_nano()}.txt
    39      tags: {}
    40      content_type: application/octet-stream
    41      metadata:
    42        exclude_prefixes: []
    43      max_in_flight: 1
    44      batching:
    45        count: 0
    46        byte_size: 0
    47        period: ""
    48        check: ""
    49      region: eu-west-1
    50  ```
    51  
    52  </TabItem>
    53  <TabItem value="advanced">
    54  
    55  ```yaml
    56  # All config fields, showing default values
    57  output:
    58    label: ""
    59    aws_s3:
    60      bucket: ""
    61      path: ${!count("files")}-${!timestamp_unix_nano()}.txt
    62      tags: {}
    63      content_type: application/octet-stream
    64      content_encoding: ""
    65      cache_control: ""
    66      content_disposition: ""
    67      content_language: ""
    68      website_redirect_location: ""
    69      metadata:
    70        exclude_prefixes: []
    71      storage_class: STANDARD
    72      kms_key_id: ""
    73      server_side_encryption: ""
    74      force_path_style_urls: false
    75      max_in_flight: 1
    76      timeout: 5s
    77      batching:
    78        count: 0
    79        byte_size: 0
    80        period: ""
    81        check: ""
    82        processors: []
    83      region: eu-west-1
    84      endpoint: ""
    85      credentials:
    86        profile: ""
    87        id: ""
    88        secret: ""
    89        token: ""
    90        role: ""
    91        role_external_id: ""
    92  ```
    93  
    94  </TabItem>
    95  </Tabs>
    96  
    97  In order to have a different path for each object you should use function
    98  interpolations described [here](/docs/configuration/interpolation#bloblang-queries), which are
    99  calculated per message of a batch.
   100  
   101  ### Metadata
   102  
   103  Metadata fields on messages will be sent as headers, in order to mutate these values (or remove them) check out the [metadata docs](/docs/configuration/metadata).
   104  
   105  ### Tags
   106  
   107  The tags field allows you to specify key/value pairs to attach to objects as tags, where the values support
   108  [interpolation functions](/docs/configuration/interpolation#bloblang-queries):
   109  
   110  ```yaml
   111  output:
   112    aws_s3:
   113      bucket: TODO
   114      path: ${!count("files")}-${!timestamp_unix_nano()}.tar.gz
   115      tags:
   116        Key1: Value1
   117        Timestamp: ${!meta("Timestamp")}
   118  ```
   119  
   120  ### Credentials
   121  
   122  By default Benthos will use a shared credentials file when connecting to AWS
   123  services. It's also possible to set them explicitly at the component level,
   124  allowing you to transfer data across accounts. You can find out more
   125  [in this document](/docs/guides/cloud/aws).
   126  
   127  ### Batching
   128  
   129  It's common to want to upload messages to S3 as batched archives, the easiest
   130  way to do this is to batch your messages at the output level and join the batch
   131  of messages with an
   132  [`archive`](/docs/components/processors/archive) and/or
   133  [`compress`](/docs/components/processors/compress) processor.
   134  
   135  For example, if we wished to upload messages as a .tar.gz archive of documents
   136  we could achieve that with the following config:
   137  
   138  ```yaml
   139  output:
   140    aws_s3:
   141      bucket: TODO
   142      path: ${!count("files")}-${!timestamp_unix_nano()}.tar.gz
   143      batching:
   144        count: 100
   145        period: 10s
   146        processors:
   147          - archive:
   148              format: tar
   149          - compress:
   150              algorithm: gzip
   151  ```
   152  
   153  Alternatively, if we wished to upload JSON documents as a single large document
   154  containing an array of objects we can do that with:
   155  
   156  ```yaml
   157  output:
   158    aws_s3:
   159      bucket: TODO
   160      path: ${!count("files")}-${!timestamp_unix_nano()}.json
   161      batching:
   162        count: 100
   163        processors:
   164          - archive:
   165              format: json_array
   166  ```
   167  
   168  ## Performance
   169  
   170  This output benefits from sending multiple messages in flight in parallel for
   171  improved performance. You can tune the max number of in flight messages with the
   172  field `max_in_flight`.
   173  
   174  ## Fields
   175  
   176  ### `bucket`
   177  
   178  The bucket to upload messages to.
   179  
   180  
   181  Type: `string`  
   182  Default: `""`  
   183  
   184  ### `path`
   185  
   186  The path of each message to upload.
   187  This field supports [interpolation functions](/docs/configuration/interpolation#bloblang-queries).
   188  
   189  
   190  Type: `string`  
   191  Default: `"${!count(\"files\")}-${!timestamp_unix_nano()}.txt"`  
   192  
   193  ```yaml
   194  # Examples
   195  
   196  path: ${!count("files")}-${!timestamp_unix_nano()}.txt
   197  
   198  path: ${!meta("kafka_key")}.json
   199  
   200  path: ${!json("doc.namespace")}/${!json("doc.id")}.json
   201  ```
   202  
   203  ### `tags`
   204  
   205  Key/value pairs to store with the object as tags.
   206  This field supports [interpolation functions](/docs/configuration/interpolation#bloblang-queries).
   207  
   208  
   209  Type: `object`  
   210  Default: `{}`  
   211  
   212  ```yaml
   213  # Examples
   214  
   215  tags:
   216    Key1: Value1
   217    Timestamp: ${!meta("Timestamp")}
   218  ```
   219  
   220  ### `content_type`
   221  
   222  The content type to set for each object.
   223  This field supports [interpolation functions](/docs/configuration/interpolation#bloblang-queries).
   224  
   225  
   226  Type: `string`  
   227  Default: `"application/octet-stream"`  
   228  
   229  ### `content_encoding`
   230  
   231  An optional content encoding to set for each object.
   232  This field supports [interpolation functions](/docs/configuration/interpolation#bloblang-queries).
   233  
   234  
   235  Type: `string`  
   236  Default: `""`  
   237  
   238  ### `cache_control`
   239  
   240  The cache control to set for each object.
   241  This field supports [interpolation functions](/docs/configuration/interpolation#bloblang-queries).
   242  
   243  
   244  Type: `string`  
   245  Default: `""`  
   246  
   247  ### `content_disposition`
   248  
   249  The content disposition to set for each object.
   250  This field supports [interpolation functions](/docs/configuration/interpolation#bloblang-queries).
   251  
   252  
   253  Type: `string`  
   254  Default: `""`  
   255  
   256  ### `content_language`
   257  
   258  The content language to set for each object.
   259  This field supports [interpolation functions](/docs/configuration/interpolation#bloblang-queries).
   260  
   261  
   262  Type: `string`  
   263  Default: `""`  
   264  
   265  ### `website_redirect_location`
   266  
   267  The website redirect location to set for each object.
   268  This field supports [interpolation functions](/docs/configuration/interpolation#bloblang-queries).
   269  
   270  
   271  Type: `string`  
   272  Default: `""`  
   273  
   274  ### `metadata`
   275  
   276  Specify criteria for which metadata values are attached to objects as headers.
   277  
   278  
   279  Type: `object`  
   280  
   281  ### `metadata.exclude_prefixes`
   282  
   283  Provide a list of explicit metadata key prefixes to be excluded when adding metadata to sent messages.
   284  
   285  
   286  Type: `array`  
   287  Default: `[]`  
   288  
   289  ### `storage_class`
   290  
   291  The storage class to set for each object.
   292  This field supports [interpolation functions](/docs/configuration/interpolation#bloblang-queries).
   293  
   294  
   295  Type: `string`  
   296  Default: `"STANDARD"`  
   297  Options: `STANDARD`, `REDUCED_REDUNDANCY`, `GLACIER`, `STANDARD_IA`, `ONEZONE_IA`, `INTELLIGENT_TIERING`, `DEEP_ARCHIVE`.
   298  
   299  ### `kms_key_id`
   300  
   301  An optional server side encryption key.
   302  
   303  
   304  Type: `string`  
   305  Default: `""`  
   306  
   307  ### `server_side_encryption`
   308  
   309  An optional server side encryption algorithm.
   310  
   311  
   312  Type: `string`  
   313  Default: `""`  
   314  Requires version 3.63.0 or newer  
   315  
   316  ### `force_path_style_urls`
   317  
   318  Forces the client API to use path style URLs, which helps when connecting to custom endpoints.
   319  
   320  
   321  Type: `bool`  
   322  Default: `false`  
   323  
   324  ### `max_in_flight`
   325  
   326  The maximum number of messages to have in flight at a given time. Increase this to improve throughput.
   327  
   328  
   329  Type: `int`  
   330  Default: `1`  
   331  
   332  ### `timeout`
   333  
   334  The maximum period to wait on an upload before abandoning it and reattempting.
   335  
   336  
   337  Type: `string`  
   338  Default: `"5s"`  
   339  
   340  ### `batching`
   341  
   342  Allows you to configure a [batching policy](/docs/configuration/batching).
   343  
   344  
   345  Type: `object`  
   346  
   347  ```yaml
   348  # Examples
   349  
   350  batching:
   351    byte_size: 5000
   352    count: 0
   353    period: 1s
   354  
   355  batching:
   356    count: 10
   357    period: 1s
   358  
   359  batching:
   360    check: this.contains("END BATCH")
   361    count: 0
   362    period: 1m
   363  ```
   364  
   365  ### `batching.count`
   366  
   367  A number of messages at which the batch should be flushed. If `0` disables count based batching.
   368  
   369  
   370  Type: `int`  
   371  Default: `0`  
   372  
   373  ### `batching.byte_size`
   374  
   375  An amount of bytes at which the batch should be flushed. If `0` disables size based batching.
   376  
   377  
   378  Type: `int`  
   379  Default: `0`  
   380  
   381  ### `batching.period`
   382  
   383  A period in which an incomplete batch should be flushed regardless of its size.
   384  
   385  
   386  Type: `string`  
   387  Default: `""`  
   388  
   389  ```yaml
   390  # Examples
   391  
   392  period: 1s
   393  
   394  period: 1m
   395  
   396  period: 500ms
   397  ```
   398  
   399  ### `batching.check`
   400  
   401  A [Bloblang query](/docs/guides/bloblang/about/) that should return a boolean value indicating whether a message should end a batch.
   402  
   403  
   404  Type: `string`  
   405  Default: `""`  
   406  
   407  ```yaml
   408  # Examples
   409  
   410  check: this.type == "end_of_transaction"
   411  ```
   412  
   413  ### `batching.processors`
   414  
   415  A list of [processors](/docs/components/processors/about) to apply to a batch as it is flushed. This allows you to aggregate and archive the batch however you see fit. Please note that all resulting messages are flushed as a single batch, therefore splitting the batch into smaller batches using these processors is a no-op.
   416  
   417  
   418  Type: `array`  
   419  Default: `[]`  
   420  
   421  ```yaml
   422  # Examples
   423  
   424  processors:
   425    - archive:
   426        format: lines
   427  
   428  processors:
   429    - archive:
   430        format: json_array
   431  
   432  processors:
   433    - merge_json: {}
   434  ```
   435  
   436  ### `region`
   437  
   438  The AWS region to target.
   439  
   440  
   441  Type: `string`  
   442  Default: `"eu-west-1"`  
   443  
   444  ### `endpoint`
   445  
   446  Allows you to specify a custom endpoint for the AWS API.
   447  
   448  
   449  Type: `string`  
   450  Default: `""`  
   451  
   452  ### `credentials`
   453  
   454  Optional manual configuration of AWS credentials to use. More information can be found [in this document](/docs/guides/cloud/aws).
   455  
   456  
   457  Type: `object`  
   458  
   459  ### `credentials.profile`
   460  
   461  A profile from `~/.aws/credentials` to use.
   462  
   463  
   464  Type: `string`  
   465  Default: `""`  
   466  
   467  ### `credentials.id`
   468  
   469  The ID of credentials to use.
   470  
   471  
   472  Type: `string`  
   473  Default: `""`  
   474  
   475  ### `credentials.secret`
   476  
   477  The secret for the credentials being used.
   478  
   479  
   480  Type: `string`  
   481  Default: `""`  
   482  
   483  ### `credentials.token`
   484  
   485  The token for the credentials being used, required when using short term credentials.
   486  
   487  
   488  Type: `string`  
   489  Default: `""`  
   490  
   491  ### `credentials.role`
   492  
   493  A role ARN to assume.
   494  
   495  
   496  Type: `string`  
   497  Default: `""`  
   498  
   499  ### `credentials.role_external_id`
   500  
   501  An external ID to provide when assuming a role.
   502  
   503  
   504  Type: `string`  
   505  Default: `""`  
   506  
   507