github.com/Jeffail/benthos/v3@v3.65.0/website/docs/components/outputs/gcp_cloud_storage.md

github.com/Jeffail/benthos/v3@v3.65.0/website/docs/components/outputs/gcp_cloud_storage.md (about)

     1  ---
     2  title: gcp_cloud_storage
     3  type: output
     4  status: beta
     5  categories: ["Services","GCP"]
     6  ---
     7  
     8  <!--
     9       THIS FILE IS AUTOGENERATED!
    10  
    11       To make changes please edit the contents of:
    12       lib/output/gcp_cloud_storage.go
    13  -->
    14  
    15  import Tabs from '@theme/Tabs';
    16  import TabItem from '@theme/TabItem';
    17  
    18  :::caution BETA
    19  This component is mostly stable but breaking changes could still be made outside of major version releases if a fundamental problem with the component is found.
    20  :::
    21  
    22  Sends message parts as objects to a Google Cloud Storage bucket. Each object is
    23  uploaded with the path specified with the `path` field.
    24  
    25  Introduced in version 3.43.0.
    26  
    27  
    28  <Tabs defaultValue="common" values={[
    29    { label: 'Common', value: 'common', },
    30    { label: 'Advanced', value: 'advanced', },
    31  ]}>
    32  
    33  <TabItem value="common">
    34  
    35  ```yaml
    36  # Common config fields, showing default values
    37  output:
    38    label: ""
    39    gcp_cloud_storage:
    40      bucket: ""
    41      path: ${!count("files")}-${!timestamp_unix_nano()}.txt
    42      content_type: application/octet-stream
    43      collision_mode: overwrite
    44      max_in_flight: 1
    45      batching:
    46        count: 0
    47        byte_size: 0
    48        period: ""
    49        check: ""
    50  ```
    51  
    52  </TabItem>
    53  <TabItem value="advanced">
    54  
    55  ```yaml
    56  # All config fields, showing default values
    57  output:
    58    label: ""
    59    gcp_cloud_storage:
    60      bucket: ""
    61      path: ${!count("files")}-${!timestamp_unix_nano()}.txt
    62      content_type: application/octet-stream
    63      collision_mode: overwrite
    64      content_encoding: ""
    65      chunk_size: 16777216
    66      max_in_flight: 1
    67      batching:
    68        count: 0
    69        byte_size: 0
    70        period: ""
    71        check: ""
    72        processors: []
    73  ```
    74  
    75  </TabItem>
    76  </Tabs>
    77  
    78  In order to have a different path for each object you should use function
    79  interpolations described [here](/docs/configuration/interpolation#bloblang-queries), which are
    80  calculated per message of a batch.
    81  
    82  ### Metadata
    83  
    84  Metadata fields on messages will be sent as headers, in order to mutate these values (or remove them) check out the [metadata docs](/docs/configuration/metadata).
    85  
    86  ### Credentials
    87  
    88  By default Benthos will use a shared credentials file when connecting to GCP
    89  services. You can find out more [in this document](/docs/guides/cloud/gcp).
    90  
    91  ### Batching
    92  
    93  It's common to want to upload messages to Google Cloud Storage as batched
    94  archives, the easiest way to do this is to batch your messages at the output
    95  level and join the batch of messages with an
    96  [`archive`](/docs/components/processors/archive) and/or
    97  [`compress`](/docs/components/processors/compress) processor.
    98  
    99  For example, if we wished to upload messages as a .tar.gz archive of documents
   100  we could achieve that with the following config:
   101  
   102  ```yaml
   103  output:
   104    gcp_cloud_storage:
   105      bucket: TODO
   106      path: ${!count("files")}-${!timestamp_unix_nano()}.tar.gz
   107      batching:
   108        count: 100
   109        period: 10s
   110        processors:
   111          - archive:
   112              format: tar
   113          - compress:
   114              algorithm: gzip
   115  ```
   116  
   117  Alternatively, if we wished to upload JSON documents as a single large document
   118  containing an array of objects we can do that with:
   119  
   120  ```yaml
   121  output:
   122    gcp_cloud_storage:
   123      bucket: TODO
   124      path: ${!count("files")}-${!timestamp_unix_nano()}.json
   125      batching:
   126        count: 100
   127        processors:
   128          - archive:
   129              format: json_array
   130  ```
   131  
   132  ## Performance
   133  
   134  This output benefits from sending multiple messages in flight in parallel for
   135  improved performance. You can tune the max number of in flight messages with the
   136  field `max_in_flight`.
   137  
   138  This output benefits from sending messages as a batch for improved performance.
   139  Batches can be formed at both the input and output level. You can find out more
   140  [in this doc](/docs/configuration/batching).
   141  
   142  ## Fields
   143  
   144  ### `bucket`
   145  
   146  The bucket to upload messages to.
   147  
   148  
   149  Type: `string`  
   150  Default: `""`  
   151  
   152  ### `path`
   153  
   154  The path of each message to upload.
   155  This field supports [interpolation functions](/docs/configuration/interpolation#bloblang-queries).
   156  
   157  
   158  Type: `string`  
   159  Default: `"${!count(\"files\")}-${!timestamp_unix_nano()}.txt"`  
   160  
   161  ```yaml
   162  # Examples
   163  
   164  path: ${!count("files")}-${!timestamp_unix_nano()}.txt
   165  
   166  path: ${!meta("kafka_key")}.json
   167  
   168  path: ${!json("doc.namespace")}/${!json("doc.id")}.json
   169  ```
   170  
   171  ### `content_type`
   172  
   173  The content type to set for each object.
   174  This field supports [interpolation functions](/docs/configuration/interpolation#bloblang-queries).
   175  
   176  
   177  Type: `string`  
   178  Default: `"application/octet-stream"`  
   179  
   180  ### `collision_mode`
   181  
   182  Determines how file path collisions should be dealt with.
   183  
   184  
   185  Type: `string`  
   186  Default: `"overwrite"`  
   187  Requires version 3.53.0 or newer  
   188  
   189  | Option | Summary |
   190  |---|---|
   191  | `overwrite` | Replace the existing file with the new one. |
   192  | `append` | Append the message bytes to the original file. |
   193  | `error-if-exists` | Return an error, this is the equivalent of a nack. |
   194  | `ignore` | Do not modify the original file, the new data will be dropped. |
   195  
   196  
   197  ### `content_encoding`
   198  
   199  An optional content encoding to set for each object.
   200  This field supports [interpolation functions](/docs/configuration/interpolation#bloblang-queries).
   201  
   202  
   203  Type: `string`  
   204  Default: `""`  
   205  
   206  ### `chunk_size`
   207  
   208  An optional chunk size which controls the maximum number of bytes of the object that the Writer will attempt to send to the server in a single request. If ChunkSize is set to zero, chunking will be disabled.
   209  
   210  
   211  Type: `int`  
   212  Default: `16777216`  
   213  
   214  ### `max_in_flight`
   215  
   216  The maximum number of messages to have in flight at a given time. Increase this to improve throughput.
   217  
   218  
   219  Type: `int`  
   220  Default: `1`  
   221  
   222  ### `batching`
   223  
   224  Allows you to configure a [batching policy](/docs/configuration/batching).
   225  
   226  
   227  Type: `object`  
   228  
   229  ```yaml
   230  # Examples
   231  
   232  batching:
   233    byte_size: 5000
   234    count: 0
   235    period: 1s
   236  
   237  batching:
   238    count: 10
   239    period: 1s
   240  
   241  batching:
   242    check: this.contains("END BATCH")
   243    count: 0
   244    period: 1m
   245  ```
   246  
   247  ### `batching.count`
   248  
   249  A number of messages at which the batch should be flushed. If `0` disables count based batching.
   250  
   251  
   252  Type: `int`  
   253  Default: `0`  
   254  
   255  ### `batching.byte_size`
   256  
   257  An amount of bytes at which the batch should be flushed. If `0` disables size based batching.
   258  
   259  
   260  Type: `int`  
   261  Default: `0`  
   262  
   263  ### `batching.period`
   264  
   265  A period in which an incomplete batch should be flushed regardless of its size.
   266  
   267  
   268  Type: `string`  
   269  Default: `""`  
   270  
   271  ```yaml
   272  # Examples
   273  
   274  period: 1s
   275  
   276  period: 1m
   277  
   278  period: 500ms
   279  ```
   280  
   281  ### `batching.check`
   282  
   283  A [Bloblang query](/docs/guides/bloblang/about/) that should return a boolean value indicating whether a message should end a batch.
   284  
   285  
   286  Type: `string`  
   287  Default: `""`  
   288  
   289  ```yaml
   290  # Examples
   291  
   292  check: this.type == "end_of_transaction"
   293  ```
   294  
   295  ### `batching.processors`
   296  
   297  A list of [processors](/docs/components/processors/about) to apply to a batch as it is flushed. This allows you to aggregate and archive the batch however you see fit. Please note that all resulting messages are flushed as a single batch, therefore splitting the batch into smaller batches using these processors is a no-op.
   298  
   299  
   300  Type: `array`  
   301  Default: `[]`  
   302  
   303  ```yaml
   304  # Examples
   305  
   306  processors:
   307    - archive:
   308        format: lines
   309  
   310  processors:
   311    - archive:
   312        format: json_array
   313  
   314  processors:
   315    - merge_json: {}
   316  ```
   317  
   318