github.com/Jeffail/benthos/v3@v3.65.0/website/docs/components/inputs/gcp_cloud_storage.md

github.com/Jeffail/benthos/v3@v3.65.0/website/docs/components/inputs/gcp_cloud_storage.md (about)

     1  ---
     2  title: gcp_cloud_storage
     3  type: input
     4  status: beta
     5  categories: ["Services","GCP"]
     6  ---
     7  
     8  <!--
     9       THIS FILE IS AUTOGENERATED!
    10  
    11       To make changes please edit the contents of:
    12       lib/input/gcp_cloud_storage.go
    13  -->
    14  
    15  import Tabs from '@theme/Tabs';
    16  import TabItem from '@theme/TabItem';
    17  
    18  :::caution BETA
    19  This component is mostly stable but breaking changes could still be made outside of major version releases if a fundamental problem with the component is found.
    20  :::
    21  
    22  Downloads objects within a Google Cloud Storage bucket, optionally filtered by a prefix.
    23  
    24  Introduced in version 3.43.0.
    25  
    26  
    27  <Tabs defaultValue="common" values={[
    28    { label: 'Common', value: 'common', },
    29    { label: 'Advanced', value: 'advanced', },
    30  ]}>
    31  
    32  <TabItem value="common">
    33  
    34  ```yaml
    35  # Common config fields, showing default values
    36  input:
    37    label: ""
    38    gcp_cloud_storage:
    39      bucket: ""
    40      prefix: ""
    41      codec: all-bytes
    42  ```
    43  
    44  </TabItem>
    45  <TabItem value="advanced">
    46  
    47  ```yaml
    48  # All config fields, showing default values
    49  input:
    50    label: ""
    51    gcp_cloud_storage:
    52      bucket: ""
    53      prefix: ""
    54      codec: all-bytes
    55      delete_objects: false
    56  ```
    57  
    58  </TabItem>
    59  </Tabs>
    60  
    61  ## Downloading Large Files
    62  
    63  When downloading large files it's often necessary to process it in streamed parts in order to avoid loading the entire file in memory at a given time. In order to do this a [`codec`](#codec) can be specified that determines how to break the input into smaller individual messages.
    64  
    65  ## Metadata
    66  
    67  This input adds the following metadata fields to each message:
    68  
    69  ```
    70  - gcs_key
    71  - gcs_bucket
    72  - gcs_last_modified
    73  - gcs_last_modified_unix
    74  - gcs_content_type
    75  - gcs_content_encoding
    76  - All user defined metadata
    77  ```
    78  
    79  You can access these metadata fields using [function interpolation](/docs/configuration/interpolation#metadata).
    80  
    81  ### Credentials
    82  
    83  By default Benthos will use a shared credentials file when connecting to GCP
    84  services. You can find out more [in this document](/docs/guides/cloud/gcp).
    85  
    86  ## Fields
    87  
    88  ### `bucket`
    89  
    90  The name of the bucket from which to download objects.
    91  
    92  
    93  Type: `string`  
    94  Default: `""`  
    95  
    96  ### `prefix`
    97  
    98  An optional path prefix, if set only objects with the prefix are consumed.
    99  
   100  
   101  Type: `string`  
   102  Default: `""`  
   103  
   104  ### `codec`
   105  
   106  The way in which the bytes of a data source should be converted into discrete messages, codecs are useful for specifying how large files or contiunous streams of data might be processed in small chunks rather than loading it all in memory. It's possible to consume lines using a custom delimiter with the `delim:x` codec, where x is the character sequence custom delimiter. Codecs can be chained with `/`, for example a gzip compressed CSV file can be consumed with the codec `gzip/csv`.
   107  
   108  
   109  Type: `string`  
   110  Default: `"all-bytes"`  
   111  
   112  | Option | Summary |
   113  |---|---|
   114  | `auto` | EXPERIMENTAL: Attempts to derive a codec for each file based on information such as the extension. For example, a .tar.gz file would be consumed with the `gzip/tar` codec. Defaults to all-bytes. |
   115  | `all-bytes` | Consume the entire file as a single binary message. |
   116  | `chunker:x` | Consume the file in chunks of a given number of bytes. |
   117  | `csv` | Consume structured rows as comma separated values, the first row must be a header row. |
   118  | `csv:x` | Consume structured rows as values separated by a custom delimiter, the first row must be a header row. The custom delimiter must be a single character, e.g. the codec `"csv:\t"` would consume a tab delimited file. |
   119  | `delim:x` | Consume the file in segments divided by a custom delimiter. |
   120  | `gzip` | Decompress a gzip file, this codec should precede another codec, e.g. `gzip/all-bytes`, `gzip/tar`, `gzip/csv`, etc. |
   121  | `lines` | Consume the file in segments divided by linebreaks. |
   122  | `multipart` | Consumes the output of another codec and batches messages together. A batch ends when an empty message is consumed. For example, the codec `lines/multipart` could be used to consume multipart messages where an empty line indicates the end of each batch. |
   123  | `regex:(?m)^\d\d:\d\d:\d\d` | Consume the file in segments divided by regular expression. |
   124  | `tar` | Parse the file as a tar archive, and consume each file of the archive as a message. |
   125  
   126  
   127  ```yaml
   128  # Examples
   129  
   130  codec: lines
   131  
   132  codec: "delim:\t"
   133  
   134  codec: delim:foobar
   135  
   136  codec: gzip/csv
   137  ```
   138  
   139  ### `delete_objects`
   140  
   141  Whether to delete downloaded objects from the bucket once they are processed.
   142  
   143  
   144  Type: `bool`  
   145  Default: `false`  
   146  
   147