github.com/Jeffail/benthos/v3@v3.65.0/website/docs/components/outputs/gcp_cloud_storage.md (about) 1 --- 2 title: gcp_cloud_storage 3 type: output 4 status: beta 5 categories: ["Services","GCP"] 6 --- 7 8 <!-- 9 THIS FILE IS AUTOGENERATED! 10 11 To make changes please edit the contents of: 12 lib/output/gcp_cloud_storage.go 13 --> 14 15 import Tabs from '@theme/Tabs'; 16 import TabItem from '@theme/TabItem'; 17 18 :::caution BETA 19 This component is mostly stable but breaking changes could still be made outside of major version releases if a fundamental problem with the component is found. 20 ::: 21 22 Sends message parts as objects to a Google Cloud Storage bucket. Each object is 23 uploaded with the path specified with the `path` field. 24 25 Introduced in version 3.43.0. 26 27 28 <Tabs defaultValue="common" values={[ 29 { label: 'Common', value: 'common', }, 30 { label: 'Advanced', value: 'advanced', }, 31 ]}> 32 33 <TabItem value="common"> 34 35 ```yaml 36 # Common config fields, showing default values 37 output: 38 label: "" 39 gcp_cloud_storage: 40 bucket: "" 41 path: ${!count("files")}-${!timestamp_unix_nano()}.txt 42 content_type: application/octet-stream 43 collision_mode: overwrite 44 max_in_flight: 1 45 batching: 46 count: 0 47 byte_size: 0 48 period: "" 49 check: "" 50 ``` 51 52 </TabItem> 53 <TabItem value="advanced"> 54 55 ```yaml 56 # All config fields, showing default values 57 output: 58 label: "" 59 gcp_cloud_storage: 60 bucket: "" 61 path: ${!count("files")}-${!timestamp_unix_nano()}.txt 62 content_type: application/octet-stream 63 collision_mode: overwrite 64 content_encoding: "" 65 chunk_size: 16777216 66 max_in_flight: 1 67 batching: 68 count: 0 69 byte_size: 0 70 period: "" 71 check: "" 72 processors: [] 73 ``` 74 75 </TabItem> 76 </Tabs> 77 78 In order to have a different path for each object you should use function 79 interpolations described [here](/docs/configuration/interpolation#bloblang-queries), which are 80 calculated per message of a batch. 81 82 ### Metadata 83 84 Metadata fields on messages will be sent as headers, in order to mutate these values (or remove them) check out the [metadata docs](/docs/configuration/metadata). 85 86 ### Credentials 87 88 By default Benthos will use a shared credentials file when connecting to GCP 89 services. You can find out more [in this document](/docs/guides/cloud/gcp). 90 91 ### Batching 92 93 It's common to want to upload messages to Google Cloud Storage as batched 94 archives, the easiest way to do this is to batch your messages at the output 95 level and join the batch of messages with an 96 [`archive`](/docs/components/processors/archive) and/or 97 [`compress`](/docs/components/processors/compress) processor. 98 99 For example, if we wished to upload messages as a .tar.gz archive of documents 100 we could achieve that with the following config: 101 102 ```yaml 103 output: 104 gcp_cloud_storage: 105 bucket: TODO 106 path: ${!count("files")}-${!timestamp_unix_nano()}.tar.gz 107 batching: 108 count: 100 109 period: 10s 110 processors: 111 - archive: 112 format: tar 113 - compress: 114 algorithm: gzip 115 ``` 116 117 Alternatively, if we wished to upload JSON documents as a single large document 118 containing an array of objects we can do that with: 119 120 ```yaml 121 output: 122 gcp_cloud_storage: 123 bucket: TODO 124 path: ${!count("files")}-${!timestamp_unix_nano()}.json 125 batching: 126 count: 100 127 processors: 128 - archive: 129 format: json_array 130 ``` 131 132 ## Performance 133 134 This output benefits from sending multiple messages in flight in parallel for 135 improved performance. You can tune the max number of in flight messages with the 136 field `max_in_flight`. 137 138 This output benefits from sending messages as a batch for improved performance. 139 Batches can be formed at both the input and output level. You can find out more 140 [in this doc](/docs/configuration/batching). 141 142 ## Fields 143 144 ### `bucket` 145 146 The bucket to upload messages to. 147 148 149 Type: `string` 150 Default: `""` 151 152 ### `path` 153 154 The path of each message to upload. 155 This field supports [interpolation functions](/docs/configuration/interpolation#bloblang-queries). 156 157 158 Type: `string` 159 Default: `"${!count(\"files\")}-${!timestamp_unix_nano()}.txt"` 160 161 ```yaml 162 # Examples 163 164 path: ${!count("files")}-${!timestamp_unix_nano()}.txt 165 166 path: ${!meta("kafka_key")}.json 167 168 path: ${!json("doc.namespace")}/${!json("doc.id")}.json 169 ``` 170 171 ### `content_type` 172 173 The content type to set for each object. 174 This field supports [interpolation functions](/docs/configuration/interpolation#bloblang-queries). 175 176 177 Type: `string` 178 Default: `"application/octet-stream"` 179 180 ### `collision_mode` 181 182 Determines how file path collisions should be dealt with. 183 184 185 Type: `string` 186 Default: `"overwrite"` 187 Requires version 3.53.0 or newer 188 189 | Option | Summary | 190 |---|---| 191 | `overwrite` | Replace the existing file with the new one. | 192 | `append` | Append the message bytes to the original file. | 193 | `error-if-exists` | Return an error, this is the equivalent of a nack. | 194 | `ignore` | Do not modify the original file, the new data will be dropped. | 195 196 197 ### `content_encoding` 198 199 An optional content encoding to set for each object. 200 This field supports [interpolation functions](/docs/configuration/interpolation#bloblang-queries). 201 202 203 Type: `string` 204 Default: `""` 205 206 ### `chunk_size` 207 208 An optional chunk size which controls the maximum number of bytes of the object that the Writer will attempt to send to the server in a single request. If ChunkSize is set to zero, chunking will be disabled. 209 210 211 Type: `int` 212 Default: `16777216` 213 214 ### `max_in_flight` 215 216 The maximum number of messages to have in flight at a given time. Increase this to improve throughput. 217 218 219 Type: `int` 220 Default: `1` 221 222 ### `batching` 223 224 Allows you to configure a [batching policy](/docs/configuration/batching). 225 226 227 Type: `object` 228 229 ```yaml 230 # Examples 231 232 batching: 233 byte_size: 5000 234 count: 0 235 period: 1s 236 237 batching: 238 count: 10 239 period: 1s 240 241 batching: 242 check: this.contains("END BATCH") 243 count: 0 244 period: 1m 245 ``` 246 247 ### `batching.count` 248 249 A number of messages at which the batch should be flushed. If `0` disables count based batching. 250 251 252 Type: `int` 253 Default: `0` 254 255 ### `batching.byte_size` 256 257 An amount of bytes at which the batch should be flushed. If `0` disables size based batching. 258 259 260 Type: `int` 261 Default: `0` 262 263 ### `batching.period` 264 265 A period in which an incomplete batch should be flushed regardless of its size. 266 267 268 Type: `string` 269 Default: `""` 270 271 ```yaml 272 # Examples 273 274 period: 1s 275 276 period: 1m 277 278 period: 500ms 279 ``` 280 281 ### `batching.check` 282 283 A [Bloblang query](/docs/guides/bloblang/about/) that should return a boolean value indicating whether a message should end a batch. 284 285 286 Type: `string` 287 Default: `""` 288 289 ```yaml 290 # Examples 291 292 check: this.type == "end_of_transaction" 293 ``` 294 295 ### `batching.processors` 296 297 A list of [processors](/docs/components/processors/about) to apply to a batch as it is flushed. This allows you to aggregate and archive the batch however you see fit. Please note that all resulting messages are flushed as a single batch, therefore splitting the batch into smaller batches using these processors is a no-op. 298 299 300 Type: `array` 301 Default: `[]` 302 303 ```yaml 304 # Examples 305 306 processors: 307 - archive: 308 format: lines 309 310 processors: 311 - archive: 312 format: json_array 313 314 processors: 315 - merge_json: {} 316 ``` 317 318