github.com/Jeffail/benthos/v3@v3.65.0/website/docs/components/inputs/s3.md (about) 1 --- 2 title: s3 3 type: input 4 status: deprecated 5 categories: ["Services","AWS"] 6 --- 7 8 <!-- 9 THIS FILE IS AUTOGENERATED! 10 11 To make changes please edit the contents of: 12 lib/input/s3.go 13 --> 14 15 import Tabs from '@theme/Tabs'; 16 import TabItem from '@theme/TabItem'; 17 18 :::warning DEPRECATED 19 This component is deprecated and will be removed in the next major version release. Please consider moving onto [alternative components](#alternatives). 20 ::: 21 22 Downloads objects within an Amazon S3 bucket, optionally filtered by a prefix. 23 If an SQS queue has been configured then only object keys read from the queue 24 will be downloaded. 25 26 27 <Tabs defaultValue="common" values={[ 28 { label: 'Common', value: 'common', }, 29 { label: 'Advanced', value: 'advanced', }, 30 ]}> 31 32 <TabItem value="common"> 33 34 ```yaml 35 # Common config fields, showing default values 36 input: 37 label: "" 38 s3: 39 bucket: "" 40 prefix: "" 41 sqs_url: "" 42 sqs_body_path: Records.*.s3.object.key 43 sqs_bucket_path: "" 44 sqs_envelope_path: "" 45 region: eu-west-1 46 ``` 47 48 </TabItem> 49 <TabItem value="advanced"> 50 51 ```yaml 52 # All config fields, showing default values 53 input: 54 label: "" 55 s3: 56 bucket: "" 57 prefix: "" 58 sqs_url: "" 59 sqs_body_path: Records.*.s3.object.key 60 sqs_bucket_path: "" 61 sqs_envelope_path: "" 62 sqs_max_messages: 10 63 sqs_endpoint: "" 64 region: eu-west-1 65 endpoint: "" 66 credentials: 67 profile: "" 68 id: "" 69 secret: "" 70 token: "" 71 role: "" 72 role_external_id: "" 73 retries: 3 74 force_path_style_urls: false 75 delete_objects: false 76 download_manager: 77 enabled: true 78 timeout: 5s 79 ``` 80 81 </TabItem> 82 </Tabs> 83 84 ## Alternatives 85 86 This input is being replaced with the shiny new [`aws_s3` input](/docs/components/inputs/aws_s3), which has improved features, consider trying it out instead. 87 88 If an SQS queue is not specified the entire list of objects found when this 89 input starts will be consumed. Note that the prefix configuration is only used 90 when downloading objects without SQS configured. 91 92 If your bucket is configured to send events directly to an SQS queue then you 93 need to set the `sqs_body_path` field to a 94 [dot path](/docs/configuration/field_paths) where the object key is found in the payload. 95 However, it is also common practice to send bucket events to an SNS topic which 96 sends enveloped events to SQS, in which case you must also set the 97 `sqs_envelope_path` field to where the payload can be found. 98 99 When using SQS events it's also possible to extract target bucket names from the 100 events by specifying a path in the field `sqs_bucket_path`. For each 101 SQS event, if that path exists and contains a string it will used as the bucket 102 of the download instead of the `bucket` field. 103 104 Here is a guide for setting up an SQS queue that receives events for new S3 105 bucket objects: 106 107 https://docs.aws.amazon.com/AmazonS3/latest/dev/ways-to-add-notification-config-to-bucket.html 108 109 WARNING: When using SQS please make sure you have sensible values for 110 `sqs_max_messages` and also the visibility timeout of the queue 111 itself. 112 113 When Benthos consumes an S3 item as a result of receiving an SQS message the 114 message is not deleted until the S3 item has been sent onwards. This ensures 115 at-least-once crash resiliency, but also means that if the S3 item takes longer 116 to process than the visibility timeout of your queue then the same items might 117 be processed multiple times. 118 119 ### Credentials 120 121 By default Benthos will use a shared credentials file when connecting to AWS 122 services. It's also possible to set them explicitly at the component level, 123 allowing you to transfer data across accounts. You can find out more 124 [in this document](/docs/guides/cloud/aws). 125 126 ### Metadata 127 128 This input adds the following metadata fields to each message: 129 130 ``` 131 - s3_key 132 - s3_bucket 133 - s3_last_modified_unix* 134 - s3_last_modified (RFC3339)* 135 - s3_content_type* 136 - s3_content_encoding* 137 - All user defined metadata* 138 139 * Only added when NOT using download manager 140 ``` 141 142 You can access these metadata fields using 143 [function interpolation](/docs/configuration/interpolation#metadata). 144 145 ## Fields 146 147 ### `bucket` 148 149 The bucket to consume from. If `sqs_bucket_path` is set this field is still required as a fallback. 150 151 152 Type: `string` 153 Default: `""` 154 155 ### `prefix` 156 157 An optional path prefix, if set only objects with the prefix are consumed. This field is ignored when SQS is used. 158 159 160 Type: `string` 161 Default: `""` 162 163 ### `sqs_url` 164 165 An optional SQS URL to connect to. When specified this queue will control which objects are downloaded from the target bucket. 166 167 168 Type: `string` 169 Default: `""` 170 171 ### `sqs_body_path` 172 173 A [dot path](/docs/configuration/field_paths) whereby object keys are found in SQS messages, this field is only required when an `sqs_url` is specified. 174 175 176 Type: `string` 177 Default: `"Records.*.s3.object.key"` 178 179 ### `sqs_bucket_path` 180 181 An optional [dot path](/docs/configuration/field_paths) whereby the bucket of an object can be found in consumed SQS messages. 182 183 184 Type: `string` 185 Default: `""` 186 187 ### `sqs_envelope_path` 188 189 An optional [dot path](/docs/configuration/field_paths) of enveloped payloads to extract from SQS messages. This is required when pushing events from S3 to SNS to SQS. 190 191 192 Type: `string` 193 Default: `""` 194 195 ### `sqs_max_messages` 196 197 The maximum number of SQS messages to consume from each request. 198 199 200 Type: `int` 201 Default: `10` 202 203 ### `sqs_endpoint` 204 205 A custom endpoint to use when connecting to SQS. 206 207 208 Type: `string` 209 Default: `""` 210 211 ### `region` 212 213 The AWS region to target. 214 215 216 Type: `string` 217 Default: `"eu-west-1"` 218 219 ### `endpoint` 220 221 Allows you to specify a custom endpoint for the AWS API. 222 223 224 Type: `string` 225 Default: `""` 226 227 ### `credentials` 228 229 Optional manual configuration of AWS credentials to use. More information can be found [in this document](/docs/guides/cloud/aws). 230 231 232 Type: `object` 233 234 ### `credentials.profile` 235 236 A profile from `~/.aws/credentials` to use. 237 238 239 Type: `string` 240 Default: `""` 241 242 ### `credentials.id` 243 244 The ID of credentials to use. 245 246 247 Type: `string` 248 Default: `""` 249 250 ### `credentials.secret` 251 252 The secret for the credentials being used. 253 254 255 Type: `string` 256 Default: `""` 257 258 ### `credentials.token` 259 260 The token for the credentials being used, required when using short term credentials. 261 262 263 Type: `string` 264 Default: `""` 265 266 ### `credentials.role` 267 268 A role ARN to assume. 269 270 271 Type: `string` 272 Default: `""` 273 274 ### `credentials.role_external_id` 275 276 An external ID to provide when assuming a role. 277 278 279 Type: `string` 280 Default: `""` 281 282 ### `retries` 283 284 The maximum number of times to attempt an object download. 285 286 287 Type: `int` 288 Default: `3` 289 290 ### `force_path_style_urls` 291 292 Forces the client API to use path style URLs, which helps when connecting to custom endpoints. 293 294 295 Type: `bool` 296 Default: `false` 297 298 ### `delete_objects` 299 300 Whether to delete downloaded objects from the bucket. 301 302 303 Type: `bool` 304 Default: `false` 305 306 ### `download_manager` 307 308 Controls if and how to use the download manager API. This can help speed up file downloads, but results in file metadata not being copied. 309 310 311 Type: `object` 312 313 ### `download_manager.enabled` 314 315 Whether to use to download manager API. 316 317 318 Type: `bool` 319 Default: `true` 320 321 ### `timeout` 322 323 The period of time to wait before abandoning a request and trying again. 324 325 326 Type: `string` 327 Default: `"5s"` 328 329