github.com/Jeffail/benthos/v3@v3.65.0/website/docs/components/outputs/hdfs.md (about) 1 --- 2 title: hdfs 3 type: output 4 status: stable 5 categories: ["Services"] 6 --- 7 8 <!-- 9 THIS FILE IS AUTOGENERATED! 10 11 To make changes please edit the contents of: 12 lib/output/hdfs.go 13 --> 14 15 import Tabs from '@theme/Tabs'; 16 import TabItem from '@theme/TabItem'; 17 18 19 Sends message parts as files to a HDFS directory. 20 21 22 <Tabs defaultValue="common" values={[ 23 { label: 'Common', value: 'common', }, 24 { label: 'Advanced', value: 'advanced', }, 25 ]}> 26 27 <TabItem value="common"> 28 29 ```yaml 30 # Common config fields, showing default values 31 output: 32 label: "" 33 hdfs: 34 hosts: 35 - localhost:9000 36 user: benthos_hdfs 37 directory: "" 38 path: ${!count("files")}-${!timestamp_unix_nano()}.txt 39 max_in_flight: 1 40 batching: 41 count: 0 42 byte_size: 0 43 period: "" 44 check: "" 45 ``` 46 47 </TabItem> 48 <TabItem value="advanced"> 49 50 ```yaml 51 # All config fields, showing default values 52 output: 53 label: "" 54 hdfs: 55 hosts: 56 - localhost:9000 57 user: benthos_hdfs 58 directory: "" 59 path: ${!count("files")}-${!timestamp_unix_nano()}.txt 60 max_in_flight: 1 61 batching: 62 count: 0 63 byte_size: 0 64 period: "" 65 check: "" 66 processors: [] 67 ``` 68 69 </TabItem> 70 </Tabs> 71 72 Each file is written with the path specified with the 'path' field, in order to 73 have a different path for each object you should use function interpolations 74 described [here](/docs/configuration/interpolation#bloblang-queries). 75 76 ## Performance 77 78 This output benefits from sending multiple messages in flight in parallel for 79 improved performance. You can tune the max number of in flight messages with the 80 field `max_in_flight`. 81 82 ## Fields 83 84 ### `hosts` 85 86 A list of hosts to connect to. 87 88 89 Type: `array` 90 Default: `["localhost:9000"]` 91 92 ```yaml 93 # Examples 94 95 hosts: localhost:9000 96 ``` 97 98 ### `user` 99 100 A user identifier. 101 102 103 Type: `string` 104 Default: `"benthos_hdfs"` 105 106 ### `directory` 107 108 A directory to store message files within. If the directory does not exist it will be created. 109 110 111 Type: `string` 112 Default: `""` 113 114 ### `path` 115 116 The path to upload messages as, interpolation functions should be used in order to generate unique file paths. 117 This field supports [interpolation functions](/docs/configuration/interpolation#bloblang-queries). 118 119 120 Type: `string` 121 Default: `"${!count(\"files\")}-${!timestamp_unix_nano()}.txt"` 122 123 ```yaml 124 # Examples 125 126 path: ${!count("files")}-${!timestamp_unix_nano()}.txt 127 ``` 128 129 ### `max_in_flight` 130 131 The maximum number of messages to have in flight at a given time. Increase this to improve throughput. 132 133 134 Type: `int` 135 Default: `1` 136 137 ### `batching` 138 139 Allows you to configure a [batching policy](/docs/configuration/batching). 140 141 142 Type: `object` 143 144 ```yaml 145 # Examples 146 147 batching: 148 byte_size: 5000 149 count: 0 150 period: 1s 151 152 batching: 153 count: 10 154 period: 1s 155 156 batching: 157 check: this.contains("END BATCH") 158 count: 0 159 period: 1m 160 ``` 161 162 ### `batching.count` 163 164 A number of messages at which the batch should be flushed. If `0` disables count based batching. 165 166 167 Type: `int` 168 Default: `0` 169 170 ### `batching.byte_size` 171 172 An amount of bytes at which the batch should be flushed. If `0` disables size based batching. 173 174 175 Type: `int` 176 Default: `0` 177 178 ### `batching.period` 179 180 A period in which an incomplete batch should be flushed regardless of its size. 181 182 183 Type: `string` 184 Default: `""` 185 186 ```yaml 187 # Examples 188 189 period: 1s 190 191 period: 1m 192 193 period: 500ms 194 ``` 195 196 ### `batching.check` 197 198 A [Bloblang query](/docs/guides/bloblang/about/) that should return a boolean value indicating whether a message should end a batch. 199 200 201 Type: `string` 202 Default: `""` 203 204 ```yaml 205 # Examples 206 207 check: this.type == "end_of_transaction" 208 ``` 209 210 ### `batching.processors` 211 212 A list of [processors](/docs/components/processors/about) to apply to a batch as it is flushed. This allows you to aggregate and archive the batch however you see fit. Please note that all resulting messages are flushed as a single batch, therefore splitting the batch into smaller batches using these processors is a no-op. 213 214 215 Type: `array` 216 Default: `[]` 217 218 ```yaml 219 # Examples 220 221 processors: 222 - archive: 223 format: lines 224 225 processors: 226 - archive: 227 format: json_array 228 229 processors: 230 - merge_json: {} 231 ``` 232 233