github.com/Jeffail/benthos/v3@v3.65.0/website/docs/components/outputs/hdfs.md (about)

     1  ---
     2  title: hdfs
     3  type: output
     4  status: stable
     5  categories: ["Services"]
     6  ---
     7  
     8  <!--
     9       THIS FILE IS AUTOGENERATED!
    10  
    11       To make changes please edit the contents of:
    12       lib/output/hdfs.go
    13  -->
    14  
    15  import Tabs from '@theme/Tabs';
    16  import TabItem from '@theme/TabItem';
    17  
    18  
    19  Sends message parts as files to a HDFS directory.
    20  
    21  
    22  <Tabs defaultValue="common" values={[
    23    { label: 'Common', value: 'common', },
    24    { label: 'Advanced', value: 'advanced', },
    25  ]}>
    26  
    27  <TabItem value="common">
    28  
    29  ```yaml
    30  # Common config fields, showing default values
    31  output:
    32    label: ""
    33    hdfs:
    34      hosts:
    35        - localhost:9000
    36      user: benthos_hdfs
    37      directory: ""
    38      path: ${!count("files")}-${!timestamp_unix_nano()}.txt
    39      max_in_flight: 1
    40      batching:
    41        count: 0
    42        byte_size: 0
    43        period: ""
    44        check: ""
    45  ```
    46  
    47  </TabItem>
    48  <TabItem value="advanced">
    49  
    50  ```yaml
    51  # All config fields, showing default values
    52  output:
    53    label: ""
    54    hdfs:
    55      hosts:
    56        - localhost:9000
    57      user: benthos_hdfs
    58      directory: ""
    59      path: ${!count("files")}-${!timestamp_unix_nano()}.txt
    60      max_in_flight: 1
    61      batching:
    62        count: 0
    63        byte_size: 0
    64        period: ""
    65        check: ""
    66        processors: []
    67  ```
    68  
    69  </TabItem>
    70  </Tabs>
    71  
    72  Each file is written with the path specified with the 'path' field, in order to
    73  have a different path for each object you should use function interpolations
    74  described [here](/docs/configuration/interpolation#bloblang-queries).
    75  
    76  ## Performance
    77  
    78  This output benefits from sending multiple messages in flight in parallel for
    79  improved performance. You can tune the max number of in flight messages with the
    80  field `max_in_flight`.
    81  
    82  ## Fields
    83  
    84  ### `hosts`
    85  
    86  A list of hosts to connect to.
    87  
    88  
    89  Type: `array`  
    90  Default: `["localhost:9000"]`  
    91  
    92  ```yaml
    93  # Examples
    94  
    95  hosts: localhost:9000
    96  ```
    97  
    98  ### `user`
    99  
   100  A user identifier.
   101  
   102  
   103  Type: `string`  
   104  Default: `"benthos_hdfs"`  
   105  
   106  ### `directory`
   107  
   108  A directory to store message files within. If the directory does not exist it will be created.
   109  
   110  
   111  Type: `string`  
   112  Default: `""`  
   113  
   114  ### `path`
   115  
   116  The path to upload messages as, interpolation functions should be used in order to generate unique file paths.
   117  This field supports [interpolation functions](/docs/configuration/interpolation#bloblang-queries).
   118  
   119  
   120  Type: `string`  
   121  Default: `"${!count(\"files\")}-${!timestamp_unix_nano()}.txt"`  
   122  
   123  ```yaml
   124  # Examples
   125  
   126  path: ${!count("files")}-${!timestamp_unix_nano()}.txt
   127  ```
   128  
   129  ### `max_in_flight`
   130  
   131  The maximum number of messages to have in flight at a given time. Increase this to improve throughput.
   132  
   133  
   134  Type: `int`  
   135  Default: `1`  
   136  
   137  ### `batching`
   138  
   139  Allows you to configure a [batching policy](/docs/configuration/batching).
   140  
   141  
   142  Type: `object`  
   143  
   144  ```yaml
   145  # Examples
   146  
   147  batching:
   148    byte_size: 5000
   149    count: 0
   150    period: 1s
   151  
   152  batching:
   153    count: 10
   154    period: 1s
   155  
   156  batching:
   157    check: this.contains("END BATCH")
   158    count: 0
   159    period: 1m
   160  ```
   161  
   162  ### `batching.count`
   163  
   164  A number of messages at which the batch should be flushed. If `0` disables count based batching.
   165  
   166  
   167  Type: `int`  
   168  Default: `0`  
   169  
   170  ### `batching.byte_size`
   171  
   172  An amount of bytes at which the batch should be flushed. If `0` disables size based batching.
   173  
   174  
   175  Type: `int`  
   176  Default: `0`  
   177  
   178  ### `batching.period`
   179  
   180  A period in which an incomplete batch should be flushed regardless of its size.
   181  
   182  
   183  Type: `string`  
   184  Default: `""`  
   185  
   186  ```yaml
   187  # Examples
   188  
   189  period: 1s
   190  
   191  period: 1m
   192  
   193  period: 500ms
   194  ```
   195  
   196  ### `batching.check`
   197  
   198  A [Bloblang query](/docs/guides/bloblang/about/) that should return a boolean value indicating whether a message should end a batch.
   199  
   200  
   201  Type: `string`  
   202  Default: `""`  
   203  
   204  ```yaml
   205  # Examples
   206  
   207  check: this.type == "end_of_transaction"
   208  ```
   209  
   210  ### `batching.processors`
   211  
   212  A list of [processors](/docs/components/processors/about) to apply to a batch as it is flushed. This allows you to aggregate and archive the batch however you see fit. Please note that all resulting messages are flushed as a single batch, therefore splitting the batch into smaller batches using these processors is a no-op.
   213  
   214  
   215  Type: `array`  
   216  Default: `[]`  
   217  
   218  ```yaml
   219  # Examples
   220  
   221  processors:
   222    - archive:
   223        format: lines
   224  
   225  processors:
   226    - archive:
   227        format: json_array
   228  
   229  processors:
   230    - merge_json: {}
   231  ```
   232  
   233