github.com/Jeffail/benthos/v3@v3.65.0/website/docs/components/processors/archive.md (about)

     1  ---
     2  title: archive
     3  type: processor
     4  status: stable
     5  categories: ["Parsing","Utility"]
     6  ---
     7  
     8  <!--
     9       THIS FILE IS AUTOGENERATED!
    10  
    11       To make changes please edit the contents of:
    12       lib/processor/archive.go
    13  -->
    14  
    15  import Tabs from '@theme/Tabs';
    16  import TabItem from '@theme/TabItem';
    17  
    18  
    19  Archives all the messages of a batch into a single message according to the
    20  selected archive [format](#formats).
    21  
    22  ```yaml
    23  # Config fields, showing default values
    24  label: ""
    25  archive:
    26    format: binary
    27    path: ${!count("files")}-${!timestamp_unix_nano()}.txt
    28  ```
    29  
    30  Some archive formats (such as tar, zip) treat each archive item (message part)
    31  as a file with a path. Since message parts only contain raw data a unique path
    32  must be generated for each part. This can be done by using function
    33  interpolations on the 'path' field as described
    34  [here](/docs/configuration/interpolation#bloblang-queries). For types that aren't file based
    35  (such as binary) the file field is ignored.
    36  
    37  The resulting archived message adopts the metadata of the _first_ message part
    38  of the batch.
    39  
    40  The functionality of this processor depends on being applied across messages
    41  that are batched. You can find out more about batching [in this doc](/docs/configuration/batching).
    42  
    43  ## Fields
    44  
    45  ### `format`
    46  
    47  The archiving [format](#formats) to apply.
    48  
    49  
    50  Type: `string`  
    51  Default: `"binary"`  
    52  Options: `tar`, `zip`, `binary`, `lines`, `json_array`, `concatenate`.
    53  
    54  ### `path`
    55  
    56  The path to set for each message in the archive (when applicable).
    57  This field supports [interpolation functions](/docs/configuration/interpolation#bloblang-queries).
    58  
    59  
    60  Type: `string`  
    61  Default: `"${!count(\"files\")}-${!timestamp_unix_nano()}.txt"`  
    62  
    63  ```yaml
    64  # Examples
    65  
    66  path: ${!count("files")}-${!timestamp_unix_nano()}.txt
    67  
    68  path: ${!meta("kafka_key")}-${!json("id")}.json
    69  ```
    70  
    71  ## Formats
    72  
    73  ### `concatenate`
    74  
    75  Join the raw contents of each message into a single binary message.
    76  
    77  ### `tar`
    78  
    79  Archive messages to a unix standard tape archive.
    80  
    81  ### `zip`
    82  
    83  Archive messages to a zip file.
    84  
    85  ### `binary`
    86  
    87  Archive messages to a binary blob format consisting of:
    88  
    89  - Four bytes containing number of messages in the batch (in big endian)
    90  - For each message part:
    91    + Four bytes containing the length of the message (in big endian)
    92    + The content of message
    93  
    94  ### `lines`
    95  
    96  Join the raw contents of each message and insert a line break between each one.
    97  
    98  ### `json_array`
    99  
   100  Attempt to parse each message as a JSON document and append the result to an
   101  array, which becomes the contents of the resulting message.
   102  
   103  ## Examples
   104  
   105  If we had JSON messages in a batch each of the form:
   106  
   107  ```json
   108  {"doc":{"id":"foo","body":"hello world 1"}}
   109  ```
   110  
   111  And we wished to tar archive them, setting their filenames to their respective
   112  unique IDs (with the extension `.json`), our config might look like
   113  this:
   114  
   115  ```yaml
   116  pipeline:
   117    processors:
   118      - archive:
   119          format: tar
   120          path: ${!json("doc.id")}.json
   121  ```
   122