github.com/Jeffail/benthos/v3@v3.65.0/website/docs/components/processors/about.md

github.com/Jeffail/benthos/v3@v3.65.0/website/docs/components/processors/about.md (about)

     1  ---
     2  title: Processors
     3  sidebar_label: About
     4  ---
     5  
     6  Benthos processors are functions applied to messages passing through a pipeline. The function signature allows a processor to mutate or drop messages depending on the content of the message. There are many types on offer but the most powerful is the [`bloblang` processor][processor.bloblang].
     7  
     8  Processors are set via config, and depending on where in the config they are placed they will be run either immediately after a specific input (set in the input section), on all messages (set in the pipeline section) or before a specific output (set in the output section). Most processors apply to all messages and can be placed in the pipeline section:
     9  
    10  ```yaml
    11  pipeline:
    12    threads: 1
    13    processors:
    14      - label: my_cool_mapping
    15        bloblang: |
    16          root.message = this
    17          root.meta.link_count = this.links.length()
    18  ```
    19  
    20  The `threads` field in the pipeline section determines how many parallel processing threads are created. You can read more about parallel processing in the [pipeline guide][pipelines].
    21  
    22  ## Labels
    23  
    24  Processors have an optional field `label` that can uniquely identify them in observability data such as metrics and logs. This can be useful when running configs with multiple nested processors, otherwise their metrics labels will be generated based on their composition. For more information check out the [metrics documentation][metrics.about].
    25  
    26  ## Error Handling
    27  
    28  Some processors have conditions whereby they might fail. Rather than throw these messages into the abyss Benthos still attempts to send these messages onwards, and has mechanisms for filtering, recovering or dead-letter queuing messages that have failed which can be read about [here][error_handling].
    29  
    30  ## Using Processors as Outputs
    31  
    32  It might be the case that a processor that results in a side effect, such as the [`sql_insert`][processor.sql_insert] or [`redis`][processor.redis] processors, is the only side effect of a pipeline, and therefore could be considered the output.
    33  
    34  In such cases it's possible to place these processors within a [`reject` output][output.reject] so that they behave the same as regular outputs, where success results in dropping the message with an acknowledgement and failure results in a nack (or retry):
    35  
    36  ```yaml
    37  output:
    38    reject: 'failed to send data: ${! error() }'
    39    processors:
    40      - try:
    41          - redis:
    42              url: tcp://localhost:6379
    43              operator: sadd
    44              key: ${! json("foo") }
    45          - bloblang: root = deleted()
    46  ```
    47  
    48  The way this works is that if your processor with the side effect (`redis` in this case) succeeds then the final `bloblang` processor deletes the message which results in an acknowledgement. If the processor fails then the `try` block exits early without executing the `bloblang` processor and instead the message is routed to the `reject` output, which nacks the message with an error message containing the error obtained from the `redis` processor.
    49  
    50  import ComponentsByCategory from '@theme/ComponentsByCategory';
    51  
    52  ## Categories
    53  
    54  <ComponentsByCategory type="processors"></ComponentsByCategory>
    55  
    56  ## Batching and Multiple Part Messages
    57  
    58  All Benthos processors support multiple part messages, which are synonymous with batches. This enables some cool [windowed processing][windowed_processing] capabilities.
    59  
    60  Many processors are able to perform their behaviours on specific parts of a message batch, or on all parts, and have a field `parts` for specifying an array of part indexes they should apply to. If the list of target parts is empty these processors will be applied to all message parts.
    61  
    62  Part indexes can be negative, and if so the part will be selected from the end counting backwards starting from -1. E.g. if part = -1 then the selected part will be the last part of the message, if part = -2 then the part before the last element will be selected, and so on.
    63  
    64  Some processors such as [`dedupe`][processor.dedupe] act across an entire batch, when instead we might like to perform them on individual messages of a batch. In this case the [`for_each`][processor.for_each] processor can be used.
    65  
    66  You can read more about batching [in this document][batching].
    67  
    68  [error_handling]: /docs/configuration/error_handling
    69  [batching]: /docs/configuration/batching
    70  [windowed_processing]: /docs/configuration/windowed_processing
    71  [pipelines]: /docs/configuration/processing_pipelines
    72  [output.reject]: /docs/components/outputs/reject
    73  [processor.sql_insert]: /docs/components/processors/sql_insert
    74  [processor.redis]: /docs/components/processors/redis
    75  [processor.bloblang]: /docs/components/processors/bloblang
    76  [processor.split]: /docs/components/processors/split
    77  [processor.dedupe]: /docs/components/processors/dedupe
    78  [processor.for_each]: /docs/components/processors/for_each