github.com/Jeffail/benthos/v3@v3.65.0/website/docs/configuration/processing_pipelines.md (about) 1 --- 2 title: Processing Pipelines 3 --- 4 5 Within a Benthos configuration, in between `input` and `output`, is a `pipeline` section. This section describes an array of [processors][processors] that are to be applied to *all* messages, and are not bound to any particular input or output. 6 7 If you have processors that are heavy on CPU and aren't specific to a certain input or output they are best suited for the pipeline section. It is advantageous to use the pipeline section as it allows you to set an explicit number of parallel threads of execution: 8 9 ```yaml 10 input: 11 resource: foo 12 13 pipeline: 14 threads: 4 15 processors: 16 - bloblang: | 17 root = this 18 fans = fans.map_each(match { 19 this.obsession > 0.5 => this 20 _ => deleted() 21 }) 22 23 output: 24 resource: bar 25 ``` 26 27 If the field `threads` is set to `0` it will automatically match the number of logical CPUs available. 28 29 By default almost all Benthos sources will utilise as many processing threads as have been configured, which makes horizontal scaling easy. However, this configuration would not be optimal if our input isn't able to utilise >1 processing threads, which will be mentioned in its documentation ([`kafka`][kafka-input], for example). 30 31 It's also possible that the input source isn't able to provide enough traffic to fully saturate our processing threads. The following patterns can help you to achieve a distribution of work across these processing threads even under those circumstances. 32 33 ### Multiple Consumers 34 35 Sometimes our source of data can have many multiple connected clients and will distribute a stream of messages amongst them. In which case it is possible to increase utilisation of parallel processing threads by adding more consumers. This can be done with a [`broker` input][broker-input]: 36 37 ```yaml 38 input: 39 broker: 40 copies: 8 41 inputs: 42 - resource: baz 43 44 pipeline: 45 threads: 4 46 processors: 47 - bloblang: | 48 root = this 49 fans = fans.map_each(match { 50 this.obsession > 0.5 => this 51 _ => deleted() 52 }) 53 54 output: 55 resource: bar 56 ``` 57 58 The disadvantage of this set up is that increasing the number of consuming clients potentially puts unnecessary stress on your data source. 59 60 ### Add a Buffer 61 62 [Buffers][buffers] should be used with caution as they weaken the delivery guarantees of your pipeline. However, they can be very useful for horizontally scaling processing in cases where an input feed is sporadic, as they can level out throughput spikes and provide a backlog of messages during gaps. 63 64 ```yaml 65 input: 66 resource: foo 67 68 buffer: 69 memory: 70 limit: 5000000 71 72 pipeline: 73 threads: 4 74 processors: 75 - bloblang: | 76 root = this 77 fans = fans.map_each(match { 78 this.obsession > 0.5 => this 79 _ => deleted() 80 }) 81 82 output: 83 resource: bar 84 ``` 85 86 [processors]: /docs/components/processors/about 87 [split-proc]: /docs/components/processors/split 88 [broker-input]: /docs/components/inputs/broker 89 [kafka-input]: /docs/components/inputs/kafka 90 [buffers]: /docs/components/buffers/about