github.com/Jeffail/benthos/v3@v3.65.0/website/blog/2019-08-20-write-a-benthos-plugin.md (about) 1 --- 2 title: Write a Benthos Plugin 3 author: Ashley Jeffs 4 author_url: https://github.com/Jeffail 5 author_image_url: /img/ash.jpg 6 description: I made it difficult for our job security 7 keywords: [ 8 "benthos", 9 "plugin", 10 "go", 11 "golang", 12 "stream processor", 13 ] 14 tags: [ "Plugins" ] 15 --- 16 17 I'm going to walk you through writing a Benthos plugin from scratch in Go. 18 19 <!--truncate--> 20 21 Too lazy to read? You can find a video equivalent of this post at: [https://youtu.be/Ilah_Y0uMk4](https://youtu.be/Ilah_Y0uMk4). If you prefer to dig straight into code then you should check out the [benthos-plugin-example][plugin-repo] repo. 22 23  24 25 Plugins allow you to embed your code within Benthos as a component. [Processors][benthos-proc] are the most common type of component to get plugged, which is what we're going to do in this post. If you want to run non-Go code from Benthos then you still have options, such as the [`subprocess`][subprocess-proc], [`http`][http-proc] or [`lambda`][lambda-proc] processors. 26 27 ## Roleplay 28 29 Imagine you are a competent engineer. You wrote a function to detect sarcasm in internet posts on a linear scale of 0 to 100: 30 31 ```go 32 // HowSarcastic TOTALLY detects sarcasm EVERY time. 33 func HowSarcastic(content []byte) float64 { 34 if bytes.Contains(content, []byte("/s")) { 35 return 100 36 } 37 return 0 38 } 39 ``` 40 41 You are confident that `HowSarcastic` is 100% accurate and wish to apply it to a continuous stream of data by deploying it within a stream processing solution. 42 43 You want this service to be resilient with at-least-once delivery guarantees, scalable both horizontally and vertically, and able to expose various metrics about the health of the data stream. 44 45 You have decided to use Benthos for this service because you love the logo. 46 47  48 49 ### Stuff you don't need to care about yet 50 51 Since you are using Benthos you don't need to choose a queue system, metrics aggregator or deployment platform yet, those items can be configured. 52 53 You don't even need to know what format the data comes in or how it needs to look when it leaves your service, as Benthos [has plenty of processors][processors] for configuring that stuff on the fly. 54 55 ## Getting Started 56 57 You're going to use Go modules for this one, make a directory and create a `go.mod` file: 58 59 ```sh 60 mkdir foo && cd foo 61 go mod init github.com/bar/foo 62 ``` 63 64 Next, you need to pull in your only dependency, [Benthos][benthos-repo]: 65 66 ```sh 67 go get github.com/Jeffail/benthos/v3 68 # Look! Now you have more dependencies than friends! 69 ``` 70 71 That'll automatically add the dep to your `go.mod` file at the latest v3 tag. Next, you're going to write your stream processor service. Write this to the file `main.go`: 72 73 ```go 74 package main 75 76 import ( 77 "github.com/Jeffail/benthos/v3/lib/service" 78 ) 79 80 func main() { 81 service.Run() 82 } 83 ``` 84 85 That's it, you've got a full Benthos. If you want to verify then you can run it: 86 87 ```sh 88 go run ./main.go --help 89 ``` 90 91 ## Write Your Plugin 92 93 Now you will write the actual plugin that executes your function. Processor plugins implement [`types.Processor`][types-processor] and have the signature: 94 95 ```go 96 func ProcessMessage(msg types.Message) ([]types.Message, types.Response) 97 ``` 98 99 A message can have multiple parts (synonymous with a batch) and we are allowed to return either one or more messages or a response which is either an ack or noack. 100 101 A message part has both content and any number of metadata key/value pairs. It is therefore up to you as to whether you modify the contents of messages or whether the sarcasm level is added as metadata. 102 103 Thankfully you don't need to make that decision now. Instead, you're going to expose it as a config field and support both. The config field will be called `metadata_key`, and if left empty the contents of messages will be replaced entirely with the sarcasm level. 104 105 There won't be much code needed so for brevity you are going to write this straight into your `main.go` file: 106 107 ```go 108 // SarcasmProc applies our sarcasm detector to messages. 109 type SarcasmProc struct { 110 MetadataKey string `json:"metadata_key" yaml:"metadata_key"` 111 } 112 113 // ProcessMessage returns messages mutated with their sarcasm level. 114 func (s *SarcasmProc) ProcessMessage(msg types.Message) ([]types.Message, types.Response) { 115 newMsg := msg.Copy() 116 117 newMsg.Iter(func(i int, p types.Part) error { 118 sarcasm := HowSarcastic(p.Get()) 119 sarcasmStr := strconv.FormatFloat(sarcasm, 'f', -1, 64) 120 121 if len(s.MetadataKey) > 0 { 122 p.Metadata().Set(s.MetadataKey, sarcasmStr) 123 } else { 124 p.Set([]byte(sarcasmStr)) 125 } 126 return nil 127 }) 128 129 return []types.Message{newMsg}, nil 130 } 131 132 // CloseAsync does nothing. 133 func (s *SarcasmProc) CloseAsync() {} 134 135 // WaitForClose does nothing. 136 func (s *SarcasmProc) WaitForClose(timeout time.Duration) error { 137 return nil 138 } 139 ``` 140 141 Let's break this down. You have a struct called `SarcasmProc`, which contains a configuration field `MetadataKey`. The functions `CloseAsync` and `WaitForClose` can be ignored as your processor doesn't contain any state that requires termination. 142 143 Within your function `ProcessMessage` you iterate all the payloads within the message batch and calculate the sarcasm level with your function `HowSarcastic`. The result is converted into a string and, depending on whether a metadata key has been set, replaces the contents with the result or sets a new metadata value on the payload. 144 145 That's your processor completed. Now you need to register the plugin before calling `service.Run`. Since this is a processor plugin you're going to call [`processor.RegisterPlugin`][proc-register-plugin]: 146 147 ```go 148 func main() { 149 processor.RegisterPlugin( 150 "how_sarcastic", 151 func() interface{} { 152 s := SarcasmProc{} 153 return &s 154 }, 155 func( 156 iconf interface{}, 157 mgr types.Manager, 158 logger log.Modular, 159 stats metrics.Type, 160 ) (types.Processor, error) { 161 return iconf.(*SarcasmProc), nil 162 }, 163 ) 164 165 service.Run() 166 } 167 ``` 168 169 The first argument is a string that identifies the type of this plugin, that's the string used to specify it within a Benthos config file. 170 171 The second argument is a function that creates our config structure, this will be embedded within the Benthos config specification. In this case our processor implementation is the same type as the configuration struct, but you can separate them if you prefer. 172 173 The third argument is the generic function that constructs our processor. In this case we've already constructed it as our configuration type and so we can simply cast it and return it. 174 175 You can download the full file here: [`main.go`][full-example] 176 177 Now you're going to build your custom Benthos with: 178 179 ```sh 180 go build -o benthos 181 ``` 182 183 ## Run Your Plugin 184 185 In order to execute your plugin with Benthos you need a config. Write the following to a file `config.yaml`: 186 187 ```yaml 188 pipeline: 189 processors: 190 - type: how_sarcastic 191 ``` 192 193 And run it: 194 195 ```sh 196 ./benthos -c ./config.yaml 197 ``` 198 199 Your config hasn't specified an input or output so they will default to `stdin` and `stdout`. Write the line `'this is not sarcastic'`, followed by the line `'this is sarcastic /s'`. Benthos should print `0` and `100` respectively. 200 201 Cool, but this config is pretty useless, good job idiot. Now you're going to fix your mistake. Let's imagine you are processing a stream of JSON documents of the form `{"id":"fooid","content":"this is the content"}` and you want to add a field `sarcasm` containing the sarcasm level of `content`. You can do that purely through config by using the [`json`][json-proc] and [`process_field`][process-field-proc] processors: 202 203 ```yaml 204 pipeline: 205 processors: 206 - json: 207 operator: copy 208 path: content 209 value: sarcasm 210 - process_field: 211 path: sarcasm 212 result_type: float 213 processors: 214 - type: how_sarcastic 215 ``` 216 217 Run that config with some JSON documents: 218 219 ```sh 220 echo '{"id":"foo1","content":"this is totally sarcastic /s"} 221 {"id":"foo2","content":"but this isnt sarcastic at all"}' | 222 ./benthos -c ./config.yaml 223 ``` 224 225 You'll see some log events but also you should see your two modified documents: 226 227 ```text 228 {"content":"this is totally sarcastic /s","id":"foo1","sarcasm":100} 229 {"content":"but this isnt sarcastic at all","id":"foo2","sarcasm":0} 230 ``` 231 232 That's much more useful, but this is just barely scratching the surface of what Benthos can do. For example, here's a config that calculates sarcasm with your processor and removes anything with a sarcasm level at or above 80: 233 234 ```yaml 235 pipeline: 236 processors: 237 - type: how_sarcastic 238 plugin: 239 metadata_key: sarcasm 240 - filter_parts: 241 metadata: 242 operator: less_than 243 key: sarcasm 244 arg: 80 245 ``` 246 247 Note that it makes use of your `metadata_key` field in order to filter the documents without changing their content. 248 249 Try experimenting with other Benthos processors, you can find the documentation at [benthos.dev/docs/components/processors/about][processors]. 250 251 ## Next Steps 252 253 After playing around with Benthos processors you should check out the various [inputs][inputs], [outputs][outputs], [metrics aggregators][metrics] and [tracers][tracers] that it's able to hook up with. 254 255 For example, here's a modified version of the previous config where we write from Kafka to an S3 bucket, sending our metrics to Prometheus: 256 257 ```yaml 258 http: 259 address: 0.0.0.0:4195 260 261 input: 262 kafka: 263 addresses: 264 - localhost:9092 265 consumer_group: foo_consumer_group 266 topics: 267 - foo_stream 268 269 pipeline: 270 processors: 271 - type: how_sarcastic 272 plugin: 273 metadata_key: sarcasm 274 - filter_parts: 275 metadata: 276 operator: less_than 277 key: sarcasm 278 arg: 80 279 280 output: 281 s3: 282 bucket: foo_bucket 283 content_type: application/json 284 path: ${!metadata:kafka_key}-${!timestamp_unix_nano}-${!count:files}.json 285 286 metrics: 287 # Endpoint hosted at both :4195/stats and :4195/metrics 288 type: prometheus 289 ``` 290 291 I'm sure you'll make great use of Benthos plugins with your extremely important work /s. 292 293 [benthos]: https://www.benthos.dev 294 [benthos-repo]: https://github.com/Jeffail/benthos 295 [plugin-repo]: https://github.com/benthosdev/benthos-plugin-example 296 [processors]: https://benthos.dev/docs/components/processors/about 297 [inputs]: https://benthos.dev/docs/components/inputs/about 298 [outputs]: https://benthos.dev/docs/components/outputs/about 299 [metrics]: https://benthos.dev/docs/components/metrics/about 300 [tracers]: https://benthos.dev/docs/components/tracers/about 301 [benthos-proc]: https://benthos.dev/docs/components/processors/about 302 [json-proc]: https://benthos.dev/docs/components/processors/json 303 [subprocess-proc]: https://benthos.dev/docs/components/processors/subprocess 304 [http-proc]: https://benthos.dev/docs/components/processors/http 305 [lambda-proc]: https://benthos.dev/docs/components/processors/lambda 306 [process-field-proc]: https://benthos.dev/docs/components/processors/process_field 307 [full-example]: /snippets/write-a-benthos-plugin/main.go 308 [types-processor]: https://godoc.org/github.com/Jeffail/benthos/lib/types#Processor 309 [proc-register-plugin]: https://godoc.org/github.com/Jeffail/benthos/lib/processor#RegisterPlugin