github.com/Jeffail/benthos/v3@v3.65.0/website/cookbooks/custom_metrics.md (about) 1 --- 2 id: custom-metrics 3 title: Custom Metrics 4 description: Learn how to emit custom metrics from messages. 5 --- 6 7 You can't build cool graphs without metrics, and [Benthos emits many][internal-metrics]. However, occasionally you might want to also emit custom metrics that track data extracted from messages being processed. In this cookbook we'll explore how to achieve this by configuring Benthos to pull download stats from Github, Dockerhub and Homebrew and emit them as gauges. 8 9 ## The Basics 10 11 Firstly, we need to target an API so let's start with the nice and simple Homebrew API, which we'll poll every 60 seconds. 12 13 We can either do it with an [`http_client` input][inputs.http_client] and a [rate limit][rate_limits] that restricts us to one request per 60 seconds, or we can use a [`generate` input][inputs.generate] to generate a message every 60 seconds that triggers an [`http` processor][processors.http]: 14 15 import Tabs from '@theme/Tabs'; 16 import TabItem from '@theme/TabItem'; 17 18 <Tabs defaultValue="Processor" values={[ 19 { label: 'With Processor', value: 'Processor', }, 20 { label: 'With Input', value: 'Input', }, 21 ]}> 22 23 <TabItem value="Processor"> 24 25 ```yaml 26 input: 27 generate: 28 interval: 60s 29 mapping: root = "" 30 31 pipeline: 32 processors: 33 - http: 34 url: https://formulae.brew.sh/api/formula/benthos.json 35 verb: GET 36 ``` 37 38 </TabItem> 39 40 <TabItem value="Input"> 41 42 ```yaml 43 input: 44 http_client: 45 url: https://formulae.brew.sh/api/formula/benthos.json 46 verb: GET 47 rate_limit: brewlimit 48 49 rate_limit_resources: 50 - label: brewlimit 51 local: 52 count: 1 53 interval: 60s 54 ``` 55 56 </TabItem> 57 58 </Tabs> 59 60 61 For this cookbook we'll continue with the processor option as it makes it easier to deploy it as a [scheduled lambda function][serverless.lambda] later on, which is how I'm currently doing it in real life. 62 63 The homebrew formula API gives us a JSON blob that looks like this (removing fields we're not interested in, and with numbers inflated relative to my ego): 64 65 ```json 66 { 67 "name":"benthos", 68 "desc":"Stream processor for mundane tasks written in Go", 69 "analytics":{"install":{"30d":{"benthos":78978979},"90d":{"benthos":253339124},"365d":{"benthos":681356871}}} 70 } 71 ``` 72 73 This format makes it fairly easy to emit the value of `analytics.install.30d.benthos` as a gauge with the [`metric` processor][processors.metric]: 74 75 ```yaml 76 http: 77 address: 0.0.0.0:4195 78 79 input: 80 generate: 81 interval: 60s 82 mapping: root = "" 83 84 pipeline: 85 processors: 86 - http: 87 url: https://formulae.brew.sh/api/formula/benthos.json 88 verb: GET 89 90 - metric: 91 type: gauge 92 name: downloads 93 labels: 94 source: homebrew 95 value: ${! json("analytics.install.30d.benthos") } 96 97 - bloblang: root = deleted() 98 99 metrics: 100 prometheus: 101 prefix: benthos 102 path_mapping: if this != "downloads" { deleted() } 103 ``` 104 105 With the above config we have selected the [`prometheus` metrics type][metrics.prometheus], which allows us to use [Prometheus][prometheus] to scrape metrics from Benthos by polling its HTTP API at the url `http://localhost:4195/stats`. 106 107 We have also specified a [`path_mapping`][metrics.prometheus.path_mapping] that deletes any internal metrics usually emitted by Benthos by filtering on our custom metric name. 108 109 Finally, there's also a [`bloblang` processor][processors.bloblang] added to the end of our pipeline that deletes all messages since we're not interested in sending the raw data anywhere after this point anyway. 110 111 While running this config you can verify that our custom metric is emitted with `curl`: 112 113 ```sh 114 curl -s http://localhost:4195/stats | grep downloads 115 ``` 116 117 Giving something like: 118 119 ```text 120 # HELP benthos_downloads Benthos Gauge metric 121 # TYPE benthos_downloads gauge 122 benthos_downloads{source="homebrew"} 78978979 123 ``` 124 125 Easy! The Dockerhub API is also pretty simple, and adding it to our pipeline is just: 126 127 <Tabs defaultValue="Diff" values={[ 128 { label: 'Diff', value: 'Diff', }, 129 { label: 'Full Config', value: 'Full Config', }, 130 ]}> 131 132 <TabItem value="Diff"> 133 134 ```diff 135 source: homebrew 136 value: ${! json("analytics.install.30d.benthos") } 137 138 + - bloblang: root = "" 139 + 140 + - http: 141 + url: https://hub.docker.com/v2/repositories/jeffail/benthos/ 142 + verb: GET 143 + 144 + - metric: 145 + type: gauge 146 + name: downloads 147 + labels: 148 + source: dockerhub 149 + value: ${! json("pull_count") } 150 + 151 - bloblang: root = deleted() 152 ``` 153 </TabItem> 154 155 <TabItem value="Full Config"> 156 157 ```yaml 158 http: 159 address: 0.0.0.0:4195 160 161 input: 162 generate: 163 interval: 60s 164 mapping: root = "" 165 166 pipeline: 167 processors: 168 - http: 169 url: https://formulae.brew.sh/api/formula/benthos.json 170 verb: GET 171 172 - metric: 173 type: gauge 174 name: downloads 175 labels: 176 source: homebrew 177 value: ${! json("analytics.install.30d.benthos") } 178 179 - bloblang: root = "" 180 181 - http: 182 url: https://hub.docker.com/v2/repositories/jeffail/benthos/ 183 verb: GET 184 185 - metric: 186 type: gauge 187 name: downloads 188 labels: 189 source: dockerhub 190 value: ${! json("pull_count") } 191 192 - bloblang: root = deleted() 193 194 metrics: 195 prometheus: 196 prefix: benthos 197 path_mapping: if this != "downloads" { deleted() } 198 ``` 199 200 </TabItem> 201 202 </Tabs> 203 204 ## Harder Example 205 206 So that's the basics covered. Next, we're going to target the Github releases API which gives a slightly more complex payload that looks something like this: 207 208 ```json 209 [ 210 { 211 "tag_name": "X.XX.X", 212 "assets":[ 213 {"name":"benthos-lambda_X.XX.X_linux_amd64.zip","download_count":543534545}, 214 {"name":"benthos_X.XX.X_darwin_amd64.tar.gz","download_count":43242342}, 215 {"name":"benthos_X.XX.X_freebsd_amd64.tar.gz","download_count":534565656}, 216 {"name":"benthos_X.XX.X_linux_amd64.tar.gz","download_count":743282474324} 217 ] 218 } 219 ] 220 ``` 221 222 It's an array of objects, one for each tagged release, with a field `assets` which is an array of objects representing each release asset, of which we want to emit a separate download gauge. In order to do this we're going to use a [`bloblang` processor][processors.bloblang] to remap the payload from Github into an array of objects of the following form: 223 224 ```json 225 [ 226 {"source":"github","dist":"lambda_linux_amd64","download_count":543534545,"version":"X.XX.X"}, 227 {"source":"github","dist":"darwin_amd64","download_count":43242342,"version":"X.XX.X"}, 228 {"source":"github","dist":"freebsd_amd64","download_count":534565656,"version":"X.XX.X"}, 229 {"source":"github","dist":"linux_amd64","download_count":743282474324,"version":"X.XX.X"} 230 ] 231 ``` 232 233 Then we can use an [`unarchive` processor][processors.unarchive] with the format `json_array` to expand this array into N individual messages, one for each asset. Finally, we will follow up with a [`metric` processor][processors.metric] that dynamically sets labels following the fields `source`, `dist` and `version` so that we have a separate metrics series for each asset type for each tagged version. 234 235 A simple pipeline of these steps would look like this (please forgive the regexp): 236 237 ```yaml 238 http: 239 address: 0.0.0.0:4195 240 241 input: 242 generate: 243 interval: 60s 244 mapping: root = "" 245 246 pipeline: 247 processors: 248 - http: 249 url: https://api.github.com/repos/Jeffail/benthos/releases 250 verb: GET 251 252 - bloblang: | 253 root = this.map_each(release -> release.assets.map_each(asset -> { 254 "source": "github", 255 "dist": asset.name.re_replace("^benthos-?((lambda_)|_)[0-9\\.]+(-rc[0-9]+)?_([^\\.]+).*", "$2$4"), 256 "download_count": asset.download_count, 257 "version": release.tag_name.trim("v"), 258 }).filter(asset -> asset.dist != "checksums")).flatten() 259 260 - unarchive: 261 format: json_array 262 263 - metric: 264 type: gauge 265 name: downloads 266 labels: 267 dist: ${! json("dist") } 268 source: ${! json("source") } 269 value: ${! json("download_count") } 270 271 - bloblang: root = deleted() 272 273 metrics: 274 prometheus: 275 prefix: benthos 276 path_mapping: if this != "downloads" { deleted() } 277 ``` 278 279 Finally, let's combine all the custom metrics into one pipeline. 280 281 ## Combining into a Workflow 282 283 Okay I'm getting bored now so let's wrap this up. The following config expands on the previous examples by configuring each API poll as a [`branch` processor][processors.branch], which allows us to run them within a [`workflow` processor][processors.workflow] that can execute all three branches in parallel. 284 285 The [`metric` processors][processors.metric] have also been combined into a single reusable resource by updating the other API calls to format their payloads into the same structure as our Github remap. 286 287 ```yaml 288 http: 289 address: 0.0.0.0:4195 290 291 input: 292 generate: 293 interval: 60s 294 mapping: root = {} 295 296 pipeline: 297 processors: 298 - workflow: 299 meta_path: results 300 order: [ [ dockerhub, github, homebrew ] ] 301 302 processor_resources: 303 - label: dockerhub 304 branch: 305 request_map: 'root = ""' 306 processors: 307 - try: 308 - http: 309 url: https://hub.docker.com/v2/repositories/jeffail/benthos/ 310 verb: GET 311 - bloblang: | 312 root.source = "docker" 313 root.dist = "docker" 314 root.download_count = this.pull_count 315 root.version = "all" 316 - resource: metric_gauge 317 318 - label: github 319 branch: 320 request_map: 'root = ""' 321 processors: 322 - try: 323 - http: 324 url: https://api.github.com/repos/Jeffail/benthos/releases 325 verb: GET 326 - bloblang: | 327 root = this.map_each(release -> release.assets.map_each(asset -> { 328 "source": "github", 329 "dist": asset.name.re_replace("^benthos-?((lambda_)|_)[0-9\\.]+(-rc[0-9]+)?_([^\\.]+).*", "$2$4"), 330 "download_count": asset.download_count, 331 "version": release.tag_name.trim("v"), 332 }).filter(asset -> asset.dist != "checksums")).flatten() 333 - unarchive: 334 format: json_array 335 - resource: metric_gauge 336 - bloblang: 'root = if batch_index() != 0 { deleted() }' 337 338 - label: homebrew 339 branch: 340 request_map: 'root = ""' 341 processors: 342 - try: 343 - http: 344 url: https://formulae.brew.sh/api/formula/benthos.json 345 verb: GET 346 - bloblang: | 347 root.source = "homebrew" 348 root.dist = "homebrew" 349 root.download_count = this.analytics.install.30d.benthos 350 root.version = "all" 351 - resource: metric_gauge 352 353 - label: metric_gauge 354 metric: 355 type: gauge 356 name: downloads 357 labels: 358 dist: ${! json("dist") } 359 source: ${! json("source") } 360 version: ${! json("version") } 361 value: ${! json("download_count") } 362 363 metrics: 364 prometheus: 365 prefix: benthos 366 path_mapping: if this != "downloads" { deleted() } 367 ``` 368 369 [serverless.lambda]: /docs/guides/serverless/lambda 370 [internal-metrics]: /docs/components/metrics/about 371 [inputs.http_client]: /docs/components/inputs/http_client 372 [inputs.generate]: /docs/components/inputs/generate 373 [processors.workflow]: /docs/components/processors/workflow 374 [processors.branch]: /docs/components/processors/branch 375 [processors.unarchive]: /docs/components/processors/unarchive 376 [processors.bloblang]: /docs/components/processors/bloblang 377 [processors.http]: /docs/components/processors/http 378 [processors.metric]: /docs/components/processors/metric 379 [rate_limits]: /docs/components/rate_limits/about 380 [metrics.prometheus]: /docs/components/metrics/prometheus 381 [metrics.prometheus.path_mapping]: /docs/components/metrics/prometheus#path_mapping 382 [prometheus]: https://prometheus.io/