github.com/Jeffail/benthos/v3@v3.65.0/website/docs/guides/bloblang/advanced.md (about) 1 --- 2 title: Advanced Bloblang 3 sidebar_label: Advanced 4 description: Some advanced Bloblang patterns 5 --- 6 7 ## Map Parameters 8 9 A map definition only has one input parameter, which is the context that it is called upon: 10 11 ```coffee 12 map formatting { 13 root = "(%v)".format(this) 14 } 15 16 root.a = this.a.apply("formatting") 17 root.b = this.b.apply("formatting") 18 19 # In: {"a":"foo","b":"bar"} 20 # Out: {"a":"(foo)","b":"(bar)"} 21 ``` 22 23 However, in cases where we wish to provide multiple named parameters to a mapping we can execute them on object literals for the same effect: 24 25 However, we can still use object literals for this purpose. Imagine if we wanted a map that is the exact same as above except the pattern is `[%v]` instead, with the potential for even more patterns in the future. To do that we can pass an object with a field `value` with our target to map and a field `pattern` which allows us to specify the pattern to apply: 26 27 ```coffee 28 map formatting { 29 root = this.pattern.format(this.value) 30 } 31 32 root.a = { 33 "value":this.a, 34 "pattern":this.pattern, 35 }.apply("formatting") 36 37 root.b = { 38 "value":this.b, 39 "pattern":this.pattern, 40 }.apply("formatting") 41 42 # In: {"a":"foo","b":"bar","pattern":"[%v]"} 43 # Out: {"a":"[foo]","b":"[bar]"} 44 ``` 45 46 ## Walking the Tree 47 48 Sometimes it's necessary to perform a mapping on all values within an unknown tree structure. You can do that easily with recursive mapping: 49 50 ```coffee 51 map unescape_values { 52 root = match { 53 this.type() == "object" => this.map_each(item -> item.value.apply("unescape_values")), 54 this.type() == "array" => this.map_each(ele -> ele.apply("unescape_values")), 55 this.type() == "string" => this.unescape_html(), 56 this.type() == "bytes" => this.unescape_html(), 57 _ => this, 58 } 59 } 60 root = this.apply("unescape_values") 61 62 # In: {"first":{"nested":"foo & bar"},"second":10,"third":["1 < 2",{"also_nested":"2 > 1"}]} 63 # Out: {"first":{"nested":"foo & bar"},"second":10,"third":["1 < 2",{"also_nested":"2 > 1"}]} 64 ``` 65 66 ## Message Expansion 67 68 Expanding a single message into multiple messages can be done by mapping messages into an array and following it up with an [`unarchive` processor][processors.unarchive]. For example, given documents of this format: 69 70 ```json 71 { 72 "id": "foobar", 73 "items": [ 74 {"content":"foo"}, 75 {"content":"bar"}, 76 {"content":"baz"} 77 ] 78 } 79 ``` 80 81 We can pull `items` out to the root with `root = items` with a [`bloblang` processor][processors.bloblang] and follow it with an [`unarchive` processor][processors.unarchive] to expand each element into its own independent message: 82 83 ```yaml 84 pipeline: 85 processors: 86 - bloblang: root = this.items 87 - unarchive: 88 format: json_array 89 ``` 90 91 However, most of the time we also need to map the elements before expanding them, and often that includes copying fields outside of our target array. We can do that with methods such as `map_each` and `merge`: 92 93 ```coffee 94 root = this.items.map_each(ele -> this.without("items").merge(ele)) 95 96 # In: {"id":"foobar","items":[{"content":"foo"},{"content":"bar"},{"content":"baz"}]} 97 # Out: [{"content":"foo","id":"foobar"},{"content":"bar","id":"foobar"},{"content":"baz","id":"foobar"}] 98 ``` 99 100 However, the above mapping is slightly inefficient as we would create a copy of our source object for each element with the `this.without("items")` part. A more efficient way to do this would be to capture that query within a variable: 101 102 ```coffee 103 let doc_root = this.without("items") 104 root = this.items.map_each($doc_root.merge(this)) 105 106 # In: {"id":"foobar","items":[{"content":"foo"},{"content":"bar"},{"content":"baz"}]} 107 # Out: [{"content":"foo","id":"foobar"},{"content":"bar","id":"foobar"},{"content":"baz","id":"foobar"}] 108 ``` 109 110 Also note that when we set `doc_root` we remove the field `items` from the target document. The full config would now be: 111 112 ```yaml 113 pipeline: 114 processors: 115 - bloblang: | 116 let doc_root = this.without("items") 117 root = this.items.map_each($doc_root.merge(this)) 118 - unarchive: 119 format: json_array 120 ``` 121 122 ## Creating CSV 123 124 Benthos has a few different ways of outputting a stream of CSV data. However, the best way to do it is by converting the documents into CSV rows with Bloblang as this gives you full control over exactly how the schema is generated, erroneous data is handled, and escaping of column data is performed. 125 126 A common and simple use case is to simply flatten documents and write out the column values in alphabetical order. The first row we generate should also be prefixed with a row containing those column names. Here's a mapping that achieves this by using a `count` function to detect the very first invocation of the mapping in a stream pipeline: 127 128 ```coffee 129 map escape_csv { 130 root = if this.re_match("[\"\n,]+") { 131 "\"" + this.replace("\"", "\"\"") + "\"" 132 } else { 133 this 134 } 135 } 136 137 # Extract key/value pairs as an array and sort by the key 138 let kvs = this.key_values().sort_by(v -> v.key) 139 140 # Create a header prefix for our output only on the first row 141 let header = if count("rows_in_file") == 1 { 142 $kvs.map_each(kv -> kv.key.apply("escape_csv")).join(",") + "\n" 143 } else { "" } 144 145 root = $header + $kvs.map_each(kv -> kv.value.string().apply("escape_csv")).join(",") 146 ``` 147 148 And with this mapping we can write the data to a newly created CSV file using an output with a simple `lines` codec: 149 150 ```yaml 151 output: 152 file: 153 path: ./result.csv 154 codec: lines 155 ``` 156 157 Perhaps the first expansion of this mapping that would be worthwhile is to add an explicit list of column names, or at least confirm that the number of values in a row matches an expected count. 158 159 [processors.bloblang]: /docs/components/processors/bloblang 160 [processors.unarchive]: /docs/components/processors/unarchive