github.com/Jeffail/benthos/v3@v3.65.0/website/docs/guides/bloblang/advanced.md

github.com/Jeffail/benthos/v3@v3.65.0/website/docs/guides/bloblang/advanced.md (about)

     1  ---
     2  title: Advanced Bloblang
     3  sidebar_label: Advanced
     4  description: Some advanced Bloblang patterns
     5  ---
     6  
     7  ## Map Parameters
     8  
     9  A map definition only has one input parameter, which is the context that it is called upon:
    10  
    11  ```coffee
    12  map formatting {
    13    root = "(%v)".format(this)
    14  }
    15  
    16  root.a = this.a.apply("formatting")
    17  root.b = this.b.apply("formatting")
    18  
    19  # In:  {"a":"foo","b":"bar"}
    20  # Out: {"a":"(foo)","b":"(bar)"}
    21  ```
    22  
    23  However, in cases where we wish to provide multiple named parameters to a mapping we can execute them on object literals for the same effect:
    24  
    25  However, we can still use object literals for this purpose. Imagine if we wanted a map that is the exact same as above except the pattern is `[%v]` instead, with the potential for even more patterns in the future. To do that we can pass an object with a field `value` with our target to map and a field `pattern` which allows us to specify the pattern to apply:
    26  
    27  ```coffee
    28  map formatting {
    29    root = this.pattern.format(this.value)
    30  }
    31  
    32  root.a = {
    33    "value":this.a,
    34    "pattern":this.pattern,
    35  }.apply("formatting")
    36  
    37  root.b = {
    38    "value":this.b,
    39    "pattern":this.pattern,
    40  }.apply("formatting")
    41  
    42  # In:  {"a":"foo","b":"bar","pattern":"[%v]"}
    43  # Out: {"a":"[foo]","b":"[bar]"}
    44  ```
    45  
    46  ## Walking the Tree
    47  
    48  Sometimes it's necessary to perform a mapping on all values within an unknown tree structure. You can do that easily with recursive mapping:
    49  
    50  ```coffee
    51  map unescape_values {
    52    root = match {
    53      this.type() == "object" => this.map_each(item -> item.value.apply("unescape_values")),
    54      this.type() == "array" => this.map_each(ele -> ele.apply("unescape_values")),
    55      this.type() == "string" => this.unescape_html(),
    56      this.type() == "bytes" => this.unescape_html(),
    57      _ => this,
    58    }
    59  }
    60  root = this.apply("unescape_values")
    61  
    62  # In:  {"first":{"nested":"foo &amp; bar"},"second":10,"third":["1 &lt; 2",{"also_nested":"2 &gt; 1"}]}
    63  # Out: {"first":{"nested":"foo & bar"},"second":10,"third":["1 < 2",{"also_nested":"2 > 1"}]}
    64  ```
    65  
    66  ## Message Expansion
    67  
    68  Expanding a single message into multiple messages can be done by mapping messages into an array and following it up with an [`unarchive` processor][processors.unarchive]. For example, given documents of this format:
    69  
    70  ```json
    71  {
    72    "id": "foobar",
    73    "items": [
    74      {"content":"foo"},
    75      {"content":"bar"},
    76      {"content":"baz"}
    77    ]
    78  }
    79  ```
    80  
    81  We can pull `items` out to the root with `root = items` with a [`bloblang` processor][processors.bloblang] and follow it with an [`unarchive` processor][processors.unarchive] to expand each element into its own independent message:
    82  
    83  ```yaml
    84  pipeline:
    85    processors:
    86      - bloblang: root = this.items
    87      - unarchive:
    88          format: json_array
    89  ```
    90  
    91  However, most of the time we also need to map the elements before expanding them, and often that includes copying fields outside of our target array. We can do that with methods such as `map_each` and `merge`:
    92  
    93  ```coffee
    94  root = this.items.map_each(ele -> this.without("items").merge(ele))
    95  
    96  # In:  {"id":"foobar","items":[{"content":"foo"},{"content":"bar"},{"content":"baz"}]}
    97  # Out: [{"content":"foo","id":"foobar"},{"content":"bar","id":"foobar"},{"content":"baz","id":"foobar"}]
    98  ```
    99  
   100  However, the above mapping is slightly inefficient as we would create a copy of our source object for each element with the `this.without("items")` part. A more efficient way to do this would be to capture that query within a variable:
   101  
   102  ```coffee
   103  let doc_root = this.without("items")
   104  root = this.items.map_each($doc_root.merge(this))
   105  
   106  # In:  {"id":"foobar","items":[{"content":"foo"},{"content":"bar"},{"content":"baz"}]}
   107  # Out: [{"content":"foo","id":"foobar"},{"content":"bar","id":"foobar"},{"content":"baz","id":"foobar"}]
   108  ```
   109  
   110  Also note that when we set `doc_root` we remove the field `items` from the target document. The full config would now be:
   111  
   112  ```yaml
   113  pipeline:
   114    processors:
   115      - bloblang: |
   116          let doc_root = this.without("items")
   117          root = this.items.map_each($doc_root.merge(this))
   118      - unarchive:
   119          format: json_array
   120  ```
   121  
   122  ## Creating CSV
   123  
   124  Benthos has a few different ways of outputting a stream of CSV data. However, the best way to do it is by converting the documents into CSV rows with Bloblang as this gives you full control over exactly how the schema is generated, erroneous data is handled, and escaping of column data is performed.
   125  
   126  A common and simple use case is to simply flatten documents and write out the column values in alphabetical order. The first row we generate should also be prefixed with a row containing those column names. Here's a mapping that achieves this by using a `count` function to detect the very first invocation of the mapping in a stream pipeline:
   127  
   128  ```coffee
   129  map escape_csv {
   130    root = if this.re_match("[\"\n,]+") {
   131      "\"" + this.replace("\"", "\"\"") + "\""
   132    } else {
   133      this
   134    }
   135  }
   136  
   137  # Extract key/value pairs as an array and sort by the key
   138  let kvs = this.key_values().sort_by(v -> v.key)
   139  
   140  # Create a header prefix for our output only on the first row
   141  let header = if count("rows_in_file") == 1 {
   142    $kvs.map_each(kv -> kv.key.apply("escape_csv")).join(",") + "\n"
   143  } else { "" }
   144  
   145  root = $header + $kvs.map_each(kv -> kv.value.string().apply("escape_csv")).join(",")
   146  ```
   147  
   148  And with this mapping we can write the data to a newly created CSV file using an output with a simple `lines` codec:
   149  
   150  ```yaml
   151  output:
   152    file:
   153      path: ./result.csv
   154      codec: lines
   155  ```
   156  
   157  Perhaps the first expansion of this mapping that would be worthwhile is to add an explicit list of column names, or at least confirm that the number of values in a row matches an expected count.
   158  
   159  [processors.bloblang]: /docs/components/processors/bloblang
   160  [processors.unarchive]: /docs/components/processors/unarchive