github.com/Jeffail/benthos/v3@v3.65.0/website/docs/components/processors/grok.md (about) 1 --- 2 title: grok 3 type: processor 4 status: stable 5 categories: ["Parsing"] 6 --- 7 8 <!-- 9 THIS FILE IS AUTOGENERATED! 10 11 To make changes please edit the contents of: 12 lib/processor/grok.go 13 --> 14 15 import Tabs from '@theme/Tabs'; 16 import TabItem from '@theme/TabItem'; 17 18 19 Parses messages into a structured format by attempting to apply a list of Grok expressions, the first expression to result in at least one value replaces the original message with a JSON object containing the values. 20 21 22 <Tabs defaultValue="common" values={[ 23 { label: 'Common', value: 'common', }, 24 { label: 'Advanced', value: 'advanced', }, 25 ]}> 26 27 <TabItem value="common"> 28 29 ```yaml 30 # Common config fields, showing default values 31 label: "" 32 grok: 33 expressions: [] 34 pattern_definitions: {} 35 pattern_paths: [] 36 ``` 37 38 </TabItem> 39 <TabItem value="advanced"> 40 41 ```yaml 42 # All config fields, showing default values 43 label: "" 44 grok: 45 expressions: [] 46 pattern_definitions: {} 47 pattern_paths: [] 48 named_captures_only: true 49 use_default_patterns: true 50 remove_empty_values: true 51 parts: [] 52 ``` 53 54 </TabItem> 55 </Tabs> 56 57 Type hints within patterns are respected, therefore with the pattern `%{WORD:first},%{INT:second:int}` and a payload of `foo,1` the resulting payload would be `{"first":"foo","second":1}`. 58 59 ### Performance 60 61 This processor currently uses the [Go RE2](https://golang.org/s/re2syntax) regular expression engine, which is guaranteed to run in time linear to the size of the input. However, this property often makes it less performant than PCRE based implementations of grok. For more information see [https://swtch.com/~rsc/regexp/regexp1.html](https://swtch.com/~rsc/regexp/regexp1.html). 62 63 ## Examples 64 65 <Tabs defaultValue="VPC Flow Logs" values={[ 66 { label: 'VPC Flow Logs', value: 'VPC Flow Logs', }, 67 ]}> 68 69 <TabItem value="VPC Flow Logs"> 70 71 72 Grok can be used to parse unstructured logs such as VPC flow logs that look like this: 73 74 ```text 75 2 123456789010 eni-1235b8ca123456789 172.31.16.139 172.31.16.21 20641 22 6 20 4249 1418530010 1418530070 ACCEPT OK 76 ``` 77 78 Into structured objects that look like this: 79 80 ```json 81 {"accountid":"123456789010","action":"ACCEPT","bytes":4249,"dstaddr":"172.31.16.21","dstport":22,"end":1418530070,"interfaceid":"eni-1235b8ca123456789","logstatus":"OK","packets":20,"protocol":6,"srcaddr":"172.31.16.139","srcport":20641,"start":1418530010,"version":2} 82 ``` 83 84 With the following config: 85 86 ```yaml 87 pipeline: 88 processors: 89 - grok: 90 expressions: 91 - '%{VPCFLOWLOG}' 92 pattern_definitions: 93 VPCFLOWLOG: '%{NUMBER:version:int} %{NUMBER:accountid} %{NOTSPACE:interfaceid} %{NOTSPACE:srcaddr} %{NOTSPACE:dstaddr} %{NOTSPACE:srcport:int} %{NOTSPACE:dstport:int} %{NOTSPACE:protocol:int} %{NOTSPACE:packets:int} %{NOTSPACE:bytes:int} %{NUMBER:start:int} %{NUMBER:end:int} %{NOTSPACE:action} %{NOTSPACE:logstatus}' 94 ``` 95 96 </TabItem> 97 </Tabs> 98 99 ## Fields 100 101 ### `expressions` 102 103 One or more Grok expressions to attempt against incoming messages. The first expression to match at least one value will be used to form a result. 104 105 106 Type: `array` 107 Default: `[]` 108 109 ### `pattern_definitions` 110 111 A map of pattern definitions that can be referenced within `patterns`. 112 113 114 Type: `object` 115 Default: `{}` 116 117 ### `pattern_paths` 118 119 A list of paths to load Grok patterns from. This field supports wildcards, including super globs (double star). 120 121 122 Type: `array` 123 Default: `[]` 124 125 ### `named_captures_only` 126 127 Whether to only capture values from named patterns. 128 129 130 Type: `bool` 131 Default: `true` 132 133 ### `use_default_patterns` 134 135 Whether to use a [default set of patterns](#default-patterns). 136 137 138 Type: `bool` 139 Default: `true` 140 141 ### `remove_empty_values` 142 143 Whether to remove values that are empty from the resulting structure. 144 145 146 Type: `bool` 147 Default: `true` 148 149 ### `parts` 150 151 An optional array of message indexes of a batch that the processor should apply to. 152 If left empty all messages are processed. This field is only applicable when 153 batching messages [at the input level](/docs/configuration/batching). 154 155 Indexes can be negative, and if so the part will be selected from the end 156 counting backwards starting from -1. 157 158 159 Type: `array` 160 Default: `[]` 161 162 ## Default Patterns 163 164 A summary of the default patterns on offer can be [found here](https://github.com/Jeffail/grok/blob/master/patterns.go#L5). 165