github.com/observiq/carbon@v0.9.11-0.20200820160507-1b872e368a5e/docs/operators/regex_parser.md (about) 1 ## `regex_parser` operator 2 3 The `regex_parser` operator parses the string-type field selected by `parse_from` with the given regular expression pattern. 4 5 ### Configuration Fields 6 7 | Field | Default | Description | 8 | --- | --- | --- | 9 | `id` | `regex_parser` | A unique identifier for the operator | 10 | `output` | Next in pipeline | The connected operator(s) that will receive all outbound entries | 11 | `regex` | required | A [Go regular expression](https://github.com/google/re2/wiki/Syntax). The named capture groups will be extracted as fields in the parsed object | 12 | `parse_from` | $ | A [field](/docs/types/field.md) that indicates the field to be parsed | 13 | `parse_to` | $ | A [field](/docs/types/field.md) that indicates the field to be parsed | 14 | `preserve` | false | Preserve the unparsed value on the record | 15 | `on_error` | `send` | The behavior of the operator if it encounters an error. See [on_error](/docs/types/on_error.md) | 16 | `timestamp` | `nil` | An optional [timestamp](/docs/types/timestamp.md) block which will parse a timestamp field before passing the entry to the output operator | 17 18 ### Example Configurations 19 20 21 #### Parse the field `message` with a regular expression 22 23 Configuration: 24 ```yaml 25 - type: regex_parser 26 parse_from: message 27 regexp: '^Host=(?P<host>[^,]+), Type=(?P<type>.*)$' 28 ``` 29 30 <table> 31 <tr><td> Input record </td> <td> Output record </td></tr> 32 <tr> 33 <td> 34 35 ```json 36 { 37 "timestamp": "", 38 "record": { 39 "message": "Host=127.0.0.1, Type=HTTP" 40 } 41 } 42 ``` 43 44 </td> 45 <td> 46 47 ```json 48 { 49 "timestamp": "", 50 "record": { 51 "host": "127.0.0.1", 52 "type": "HTTP" 53 } 54 } 55 ``` 56 57 </td> 58 </tr> 59 </table> 60 61 #### Parse a nested field to a different field, preserving original 62 63 Configuration: 64 ```yaml 65 - type: regex_parser 66 parse_from: message.embedded 67 parse_to: parsed 68 regexp: '^Host=(?P<host>[^,]+), Type=(?P<type>.*)$' 69 preserve: true 70 ``` 71 72 <table> 73 <tr><td> Input record </td> <td> Output record </td></tr> 74 <tr> 75 <td> 76 77 ```json 78 { 79 "timestamp": "", 80 "record": { 81 "message": { 82 "embedded": "Host=127.0.0.1, Type=HTTP" 83 } 84 } 85 } 86 ``` 87 88 </td> 89 <td> 90 91 ```json 92 { 93 "timestamp": "", 94 "record": { 95 "message": { 96 "embedded": "Host=127.0.0.1, Type=HTTP" 97 }, 98 "parsed": { 99 "host": "127.0.0.1", 100 "type": "HTTP" 101 } 102 } 103 } 104 ``` 105 106 </td> 107 </tr> 108 </table> 109 110 111 #### Parse the field `message` with a regular expression and also parse the timestamp 112 113 Configuration: 114 ```yaml 115 - type: regex_parser 116 regexp: '^Time=(?P<timestamp_field>\d{4}-\d{2}-\d{2}), Host=(?P<host>[^,]+), Type=(?P<type>.*)$' 117 timestamp: 118 parse_from: timestamp_field 119 layout_type: strptime 120 layout: '%Y-%m-%d' 121 ``` 122 123 <table> 124 <tr><td> Input record </td> <td> Output record </td></tr> 125 <tr> 126 <td> 127 128 ```json 129 { 130 "timestamp": "", 131 "record": { 132 "message": "Time=2020-01-31, Host=127.0.0.1, Type=HTTP" 133 } 134 } 135 ``` 136 137 </td> 138 <td> 139 140 ```json 141 { 142 "timestamp": "2020-01-31T00:00:00-00:00", 143 "record": { 144 "host": "127.0.0.1", 145 "type": "HTTP" 146 } 147 } 148 ``` 149 150 </td> 151 </tr> 152 </table>