github.com/observiq/carbon@v0.9.11-0.20200820160507-1b872e368a5e/docs/operators/regex_parser.md (about)

     1  ## `regex_parser` operator
     2  
     3  The `regex_parser` operator parses the string-type field selected by `parse_from` with the given regular expression pattern.
     4  
     5  ### Configuration Fields
     6  
     7  | Field        | Default          | Description                                                                                                                                     |
     8  | ---          | ---              | ---                                                                                                                                             |
     9  | `id`         | `regex_parser`   | A unique identifier for the operator                                                                                                            |
    10  | `output`     | Next in pipeline | The connected operator(s) that will receive all outbound entries                                                                                |
    11  | `regex`      | required         | A [Go regular expression](https://github.com/google/re2/wiki/Syntax). The named capture groups will be extracted as fields in the parsed object |
    12  | `parse_from` | $                | A [field](/docs/types/field.md) that indicates the field to be parsed                                                                           |
    13  | `parse_to`   | $                | A [field](/docs/types/field.md) that indicates the field to be parsed                                                                           |
    14  | `preserve`   | false            | Preserve the unparsed value on the record                                                                                                       |
    15  | `on_error`   | `send`           | The behavior of the operator if it encounters an error. See [on_error](/docs/types/on_error.md)                                                 |
    16  | `timestamp`  | `nil`            | An optional [timestamp](/docs/types/timestamp.md) block which will parse a timestamp field before passing the entry to the output operator      |
    17  
    18  ### Example Configurations
    19  
    20  
    21  #### Parse the field `message` with a regular expression
    22  
    23  Configuration:
    24  ```yaml
    25  - type: regex_parser
    26    parse_from: message
    27    regexp: '^Host=(?P<host>[^,]+), Type=(?P<type>.*)$'
    28  ```
    29  
    30  <table>
    31  <tr><td> Input record </td> <td> Output record </td></tr>
    32  <tr>
    33  <td>
    34  
    35  ```json
    36  {
    37    "timestamp": "",
    38    "record": {
    39      "message": "Host=127.0.0.1, Type=HTTP"
    40    }
    41  }
    42  ```
    43  
    44  </td>
    45  <td>
    46  
    47  ```json
    48  {
    49    "timestamp": "",
    50    "record": {
    51      "host": "127.0.0.1",
    52      "type": "HTTP"
    53    }
    54  }
    55  ```
    56  
    57  </td>
    58  </tr>
    59  </table>
    60  
    61  #### Parse a nested field to a different field, preserving original
    62  
    63  Configuration:
    64  ```yaml
    65  - type: regex_parser
    66    parse_from: message.embedded
    67    parse_to: parsed
    68    regexp: '^Host=(?P<host>[^,]+), Type=(?P<type>.*)$'
    69    preserve: true
    70  ```
    71  
    72  <table>
    73  <tr><td> Input record </td> <td> Output record </td></tr>
    74  <tr>
    75  <td>
    76  
    77  ```json
    78  {
    79    "timestamp": "",
    80    "record": {
    81      "message": {
    82        "embedded": "Host=127.0.0.1, Type=HTTP"
    83      }
    84    }
    85  }
    86  ```
    87  
    88  </td>
    89  <td>
    90  
    91  ```json
    92  {
    93    "timestamp": "",
    94    "record": {
    95      "message": {
    96        "embedded": "Host=127.0.0.1, Type=HTTP"
    97      },
    98      "parsed": {
    99        "host": "127.0.0.1",
   100        "type": "HTTP"
   101      }
   102    }
   103  }
   104  ```
   105  
   106  </td>
   107  </tr>
   108  </table>
   109  
   110  
   111  #### Parse the field `message` with a regular expression and also parse the timestamp
   112  
   113  Configuration:
   114  ```yaml
   115  - type: regex_parser
   116    regexp: '^Time=(?P<timestamp_field>\d{4}-\d{2}-\d{2}), Host=(?P<host>[^,]+), Type=(?P<type>.*)$'
   117    timestamp:
   118      parse_from: timestamp_field
   119      layout_type: strptime
   120      layout: '%Y-%m-%d'
   121  ```
   122  
   123  <table>
   124  <tr><td> Input record </td> <td> Output record </td></tr>
   125  <tr>
   126  <td>
   127  
   128  ```json
   129  {
   130    "timestamp": "",
   131    "record": {
   132      "message": "Time=2020-01-31, Host=127.0.0.1, Type=HTTP"
   133    }
   134  }
   135  ```
   136  
   137  </td>
   138  <td>
   139  
   140  ```json
   141  {
   142    "timestamp": "2020-01-31T00:00:00-00:00",
   143    "record": {
   144      "host": "127.0.0.1",
   145      "type": "HTTP"
   146    }
   147  }
   148  ```
   149  
   150  </td>
   151  </tr>
   152  </table>