github.com/observiq/carbon@v0.9.11-0.20200820160507-1b872e368a5e/docs/operators/file_input.md (about)

     1  ## `file_input` operator
     2  
     3  The `file_input` operator reads logs from files. It will place the lines read into the `message` field of the new entry.
     4  
     5  ### Configuration Fields
     6  
     7  | Field               | Default          | Description                                                                                                        |
     8  | ---                 | ---              | ---                                                                                                                |
     9  | `id`                | `file_input`     | A unique identifier for the operator                                                                               |
    10  | `output`            | Next in pipeline | The connected operator(s) that will receive all outbound entries                                                   |
    11  | `include`           | required         | A list of file glob patterns that match the file paths to be read                                                  |
    12  | `exclude`           | []               | A list of file glob patterns to exclude from reading                                                               |
    13  | `poll_interval`     | 200ms            | The duration between filesystem polls                                                                              |
    14  | `multiline`         |                  | A `multiline` configuration block. See below for details                                                           |
    15  | `write_to`          | $                | The record [field](/docs/types/field.md) written to when creating a new log entry                                  |
    16  | `encoding`          | `nop`            | The encoding of the file being read. See the list of supported encodings below for available options               |
    17  | `include_file_name` | `true`           | Whether to add the file name as the label `file_name`                                                              |
    18  | `include_file_path` | `false`          | Whether to add the file path as the label `file_path`                                                              |
    19  | `start_at`          | `end`            | At startup, where to start reading logs from the file. Options are `beginning` or `end`                            |
    20  | `max_log_size`      | 1048576          | The maximum size of a log entry to read before failing. Protects against reading large amounts of data into memory |
    21  | `labels`            | {}               | A map of `key: value` labels to add to the entry's labels                                                          |
    22  | `resource`          | {}               | A map of `key: value` labels to add to the entry's resource                                                        |
    23  
    24  Note that by default, no logs will be read unless the monitored file is actively being written to because `start_at` defaults to `end`.
    25  
    26  #### `multiline` configuration
    27  
    28  If set, the `multiline` configuration block instructs the `file_input` operator to split log entries on a pattern other than newlines.
    29  
    30  The `multiline` configuration block must contain exactly one of `line_start_pattern` or `line_end_pattern`. These are regex patterns that
    31  match either the beginning of a new log entry, or the end of a log entry.
    32  
    33  ### Supported encodings
    34  
    35  | Key        | Description
    36  | ---        | ---                                                              |
    37  | `nop`      | No encoding validation. Treats the file as a stream of raw bytes |
    38  | `utf-8`    | UTF-8 encoding                                                   |
    39  | `utf-16le` | UTF-16 encoding with little-endian byte order                    |
    40  | `utf-16be` | UTF-16 encoding with little-endian byte order                    |
    41  | `ascii`    | ASCII encoding                                                   |
    42  | `big5`     | The Big5 Chinese character encoding                              |
    43  
    44  Other less common encodings are supported on a best-effort basis. See [https://www.iana.org/assignments/character-sets/character-sets.xhtml](https://www.iana.org/assignments/character-sets/character-sets.xhtml) for other encodings available.
    45  
    46  
    47  ### Example Configurations
    48  
    49  #### Simple file input
    50  
    51  Configuration:
    52  ```yaml
    53  - type: file_input
    54    include:
    55      - ./test.log
    56  ```
    57  
    58  <table>
    59  <tr><td> `./test.log` </td> <td> Output records </td></tr>
    60  <tr>
    61  <td>
    62  
    63  ```
    64  log1
    65  log2
    66  log3
    67  ```
    68  
    69  </td>
    70  <td>
    71  
    72  ```json
    73  {
    74    "message": "log1"
    75  },
    76  {
    77    "message": "log2"
    78  },
    79  {
    80    "message": "log3"
    81  }
    82  ```
    83  
    84  </td>
    85  </tr>
    86  </table>
    87  
    88  #### Multiline file input
    89  
    90  Configuration:
    91  ```yaml
    92  - type: file_input
    93    include:
    94      - ./test.log
    95    multiline:
    96      line_start_pattern: 'START '
    97  ```
    98  
    99  <table>
   100  <tr><td> `./test.log` </td> <td> Output records </td></tr>
   101  <tr>
   102  <td>
   103  
   104  ```
   105  START log1
   106  log2
   107  START log3
   108  log4
   109  ```
   110  
   111  </td>
   112  <td>
   113  
   114  ```json
   115  {
   116    "message": "START log1\nlog2\n"
   117  },
   118  {
   119    "message": "START log3\nlog4\n"
   120  }
   121  ```
   122  
   123  </td>
   124  </tr>
   125  </table>