github.com/netdata/go.d.plugin@v0.58.1/modules/weblog/integrations/web_server_log_files.md (about)

     1  <!--startmeta
     2  custom_edit_url: "https://github.com/netdata/go.d.plugin/edit/master/modules/weblog/README.md"
     3  meta_yaml: "https://github.com/netdata/go.d.plugin/edit/master/modules/weblog/metadata.yaml"
     4  sidebar_label: "Web server log files"
     5  learn_status: "Published"
     6  learn_rel_path: "Data Collection/Web Servers and Web Proxies"
     7  most_popular: False
     8  message: "DO NOT EDIT THIS FILE DIRECTLY, IT IS GENERATED BY THE COLLECTOR'S metadata.yaml FILE"
     9  endmeta-->
    10  
    11  # Web server log files
    12  
    13  
    14  <img src="https://netdata.cloud/img/webservers.svg" width="150"/>
    15  
    16  
    17  Plugin: go.d.plugin
    18  Module: web_log
    19  
    20  <img src="https://img.shields.io/badge/maintained%20by-Netdata-%2300ab44" />
    21  
    22  ## Overview
    23  
    24  This collector monitors web servers by parsing their log files.
    25  
    26  
    27  
    28  
    29  This collector is supported on all platforms.
    30  
    31  This collector supports collecting metrics from multiple instances of this integration, including remote instances.
    32  
    33  
    34  ### Default Behavior
    35  
    36  #### Auto-Detection
    37  
    38  It automatically detects log files of web servers running on localhost.
    39  
    40  
    41  #### Limits
    42  
    43  The default configuration for this integration does not impose any limits on data collection.
    44  
    45  #### Performance Impact
    46  
    47  The default configuration for this integration is not expected to impose a significant performance impact on the system.
    48  
    49  
    50  ## Metrics
    51  
    52  Metrics grouped by *scope*.
    53  
    54  The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels.
    55  
    56  
    57  
    58  ### Per Web server log files instance
    59  
    60  These metrics refer to the entire monitored application.
    61  
    62  This scope has no labels.
    63  
    64  Metrics:
    65  
    66  | Metric | Dimensions | Unit |
    67  |:------|:----------|:----|
    68  | web_log.requests | requests | requests/s |
    69  | web_log.excluded_requests | unmatched | requests/s |
    70  | web_log.type_requests | success, bad, redirect, error | requests/s |
    71  | web_log.status_code_class_responses | 1xx, 2xx, 3xx, 4xx, 5xx | responses/s |
    72  | web_log.status_code_class_1xx_responses | a dimension per 1xx code | responses/s |
    73  | web_log.status_code_class_2xx_responses | a dimension per 2xx code | responses/s |
    74  | web_log.status_code_class_3xx_responses | a dimension per 3xx code | responses/s |
    75  | web_log.status_code_class_4xx_responses | a dimension per 4xx code | responses/s |
    76  | web_log.status_code_class_5xx_responses | a dimension per 5xx code | responses/s |
    77  | web_log.bandwidth | received, sent | kilobits/s |
    78  | web_log.request_processing_time | min, max, avg | milliseconds |
    79  | web_log.requests_processing_time_histogram | a dimension per bucket | requests/s |
    80  | web_log.upstream_response_time | min, max, avg | milliseconds |
    81  | web_log.upstream_responses_time_histogram | a dimension per bucket | requests/s |
    82  | web_log.current_poll_uniq_clients | ipv4, ipv6 | clients |
    83  | web_log.vhost_requests | a dimension per vhost | requests/s |
    84  | web_log.port_requests | a dimension per port | requests/s |
    85  | web_log.scheme_requests | http, https | requests/s |
    86  | web_log.http_method_requests | a dimension per HTTP method | requests/s |
    87  | web_log.http_version_requests | a dimension per HTTP version | requests/s |
    88  | web_log.ip_proto_requests | ipv4, ipv6 | requests/s |
    89  | web_log.ssl_proto_requests | a dimension per SSL protocol | requests/s |
    90  | web_log.ssl_cipher_suite_requests | a dimension per SSL cipher suite | requests/s |
    91  | web_log.url_pattern_requests | a dimension per URL pattern | requests/s |
    92  | web_log.custom_field_pattern_requests | a dimension per custom field pattern | requests/s |
    93  
    94  ### Per custom time field
    95  
    96  TBD
    97  
    98  This scope has no labels.
    99  
   100  Metrics:
   101  
   102  | Metric | Dimensions | Unit |
   103  |:------|:----------|:----|
   104  | web_log.custom_time_field_summary | min, max, avg | milliseconds |
   105  | web_log.custom_time_field_histogram | a dimension per bucket | observations |
   106  
   107  ### Per custom numeric field
   108  
   109  TBD
   110  
   111  This scope has no labels.
   112  
   113  Metrics:
   114  
   115  | Metric | Dimensions | Unit |
   116  |:------|:----------|:----|
   117  | web_log.custom_numeric_field_{{field_name}}_summary | min, max, avg | {{units}} |
   118  
   119  ### Per URL pattern
   120  
   121  TBD
   122  
   123  This scope has no labels.
   124  
   125  Metrics:
   126  
   127  | Metric | Dimensions | Unit |
   128  |:------|:----------|:----|
   129  | web_log.url_pattern_status_code_responses | a dimension per pattern | responses/s |
   130  | web_log.url_pattern_http_method_requests | a dimension per HTTP method | requests/s |
   131  | web_log.url_pattern_bandwidth | received, sent | kilobits/s |
   132  | web_log.url_pattern_request_processing_time | min, max, avg | milliseconds |
   133  
   134  
   135  
   136  ## Alerts
   137  
   138  
   139  The following alerts are available:
   140  
   141  | Alert name  | On metric | Description |
   142  |:------------|:----------|:------------|
   143  | [ web_log_1m_unmatched ](https://github.com/netdata/netdata/blob/master/health/health.d/web_log.conf) | web_log.excluded_requests | percentage of unparsed log lines over the last minute |
   144  | [ web_log_1m_requests ](https://github.com/netdata/netdata/blob/master/health/health.d/web_log.conf) | web_log.type_requests | ratio of successful HTTP requests over the last minute (1xx, 2xx, 304, 401) |
   145  | [ web_log_1m_redirects ](https://github.com/netdata/netdata/blob/master/health/health.d/web_log.conf) | web_log.type_requests | ratio of redirection HTTP requests over the last minute (3xx except 304) |
   146  | [ web_log_1m_bad_requests ](https://github.com/netdata/netdata/blob/master/health/health.d/web_log.conf) | web_log.type_requests | ratio of client error HTTP requests over the last minute (4xx except 401) |
   147  | [ web_log_1m_internal_errors ](https://github.com/netdata/netdata/blob/master/health/health.d/web_log.conf) | web_log.type_requests | ratio of server error HTTP requests over the last minute (5xx) |
   148  | [ web_log_web_slow ](https://github.com/netdata/netdata/blob/master/health/health.d/web_log.conf) | web_log.request_processing_time | average HTTP response time over the last 1 minute |
   149  | [ web_log_5m_requests_ratio ](https://github.com/netdata/netdata/blob/master/health/health.d/web_log.conf) | web_log.type_requests | ratio of successful HTTP requests over over the last 5 minutes, compared with the previous 5 minutes |
   150  
   151  
   152  ## Setup
   153  
   154  ### Prerequisites
   155  
   156  No action required.
   157  
   158  ### Configuration
   159  
   160  #### File
   161  
   162  The configuration file name for this integration is `go.d/web_log.conf`.
   163  
   164  
   165  You can edit the configuration file using the `edit-config` script from the
   166  Netdata [config directory](https://github.com/netdata/netdata/blob/master/docs/configure/nodes.md#the-netdata-config-directory).
   167  
   168  ```bash
   169  cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata
   170  sudo ./edit-config go.d/web_log.conf
   171  ```
   172  #### Options
   173  
   174  Weblog is aware of how to parse and interpret the following fields (**known fields**):
   175  
   176  > [nginx](https://nginx.org/en/docs/varindex.html)
   177  >
   178  > [apache](https://httpd.apache.org/docs/current/mod/mod_log_config.html)
   179  
   180  | nginx                   | apache   | description                                                                              |
   181  |-------------------------|----------|------------------------------------------------------------------------------------------|
   182  | $host ($http_host)      | %v       | Name of the server which accepted a request.                                             |
   183  | $server_port            | %p       | Port of the server which accepted a request.                                             |
   184  | $scheme                 | -        | Request scheme. "http" or "https".                                                       |
   185  | $remote_addr            | %a (%h)  | Client address.                                                                          |
   186  | $request                | %r       | Full original request line. The line is "$request_method $request_uri $server_protocol". |
   187  | $request_method         | %m       | Request method. Usually "GET" or "POST".                                                 |
   188  | $request_uri            | %U       | Full original request URI.                                                               |
   189  | $server_protocol        | %H       | Request protocol. Usually "HTTP/1.0", "HTTP/1.1", or "HTTP/2.0".                         |
   190  | $status                 | %s (%>s) | Response status code.                                                                    |
   191  | $request_length         | %I       | Bytes received from a client, including request and headers.                             |
   192  | $bytes_sent             | %O       | Bytes sent to a client, including request and headers.                                   |
   193  | $body_bytes_sent        | %B (%b)  | Bytes sent to a client, not counting the response header.                                |
   194  | $request_time           | %D       | Request processing time.                                                                 |
   195  | $upstream_response_time | -        | Time spent on receiving the response from the upstream server.                           |
   196  | $ssl_protocol           | -        | Protocol of an established SSL connection.                                               |
   197  | $ssl_cipher             | -        | String of ciphers used for an established SSL connection.                                |
   198  
   199  Notes:
   200  
   201  - Apache `%h` logs the IP address if [HostnameLookups](https://httpd.apache.org/docs/2.4/mod/core.html#hostnamelookups) is Off. The web log collector counts hostnames as IPv4 addresses. We recommend either to disable HostnameLookups or use `%a` instead of `%h`.
   202  - Since httpd 2.0, unlike 1.3, the `%b` and `%B` format strings do not represent the number of bytes sent to the client, but simply the size in bytes of the HTTP response. It will differ, for instance, if the connection is aborted, or if SSL is used. The `%O` format provided by [`mod_logio`](https://httpd.apache.org/docs/2.4/mod/mod_logio.html) will log the actual number of bytes sent over the network.
   203  - To get `%I` and `%O` working you need to enable `mod_logio` on Apache.
   204  - NGINX logs URI with query parameters, Apache doesnt.
   205  - `$request` is parsed into `$request_method`, `$request_uri` and `$server_protocol`. If you have `$request` in your log format, there is no sense to have others.
   206  - Don't use both `$bytes_sent` and `$body_bytes_sent` (`%O` and `%B` or `%b`). The module does not distinguish between these parameters.
   207  
   208  
   209  <details><summary>Config options</summary>
   210  
   211  | Name | Description | Default | Required |
   212  |:----|:-----------|:-------|:--------:|
   213  | update_every | Data collection frequency. | 1 | no |
   214  | autodetection_retry | Recheck interval in seconds. Zero means no recheck will be scheduled. | 0 | no |
   215  | path | Path to the web server log file. |  | yes |
   216  | exclude_path | Path to exclude. | *.gz | no |
   217  | url_patterns | List of URL patterns. | [] | no |
   218  | url_patterns.name | Used as a dimension name. |  | yes |
   219  | url_patterns.pattern | Used to match against full original request URI. Pattern syntax in [matcher](https://github.com/netdata/go.d.plugin/tree/master/pkg/matcher#supported-format). |  | yes |
   220  | parser | Log parser configuration. |  | no |
   221  | parser.log_type | Log parser type. | auto | no |
   222  | parser.csv_config | CSV log parser config. |  | no |
   223  | parser.csv_config.delimiter | CSV field delimiter. | , | no |
   224  | parser.csv_config.format | CSV log format. |  | no |
   225  | parser.ltsv_config | LTSV log parser config. |  | no |
   226  | parser.ltsv_config.field_delimiter | LTSV field delimiter. | \t | no |
   227  | parser.ltsv_config.value_delimiter | LTSV value delimiter. | : | no |
   228  | parser.ltsv_config.mapping | LTSV fields mapping to **known fields**. |  | yes |
   229  | parser.json_config | JSON log parser config. |  | no |
   230  | parser.json_config.mapping | JSON fields mapping to **known fields**. |  | yes |
   231  | parser.regexp_config | RegExp log parser config. |  | no |
   232  | parser.regexp_config.pattern | RegExp pattern with named groups. |  | yes |
   233  
   234  ##### url_patterns
   235  
   236  "URL pattern" scope metrics will be collected for each URL pattern. 
   237  
   238  Option syntax:
   239  
   240  ```yaml
   241  url_patterns:
   242    - name: name1
   243      pattern: pattern1
   244    - name: name2
   245      pattern: pattern2
   246  ```
   247  
   248  
   249  ##### parser.log_type
   250  
   251  Weblog supports 5 different log parsers:
   252  
   253  | Parser type | Description                               |
   254  |-------------|-------------------------------------------|
   255  | auto        | Use CSV and auto-detect format            |
   256  | csv         | A comma-separated values                  |
   257  | json        | [JSON](https://www.json.org/json-en.html) |
   258  | ltsv        | [LTSV](http://ltsv.org/)                  |
   259  | regexp      | Regular expression with named groups      |
   260  
   261  Syntax:
   262  
   263  ```yaml
   264  parser:
   265    log_type: auto
   266  ```
   267  
   268  If `log_type` parameter set to `auto` (which is default), weblog will try to auto-detect appropriate log parser and log format using the last line of the log file.
   269  
   270  - checks if format is `CSV` (using regexp).
   271  - checks if format is `JSON` (using regexp).
   272  - assumes format is `CSV` and tries to find appropriate `CSV` log format using predefined list of formats. It tries to parse the line using each of them in the following order (the first one matches is used later):
   273  
   274    ```sh
   275    $host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - $request_length $request_time $upstream_response_time
   276    $host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - $request_length $request_time
   277    $host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent     $request_length $request_time $upstream_response_time
   278    $host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent     $request_length $request_time
   279    $host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent
   280    $remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - $request_length $request_time $upstream_response_time
   281    $remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - $request_length $request_time
   282    $remote_addr - - [$time_local] "$request" $status $body_bytes_sent     $request_length $request_time $upstream_response_time
   283    $remote_addr - - [$time_local] "$request" $status $body_bytes_sent     $request_length $request_time
   284    $remote_addr - - [$time_local] "$request" $status $body_bytes_sent
   285    ```
   286  
   287    If you're using the default Apache/NGINX log format, auto-detect will work for you. If it doesn't work you need to set the format manually.
   288  
   289  
   290  ##### parser.csv_config.format
   291  
   292  
   293  
   294  ##### parser.ltsv_config.mapping
   295  
   296  The mapping is a dictionary where the key is a field, as in logs, and the value is the corresponding **known field**.
   297  
   298  > **Note**: don't use `$` and `%` prefixes for mapped field names.
   299  
   300  ```yaml
   301  parser:
   302    log_type: ltsv
   303    ltsv_config:
   304      mapping:
   305        label1: field1
   306        label2: field2
   307  ```
   308  
   309  
   310  ##### parser.json_config.mapping
   311  
   312  The mapping is a dictionary where the key is a field, as in logs, and the value is the corresponding **known field**.
   313  
   314  > **Note**: don't use `$` and `%` prefixes for mapped field names.
   315  
   316  ```yaml
   317  parser:
   318    log_type: json
   319    json_config:
   320      mapping:
   321        label1: field1
   322        label2: field2
   323  ```
   324  
   325  
   326  ##### parser.regexp_config.pattern
   327  
   328  Use pattern with subexpressions names. These names should be **known fields**.
   329  
   330  > **Note**: don't use `$` and `%` prefixes for mapped field names.
   331  
   332  Syntax:
   333  
   334  ```yaml
   335  parser:
   336    log_type: regexp
   337    regexp_config:
   338      pattern: PATTERN
   339  ```
   340  
   341  
   342  </details>
   343  
   344  #### Examples
   345  There are no configuration examples.
   346  
   347  
   348  
   349  ## Troubleshooting
   350  
   351  ### Debug Mode
   352  
   353  To troubleshoot issues with the `web_log` collector, run the `go.d.plugin` with the debug option enabled. The output
   354  should give you clues as to why the collector isn't working.
   355  
   356  - Navigate to the `plugins.d` directory, usually at `/usr/libexec/netdata/plugins.d/`. If that's not the case on
   357    your system, open `netdata.conf` and look for the `plugins` setting under `[directories]`.
   358  
   359    ```bash
   360    cd /usr/libexec/netdata/plugins.d/
   361    ```
   362  
   363  - Switch to the `netdata` user.
   364  
   365    ```bash
   366    sudo -u netdata -s
   367    ```
   368  
   369  - Run the `go.d.plugin` to debug the collector:
   370  
   371    ```bash
   372    ./go.d.plugin -d -m web_log
   373    ```
   374  
   375