github.com/netdata/go.d.plugin@v0.58.1/modules/weblog/integrations/web_server_log_files.md (about) 1 <!--startmeta 2 custom_edit_url: "https://github.com/netdata/go.d.plugin/edit/master/modules/weblog/README.md" 3 meta_yaml: "https://github.com/netdata/go.d.plugin/edit/master/modules/weblog/metadata.yaml" 4 sidebar_label: "Web server log files" 5 learn_status: "Published" 6 learn_rel_path: "Data Collection/Web Servers and Web Proxies" 7 most_popular: False 8 message: "DO NOT EDIT THIS FILE DIRECTLY, IT IS GENERATED BY THE COLLECTOR'S metadata.yaml FILE" 9 endmeta--> 10 11 # Web server log files 12 13 14 <img src="https://netdata.cloud/img/webservers.svg" width="150"/> 15 16 17 Plugin: go.d.plugin 18 Module: web_log 19 20 <img src="https://img.shields.io/badge/maintained%20by-Netdata-%2300ab44" /> 21 22 ## Overview 23 24 This collector monitors web servers by parsing their log files. 25 26 27 28 29 This collector is supported on all platforms. 30 31 This collector supports collecting metrics from multiple instances of this integration, including remote instances. 32 33 34 ### Default Behavior 35 36 #### Auto-Detection 37 38 It automatically detects log files of web servers running on localhost. 39 40 41 #### Limits 42 43 The default configuration for this integration does not impose any limits on data collection. 44 45 #### Performance Impact 46 47 The default configuration for this integration is not expected to impose a significant performance impact on the system. 48 49 50 ## Metrics 51 52 Metrics grouped by *scope*. 53 54 The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels. 55 56 57 58 ### Per Web server log files instance 59 60 These metrics refer to the entire monitored application. 61 62 This scope has no labels. 63 64 Metrics: 65 66 | Metric | Dimensions | Unit | 67 |:------|:----------|:----| 68 | web_log.requests | requests | requests/s | 69 | web_log.excluded_requests | unmatched | requests/s | 70 | web_log.type_requests | success, bad, redirect, error | requests/s | 71 | web_log.status_code_class_responses | 1xx, 2xx, 3xx, 4xx, 5xx | responses/s | 72 | web_log.status_code_class_1xx_responses | a dimension per 1xx code | responses/s | 73 | web_log.status_code_class_2xx_responses | a dimension per 2xx code | responses/s | 74 | web_log.status_code_class_3xx_responses | a dimension per 3xx code | responses/s | 75 | web_log.status_code_class_4xx_responses | a dimension per 4xx code | responses/s | 76 | web_log.status_code_class_5xx_responses | a dimension per 5xx code | responses/s | 77 | web_log.bandwidth | received, sent | kilobits/s | 78 | web_log.request_processing_time | min, max, avg | milliseconds | 79 | web_log.requests_processing_time_histogram | a dimension per bucket | requests/s | 80 | web_log.upstream_response_time | min, max, avg | milliseconds | 81 | web_log.upstream_responses_time_histogram | a dimension per bucket | requests/s | 82 | web_log.current_poll_uniq_clients | ipv4, ipv6 | clients | 83 | web_log.vhost_requests | a dimension per vhost | requests/s | 84 | web_log.port_requests | a dimension per port | requests/s | 85 | web_log.scheme_requests | http, https | requests/s | 86 | web_log.http_method_requests | a dimension per HTTP method | requests/s | 87 | web_log.http_version_requests | a dimension per HTTP version | requests/s | 88 | web_log.ip_proto_requests | ipv4, ipv6 | requests/s | 89 | web_log.ssl_proto_requests | a dimension per SSL protocol | requests/s | 90 | web_log.ssl_cipher_suite_requests | a dimension per SSL cipher suite | requests/s | 91 | web_log.url_pattern_requests | a dimension per URL pattern | requests/s | 92 | web_log.custom_field_pattern_requests | a dimension per custom field pattern | requests/s | 93 94 ### Per custom time field 95 96 TBD 97 98 This scope has no labels. 99 100 Metrics: 101 102 | Metric | Dimensions | Unit | 103 |:------|:----------|:----| 104 | web_log.custom_time_field_summary | min, max, avg | milliseconds | 105 | web_log.custom_time_field_histogram | a dimension per bucket | observations | 106 107 ### Per custom numeric field 108 109 TBD 110 111 This scope has no labels. 112 113 Metrics: 114 115 | Metric | Dimensions | Unit | 116 |:------|:----------|:----| 117 | web_log.custom_numeric_field_{{field_name}}_summary | min, max, avg | {{units}} | 118 119 ### Per URL pattern 120 121 TBD 122 123 This scope has no labels. 124 125 Metrics: 126 127 | Metric | Dimensions | Unit | 128 |:------|:----------|:----| 129 | web_log.url_pattern_status_code_responses | a dimension per pattern | responses/s | 130 | web_log.url_pattern_http_method_requests | a dimension per HTTP method | requests/s | 131 | web_log.url_pattern_bandwidth | received, sent | kilobits/s | 132 | web_log.url_pattern_request_processing_time | min, max, avg | milliseconds | 133 134 135 136 ## Alerts 137 138 139 The following alerts are available: 140 141 | Alert name | On metric | Description | 142 |:------------|:----------|:------------| 143 | [ web_log_1m_unmatched ](https://github.com/netdata/netdata/blob/master/health/health.d/web_log.conf) | web_log.excluded_requests | percentage of unparsed log lines over the last minute | 144 | [ web_log_1m_requests ](https://github.com/netdata/netdata/blob/master/health/health.d/web_log.conf) | web_log.type_requests | ratio of successful HTTP requests over the last minute (1xx, 2xx, 304, 401) | 145 | [ web_log_1m_redirects ](https://github.com/netdata/netdata/blob/master/health/health.d/web_log.conf) | web_log.type_requests | ratio of redirection HTTP requests over the last minute (3xx except 304) | 146 | [ web_log_1m_bad_requests ](https://github.com/netdata/netdata/blob/master/health/health.d/web_log.conf) | web_log.type_requests | ratio of client error HTTP requests over the last minute (4xx except 401) | 147 | [ web_log_1m_internal_errors ](https://github.com/netdata/netdata/blob/master/health/health.d/web_log.conf) | web_log.type_requests | ratio of server error HTTP requests over the last minute (5xx) | 148 | [ web_log_web_slow ](https://github.com/netdata/netdata/blob/master/health/health.d/web_log.conf) | web_log.request_processing_time | average HTTP response time over the last 1 minute | 149 | [ web_log_5m_requests_ratio ](https://github.com/netdata/netdata/blob/master/health/health.d/web_log.conf) | web_log.type_requests | ratio of successful HTTP requests over over the last 5 minutes, compared with the previous 5 minutes | 150 151 152 ## Setup 153 154 ### Prerequisites 155 156 No action required. 157 158 ### Configuration 159 160 #### File 161 162 The configuration file name for this integration is `go.d/web_log.conf`. 163 164 165 You can edit the configuration file using the `edit-config` script from the 166 Netdata [config directory](https://github.com/netdata/netdata/blob/master/docs/configure/nodes.md#the-netdata-config-directory). 167 168 ```bash 169 cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata 170 sudo ./edit-config go.d/web_log.conf 171 ``` 172 #### Options 173 174 Weblog is aware of how to parse and interpret the following fields (**known fields**): 175 176 > [nginx](https://nginx.org/en/docs/varindex.html) 177 > 178 > [apache](https://httpd.apache.org/docs/current/mod/mod_log_config.html) 179 180 | nginx | apache | description | 181 |-------------------------|----------|------------------------------------------------------------------------------------------| 182 | $host ($http_host) | %v | Name of the server which accepted a request. | 183 | $server_port | %p | Port of the server which accepted a request. | 184 | $scheme | - | Request scheme. "http" or "https". | 185 | $remote_addr | %a (%h) | Client address. | 186 | $request | %r | Full original request line. The line is "$request_method $request_uri $server_protocol". | 187 | $request_method | %m | Request method. Usually "GET" or "POST". | 188 | $request_uri | %U | Full original request URI. | 189 | $server_protocol | %H | Request protocol. Usually "HTTP/1.0", "HTTP/1.1", or "HTTP/2.0". | 190 | $status | %s (%>s) | Response status code. | 191 | $request_length | %I | Bytes received from a client, including request and headers. | 192 | $bytes_sent | %O | Bytes sent to a client, including request and headers. | 193 | $body_bytes_sent | %B (%b) | Bytes sent to a client, not counting the response header. | 194 | $request_time | %D | Request processing time. | 195 | $upstream_response_time | - | Time spent on receiving the response from the upstream server. | 196 | $ssl_protocol | - | Protocol of an established SSL connection. | 197 | $ssl_cipher | - | String of ciphers used for an established SSL connection. | 198 199 Notes: 200 201 - Apache `%h` logs the IP address if [HostnameLookups](https://httpd.apache.org/docs/2.4/mod/core.html#hostnamelookups) is Off. The web log collector counts hostnames as IPv4 addresses. We recommend either to disable HostnameLookups or use `%a` instead of `%h`. 202 - Since httpd 2.0, unlike 1.3, the `%b` and `%B` format strings do not represent the number of bytes sent to the client, but simply the size in bytes of the HTTP response. It will differ, for instance, if the connection is aborted, or if SSL is used. The `%O` format provided by [`mod_logio`](https://httpd.apache.org/docs/2.4/mod/mod_logio.html) will log the actual number of bytes sent over the network. 203 - To get `%I` and `%O` working you need to enable `mod_logio` on Apache. 204 - NGINX logs URI with query parameters, Apache doesnt. 205 - `$request` is parsed into `$request_method`, `$request_uri` and `$server_protocol`. If you have `$request` in your log format, there is no sense to have others. 206 - Don't use both `$bytes_sent` and `$body_bytes_sent` (`%O` and `%B` or `%b`). The module does not distinguish between these parameters. 207 208 209 <details><summary>Config options</summary> 210 211 | Name | Description | Default | Required | 212 |:----|:-----------|:-------|:--------:| 213 | update_every | Data collection frequency. | 1 | no | 214 | autodetection_retry | Recheck interval in seconds. Zero means no recheck will be scheduled. | 0 | no | 215 | path | Path to the web server log file. | | yes | 216 | exclude_path | Path to exclude. | *.gz | no | 217 | url_patterns | List of URL patterns. | [] | no | 218 | url_patterns.name | Used as a dimension name. | | yes | 219 | url_patterns.pattern | Used to match against full original request URI. Pattern syntax in [matcher](https://github.com/netdata/go.d.plugin/tree/master/pkg/matcher#supported-format). | | yes | 220 | parser | Log parser configuration. | | no | 221 | parser.log_type | Log parser type. | auto | no | 222 | parser.csv_config | CSV log parser config. | | no | 223 | parser.csv_config.delimiter | CSV field delimiter. | , | no | 224 | parser.csv_config.format | CSV log format. | | no | 225 | parser.ltsv_config | LTSV log parser config. | | no | 226 | parser.ltsv_config.field_delimiter | LTSV field delimiter. | \t | no | 227 | parser.ltsv_config.value_delimiter | LTSV value delimiter. | : | no | 228 | parser.ltsv_config.mapping | LTSV fields mapping to **known fields**. | | yes | 229 | parser.json_config | JSON log parser config. | | no | 230 | parser.json_config.mapping | JSON fields mapping to **known fields**. | | yes | 231 | parser.regexp_config | RegExp log parser config. | | no | 232 | parser.regexp_config.pattern | RegExp pattern with named groups. | | yes | 233 234 ##### url_patterns 235 236 "URL pattern" scope metrics will be collected for each URL pattern. 237 238 Option syntax: 239 240 ```yaml 241 url_patterns: 242 - name: name1 243 pattern: pattern1 244 - name: name2 245 pattern: pattern2 246 ``` 247 248 249 ##### parser.log_type 250 251 Weblog supports 5 different log parsers: 252 253 | Parser type | Description | 254 |-------------|-------------------------------------------| 255 | auto | Use CSV and auto-detect format | 256 | csv | A comma-separated values | 257 | json | [JSON](https://www.json.org/json-en.html) | 258 | ltsv | [LTSV](http://ltsv.org/) | 259 | regexp | Regular expression with named groups | 260 261 Syntax: 262 263 ```yaml 264 parser: 265 log_type: auto 266 ``` 267 268 If `log_type` parameter set to `auto` (which is default), weblog will try to auto-detect appropriate log parser and log format using the last line of the log file. 269 270 - checks if format is `CSV` (using regexp). 271 - checks if format is `JSON` (using regexp). 272 - assumes format is `CSV` and tries to find appropriate `CSV` log format using predefined list of formats. It tries to parse the line using each of them in the following order (the first one matches is used later): 273 274 ```sh 275 $host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - $request_length $request_time $upstream_response_time 276 $host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - $request_length $request_time 277 $host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent $request_length $request_time $upstream_response_time 278 $host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent $request_length $request_time 279 $host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent 280 $remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - $request_length $request_time $upstream_response_time 281 $remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - $request_length $request_time 282 $remote_addr - - [$time_local] "$request" $status $body_bytes_sent $request_length $request_time $upstream_response_time 283 $remote_addr - - [$time_local] "$request" $status $body_bytes_sent $request_length $request_time 284 $remote_addr - - [$time_local] "$request" $status $body_bytes_sent 285 ``` 286 287 If you're using the default Apache/NGINX log format, auto-detect will work for you. If it doesn't work you need to set the format manually. 288 289 290 ##### parser.csv_config.format 291 292 293 294 ##### parser.ltsv_config.mapping 295 296 The mapping is a dictionary where the key is a field, as in logs, and the value is the corresponding **known field**. 297 298 > **Note**: don't use `$` and `%` prefixes for mapped field names. 299 300 ```yaml 301 parser: 302 log_type: ltsv 303 ltsv_config: 304 mapping: 305 label1: field1 306 label2: field2 307 ``` 308 309 310 ##### parser.json_config.mapping 311 312 The mapping is a dictionary where the key is a field, as in logs, and the value is the corresponding **known field**. 313 314 > **Note**: don't use `$` and `%` prefixes for mapped field names. 315 316 ```yaml 317 parser: 318 log_type: json 319 json_config: 320 mapping: 321 label1: field1 322 label2: field2 323 ``` 324 325 326 ##### parser.regexp_config.pattern 327 328 Use pattern with subexpressions names. These names should be **known fields**. 329 330 > **Note**: don't use `$` and `%` prefixes for mapped field names. 331 332 Syntax: 333 334 ```yaml 335 parser: 336 log_type: regexp 337 regexp_config: 338 pattern: PATTERN 339 ``` 340 341 342 </details> 343 344 #### Examples 345 There are no configuration examples. 346 347 348 349 ## Troubleshooting 350 351 ### Debug Mode 352 353 To troubleshoot issues with the `web_log` collector, run the `go.d.plugin` with the debug option enabled. The output 354 should give you clues as to why the collector isn't working. 355 356 - Navigate to the `plugins.d` directory, usually at `/usr/libexec/netdata/plugins.d/`. If that's not the case on 357 your system, open `netdata.conf` and look for the `plugins` setting under `[directories]`. 358 359 ```bash 360 cd /usr/libexec/netdata/plugins.d/ 361 ``` 362 363 - Switch to the `netdata` user. 364 365 ```bash 366 sudo -u netdata -s 367 ``` 368 369 - Run the `go.d.plugin` to debug the collector: 370 371 ```bash 372 ./go.d.plugin -d -m web_log 373 ``` 374 375