github.com/netdata/go.d.plugin@v0.58.1/modules/nvme/integrations/nvme_devices.md (about)

     1  <!--startmeta
     2  custom_edit_url: "https://github.com/netdata/go.d.plugin/edit/master/modules/nvme/README.md"
     3  meta_yaml: "https://github.com/netdata/go.d.plugin/edit/master/modules/nvme/metadata.yaml"
     4  sidebar_label: "NVMe devices"
     5  learn_status: "Published"
     6  learn_rel_path: "Data Collection/Storage, Mount Points and Filesystems"
     7  most_popular: False
     8  message: "DO NOT EDIT THIS FILE DIRECTLY, IT IS GENERATED BY THE COLLECTOR'S metadata.yaml FILE"
     9  endmeta-->
    10  
    11  # NVMe devices
    12  
    13  
    14  <img src="https://netdata.cloud/img/nvme.svg" width="150"/>
    15  
    16  
    17  Plugin: go.d.plugin
    18  Module: nvme
    19  
    20  <img src="https://img.shields.io/badge/maintained%20by-Netdata-%2300ab44" />
    21  
    22  ## Overview
    23  
    24  This collector monitors the health of NVMe devices using the command line tool [nvme](https://github.com/linux-nvme/nvme-cli#nvme-cli), which can only be run by the root user. It uses `sudo` and assumes it is set up so that the netdata user can execute `nvme` as root without a password.
    25  
    26  
    27  
    28  
    29  This collector is supported on all platforms.
    30  
    31  This collector supports collecting metrics from multiple instances of this integration, including remote instances.
    32  
    33  
    34  ### Default Behavior
    35  
    36  #### Auto-Detection
    37  
    38  This integration doesn't support auto-detection.
    39  
    40  #### Limits
    41  
    42  The default configuration for this integration does not impose any limits on data collection.
    43  
    44  #### Performance Impact
    45  
    46  The default configuration for this integration is not expected to impose a significant performance impact on the system.
    47  
    48  
    49  ## Metrics
    50  
    51  Metrics grouped by *scope*.
    52  
    53  The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels.
    54  
    55  
    56  
    57  ### Per device
    58  
    59  These metrics refer to the NVME device.
    60  
    61  Labels:
    62  
    63  | Label      | Description     |
    64  |:-----------|:----------------|
    65  | device | NVMe device name |
    66  
    67  Metrics:
    68  
    69  | Metric | Dimensions | Unit |
    70  |:------|:----------|:----|
    71  | nvme.device_estimated_endurance_perc | used | % |
    72  | nvme.device_available_spare_perc | spare | % |
    73  | nvme.device_composite_temperature | temperature | celsius |
    74  | nvme.device_io_transferred_count | read, written | bytes |
    75  | nvme.device_power_cycles_count | power | cycles |
    76  | nvme.device_power_on_time | power-on | seconds |
    77  | nvme.device_critical_warnings_state | available_spare, temp_threshold, nvm_subsystem_reliability, read_only, volatile_mem_backup_failed, persistent_memory_read_only | state |
    78  | nvme.device_unsafe_shutdowns_count | unsafe | shutdowns |
    79  | nvme.device_media_errors_rate | media | errors/s |
    80  | nvme.device_error_log_entries_rate | error_log | entries/s |
    81  | nvme.device_warning_composite_temperature_time | wctemp | seconds |
    82  | nvme.device_critical_composite_temperature_time | cctemp | seconds |
    83  | nvme.device_thermal_mgmt_temp1_transitions_rate | temp1 | transitions/s |
    84  | nvme.device_thermal_mgmt_temp2_transitions_rate | temp2 | transitions/s |
    85  | nvme.device_thermal_mgmt_temp1_time | temp1 | seconds |
    86  | nvme.device_thermal_mgmt_temp2_time | temp2 | seconds |
    87  
    88  
    89  
    90  ## Alerts
    91  
    92  
    93  The following alerts are available:
    94  
    95  | Alert name  | On metric | Description |
    96  |:------------|:----------|:------------|
    97  | [ nvme_device_critical_warnings_state ](https://github.com/netdata/netdata/blob/master/health/health.d/nvme.conf) | nvme.device_critical_warnings_state | NVMe device ${label:device} has critical warnings |
    98  
    99  
   100  ## Setup
   101  
   102  ### Prerequisites
   103  
   104  #### Install nvme-cli
   105  
   106  See [Distro Support](https://github.com/linux-nvme/nvme-cli#distro-support). Install `nvme-cli` using your distribution's package manager.
   107  
   108  
   109  #### Allow netdata to execute nvme
   110  
   111  Add the netdata user to `/etc/sudoers` (use `which nvme` to find the full path to the binary):
   112  
   113  ```bash
   114  netdata ALL=(root) NOPASSWD: /usr/sbin/nvme
   115  ```
   116  
   117  
   118  
   119  ### Configuration
   120  
   121  #### File
   122  
   123  The configuration file name for this integration is `go.d/nvme.conf`.
   124  
   125  
   126  You can edit the configuration file using the `edit-config` script from the
   127  Netdata [config directory](https://github.com/netdata/netdata/blob/master/docs/configure/nodes.md#the-netdata-config-directory).
   128  
   129  ```bash
   130  cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata
   131  sudo ./edit-config go.d/nvme.conf
   132  ```
   133  #### Options
   134  
   135  The following options can be defined globally: update_every, autodetection_retry.
   136  
   137  
   138  <details><summary>Config options</summary>
   139  
   140  | Name | Description | Default | Required |
   141  |:----|:-----------|:-------|:--------:|
   142  | update_every | Data collection frequency. | 10 | no |
   143  | autodetection_retry | Recheck interval in seconds. Zero means no recheck will be scheduled. | 0 | no |
   144  | binary_path | Path to nvme binary. The default is "nvme" and the executable is looked for in the directories specified in the PATH environment variable. | nvme | no |
   145  | timeout | nvme binary execution timeout. | 2 | no |
   146  
   147  </details>
   148  
   149  #### Examples
   150  
   151  ##### Custom binary path
   152  
   153  The executable is not in the directories specified in the PATH environment variable.
   154  
   155  <details><summary>Config</summary>
   156  
   157  ```yaml
   158  jobs:
   159    - name: nvme
   160      binary_path: /usr/local/sbin/nvme
   161  
   162  ```
   163  </details>
   164  
   165  
   166  
   167  ## Troubleshooting
   168  
   169  ### Debug Mode
   170  
   171  To troubleshoot issues with the `nvme` collector, run the `go.d.plugin` with the debug option enabled. The output
   172  should give you clues as to why the collector isn't working.
   173  
   174  - Navigate to the `plugins.d` directory, usually at `/usr/libexec/netdata/plugins.d/`. If that's not the case on
   175    your system, open `netdata.conf` and look for the `plugins` setting under `[directories]`.
   176  
   177    ```bash
   178    cd /usr/libexec/netdata/plugins.d/
   179    ```
   180  
   181  - Switch to the `netdata` user.
   182  
   183    ```bash
   184    sudo -u netdata -s
   185    ```
   186  
   187  - Run the `go.d.plugin` to debug the collector:
   188  
   189    ```bash
   190    ./go.d.plugin -d -m nvme
   191    ```
   192  
   193