github.com/netdata/go.d.plugin@v0.58.1/modules/postgres/integrations/postgresql.md

github.com/netdata/go.d.plugin@v0.58.1/modules/postgres/integrations/postgresql.md (about)

     1  <!--startmeta
     2  custom_edit_url: "https://github.com/netdata/go.d.plugin/edit/master/modules/postgres/README.md"
     3  meta_yaml: "https://github.com/netdata/go.d.plugin/edit/master/modules/postgres/metadata.yaml"
     4  sidebar_label: "PostgreSQL"
     5  learn_status: "Published"
     6  learn_rel_path: "Data Collection/Databases"
     7  most_popular: True
     8  message: "DO NOT EDIT THIS FILE DIRECTLY, IT IS GENERATED BY THE COLLECTOR'S metadata.yaml FILE"
     9  endmeta-->
    10  
    11  # PostgreSQL
    12  
    13  
    14  <img src="https://netdata.cloud/img/postgres.svg" width="150"/>
    15  
    16  
    17  Plugin: go.d.plugin
    18  Module: postgres
    19  
    20  <img src="https://img.shields.io/badge/maintained%20by-Netdata-%2300ab44" />
    21  
    22  ## Overview
    23  
    24  This collector monitors the activity and performance of Postgres servers, collects replication statistics, metrics for each database, table and index, and more.
    25  
    26  
    27  It establishes a connection to the Postgres instance via a TCP or UNIX socket.
    28  To collect metrics for database tables and indexes, it establishes an additional connection for each discovered database.
    29  
    30  
    31  This collector is supported on all platforms.
    32  
    33  This collector supports collecting metrics from multiple instances of this integration, including remote instances.
    34  
    35  
    36  ### Default Behavior
    37  
    38  #### Auto-Detection
    39  
    40  By default, it detects instances running on localhost by trying to connect as root and netdata using known PostgreSQL TCP and UNIX sockets:
    41  
    42  - 127.0.0.1:5432
    43  - /var/run/postgresql/
    44  
    45  
    46  #### Limits
    47  
    48  Table and index metrics are not collected for databases with more than 50 tables or 250 indexes.
    49  These limits can be changed in the configuration file.
    50  
    51  
    52  #### Performance Impact
    53  
    54  The default configuration for this integration is not expected to impose a significant performance impact on the system.
    55  
    56  
    57  ## Metrics
    58  
    59  Metrics grouped by *scope*.
    60  
    61  The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels.
    62  
    63  
    64  
    65  ### Per PostgreSQL instance
    66  
    67  These metrics refer to the entire monitored application.
    68  
    69  This scope has no labels.
    70  
    71  Metrics:
    72  
    73  | Metric | Dimensions | Unit |
    74  |:------|:----------|:----|
    75  | postgres.connections_utilization | used | percentage |
    76  | postgres.connections_usage | available, used | connections |
    77  | postgres.connections_state_count | active, idle, idle_in_transaction, idle_in_transaction_aborted, disabled | connections |
    78  | postgres.transactions_duration | a dimension per bucket | transactions/s |
    79  | postgres.queries_duration | a dimension per bucket | queries/s |
    80  | postgres.locks_utilization | used | percentage |
    81  | postgres.checkpoints_rate | scheduled, requested | checkpoints/s |
    82  | postgres.checkpoints_time | write, sync | milliseconds |
    83  | postgres.bgwriter_halts_rate | maxwritten | events/s |
    84  | postgres.buffers_io_rate | checkpoint, backend, bgwriter | B/s |
    85  | postgres.buffers_backend_fsync_rate | fsync | calls/s |
    86  | postgres.buffers_allocated_rate | allocated | B/s |
    87  | postgres.wal_io_rate | write | B/s |
    88  | postgres.wal_files_count | written, recycled | files |
    89  | postgres.wal_archiving_files_count | ready, done | files/s |
    90  | postgres.autovacuum_workers_count | analyze, vacuum_analyze, vacuum, vacuum_freeze, brin_summarize | workers |
    91  | postgres.txid_exhaustion_towards_autovacuum_perc | emergency_autovacuum | percentage |
    92  | postgres.txid_exhaustion_perc | txid_exhaustion | percentage |
    93  | postgres.txid_exhaustion_oldest_txid_num | xid | xid |
    94  | postgres.catalog_relations_count | ordinary_table, index, sequence, toast_table, view, materialized_view, composite_type, foreign_table, partitioned_table, partitioned_index | relations |
    95  | postgres.catalog_relations_size | ordinary_table, index, sequence, toast_table, view, materialized_view, composite_type, foreign_table, partitioned_table, partitioned_index | B |
    96  | postgres.uptime | uptime | seconds |
    97  | postgres.databases_count | databases | databases |
    98  
    99  ### Per repl application
   100  
   101  These metrics refer to the replication application.
   102  
   103  Labels:
   104  
   105  | Label      | Description     |
   106  |:-----------|:----------------|
   107  | application | application name |
   108  
   109  Metrics:
   110  
   111  | Metric | Dimensions | Unit |
   112  |:------|:----------|:----|
   113  | postgres.replication_app_wal_lag_size | sent_lag, write_lag, flush_lag, replay_lag | B |
   114  | postgres.replication_app_wal_lag_time | write_lag, flush_lag, replay_lag | seconds |
   115  
   116  ### Per repl slot
   117  
   118  These metrics refer to the replication slot.
   119  
   120  Labels:
   121  
   122  | Label      | Description     |
   123  |:-----------|:----------------|
   124  | slot | replication slot name |
   125  
   126  Metrics:
   127  
   128  | Metric | Dimensions | Unit |
   129  |:------|:----------|:----|
   130  | postgres.replication_slot_files_count | wal_keep, pg_replslot_files | files |
   131  
   132  ### Per database
   133  
   134  These metrics refer to the database.
   135  
   136  Labels:
   137  
   138  | Label      | Description     |
   139  |:-----------|:----------------|
   140  | database | database name |
   141  
   142  Metrics:
   143  
   144  | Metric | Dimensions | Unit |
   145  |:------|:----------|:----|
   146  | postgres.db_transactions_ratio | committed, rollback | percentage |
   147  | postgres.db_transactions_rate | committed, rollback | transactions/s |
   148  | postgres.db_connections_utilization | used | percentage |
   149  | postgres.db_connections_count | connections | connections |
   150  | postgres.db_cache_io_ratio | miss | percentage |
   151  | postgres.db_io_rate | memory, disk | B/s |
   152  | postgres.db_ops_fetched_rows_ratio | fetched | percentage |
   153  | postgres.db_ops_read_rows_rate | returned, fetched | rows/s |
   154  | postgres.db_ops_write_rows_rate | inserted, deleted, updated | rows/s |
   155  | postgres.db_conflicts_rate | conflicts | queries/s |
   156  | postgres.db_conflicts_reason_rate | tablespace, lock, snapshot, bufferpin, deadlock | queries/s |
   157  | postgres.db_deadlocks_rate | deadlocks | deadlocks/s |
   158  | postgres.db_locks_held_count | access_share, row_share, row_exclusive, share_update, share, share_row_exclusive, exclusive, access_exclusive | locks |
   159  | postgres.db_locks_awaited_count | access_share, row_share, row_exclusive, share_update, share, share_row_exclusive, exclusive, access_exclusive | locks |
   160  | postgres.db_temp_files_created_rate | created | files/s |
   161  | postgres.db_temp_files_io_rate | written | B/s |
   162  | postgres.db_size | size | B |
   163  
   164  ### Per table
   165  
   166  These metrics refer to the database table.
   167  
   168  Labels:
   169  
   170  | Label      | Description     |
   171  |:-----------|:----------------|
   172  | database | database name |
   173  | schema | schema name |
   174  | table | table name |
   175  | parent_table | parent table name |
   176  
   177  Metrics:
   178  
   179  | Metric | Dimensions | Unit |
   180  |:------|:----------|:----|
   181  | postgres.table_rows_dead_ratio | dead | percentage |
   182  | postgres.table_rows_count | live, dead | rows |
   183  | postgres.table_ops_rows_rate | inserted, deleted, updated | rows/s |
   184  | postgres.table_ops_rows_hot_ratio | hot | percentage |
   185  | postgres.table_ops_rows_hot_rate | hot | rows/s |
   186  | postgres.table_cache_io_ratio | miss | percentage |
   187  | postgres.table_io_rate | memory, disk | B/s |
   188  | postgres.table_index_cache_io_ratio | miss | percentage |
   189  | postgres.table_index_io_rate | memory, disk | B/s |
   190  | postgres.table_toast_cache_io_ratio | miss | percentage |
   191  | postgres.table_toast_io_rate | memory, disk | B/s |
   192  | postgres.table_toast_index_cache_io_ratio | miss | percentage |
   193  | postgres.table_toast_index_io_rate | memory, disk | B/s |
   194  | postgres.table_scans_rate | index, sequential | scans/s |
   195  | postgres.table_scans_rows_rate | index, sequential | rows/s |
   196  | postgres.table_autovacuum_since_time | time | seconds |
   197  | postgres.table_vacuum_since_time | time | seconds |
   198  | postgres.table_autoanalyze_since_time | time | seconds |
   199  | postgres.table_analyze_since_time | time | seconds |
   200  | postgres.table_null_columns | null | columns |
   201  | postgres.table_size | size | B |
   202  | postgres.table_bloat_size_perc | bloat | percentage |
   203  | postgres.table_bloat_size | bloat | B |
   204  
   205  ### Per index
   206  
   207  These metrics refer to the table index.
   208  
   209  Labels:
   210  
   211  | Label      | Description     |
   212  |:-----------|:----------------|
   213  | database | database name |
   214  | schema | schema name |
   215  | table | table name |
   216  | parent_table | parent table name |
   217  | index | index name |
   218  
   219  Metrics:
   220  
   221  | Metric | Dimensions | Unit |
   222  |:------|:----------|:----|
   223  | postgres.index_size | size | B |
   224  | postgres.index_bloat_size_perc | bloat | percentage |
   225  | postgres.index_bloat_size | bloat | B |
   226  | postgres.index_usage_status | used, unused | status |
   227  
   228  
   229  
   230  ## Alerts
   231  
   232  
   233  The following alerts are available:
   234  
   235  | Alert name  | On metric | Description |
   236  |:------------|:----------|:------------|
   237  | [ postgres_total_connection_utilization ](https://github.com/netdata/netdata/blob/master/health/health.d/postgres.conf) | postgres.connections_utilization | average total connection utilization over the last minute |
   238  | [ postgres_acquired_locks_utilization ](https://github.com/netdata/netdata/blob/master/health/health.d/postgres.conf) | postgres.locks_utilization | average acquired locks utilization over the last minute |
   239  | [ postgres_txid_exhaustion_perc ](https://github.com/netdata/netdata/blob/master/health/health.d/postgres.conf) | postgres.txid_exhaustion_perc | percent towards TXID wraparound |
   240  | [ postgres_db_cache_io_ratio ](https://github.com/netdata/netdata/blob/master/health/health.d/postgres.conf) | postgres.db_cache_io_ratio | average cache hit ratio in db ${label:database} over the last minute |
   241  | [ postgres_db_transactions_rollback_ratio ](https://github.com/netdata/netdata/blob/master/health/health.d/postgres.conf) | postgres.db_cache_io_ratio | average aborted transactions percentage in db ${label:database} over the last five minutes |
   242  | [ postgres_db_deadlocks_rate ](https://github.com/netdata/netdata/blob/master/health/health.d/postgres.conf) | postgres.db_deadlocks_rate | number of deadlocks detected in db ${label:database} in the last minute |
   243  | [ postgres_table_cache_io_ratio ](https://github.com/netdata/netdata/blob/master/health/health.d/postgres.conf) | postgres.table_cache_io_ratio | average cache hit ratio in db ${label:database} table ${label:table} over the last minute |
   244  | [ postgres_table_index_cache_io_ratio ](https://github.com/netdata/netdata/blob/master/health/health.d/postgres.conf) | postgres.table_index_cache_io_ratio | average index cache hit ratio in db ${label:database} table ${label:table} over the last minute |
   245  | [ postgres_table_toast_cache_io_ratio ](https://github.com/netdata/netdata/blob/master/health/health.d/postgres.conf) | postgres.table_toast_cache_io_ratio | average TOAST hit ratio in db ${label:database} table ${label:table} over the last minute |
   246  | [ postgres_table_toast_index_cache_io_ratio ](https://github.com/netdata/netdata/blob/master/health/health.d/postgres.conf) | postgres.table_toast_index_cache_io_ratio | average index TOAST hit ratio in db ${label:database} table ${label:table} over the last minute |
   247  | [ postgres_table_bloat_size_perc ](https://github.com/netdata/netdata/blob/master/health/health.d/postgres.conf) | postgres.table_bloat_size_perc | bloat size percentage in db ${label:database} table ${label:table} |
   248  | [ postgres_table_last_autovacuum_time ](https://github.com/netdata/netdata/blob/master/health/health.d/postgres.conf) | postgres.table_autovacuum_since_time | time elapsed since db ${label:database} table ${label:table} was vacuumed by the autovacuum daemon |
   249  | [ postgres_table_last_autoanalyze_time ](https://github.com/netdata/netdata/blob/master/health/health.d/postgres.conf) | postgres.table_autoanalyze_since_time | time elapsed since db ${label:database} table ${label:table} was analyzed by the autovacuum daemon |
   250  | [ postgres_index_bloat_size_perc ](https://github.com/netdata/netdata/blob/master/health/health.d/postgres.conf) | postgres.index_bloat_size_perc | bloat size percentage in db ${label:database} table ${label:table} index ${label:index} |
   251  
   252  
   253  ## Setup
   254  
   255  ### Prerequisites
   256  
   257  #### Create netdata user
   258  
   259  Create a user with granted `pg_monitor`
   260  or `pg_read_all_stat` [built-in role](https://www.postgresql.org/docs/current/predefined-roles.html).
   261  
   262  To create the `netdata` user with these permissions, execute the following in the psql session, as a user with CREATEROLE privileges:
   263  
   264  ```postgresql
   265  CREATE USER netdata;
   266  GRANT pg_monitor TO netdata;
   267  ```
   268  
   269  After creating the new user, restart the Netdata agent with `sudo systemctl restart netdata`, or
   270  the [appropriate method](https://github.com/netdata/netdata/blob/master/docs/configure/start-stop-restart.md) for your
   271  system.
   272  
   273  
   274  
   275  ### Configuration
   276  
   277  #### File
   278  
   279  The configuration file name for this integration is `go.d/postgres.conf`.
   280  
   281  
   282  You can edit the configuration file using the `edit-config` script from the
   283  Netdata [config directory](https://github.com/netdata/netdata/blob/master/docs/configure/nodes.md#the-netdata-config-directory).
   284  
   285  ```bash
   286  cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata
   287  sudo ./edit-config go.d/postgres.conf
   288  ```
   289  #### Options
   290  
   291  The following options can be defined globally: update_every, autodetection_retry.
   292  
   293  
   294  <details><summary>Config options</summary>
   295  
   296  | Name | Description | Default | Required |
   297  |:----|:-----------|:-------|:--------:|
   298  | update_every | Data collection frequency. | 5 | no |
   299  | autodetection_retry | Recheck interval in seconds. Zero means no recheck will be scheduled. | 0 | no |
   300  | dsn | Postgres server DSN (Data Source Name). See [DSN syntax](https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING). | postgres://postgres:postgres@127.0.0.1:5432/postgres | yes |
   301  | timeout | Query timeout in seconds. | 2 | no |
   302  | collect_databases_matching | Databases selector. Determines which database metrics will be collected. Syntax is [simple patterns](https://github.com/netdata/go.d.plugin/tree/master/pkg/matcher#simple-patterns-matcher). |  | no |
   303  | max_db_tables | Maximum number of tables in the database. Table metrics will not be collected for databases that have more tables than max_db_tables. 0 means no limit. | 50 | no |
   304  | max_db_indexes | Maximum number of indexes in the database. Index metrics will not be collected for databases that have more indexes than max_db_indexes. 0 means no limit. | 250 | no |
   305  
   306  </details>
   307  
   308  #### Examples
   309  
   310  ##### TCP socket
   311  
   312  An example configuration.
   313  
   314  ```yaml
   315  jobs:
   316    - name: local
   317      dsn: 'postgresql://netdata@127.0.0.1:5432/postgres'
   318  
   319  ```
   320  ##### Unix socket
   321  
   322  An example configuration.
   323  
   324  <details><summary>Config</summary>
   325  
   326  ```yaml
   327  jobs:
   328    - name: local
   329      dsn: 'host=/var/run/postgresql dbname=postgres user=netdata'
   330  
   331  ```
   332  </details>
   333  
   334  ##### Multi-instance
   335  
   336  > **Note**: When you define multiple jobs, their names must be unique.
   337  
   338  Local and remote instances.
   339  
   340  
   341  <details><summary>Config</summary>
   342  
   343  ```yaml
   344  jobs:
   345    - name: local
   346      dsn: 'postgresql://netdata@127.0.0.1:5432/postgres'
   347  
   348    - name: remote
   349      dsn: 'postgresql://netdata@203.0.113.0:5432/postgres'
   350  
   351  ```
   352  </details>
   353  
   354  
   355  
   356  ## Troubleshooting
   357  
   358  ### Debug Mode
   359  
   360  To troubleshoot issues with the `postgres` collector, run the `go.d.plugin` with the debug option enabled. The output
   361  should give you clues as to why the collector isn't working.
   362  
   363  - Navigate to the `plugins.d` directory, usually at `/usr/libexec/netdata/plugins.d/`. If that's not the case on
   364    your system, open `netdata.conf` and look for the `plugins` setting under `[directories]`.
   365  
   366    ```bash
   367    cd /usr/libexec/netdata/plugins.d/
   368    ```
   369  
   370  - Switch to the `netdata` user.
   371  
   372    ```bash
   373    sudo -u netdata -s
   374    ```
   375  
   376  - Run the `go.d.plugin` to debug the collector:
   377  
   378    ```bash
   379    ./go.d.plugin -d -m postgres
   380    ```
   381  
   382