github.com/aclisp/heapster@v0.19.2-0.20160613100040-51756f899a96/docs/storage-schema.md (about)

     1  ## Metrics
     2  
     3  Heapster exports the following metrics to its backends.
     4  
     5  | Metric Name | Description |
     6  |------------|-------------|
     7  | cpu/limit | CPU hard limit in millicores. |
     8  | cpu/node_reservation | Share of cpu that is reserved on the node. |
     9  | cpu/node_utilization | CPU utilization as a share of node capacity. |
    10  | cpu/request | CPU request (the guaranteed amount of resources) in millicores. |
    11  | cpu/usage | Cumulative CPU usage on all cores. |
    12  | cpu/usage_rate | CPU usage on all cores in millicores. |
    13  | filesystem/usage | Total number of bytes consumed on a filesystem. |
    14  | filesystem/limit | The total size of filesystem in bytes. |
    15  | filesystem/available | The number of available bytes remaining in a the filesystem |
    16  | memory/limit | Memory hard limit in bytes. |
    17  | memory/major_page_faults | Number of major page faults. |
    18  | memory/major_page_faults_rate | Number of major page faults per second. |
    19  | memory/node_reservation | Share of memory that is reserved on the node. |
    20  | memory/node_utilization | Memory utilization as a share of memory capacity. |
    21  | memory/page_faults | Number of page faults. |
    22  | memory/page_faults_rate | Number of page faults per second. |
    23  | memory/request | Memory request (the guaranteed amount of resources) in bytes. |
    24  | memory/usage | Total memory usage. |
    25  | memory/working_set | Total working set usage. Working set is the memory being used and not easily dropped by the kernel. |
    26  | network/rx | Cumulative number of bytes received over the network. |
    27  | network/rx_errors | Cumulative number of errors while receiving over the network. |
    28  | network/rx_errors_rate | Number of errors while receiving over the network per second. |
    29  | network/rx_rate | Number of bytes received over the network per second. |
    30  | network/tx | Cumulative number of bytes sent over the network |
    31  | network/tx_errors | Cumulative number of errors while sending over the network |
    32  | network/tx_errors_rate | Number of errors while sending over the network |
    33  | network/tx_rate | Number of bytes sent over the network per second. |
    34  | uptime  | Number of milliseconds since the container was started. |
    35  
    36  All custom (aka application) metrics are prefixed with 'custom/'.
    37  
    38  ## Labels
    39  
    40  Heapster tags each metric with the following labels.
    41  
    42  | Label Name     | Description                                                                   |
    43  |----------------|-------------------------------------------------------------------------------|
    44  | pod_id         | Unique ID of a Pod                                                            |
    45  | pod_name       | User-provided name of a Pod                                                   |
    46  | pod_namespace  | The namespace of a Pod                                                        |
    47  | container_base_image | Base image for the container |  
    48  | container_name | User-provided name of the container or full cgroup name for system containers |
    49  | host_id        | Cloud-provider specified or user specified Identifier of a node               | 
    50  | hostname       | Hostname where the container ran                                              | 
    51  | labels         | Comma-separated list of user-provided labels. Format is 'key:value'           |
    52  | namespace_id   | UID of the namespace of a Pod                                                 |
    53  | resource_id    | An unique identifier used to differentiate multiple metrics of the same type. e.x. Fs partitions under filesystem/usage | 
    54  
    55  ## Aggregates
    56  
    57  The metrics are collected initally collected for nodes and containers and latter aggregated for pods, namespaces and clusters. 
    58  Disk and network metrics are not available at container level (only at pod and node level). 
    59  
    60  ## Storage Schema
    61  
    62  ### InfluxDB
    63  
    64  ##### Default
    65  
    66  Each metric translates to a separate 'series' in InfluxDB. Labels are stored as tags.
    67  The metric name is not modified.
    68  
    69  ##### Using fields
    70  
    71  If you want to use InfluxDB fields, you have to add `with_fields=true` as parameter in InfluxDB sink URL.
    72  (More information here: https://docs.influxdata.com/influxdb/v0.9/concepts/key_concepts/)
    73  
    74  In that case, each metric translates to a separate in 'series' in InfluxDB. This means that some metrics are grouped in the same 'measurement'.
    75  For example, we have the measurement 'cpu' with fields 'node_reservation', 'node_utilization', 'request', 'usage', 'usage_rate'.
    76  Also, all labels are stored as tags.
    77  Here the measurement list: cpu, filesystem, memory, network, uptime
    78  
    79  Also, standard Grafana dashboard are not working with this new schema, you have to use [new dashboards](/grafana/dashboards/influxdb_withfields)
    80  
    81  ### Google Cloud Monitoring
    82  
    83  Metrics mentioned above are stored along with corresponding labels as [custom metrics](https://cloud.google.com/monitoring/custom-metrics/) in Google Cloud Monitoring.
    84  
    85  * Metrics are collected every 2 minutes by default and pushed with a 1 minute precision.
    86  * Each metric has a custom metric prefix - `custom.cloudmonitoring.googleapis.com`
    87  * Each metric is pushed with an additonal namespace prefix - `kubernetes.io`.
    88  * GCM does not support visualizing cumulative metrics yet. To work around that, heapster exports an equivalent gauge metric for all cumulative metrics mentioned above.
    89  
    90    The gauge metrics use their parent cumulative metric name as the prefix, followed by a "_rate" suffix. 
    91     E.x.: "cpu/usage", which is cumulative, will have a corresponding gauge metric "cpu/usage_rate"
    92     NOTE: The gauge metrics will be deprecated as soon as GCM supports visualizing cumulative metrics.
    93  
    94  TODO: Add a snapshot of all the metrics stored in GCM.
    95  
    96  ### Hawkular
    97  
    98  Each metric is stored as separate timeseries (metric) in Hawkular-Metrics with tags being inherited from common ancestor type. The metric name is created with the following format: `containerName/podId/metricName` (`/` is separator). Each definition stores the labels as tags with following addons:
    99  
   100  * All the Label descriptions are stored as label_description
   101  * The ancestor metric name (such as cpu/usage) is stored under the tag `descriptor_name`
   102  * To ease search, a tag with `group_id` stores the key `containerName/metricName` so each podId can be linked under a single timeseries if necessary.
   103  * Units are stored under `units` tag
   104  * If labelToTenant parameter is given, any metric with the label will use this label's value as the target tenant. If the metric doesn't have the label defined, default tenant is used.
   105  
   106  At the start, all the definitions are fetched from the Hawkular-Metrics tenant and filtered to cache only the Heapster metrics. It is recommended to use a separate tenant for Heapster information if you have lots of metrics from other systems, but not required.
   107  
   108  The Hawkular-Metrics instance can be a standalone installation of Hawkular-Metrics or the full installation of Hawkular.