github.com/jonaz/heapster@v1.3.0-beta.0.0.20170208112634-cd3c15ca3d29/docs/proposals/push-metrics.md

github.com/jonaz/heapster@v1.3.0-beta.0.0.20170208112634-cd3c15ca3d29/docs/proposals/push-metrics.md (about)

     1  # Heapster Push Metrics
     2  
     3  ## Overview and Motivation
     4  
     5  Currently, Heapster supports pulling metrics from kubelet, and defines an
     6  interface for pulling from other sources.  However, in certain cases, it is
     7  more useful to be able to have services push metrics into Heapster, instead
     8  of having Heapster pull metrics.
     9  
    10  For instance, supporting push metrics makes it easy for cluster admins to add
    11  custom metrics into Heapster with relatively minimal effort: they can simply
    12  write a program or script which collects and processes the information, and
    13  then add a recurring Job or cron job which pushes the metrics.  This also
    14  enables existing tooling designed with the push model in mind to be adapted to
    15  provide metrics through Heapster.
    16  
    17  ### Target Audience and Metrics
    18  
    19  Like the existing custom metrics pull mechanism, the metrics pushed through the
    20  push mechanism are intended to be those metrics that are useful for consumption
    21  by system components, such as metrics intended for use with autoscaling.  The
    22  custom metrics pushed through the push mechanisms are still intended to follow
    23  the overall guidelines for Heapster custom metrics (keep the number of
    24  different metric names relatively limited, etc).
    25  
    26  The current custom metrics mechanism can only be used to collect custom metrics
    27  which describe the producing pod.  This proposal is designed to provide a
    28  method of collecting metrics which describe multiple resources (other pods,
    29  services, etc) across the cluster.  Such producers are generally add-on cluster
    30  infrastructure components deployed by the cluster admins.  The push mechanisms
    31  are not, in general, intended for use by arbitrary cluster users (although the
    32  metrics transferred would probably apply to the users' pods).  It is up to
    33  cluster admins to decide which applications are permitted to use the push
    34  mechanism (see the "Authentication" section below for more information on how
    35  cluster admins can control access).
    36  
    37  For these reasons, the push metrics mechanism is not designed to replace the
    38  existing pull mechanism for custom metrics, but instead to support additional
    39  use cases not already supported.
    40  
    41  ## API
    42  
    43  ### Authentication, Segregation, and Flow Control
    44  
    45  Producers will be authenticated similarly to the way Heapster currently
    46  authenticates clients: producers will present a certificate signed by a CA, and
    47  Heapster will be configurable to only allow certain names to push metrics, or
    48  to allow any certificate signed by the CA.
    49  
    50  Additionally, the name presented during authentication will be used as a prefix
    51  for all metrics added.  This will prevent two different metrics producers from
    52  accidentally overwriting each other's custom metrics (otherwise, push metrics
    53  will be stored and retrieved identically to pull-based custom metrics).
    54  
    55  In order to prevent push metrics from overwhelming the Heapster instance, it
    56  will be possible to limit the total number of custom metrics each producer is
    57  allowed to add, and the frequency at which producers are allowed to push new
    58  sets of custom metrics.  By default, no limits will be enforced unless
    59  explicitly set via command line arguments.
    60  
    61  ### Paths
    62  
    63  To add metrics, metrics producers will `POST` new metrics to
    64  `/api/v1/push/{format}/{subpath}` in the format specified by `{format}`.  The
    65  metrics are pushed in bulk -- it is up to the format to support a way to name
    66  specific metrics, namespaces, pods, and containers (for instance, the
    67  Prometheus format uses metric names and labels for this purpose).  The
    68  `{subpath}` option enables different formats to have format specific sub-paths.
    69  
    70  ### Metrics Format
    71  
    72  The underlying design can support multiple format "backends".  The initial
    73  backend, detailed here, will be based on the Prometheus format, and will be
    74  available at `/api/v1/push/prometheus/`,
    75  `/api/v1/push/prometheus/metrics/job/{producer_name}`, or
    76  `/api/v1/push/promethus/metrics/jobs/{producer_name}`.  If either of the latter
    77  two paths are used, `{producer_name}` should match the name used when
    78  authenticating with Heapster.
    79  
    80  Both the Prometheus text and protobuf formats will be supported.
    81  
    82  The metric name specified on each metric line will be used as the custom metric
    83  name in Heapster (except prefixed as discussed in the "Authentication"
    84  section).  The following labels will be used to determine which object a metric
    85  is associated with:
    86  
    87  - For kubernetes-related metrics, the `namespace` label will indicate
    88    namespace, the `pod` label will indicate pod, and the `container` label will
    89    indicate container.  If only `namespace` is present, then the metric will be
    90    considered namespace-level.  If `namespace` and `pod` are present, then the
    91    metric will be considered pod-level.  If all three are present, the metric
    92    will be considered container-level.
    93  
    94  - For kubernetes-related metrics, the `service` label can be used in
    95    conjunction with the `namespace` label to indicate a service-level metric.
    96    Currently, this will be stored in Heapster as a namespace-level metric
    97    prefixed with the service name.
    98  
    99  - For non-kubernetes-related metrics, the `node` label will indicate node, and
   100    the `container` label will indicate free container.   If only `node` is
   101    present, the metric will be considered node-level.  Otherwise, the metric
   102    will be considered free-container-level.
   103  
   104  If the `node`, `namespace`, `pod`, and `container`, and `service` labels are
   105  not present in one of the configurations listed above, the metric line is
   106  invalid and the batch should be rejected.
   107  
   108  Any additional labels will be treated the same as labels on existing custom
   109  metrics (currently multiple custom metrics with the same name, but different
   110  labels, are ignored, but this seems like an oversight and should probably be
   111  fixed).
   112  
   113  Timestamps will be assigned by using the next Heapster metrics batch timestamp
   114  after the time at which the metrics are received.  If a timestamp is provided
   115  as part of the metric line, this may be stored as a separate field for
   116  posterity, but the "official" timestamp will be that of the assigned batch.
   117  
   118  As with normal Prometheus metrics, the `TYPE` line should be used to provide
   119  the type of the metric.
   120  
   121  #### Example
   122  
   123  Suppose a producer with the name "http_gatherer" sent the following metrics to
   124  `/api/v1/push/prometheus`:
   125  
   126  ```
   127  # This is a pod-level metric (it might be used for autoscaling)
   128  # TYPE http_requests_per_second guage
   129  http_requests_per_minute{namespace="webapp",pod="frontend-server-a-1"} 20
   130  http_requests_per_minute{namespace="webapp",pod="frontend-server-a-2"} 5
   131  http_requests_per_minute{namespace="webapp",pod="frontend-server-b-1"} 25
   132  
   133  # This is a service-level metric, which will be stored as frontend_hits_total
   134  # and restapi_hits_total (these might be used for auto-idling)
   135  # TYPE hits_total counter
   136  hits_total{namespace="webapp",service="frontend"} 5000
   137  hits_total{namespace="webapp",service="restapi"} 6000
   138  ```
   139  
   140  This would result in the metrics being available at:
   141  
   142  ```
   143  /api/v1/model/namespaces/webapp/pods/frontend-server-a-1/metrics/custom/http_gatherer/http_requests_per_minute
   144  /api/v1/model/namespaces/webapp/pods/frontend-server-a-2/metrics/custom/http_gatherer/http_requests_per_minute
   145  /api/v1/model/namespaces/webapp/pods/frontend-server-b-1/metrics/custom/http_gatherer/http_requests_per_minute
   146  /api/v1/model/namespaces/webapp/metrics/custom/http_gatherer/frontend/hits_total
   147  /api/v1/model/namespaces/webapp/metrics/custom/http_gatherer/restapi/hits_total
   148  ```
   149  
   150  ## Discussed Alternatives
   151  
   152  A number of alternatives came up during the discussion of this proposal.  They
   153  are discussed briefly below.  Note that most of these alternatives do not deal
   154  particularly well with a case where metrics need to come from a source that is
   155  not running as a pod on the cluster.  While it is expected that many of the
   156  producers will be running as components on the cluster (e.g. as DaemonSets or
   157  PetSets), it could still be advantageous to support metrics coming from
   158  components that are not in the form of pods.
   159  
   160  ### Writing directly into sinks
   161  
   162  This alternative would have producers write directly into sinks in the Heapster
   163  storage schema, and then use a mechanism similar to the Oldtimer API to read
   164  the metrics back.
   165  
   166  This would require every producer to know how to talk to every sink, would make
   167  configuring the sinks more complicated, and would most likely lead to software
   168  only being able to talk to one of the sinks supported by Heapster.
   169  Additionally, you lose the benefits of the Heapster model, and either have to
   170  adapt the existing Heapster model to fall back to an Oldtimer-like approach, or
   171  teach all cluster components to be able to read from both the Heapster model
   172  and Oldtimer simultaneously.
   173  
   174  ### Reworking the existing cAdvisor-Kubelet-Heapster Pull Mechanism
   175  
   176  This alternative would involve reworking the existing pull mechanism to allow
   177  certain pods to produce metrics that describe other resources besides the
   178  themselves, as opposed to the current situation, where all custom metrics
   179  collected via the current pull mechanism are marked as describing the producer
   180  pod.
   181  
   182  This would require a mechanism for indicating to Heapster which pod names were
   183  allowed to produce metrics that describe other resources, since admins would
   184  generally want most pods producing metrics to continue to just have metrics
   185  which describe only the producer pod.  It would also conceptually blend
   186  together pods producing metrics about themselves versus pods producing metrics
   187  about others.  Additionally, the current cAdvisor-based custom metrics
   188  collection is not secured, so all metrics would be available to anyone with
   189  knowledge of the appropriate port, but this may change in the future.
   190  
   191  ### Using a new daemon per node to produce metrics
   192  
   193  This alternative would involve running a daemon on each node that aggregated
   194  all the separate custom metrics producers' results together.
   195  
   196  It was suggested that an approach similar to the Prometheus Node Exporter
   197  Textfile Collector could be used, in which sources would write their metrics to
   198  files in a directly, which would later be read by the collector when polled for
   199  metrics.  When the producers are containerized, you'd need to use a hostPath
   200  volume, have the daemon look for specific emptyDir mounts in containers (and
   201  use one director per container), or something similar.
   202  
   203  Alternatively, a new daemon could be run on each node that was responsible for
   204  collecting metrics from producers who produce bulk metrics describing other
   205  resources.
   206  
   207  This would still require some sort of auth to limit which pods where allowed to
   208  do so (while the scoping above prevents collision, cluster admins would most
   209  likely still want to limit which pods are allowed to post metrics which appear
   210  in another pod's list of custom metrics).  Unlike the proposal above, admins
   211  could not simply rely on "whoever is allowed to authenticate" rule, since
   212  cAdvisor does not check certificates like the normal Heapster auth mechanism.
   213  
   214  Additionally, this adds a bit of complexity on the producer's side, since it
   215  requires continuously serving the metrics (this could be made easier by
   216  providing a tool like the Prometheus Node Exporter Textfile Collector, which
   217  just serves up metrics based on text files in a directory).
   218  
   219  ### Adding an additional standard pull mechanism
   220  
   221  This alternative would involve writing a pull mechanism which, for instance, was
   222  just able to read Prometheus metrics directly.  This would either require
   223  admins to configure the Heapster instance to know about every custom metrics
   224  source (and restart Heapster when a new source needed to be added), or would
   225  require teaching Heapster how to look for an annotation on certain pods to
   226  determine which pods to query (Heapster would have a list-watch on the pods,
   227  and look for pods added/removed/changed with the appropriate annotation).
   228  
   229  When used in the latter form, this mechanism would still likely require a
   230  similar auth setup to the one proposed above, in order to allow the admin to
   231  restrict which pods actually were allowed to produce the metrics.  It also has
   232  similar restrictions/disadvantages as the "Using a new daemon per node" method
   233  discussed above.