github.com/jonaz/heapster@v1.3.0-beta.0.0.20170208112634-cd3c15ca3d29/docs/proposals/push-metrics.md (about) 1 # Heapster Push Metrics 2 3 ## Overview and Motivation 4 5 Currently, Heapster supports pulling metrics from kubelet, and defines an 6 interface for pulling from other sources. However, in certain cases, it is 7 more useful to be able to have services push metrics into Heapster, instead 8 of having Heapster pull metrics. 9 10 For instance, supporting push metrics makes it easy for cluster admins to add 11 custom metrics into Heapster with relatively minimal effort: they can simply 12 write a program or script which collects and processes the information, and 13 then add a recurring Job or cron job which pushes the metrics. This also 14 enables existing tooling designed with the push model in mind to be adapted to 15 provide metrics through Heapster. 16 17 ### Target Audience and Metrics 18 19 Like the existing custom metrics pull mechanism, the metrics pushed through the 20 push mechanism are intended to be those metrics that are useful for consumption 21 by system components, such as metrics intended for use with autoscaling. The 22 custom metrics pushed through the push mechanisms are still intended to follow 23 the overall guidelines for Heapster custom metrics (keep the number of 24 different metric names relatively limited, etc). 25 26 The current custom metrics mechanism can only be used to collect custom metrics 27 which describe the producing pod. This proposal is designed to provide a 28 method of collecting metrics which describe multiple resources (other pods, 29 services, etc) across the cluster. Such producers are generally add-on cluster 30 infrastructure components deployed by the cluster admins. The push mechanisms 31 are not, in general, intended for use by arbitrary cluster users (although the 32 metrics transferred would probably apply to the users' pods). It is up to 33 cluster admins to decide which applications are permitted to use the push 34 mechanism (see the "Authentication" section below for more information on how 35 cluster admins can control access). 36 37 For these reasons, the push metrics mechanism is not designed to replace the 38 existing pull mechanism for custom metrics, but instead to support additional 39 use cases not already supported. 40 41 ## API 42 43 ### Authentication, Segregation, and Flow Control 44 45 Producers will be authenticated similarly to the way Heapster currently 46 authenticates clients: producers will present a certificate signed by a CA, and 47 Heapster will be configurable to only allow certain names to push metrics, or 48 to allow any certificate signed by the CA. 49 50 Additionally, the name presented during authentication will be used as a prefix 51 for all metrics added. This will prevent two different metrics producers from 52 accidentally overwriting each other's custom metrics (otherwise, push metrics 53 will be stored and retrieved identically to pull-based custom metrics). 54 55 In order to prevent push metrics from overwhelming the Heapster instance, it 56 will be possible to limit the total number of custom metrics each producer is 57 allowed to add, and the frequency at which producers are allowed to push new 58 sets of custom metrics. By default, no limits will be enforced unless 59 explicitly set via command line arguments. 60 61 ### Paths 62 63 To add metrics, metrics producers will `POST` new metrics to 64 `/api/v1/push/{format}/{subpath}` in the format specified by `{format}`. The 65 metrics are pushed in bulk -- it is up to the format to support a way to name 66 specific metrics, namespaces, pods, and containers (for instance, the 67 Prometheus format uses metric names and labels for this purpose). The 68 `{subpath}` option enables different formats to have format specific sub-paths. 69 70 ### Metrics Format 71 72 The underlying design can support multiple format "backends". The initial 73 backend, detailed here, will be based on the Prometheus format, and will be 74 available at `/api/v1/push/prometheus/`, 75 `/api/v1/push/prometheus/metrics/job/{producer_name}`, or 76 `/api/v1/push/promethus/metrics/jobs/{producer_name}`. If either of the latter 77 two paths are used, `{producer_name}` should match the name used when 78 authenticating with Heapster. 79 80 Both the Prometheus text and protobuf formats will be supported. 81 82 The metric name specified on each metric line will be used as the custom metric 83 name in Heapster (except prefixed as discussed in the "Authentication" 84 section). The following labels will be used to determine which object a metric 85 is associated with: 86 87 - For kubernetes-related metrics, the `namespace` label will indicate 88 namespace, the `pod` label will indicate pod, and the `container` label will 89 indicate container. If only `namespace` is present, then the metric will be 90 considered namespace-level. If `namespace` and `pod` are present, then the 91 metric will be considered pod-level. If all three are present, the metric 92 will be considered container-level. 93 94 - For kubernetes-related metrics, the `service` label can be used in 95 conjunction with the `namespace` label to indicate a service-level metric. 96 Currently, this will be stored in Heapster as a namespace-level metric 97 prefixed with the service name. 98 99 - For non-kubernetes-related metrics, the `node` label will indicate node, and 100 the `container` label will indicate free container. If only `node` is 101 present, the metric will be considered node-level. Otherwise, the metric 102 will be considered free-container-level. 103 104 If the `node`, `namespace`, `pod`, and `container`, and `service` labels are 105 not present in one of the configurations listed above, the metric line is 106 invalid and the batch should be rejected. 107 108 Any additional labels will be treated the same as labels on existing custom 109 metrics (currently multiple custom metrics with the same name, but different 110 labels, are ignored, but this seems like an oversight and should probably be 111 fixed). 112 113 Timestamps will be assigned by using the next Heapster metrics batch timestamp 114 after the time at which the metrics are received. If a timestamp is provided 115 as part of the metric line, this may be stored as a separate field for 116 posterity, but the "official" timestamp will be that of the assigned batch. 117 118 As with normal Prometheus metrics, the `TYPE` line should be used to provide 119 the type of the metric. 120 121 #### Example 122 123 Suppose a producer with the name "http_gatherer" sent the following metrics to 124 `/api/v1/push/prometheus`: 125 126 ``` 127 # This is a pod-level metric (it might be used for autoscaling) 128 # TYPE http_requests_per_second guage 129 http_requests_per_minute{namespace="webapp",pod="frontend-server-a-1"} 20 130 http_requests_per_minute{namespace="webapp",pod="frontend-server-a-2"} 5 131 http_requests_per_minute{namespace="webapp",pod="frontend-server-b-1"} 25 132 133 # This is a service-level metric, which will be stored as frontend_hits_total 134 # and restapi_hits_total (these might be used for auto-idling) 135 # TYPE hits_total counter 136 hits_total{namespace="webapp",service="frontend"} 5000 137 hits_total{namespace="webapp",service="restapi"} 6000 138 ``` 139 140 This would result in the metrics being available at: 141 142 ``` 143 /api/v1/model/namespaces/webapp/pods/frontend-server-a-1/metrics/custom/http_gatherer/http_requests_per_minute 144 /api/v1/model/namespaces/webapp/pods/frontend-server-a-2/metrics/custom/http_gatherer/http_requests_per_minute 145 /api/v1/model/namespaces/webapp/pods/frontend-server-b-1/metrics/custom/http_gatherer/http_requests_per_minute 146 /api/v1/model/namespaces/webapp/metrics/custom/http_gatherer/frontend/hits_total 147 /api/v1/model/namespaces/webapp/metrics/custom/http_gatherer/restapi/hits_total 148 ``` 149 150 ## Discussed Alternatives 151 152 A number of alternatives came up during the discussion of this proposal. They 153 are discussed briefly below. Note that most of these alternatives do not deal 154 particularly well with a case where metrics need to come from a source that is 155 not running as a pod on the cluster. While it is expected that many of the 156 producers will be running as components on the cluster (e.g. as DaemonSets or 157 PetSets), it could still be advantageous to support metrics coming from 158 components that are not in the form of pods. 159 160 ### Writing directly into sinks 161 162 This alternative would have producers write directly into sinks in the Heapster 163 storage schema, and then use a mechanism similar to the Oldtimer API to read 164 the metrics back. 165 166 This would require every producer to know how to talk to every sink, would make 167 configuring the sinks more complicated, and would most likely lead to software 168 only being able to talk to one of the sinks supported by Heapster. 169 Additionally, you lose the benefits of the Heapster model, and either have to 170 adapt the existing Heapster model to fall back to an Oldtimer-like approach, or 171 teach all cluster components to be able to read from both the Heapster model 172 and Oldtimer simultaneously. 173 174 ### Reworking the existing cAdvisor-Kubelet-Heapster Pull Mechanism 175 176 This alternative would involve reworking the existing pull mechanism to allow 177 certain pods to produce metrics that describe other resources besides the 178 themselves, as opposed to the current situation, where all custom metrics 179 collected via the current pull mechanism are marked as describing the producer 180 pod. 181 182 This would require a mechanism for indicating to Heapster which pod names were 183 allowed to produce metrics that describe other resources, since admins would 184 generally want most pods producing metrics to continue to just have metrics 185 which describe only the producer pod. It would also conceptually blend 186 together pods producing metrics about themselves versus pods producing metrics 187 about others. Additionally, the current cAdvisor-based custom metrics 188 collection is not secured, so all metrics would be available to anyone with 189 knowledge of the appropriate port, but this may change in the future. 190 191 ### Using a new daemon per node to produce metrics 192 193 This alternative would involve running a daemon on each node that aggregated 194 all the separate custom metrics producers' results together. 195 196 It was suggested that an approach similar to the Prometheus Node Exporter 197 Textfile Collector could be used, in which sources would write their metrics to 198 files in a directly, which would later be read by the collector when polled for 199 metrics. When the producers are containerized, you'd need to use a hostPath 200 volume, have the daemon look for specific emptyDir mounts in containers (and 201 use one director per container), or something similar. 202 203 Alternatively, a new daemon could be run on each node that was responsible for 204 collecting metrics from producers who produce bulk metrics describing other 205 resources. 206 207 This would still require some sort of auth to limit which pods where allowed to 208 do so (while the scoping above prevents collision, cluster admins would most 209 likely still want to limit which pods are allowed to post metrics which appear 210 in another pod's list of custom metrics). Unlike the proposal above, admins 211 could not simply rely on "whoever is allowed to authenticate" rule, since 212 cAdvisor does not check certificates like the normal Heapster auth mechanism. 213 214 Additionally, this adds a bit of complexity on the producer's side, since it 215 requires continuously serving the metrics (this could be made easier by 216 providing a tool like the Prometheus Node Exporter Textfile Collector, which 217 just serves up metrics based on text files in a directory). 218 219 ### Adding an additional standard pull mechanism 220 221 This alternative would involve writing a pull mechanism which, for instance, was 222 just able to read Prometheus metrics directly. This would either require 223 admins to configure the Heapster instance to know about every custom metrics 224 source (and restart Heapster when a new source needed to be added), or would 225 require teaching Heapster how to look for an annotation on certain pods to 226 determine which pods to query (Heapster would have a list-watch on the pods, 227 and look for pods added/removed/changed with the appropriate annotation). 228 229 When used in the latter form, this mechanism would still likely require a 230 similar auth setup to the one proposed above, in order to allow the admin to 231 restrict which pods actually were allowed to produce the metrics. It also has 232 similar restrictions/disadvantages as the "Using a new daemon per node" method 233 discussed above.