github.com/gocrane/crane@v0.11.0/docs/tutorials/using-time-series-prediction.md (about) 1 # TimeSeriesPrediction 2 3 Knowing the future makes things easier for us. 4 5 --- 6 7 Many businesses are naturally cyclical in time series, especially for those that directly or indirectly serve "people". This periodicity is determined by the regularity of people’s daily activities. For example, people are accustomed to ordering take-out at noon and in the evenings; there are always traffic peaks in the morning and evening; even for services that don't have such obvious patterns, such as searching, the amount of requests at night is much lower than that during business hours. For applications related to this kind of business, it is a natural idea to infer the next day's metrics from the historical data of the past few days, or to infer the coming Monday's access traffic from the data of last Monday. With predicted metrics or traffic patterns in the next 24 hours, we can better manage our application instances, stabilize our system, and meanwhile, reduce the cost. 8 9 TimeSeriesPrediction is used to forecast the kubernetes object metric. It is based on PredictionCore to do forecast. 10 11 12 # Features 13 A TimeSeriesPrediction sample yaml looks like below: 14 ```yaml 15 apiVersion: prediction.crane.io/v1alpha1 16 kind: TimeSeriesPrediction 17 metadata: 18 name: node-resource-percentile 19 namespace: default 20 spec: 21 targetRef: 22 kind: Node 23 name: 192.168.56.166 24 predictionWindowSeconds: 600 25 predictionMetrics: 26 - resourceIdentifier: node-cpu 27 type: ResourceQuery 28 resourceQuery: cpu 29 algorithm: 30 algorithmType: "percentile" 31 percentile: 32 sampleInterval: "1m" 33 minSampleWeight: "1.0" 34 histogram: 35 maxValue: "10000.0" 36 epsilon: "1e-10" 37 halfLife: "12h" 38 bucketSize: "10" 39 firstBucketSize: "40" 40 bucketSizeGrowthRatio: "1.5" 41 - resourceIdentifier: node-mem 42 type: ResourceQuery 43 resourceQuery: memory 44 algorithm: 45 algorithmType: "percentile" 46 percentile: 47 sampleInterval: "1m" 48 minSampleWeight: "1.0" 49 histogram: 50 maxValue: "1000000.0" 51 epsilon: "1e-10" 52 halfLife: "12h" 53 bucketSize: "10" 54 firstBucketSize: "40" 55 bucketSizeGrowthRatio: "1.5" 56 ``` 57 58 * spec.targetRef defines the reference to the kubernetes object including Node or other workload such as Deployment. 59 * spec.predictionMetrics defines the metrics about the spec.targetRef. 60 * spec.predictionWindowSeconds is a prediction time series duration. the TimeSeriesPredictionController will rotate the predicted data in spec.Status for consumer to consume the predicted time series data. 61 62 ## PredictionMetrics 63 ```yaml 64 apiVersion: prediction.crane.io/v1alpha1 65 kind: TimeSeriesPrediction 66 metadata: 67 name: node-resource-percentile 68 namespace: default 69 spec: 70 predictionMetrics: 71 - resourceIdentifier: node-cpu 72 type: ResourceQuery 73 resourceQuery: cpu 74 algorithm: 75 algorithmType: "percentile" 76 percentile: 77 sampleInterval: "1m" 78 minSampleWeight: "1.0" 79 histogram: 80 maxValue: "10000.0" 81 epsilon: "1e-10" 82 halfLife: "12h" 83 bucketSize: "10" 84 firstBucketSize: "40" 85 bucketSizeGrowthRatio: "1.5" 86 ``` 87 88 ### MetricType 89 90 There are three types of the metric query: 91 92 - `ResourceQuery` is a kubernetes built-in resource metric such as cpu or memory. crane supports only cpu and memory now. 93 - `RawQuery` is a query by DSL, such as prometheus query language. now support prometheus. 94 - `ExpressionQuery` is a query by Expression selector. 95 96 Now we only support prometheus as data source. We define the `MetricType` to orthogonal with the datasource. but now maybe some datasources do not support the metricType. 97 98 ### Algorithm 99 `Algorithm` define the algorithm type and params to do predict for the metric. Now there are two kinds of algorithms: 100 101 - `dsp` is an algorithm to forcasting a time series, it is based on FFT(Fast Fourier Transform), it is good at predicting some time series with seasonality and periods. 102 - `percentile` is an algorithm to estimate a time series, and find a recommended value to represent the past time series, it is based on exponentially-decaying weights historgram statistics. it is used to estimate a time series, it is not good at to predict a time sequences, although the percentile can output a time series predicted data, but it is all the same value. so if you want to predict a time sequences, dsp is a better choice. 103 104 105 #### dsp params 106 107 #### percentile params