github.com/gocrane/crane@v0.11.0/docs/tutorials/using-time-series-prediction.md

github.com/gocrane/crane@v0.11.0/docs/tutorials/using-time-series-prediction.md (about)

     1  # TimeSeriesPrediction
     2  
     3  Knowing the future makes things easier for us.
     4  
     5  ---
     6  
     7  Many businesses are naturally cyclical in time series, especially for those that directly or indirectly serve "people". This periodicity is determined by the regularity of people’s daily activities. For example, people are accustomed to ordering take-out at noon and in the evenings; there are always traffic peaks in the morning and evening; even for services that don't have such obvious patterns, such as searching, the amount of requests at night is much lower than that during business hours. For applications related to this kind of business, it is a natural idea to infer the next day's metrics from the historical data of the past few days, or to infer the coming Monday's access traffic from the data of last Monday. With predicted metrics or traffic patterns in the next 24 hours, we can better manage our application instances, stabilize our system, and meanwhile, reduce the cost.
     8  
     9  TimeSeriesPrediction is used to forecast the kubernetes object metric. It is based on PredictionCore to do forecast.
    10  
    11  
    12  # Features
    13  A TimeSeriesPrediction sample yaml looks like below:
    14  ```yaml
    15  apiVersion: prediction.crane.io/v1alpha1
    16  kind: TimeSeriesPrediction
    17  metadata:
    18    name: node-resource-percentile
    19    namespace: default
    20  spec:
    21    targetRef:
    22      kind: Node
    23      name: 192.168.56.166
    24    predictionWindowSeconds: 600
    25    predictionMetrics:
    26      - resourceIdentifier: node-cpu
    27        type: ResourceQuery
    28        resourceQuery: cpu
    29        algorithm:
    30          algorithmType: "percentile"
    31          percentile:
    32            sampleInterval: "1m"
    33            minSampleWeight: "1.0"
    34            histogram:
    35              maxValue: "10000.0"
    36              epsilon: "1e-10"
    37              halfLife: "12h"
    38              bucketSize: "10"
    39              firstBucketSize: "40"
    40              bucketSizeGrowthRatio: "1.5"
    41      - resourceIdentifier: node-mem
    42        type: ResourceQuery
    43        resourceQuery: memory
    44        algorithm:
    45          algorithmType: "percentile"
    46          percentile:
    47            sampleInterval: "1m"
    48            minSampleWeight: "1.0"
    49            histogram:
    50              maxValue: "1000000.0"
    51              epsilon: "1e-10"
    52              halfLife: "12h"
    53              bucketSize: "10"
    54              firstBucketSize: "40"
    55              bucketSizeGrowthRatio: "1.5"
    56  ```
    57  
    58  * spec.targetRef defines the reference to the kubernetes object including Node or other workload such as Deployment.
    59  * spec.predictionMetrics defines the metrics about the spec.targetRef.
    60  * spec.predictionWindowSeconds is a prediction time series duration. the TimeSeriesPredictionController will rotate the predicted data in spec.Status for consumer to consume the predicted time series data.
    61  
    62  ## PredictionMetrics
    63  ```yaml
    64  apiVersion: prediction.crane.io/v1alpha1
    65  kind: TimeSeriesPrediction
    66  metadata:
    67    name: node-resource-percentile
    68    namespace: default
    69  spec:
    70    predictionMetrics:
    71      - resourceIdentifier: node-cpu
    72        type: ResourceQuery
    73        resourceQuery: cpu
    74        algorithm:
    75          algorithmType: "percentile"
    76          percentile:
    77            sampleInterval: "1m"
    78            minSampleWeight: "1.0"
    79            histogram:
    80              maxValue: "10000.0"
    81              epsilon: "1e-10"
    82              halfLife: "12h"
    83              bucketSize: "10"
    84              firstBucketSize: "40"
    85              bucketSizeGrowthRatio: "1.5"
    86  ```
    87  
    88  ### MetricType
    89  
    90  There are three types of the metric query:
    91  
    92   - `ResourceQuery` is a kubernetes built-in resource metric such as cpu or memory. crane supports only cpu and memory  now.
    93   - `RawQuery` is a query by DSL, such as prometheus query language. now support prometheus.
    94   - `ExpressionQuery` is a query by Expression selector. 
    95  
    96  Now we only support prometheus as data source. We define the `MetricType` to orthogonal with the datasource. but now maybe some datasources do not support the metricType.
    97  
    98  ### Algorithm
    99  `Algorithm` define the algorithm type and params to do predict for the metric. Now there are two kinds of algorithms:
   100  
   101   - `dsp` is an algorithm to forcasting a time series, it is based on FFT(Fast Fourier Transform), it is good at predicting some time series with seasonality and periods.
   102   - `percentile` is an algorithm to estimate a time series, and find a recommended value to represent the past time series, it is based on exponentially-decaying weights historgram statistics. it is used to estimate a time series, it is not good at to predict a time sequences, although the percentile can output a time series predicted data, but it is all the same value. so if you want to predict a time sequences, dsp is a better choice.
   103   
   104  
   105  #### dsp params
   106  
   107  #### percentile params