github.com/gocrane/crane@v0.11.0/docs/tutorials/resource-recommendation.zh.md (about)

     1  # 资源推荐
     2  
     3  Kubernetes 用户在创建应用资源时常常是基于经验值来设置 request 和 limit。通过资源推荐的算法分析应用的真实用量推荐更合适的资源配置,您可以参考并采纳它提升集群的资源利用率。
     4  
     5  ## 产品功能
     6  
     7  资源推荐是 VPA 的轻量化实现,且更灵活。
     8  
     9  1. 算法:算法模型采用了 VPA 的滑动窗口(Moving Window)算法,并且支持自定义算法的关键配置,提供了更高的灵活性
    10  2. 支持批量分析:通过 `Analytics` 的 ResourceSelector,用户可以批量分析多个工作负载,而无需一个一个的创建 VPA 对象
    11  3. 更轻便:由于 VPA 的 Auto 模式在更新容器资源配置时会导致容器重建,因此很难在生产上使用自动模式,资源推荐给用户提供资源建议,把变更的决定交给用户决定
    12  
    13  ## 创建资源分析
    14  
    15  我们通过 deployment: `nginx` 和 `Analytics` 作为一个例子演示如何开始一次资源推荐之旅:
    16  
    17  
    18  === "Main"
    19  
    20        ```bash
    21        kubectl apply -f https://raw.githubusercontent.com/gocrane/crane/main/examples/analytics/nginx-deployment.yaml
    22        kubectl apply -f https://raw.githubusercontent.com/gocrane/crane/main/examples/analytics/analytics-resource.yaml
    23        kubectl get analytics
    24        ```
    25  
    26  === "Mirror"
    27  
    28        ```bash
    29        kubectl apply -f https://gitee.com/finops/crane/raw/main/examples/analytics/nginx-deployment.yaml
    30        kubectl apply -f https://gitee.com/finops/crane/raw/main/examples/analytics/analytics-resource.yaml
    31        kubectl get analytics
    32        ```
    33  
    34  
    35  ```yaml title="analytics-resource.yaml"
    36  apiVersion: analysis.crane.io/v1alpha1
    37  kind: Analytics
    38  metadata:
    39    name: nginx-resource
    40  spec:
    41    type: Resource                        # This can only be "Resource" or "HPA".
    42    completionStrategy:
    43      completionStrategyType: Periodical  # This can only be "Once" or "Periodical".
    44      periodSeconds: 86400                # analytics selected resources every 1 day
    45    resourceSelectors:                    # defines all the resources to be select with
    46      - kind: Deployment
    47        apiVersion: apps/v1
    48        name: nginx-deployment
    49  ```
    50  
    51  结果如下:
    52  
    53  ```bash
    54  NAME        AGE
    55  nginx-resource   16m
    56  ```
    57  
    58  查看 Analytics 详情:
    59  
    60  ```bash
    61  kubectl get analytics nginx-resource -o yaml
    62  ```
    63  
    64  结果如下:
    65  
    66  ```yaml
    67  apiVersion: analysis.crane.io/v1alpha1
    68  kind: Analytics
    69  metadata:
    70    name: nginx-resource
    71    namespace: default
    72  spec:
    73    completionStrategy:
    74      completionStrategyType: Periodical
    75      periodSeconds: 86400
    76    resourceSelectors:
    77      - apiVersion: apps/v1
    78        kind: Deployment
    79        labelSelector: {}
    80        name: nginx-deployment
    81    type: Resource
    82  status:
    83    conditions:
    84      - lastTransitionTime: "2022-05-15T14:38:35Z"
    85        message: Analytics is ready
    86        reason: AnalyticsReady
    87        status: "True"
    88        type: Ready
    89    lastUpdateTime: "2022-05-15T14:38:35Z"
    90    recommendations:
    91      - lastStartTime: "2022-05-15T14:38:35Z"
    92        message: Success
    93        name: nginx-resource-resource-w45nq
    94        namespace: default
    95        targetRef:
    96          apiVersion: apps/v1
    97          kind: Deployment
    98          name: nginx-deployment
    99          namespace: default
   100        uid: 750cb3bd-0b87-4f87-acbe-57e621af0a1e
   101  ```
   102  
   103  ## 查看分析结果
   104  
   105  查看分析结果 **Recommendation**:
   106  
   107  ```bash
   108  kubectl get recommend -l analysis.crane.io/analytics-name=nginx-resource -o yaml
   109  ```
   110  
   111  分析结果如下:
   112  
   113  ```yaml
   114  apiVersion: v1
   115  items:
   116    - apiVersion: analysis.crane.io/v1alpha1
   117      kind: Recommendation
   118      metadata:
   119        creationTimestamp: "2022-06-15T15:26:25Z"
   120        generateName: nginx-resource-resource-
   121        generation: 1
   122        labels:
   123          analysis.crane.io/analytics-name: nginx-resource
   124          analysis.crane.io/analytics-type: Resource
   125          analysis.crane.io/analytics-uid: 9e78964b-f8ae-40de-9740-f9a715d16280
   126          app: nginx
   127        name: nginx-resource-resource-t4xpn
   128        namespace: default
   129        ownerReferences:
   130          - apiVersion: analysis.crane.io/v1alpha1
   131            blockOwnerDeletion: false
   132            controller: false
   133            kind: Analytics
   134            name: nginx-resource
   135            uid: 9e78964b-f8ae-40de-9740-f9a715d16280
   136        resourceVersion: "2117439429"
   137        selfLink: /apis/analysis.crane.io/v1alpha1/namespaces/default/recommendations/nginx-resource-resource-t4xpn
   138        uid: 8005e3e0-8fe9-470b-99cf-5ce9dd407529
   139      spec:
   140        adoptionType: StatusAndAnnotation
   141        completionStrategy:
   142          completionStrategyType: Once
   143        targetRef:
   144          apiVersion: apps/v1
   145          kind: Deployment
   146          name: nginx-deployment
   147          namespace: default
   148        type: Resource
   149      status:
   150        recommendedValue: |
   151          resourceRequest:
   152            containers:
   153            - containerName: nginx
   154              target:
   155                cpu: 100m
   156                memory: 100Mi
   157  kind: List
   158  metadata:
   159    resourceVersion: ""
   160    selfLink: ""
   161  ```
   162  
   163  ## 批量推荐
   164  
   165  我们通过一个例子来演示如何使用 `Analytics` 推荐集群中所有的 Deployment 和 StatefulSet:
   166  
   167  ```yaml
   168  apiVersion: analysis.crane.io/v1alpha1
   169  kind: Analytics
   170  metadata:
   171    name: workload-resource
   172    namespace: crane-system               # The Analytics in Crane-system will select all resource across all namespaces.
   173  spec:
   174    type: Resource                        # This can only be "Resource" or "Replicas".
   175    completionStrategy:
   176      completionStrategyType: Periodical  # This can only be "Once" or "Periodical".
   177      periodSeconds: 86400                # analytics selected resources every 1 day
   178    resourceSelectors:                    # defines all the resources to be select with
   179      - kind: Deployment
   180        apiVersion: apps/v1
   181      - kind: StatefulSet
   182        apiVersion: apps/v1
   183  ```
   184  
   185  1. 当 namespace 等于 `crane-system` 时,`Analytics` 选择的资源是集群中所有的 namespace,当 namespace 不等于 `crane-system` 时,`Analytics` 选择 `Analytics` namespace 下的资源
   186  2. resourceSelectors 通过数组配置需要分析的资源,kind 和 apiVersion 是必填字段,name 选填
   187  3. resourceSelectors 支持配置任意支持 [Scale Subresource](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#scale-subresource) 的资源
   188  
   189  ## 资源推荐计算模型
   190  
   191  ### 筛选阶段
   192  
   193  没有 Pod 的工作负载: 如果工作负载没有 Pod,无法进行算法分析。
   194  
   195  ### 推荐
   196  
   197  采用 VPA 的滑动窗口(Moving Window)算法分别计算每个容器的 CPU 和 Memory 并给出对应的推荐值
   198  
   199  ## 常见问题
   200  
   201  ### 如何让推荐结果更准确
   202  
   203  应用在监控系统(比如 Prometheus)中的历史数据越久,推荐结果就越准确,建议生产上超过两周时间。对新建应用的预测往往不准,可以通过参数配置保证只对历史数据长度超过一定天数的业务推荐。