github.com/gocrane/crane@v0.11.0/docs/tutorials/resource-recommendation.zh.md (about) 1 # 资源推荐 2 3 Kubernetes 用户在创建应用资源时常常是基于经验值来设置 request 和 limit。通过资源推荐的算法分析应用的真实用量推荐更合适的资源配置,您可以参考并采纳它提升集群的资源利用率。 4 5 ## 产品功能 6 7 资源推荐是 VPA 的轻量化实现,且更灵活。 8 9 1. 算法:算法模型采用了 VPA 的滑动窗口(Moving Window)算法,并且支持自定义算法的关键配置,提供了更高的灵活性 10 2. 支持批量分析:通过 `Analytics` 的 ResourceSelector,用户可以批量分析多个工作负载,而无需一个一个的创建 VPA 对象 11 3. 更轻便:由于 VPA 的 Auto 模式在更新容器资源配置时会导致容器重建,因此很难在生产上使用自动模式,资源推荐给用户提供资源建议,把变更的决定交给用户决定 12 13 ## 创建资源分析 14 15 我们通过 deployment: `nginx` 和 `Analytics` 作为一个例子演示如何开始一次资源推荐之旅: 16 17 18 === "Main" 19 20 ```bash 21 kubectl apply -f https://raw.githubusercontent.com/gocrane/crane/main/examples/analytics/nginx-deployment.yaml 22 kubectl apply -f https://raw.githubusercontent.com/gocrane/crane/main/examples/analytics/analytics-resource.yaml 23 kubectl get analytics 24 ``` 25 26 === "Mirror" 27 28 ```bash 29 kubectl apply -f https://gitee.com/finops/crane/raw/main/examples/analytics/nginx-deployment.yaml 30 kubectl apply -f https://gitee.com/finops/crane/raw/main/examples/analytics/analytics-resource.yaml 31 kubectl get analytics 32 ``` 33 34 35 ```yaml title="analytics-resource.yaml" 36 apiVersion: analysis.crane.io/v1alpha1 37 kind: Analytics 38 metadata: 39 name: nginx-resource 40 spec: 41 type: Resource # This can only be "Resource" or "HPA". 42 completionStrategy: 43 completionStrategyType: Periodical # This can only be "Once" or "Periodical". 44 periodSeconds: 86400 # analytics selected resources every 1 day 45 resourceSelectors: # defines all the resources to be select with 46 - kind: Deployment 47 apiVersion: apps/v1 48 name: nginx-deployment 49 ``` 50 51 结果如下: 52 53 ```bash 54 NAME AGE 55 nginx-resource 16m 56 ``` 57 58 查看 Analytics 详情: 59 60 ```bash 61 kubectl get analytics nginx-resource -o yaml 62 ``` 63 64 结果如下: 65 66 ```yaml 67 apiVersion: analysis.crane.io/v1alpha1 68 kind: Analytics 69 metadata: 70 name: nginx-resource 71 namespace: default 72 spec: 73 completionStrategy: 74 completionStrategyType: Periodical 75 periodSeconds: 86400 76 resourceSelectors: 77 - apiVersion: apps/v1 78 kind: Deployment 79 labelSelector: {} 80 name: nginx-deployment 81 type: Resource 82 status: 83 conditions: 84 - lastTransitionTime: "2022-05-15T14:38:35Z" 85 message: Analytics is ready 86 reason: AnalyticsReady 87 status: "True" 88 type: Ready 89 lastUpdateTime: "2022-05-15T14:38:35Z" 90 recommendations: 91 - lastStartTime: "2022-05-15T14:38:35Z" 92 message: Success 93 name: nginx-resource-resource-w45nq 94 namespace: default 95 targetRef: 96 apiVersion: apps/v1 97 kind: Deployment 98 name: nginx-deployment 99 namespace: default 100 uid: 750cb3bd-0b87-4f87-acbe-57e621af0a1e 101 ``` 102 103 ## 查看分析结果 104 105 查看分析结果 **Recommendation**: 106 107 ```bash 108 kubectl get recommend -l analysis.crane.io/analytics-name=nginx-resource -o yaml 109 ``` 110 111 分析结果如下: 112 113 ```yaml 114 apiVersion: v1 115 items: 116 - apiVersion: analysis.crane.io/v1alpha1 117 kind: Recommendation 118 metadata: 119 creationTimestamp: "2022-06-15T15:26:25Z" 120 generateName: nginx-resource-resource- 121 generation: 1 122 labels: 123 analysis.crane.io/analytics-name: nginx-resource 124 analysis.crane.io/analytics-type: Resource 125 analysis.crane.io/analytics-uid: 9e78964b-f8ae-40de-9740-f9a715d16280 126 app: nginx 127 name: nginx-resource-resource-t4xpn 128 namespace: default 129 ownerReferences: 130 - apiVersion: analysis.crane.io/v1alpha1 131 blockOwnerDeletion: false 132 controller: false 133 kind: Analytics 134 name: nginx-resource 135 uid: 9e78964b-f8ae-40de-9740-f9a715d16280 136 resourceVersion: "2117439429" 137 selfLink: /apis/analysis.crane.io/v1alpha1/namespaces/default/recommendations/nginx-resource-resource-t4xpn 138 uid: 8005e3e0-8fe9-470b-99cf-5ce9dd407529 139 spec: 140 adoptionType: StatusAndAnnotation 141 completionStrategy: 142 completionStrategyType: Once 143 targetRef: 144 apiVersion: apps/v1 145 kind: Deployment 146 name: nginx-deployment 147 namespace: default 148 type: Resource 149 status: 150 recommendedValue: | 151 resourceRequest: 152 containers: 153 - containerName: nginx 154 target: 155 cpu: 100m 156 memory: 100Mi 157 kind: List 158 metadata: 159 resourceVersion: "" 160 selfLink: "" 161 ``` 162 163 ## 批量推荐 164 165 我们通过一个例子来演示如何使用 `Analytics` 推荐集群中所有的 Deployment 和 StatefulSet: 166 167 ```yaml 168 apiVersion: analysis.crane.io/v1alpha1 169 kind: Analytics 170 metadata: 171 name: workload-resource 172 namespace: crane-system # The Analytics in Crane-system will select all resource across all namespaces. 173 spec: 174 type: Resource # This can only be "Resource" or "Replicas". 175 completionStrategy: 176 completionStrategyType: Periodical # This can only be "Once" or "Periodical". 177 periodSeconds: 86400 # analytics selected resources every 1 day 178 resourceSelectors: # defines all the resources to be select with 179 - kind: Deployment 180 apiVersion: apps/v1 181 - kind: StatefulSet 182 apiVersion: apps/v1 183 ``` 184 185 1. 当 namespace 等于 `crane-system` 时,`Analytics` 选择的资源是集群中所有的 namespace,当 namespace 不等于 `crane-system` 时,`Analytics` 选择 `Analytics` namespace 下的资源 186 2. resourceSelectors 通过数组配置需要分析的资源,kind 和 apiVersion 是必填字段,name 选填 187 3. resourceSelectors 支持配置任意支持 [Scale Subresource](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#scale-subresource) 的资源 188 189 ## 资源推荐计算模型 190 191 ### 筛选阶段 192 193 没有 Pod 的工作负载: 如果工作负载没有 Pod,无法进行算法分析。 194 195 ### 推荐 196 197 采用 VPA 的滑动窗口(Moving Window)算法分别计算每个容器的 CPU 和 Memory 并给出对应的推荐值 198 199 ## 常见问题 200 201 ### 如何让推荐结果更准确 202 203 应用在监控系统(比如 Prometheus)中的历史数据越久,推荐结果就越准确,建议生产上超过两周时间。对新建应用的预测往往不准,可以通过参数配置保证只对历史数据长度超过一定天数的业务推荐。