github.com/gocrane/crane@v0.11.0/docs/proposals/20220228-advanced-cpuset-manger.md (about) 1 --- 2 title: Advanced CPUSet Manager 3 authors: 4 - "@szy441687879" 5 reviewers: 6 - "@yan234280533" 7 - "@mfanjie" 8 creation-date: 2022-02-28 9 last-updated: 2022-03-16 10 status: provisional 11 --- 12 13 # Advanced CPUSet Manager 14 - Static CPU manager is supported by kubelet, when a guaranteed Pod is running on a node, kubelet allocate specific cpu cores to the processes exclusively, which generally keeps the cpu utilization of the node low. 15 This proposal provides a new mechanism to manage cpusets, which allows sharing cpu cores with other processes while binds cpuset.It also allows to revise cpuset when pod is running and relaxes restrictions of binding cpus in kubelet. 16 17 ## Table of Contents 18 19 <!-- TOC --> 20 21 - [Advanced CPUSet Manager](#advanced-cpuset-manager) 22 - [Table of Contents](#table-of-contents) 23 - [Motivation](#motivation) 24 - [Goals](#goals) 25 - [Non-Goals/Future Work](#non-goalsfuture-work) 26 - [Proposal](#proposal) 27 - [Relax restrictions of cpuset allocation](#relax-restrictions-of-cpuset-allocation) 28 - [Add new annotation to describe the requirement of cpuset contorl manger](#add-new-annotation-to-describe-the--requirement-of-cpuset-contorl-manger) 29 - [Advanced CPU Manager component](#advanced-cpu-manager-component) 30 - [User Stories](#user-stories) 31 - [Story 1](#story-1) 32 - [Story 2](#story-2) 33 - [Risks and Mitigations](#risks-and-mitigations) 34 35 <!-- /TOC --> 36 ## Motivation 37 Some latency-sensitive applications have lower lantency and cpu usage when running with specific cores, which results in fewer context switches and higer cache affinity. 38 But kubelet will always exclude assigned cores in shared cores, which may waste resources.Offline and other online pods can running on the cores actually. In our experiment, for the most part, it is barely noticeable for performance of service. 39 40 ### Goals 41 42 - Provide a new mechanism to manage cpuset bypass 43 - Provide a new cpuset manager method "shared" 44 - Allow revise cpuset when pod running 45 - Relax restrictions of binding cpus 46 47 48 ### Non-Goals/Future Work 49 50 - Solve the conflicts with kubelet static cpuset manager, you need to set kubelet cpuset manager to "none" 51 - Numa manager will support in future, CCX/CCD manager also be considered 52 53 ## Proposal 54 ### Relax restrictions of cpuset allocation 55 Kubelet allocate cpus for containers should meet the conditions: 56 57 1. requests and limits are specified for all the containers and they are equal 58 59 2. the container's resource limit for the limit of CPU is an integer greater than or equal to one and equal to request request of CPU. 60 61 In Crane, only need to meet condition No.2 62 ### Add new annotation to describe the requirement of cpuset contorl manger 63 ```yaml 64 apiVersion: v1 65 kind: Pod 66 metadata: 67 annotations: 68 qos.gocrane.io/cpu-manager: none/exclusive/share 69 ``` 70 Provide three polices for cpuset manager: 71 - none: containers of this pod shares a set of cpus which not allocated to exclusive containers 72 - exclusive: containers of this pod monopolize the allocated CPUs , other containers not allowed to use. 73 - share: containers of this pod runs in theallocated CPUs , but other containers can also use. 74 75 ### Advanced CPU Manager component 76  77 78 - Crane-agent use podLister informs to sense the creation of pod. 79 - Crane-agent allocate cpus when pod is binded, and loop in cycle to addContainer(change cpuset) until the containers are created 80 - Update/Delete pod will handle in reconcile state. 81 - state.State referenced from kubelet and topology_cpu_assignment copied from kubelet 82 83 84 ### User Stories 85 86 - Users can update pod annotaion to control cpuset policy flexibly 87 88 #### Story 1 89 make pod from none to share without recreating pod 90 #### Story 2 91 make pod from exclusive to share, so offline process can use these CPUs 92 93 ### Risks and Mitigations 94 95 - kubelet cpu manger policy need to be set to none, otherwise will be conflicted with crane-agent 96 - if crane-agent can not allocate CPUs for pods, it will not refuse to start pod as kubelet 97