github.com/gocrane/crane@v0.11.0/docs/tutorials/scheduling-pods-based-on-actual-node-load.md (about)

     1  # Crane-scheduler
     2  
     3  ## Overview
     4  Crane-scheduler is a collection of scheduler plugins based on [scheduler framework](https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/), including:
     5  
     6  - [Dynamic scheduler: a load-aware scheduler plugin](./dynamic-scheduler-plugin.md)
     7  
     8  ## Get Started
     9  
    10  ### Install Prometheus
    11  Make sure your kubernetes cluster has Prometheus installed. If not, please refer to [Install Prometheus](https://github.com/gocrane/fadvisor/blob/main/README.md#prerequests).
    12  
    13  ### Configure Prometheus Rules
    14  
    15  Configure the rules of Prometheus to get expected aggregated data:
    16  
    17  ```yaml
    18  apiVersion: monitoring.coreos.com/v1
    19  kind: PrometheusRule
    20  metadata:
    21      name: example-record
    22  spec:
    23      groups:
    24      - name: cpu_mem_usage_active
    25          interval: 30s
    26          rules:
    27          - record: cpu_usage_active
    28          expr: 100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[30s])) * 100)
    29          - record: mem_usage_active
    30          expr: 100*(1-node_memory_MemAvailable_bytes/node_memory_MemTotal_bytes)
    31      - name: cpu-usage-5m
    32          interval: 5m
    33          rules:
    34          - record: cpu_usage_max_avg_1h
    35          expr: max_over_time(cpu_usage_avg_5m[1h])
    36          - record: cpu_usage_max_avg_1d
    37          expr: max_over_time(cpu_usage_avg_5m[1d])
    38      - name: cpu-usage-1m
    39          interval: 1m
    40          rules:
    41          - record: cpu_usage_avg_5m
    42          expr: avg_over_time(cpu_usage_active[5m])
    43      - name: mem-usage-5m
    44          interval: 5m
    45          rules:
    46          - record: mem_usage_max_avg_1h
    47          expr: max_over_time(mem_usage_avg_5m[1h])
    48          - record: mem_usage_max_avg_1d
    49          expr: max_over_time(mem_usage_avg_5m[1d])
    50      - name: mem-usage-1m
    51          interval: 1m
    52          rules:
    53          - record: mem_usage_avg_5m
    54          expr: avg_over_time(mem_usage_active[5m])
    55  ```
    56  !!! warning "️Troubleshooting"
    57  
    58          The sampling interval of Prometheus must be less than 30 seconds, otherwise the above rules(such as cpu_usage_active) may not take effect.
    59  
    60  ### Install Crane-scheduler
    61  There are two options:
    62  
    63  - Install Crane-scheduler as a second scheduler
    64  - Replace native Kube-scheduler with Crane-scheduler
    65  
    66  #### Install Crane-scheduler as a second scheduler
    67  === "Main"
    68  
    69         ```bash
    70         helm repo add crane https://gocrane.github.io/helm-charts
    71         helm install scheduler -n crane-system --create-namespace --set global.prometheusAddr="REPLACE_ME_WITH_PROMETHEUS_ADDR" crane/scheduler
    72         ```
    73  
    74  === "Mirror"
    75  
    76         ```bash
    77         helm repo add crane https://finops-helm.pkg.coding.net/gocrane/gocrane
    78         helm install scheduler -n crane-system --create-namespace --set global.prometheusAddr="REPLACE_ME_WITH_PROMETHEUS_ADDR" crane/scheduler
    79         ```
    80  #### Replace native Kube-scheduler with Crane-scheduler
    81  
    82  1. Backup `/etc/kubernetes/manifests/kube-scheduler.yaml`
    83  ```bash
    84  cp /etc/kubernetes/manifests/kube-scheduler.yaml /etc/kubernetes/
    85  ```
    86  2. Modify configfile of kube-scheduler(`scheduler-config.yaml`) to enable Dynamic scheduler plugin and configure plugin args:
    87  ```yaml title="scheduler-config.yaml"
    88  apiVersion: kubescheduler.config.k8s.io/v1beta2
    89  kind: KubeSchedulerConfiguration
    90  ...
    91  profiles:
    92  - schedulerName: default-scheduler
    93   plugins:
    94     filter:
    95       enabled:
    96       - name: Dynamic
    97     score:
    98       enabled:
    99       - name: Dynamic
   100         weight: 3
   101   pluginConfig:
   102   - name: Dynamic
   103      args:
   104       policyConfigPath: /etc/kubernetes/policy.yaml
   105  ...
   106  ```
   107  3. Create `/etc/kubernetes/policy.yaml`, using as scheduler policy of Dynamic plugin:
   108   ```yaml title="/etc/kubernetes/policy.yaml"
   109    apiVersion: scheduler.policy.crane.io/v1alpha1
   110    kind: DynamicSchedulerPolicy
   111    spec:
   112      syncPolicy:
   113        ##cpu usage
   114        - name: cpu_usage_avg_5m
   115          period: 3m
   116        - name: cpu_usage_max_avg_1h
   117          period: 15m
   118        - name: cpu_usage_max_avg_1d
   119          period: 3h
   120        ##memory usage
   121        - name: mem_usage_avg_5m
   122          period: 3m
   123        - name: mem_usage_max_avg_1h
   124          period: 15m
   125        - name: mem_usage_max_avg_1d
   126          period: 3h
   127  
   128      predicate:
   129        ##cpu usage
   130        - name: cpu_usage_avg_5m
   131          maxLimitPecent: 0.65
   132        - name: cpu_usage_max_avg_1h
   133          maxLimitPecent: 0.75
   134        ##memory usage
   135        - name: mem_usage_avg_5m
   136          maxLimitPecent: 0.65
   137        - name: mem_usage_max_avg_1h
   138          maxLimitPecent: 0.75
   139  
   140      priority:
   141        ##cpu usage
   142        - name: cpu_usage_avg_5m
   143          weight: 0.2
   144        - name: cpu_usage_max_avg_1h
   145          weight: 0.3
   146        - name: cpu_usage_max_avg_1d
   147          weight: 0.5
   148        ##memory usage
   149        - name: mem_usage_avg_5m
   150          weight: 0.2
   151        - name: mem_usage_max_avg_1h
   152          weight: 0.3
   153        - name: mem_usage_max_avg_1d
   154          weight: 0.5
   155  
   156      hotValue:
   157        - timeRange: 5m
   158          count: 5
   159        - timeRange: 1m
   160          count: 2
   161   ```
   162   4. Modify `kube-scheduler.yaml` and replace kube-scheduler image with Crane-scheduler:
   163   ```yaml title="kube-scheduler.yaml"
   164   ...
   165    image: docker.io/gocrane/crane-scheduler:0.0.23
   166   ...
   167   ```
   168   5. Install [crane-scheduler-controller](https://github.com/gocrane/crane-scheduler/tree/main/deploy/controller):
   169  
   170  === "Main"
   171  
   172        ```bash
   173        kubectl apply -f https://raw.githubusercontent.com/gocrane/crane-scheduler/main/deploy/controller/rbac.yaml
   174        kubectl apply -f https://raw.githubusercontent.com/gocrane/crane-scheduler/main/deploy/controller/deployment.yaml
   175        ```
   176  
   177  === "Mirror"
   178  
   179  
   180        ```bash
   181        kubectl apply -f https://gitee.com/finops/crane-scheduler/raw/main/deploy/controller/rbac.yaml
   182        kubectl apply -f https://gitee.com/finops/crane-scheduler/raw/main/deploy/controller/deployment.yaml
   183        ```
   184  
   185  ### Schedule Pods With Crane-scheduler
   186  Test Crane-scheduler with following example:
   187  ```yaml
   188  apiVersion: apps/v1
   189  kind: Deployment
   190  metadata:
   191    name: cpu-stress
   192  spec:
   193    selector:
   194      matchLabels:
   195        app: cpu-stress
   196    replicas: 1
   197    template:
   198      metadata:
   199        labels:
   200          app: cpu-stress
   201      spec:
   202        schedulerName: crane-scheduler
   203        hostNetwork: true
   204        tolerations:
   205        - key: node.kubernetes.io/network-unavailable
   206          operator: Exists
   207          effect: NoSchedule
   208        containers:
   209        - name: stress
   210          image: docker.io/gocrane/stress:latest
   211          command: ["stress", "-c", "1"]
   212          resources:
   213            requests:
   214              memory: "1Gi"
   215              cpu: "1"
   216            limits:
   217              memory: "1Gi"
   218              cpu: "1"
   219  ```
   220  !!! Note
   221       Change `crane-scheduler` to `default-scheduler` if `crane-scheduler` is used as default.
   222  
   223  There will be the following event if the test pod is successfully scheduled:
   224  ```bash
   225  Type    Reason     Age   From             Message
   226  ----    ------     ----  ----             -------
   227  Normal  Scheduled  28s   crane-scheduler  Successfully assigned default/cpu-stress-7669499b57-zmrgb to vm-162-247-ubuntu
   228  ```