github.com/cilium/cilium@v1.16.2/Documentation/configuration/api-rate-limiting.rst (about) 1 .. only:: not (epub or latex or html) 2 3 WARNING: You are looking at unreleased Cilium documentation. 4 Please use the official rendered version released here: 5 https://docs.cilium.io 6 7 .. _api_rate_limiting: 8 9 ***************** 10 API Rate Limiting 11 ***************** 12 13 The per node Cilium agent is essentially event-driven. For example, the CNI 14 plugin is invoked when a new workload is scheduled onto the node which in turn 15 makes an API call to the Cilium agent to allocate an IP address and create the 16 Cilium endpoint. Another example is loading of network policy or service 17 definitions where changes of these definitions will create an event which will 18 notify the Cilium agent that a modification is required. 19 20 Due to being event-driven, the amount of work performed by the Cilium agent 21 highly depends on the rate of external events it receives. In order to 22 constrain the resources that the Cilium agent consumes, it can be helpful to 23 restrict the rate and allowed parallel executions of API calls. 24 25 Default Rate Limits 26 =================== 27 28 The following API calls are currently subject to rate limiting: 29 30 ========================== ====== ===== ============= ============ ================= =========== =============================== 31 API Call Limit Burst Max Parallel Min Parallel Max Wait Duration Auto Adjust Estimated Processing Duration 32 ========================== ====== ===== ============= ============ ================= =========== =============================== 33 ``PUT /endpoint/{id}`` 0.5/s 4 4 15s True 2s 34 ``DELETE /endpoint/{id}`` 4 4 True 200ms 35 ``GET /endpoint/{id}/*`` 4/s 4 4 2 10s True 200ms 36 ``PATCH /endpoint/{id}*`` 0.5/s 4 4 15s True 1s 37 ``GET /endpoint`` 1/s 4 2 2 True 300ms 38 ========================== ====== ===== ============= ============ ================= =========== =============================== 39 40 Configuration 41 ============= 42 43 The ``api-rate-limit`` option can be used to overwrite individual settings of the 44 default configuration:: 45 46 --api-rate-limit endpoint-create=rate-limit:2/s,rate-burst:4 47 48 API call to Configuration mapping 49 --------------------------------- 50 51 ========================== ==================== 52 API Call Config Name 53 ========================== ==================== 54 ``PUT /endpoint/{id}`` ``endpoint-create`` 55 ``DELETE /endpoint/{id}`` ``endpoint-delete`` 56 ``GET /endpoint/{id}/*`` ``endpoint-get`` 57 ``PATCH /endpoint/{id}*`` ``endponit-patch`` 58 ``GET /endpoint`` ``endpoint-list`` 59 ========================== ==================== 60 61 Configuration Parameters 62 ------------------------ 63 64 ================================= ========= ========= ===================================================================================== 65 Configuration Key Example Default Description 66 ================================= ========= ========= ===================================================================================== 67 ``rate-limit`` ``5/m`` None Allowed requests per time unit in the format ``<number>/<duration>``. 68 ``rate-burst`` ``4`` None Burst of API requests allowed by rate limiter. 69 ``min-wait-duration`` ``10ms`` ``0`` Minimum wait duration each API call has to wait before being processed. 70 ``max-wait-duration`` ``15s`` ``0`` Maximum duration an API call is allowed to wait before it fails. 71 ``estimated-processing-duration`` ``100ms`` ``0`` Estimated processing duration of an average API call. Used for automatic adjustment. 72 ``auto-adjust`` ``true`` ``false`` Enable automatic adjustment of ``rate-limit``, ``rate-burst`` and ``parallel-requests``. 73 ``parallel-requests`` ``4`` ``0`` Number of parallel API calls allowed. 74 ``min-parallel-requests`` ``2`` ``0`` Lower limit of parallel requests when auto-adjusting. 75 ``max-parallel-requests`` ``6`` ``0`` Upper limit of parallel requests when auto-adjusting. 76 ``mean-over`` ``10`` ``10`` Number of API calls to calculate mean processing duration for auto adjustment. 77 ``log`` ``true`` ``false`` Log an Info message for each API call processed. 78 ``delayed-adjustment-factor`` ``0.25`` ``0.5`` Factor for slower adjustment of ``rate-burst`` and ``parallel-requests``. 79 ``max-adjustment-factor`` ``10.0`` ``100.0`` Maximum factor the auto-adjusted values can deviate from the initial base values configured. 80 ================================= ========= ========= ===================================================================================== 81 82 Valid duration values 83 --------------------- 84 85 The ``rate-limit`` option expects a value in the form ``<number>/<duration>`` 86 where ``<duration>`` is a value that can be parsed with `ParseDuration() 87 <https://golang.org/pkg/time/#ParseDuration>`_. The supported units are: 88 ``ns``, ``us``, ``ms``, ``s``, ``m``, ``h``. 89 90 **Examples:** 91 92 * ``rate-limit:10/2m`` 93 * ``rate-limit:3.5/h`` 94 * ``rate-limit:1/100ms`` 95 96 Automatic Adjustment 97 ==================== 98 99 Static values are relatively useless as the Cilium agent will run on different 100 machine types. Deriving rate limits based on number of available CPU cores or 101 available memory can be misleading as well as the Cilium agent may be subject 102 to CPU and memory constraints. 103 104 For this reason, all API call rate limiting is done with automatic adjustment 105 of the limits with the goal to stay as close as possible to the configured 106 estimated processing duration. This processing duration is specified for each 107 group of API call and is constantly monitored. 108 109 On completion of every API call, new limits are calculated. For this purpose, an 110 adjustment factor is calculated: 111 112 .. code-block:: go 113 114 AdjustmentFactor := EstimatedProcessingDuration / MeanProcessingDuration 115 AdjustmentFactor = Min(Max(AdjustmentFactor, 1.0/MaxAdjustmentFactor), MaxAdjustmentFactor) 116 117 This adjustment factor is then applied to ``rate-limit``, ``rate-burst`` and 118 ``parallel-requests`` and will steer the mean processing duration to get closer 119 to the estimated processing duration. 120 121 If ``delayed-adjustment-factor`` is specified, then this additional factor is 122 used to slow the growth of the ``rate-burst`` and ``parallel-requests`` as both 123 values should typically adjust slower than ``rate-limit``: 124 125 .. code-block:: go 126 127 NewValue = OldValue * AdjustmentFactor 128 NewValue = OldValue + ((NewValue - OldValue) * DelayedAdjustmentFactor) 129 130 Metrics 131 ======= 132 133 All API calls subject to rate limiting will expose :ref:`metrics_api_rate_limiting`. Example:: 134 135 cilium_api_limiter_adjustment_factor api_call="endpoint-create" 0.695787 136 cilium_api_limiter_processed_requests_total api_call="endpoint-create" outcome="success" return_code="200" 7.000000 137 cilium_api_limiter_processing_duration_seconds api_call="endpoint-create" value="estimated" 2.000000 138 cilium_api_limiter_processing_duration_seconds api_call="endpoint-create" value="mean" 2.874443 139 cilium_api_limiter_rate_limit api_call="endpoint-create" value="burst" 4.000000 140 cilium_api_limiter_rate_limit api_call="endpoint-create" value="limit" 0.347894 141 cilium_api_limiter_requests_in_flight api_call="endpoint-create" value="in-flight" 0.000000 142 cilium_api_limiter_requests_in_flight api_call="endpoint-create" value="limit" 0.000000 143 cilium_api_limiter_wait_duration_seconds api_call="endpoint-create" value="max" 15.000000 144 cilium_api_limiter_wait_duration_seconds api_call="endpoint-create" value="mean" 0.000000 145 cilium_api_limiter_wait_duration_seconds api_call="endpoint-create" value="min" 0.000000 146 147 Understanding the log output 148 ============================ 149 150 The API rate limiter logs under the ``rate`` subsystem. An example message can 151 be seen below:: 152 153 level=info msg="API call has been processed" name=endpoint-create processingDuration=772.847247ms subsys=rate totalDuration=14.923958916s uuid=d34a2e1f-1ac9-11eb-8663-42010a8a0fe1 waitDurationTotal=14.151023084s 154 155 The following is an explanation for all the API rate limiting messages: 156 157 :: 158 159 "Processing API request with rate limiter" 160 161 The request was admitted into the rate limiter. The associated HTTP context 162 (caller's request) has not yet timed out. The request will now be rate-limited 163 according to the configuration of the rate limiter. It will enter the waiting 164 stage according to the computed waiting duration. 165 166 :: 167 168 "API request released by rate limiter" 169 170 171 The request has finished waiting its computed duration to achieve 172 rate-limiting. The underlying HTTP API action will now take place. This means 173 that this request was not thrown back at the caller with a 429 HTTP status 174 code. 175 176 This is a common message when the requests are being processed within the 177 configured bounds of the rate limiter. 178 179 :: 180 181 "API call has been processed": 182 183 The API rate limiter has processed this request and the underlying HTTP API 184 action has finished. This means the request is no longer actively waiting or in 185 other words, no longer being rate-limited. This does not mean the underlying 186 HTTP action has succeeded; only that this request has been dealt with. 187 188 :: 189 190 "Not processing API request due to cancelled context" 191 192 The underlying HTTP context (request) was cancelled. In other words, the caller 193 has given up on the request. This most likely means that the HTTP request timed 194 out. A 429 HTTP response status code is returned to the caller, which may or 195 may not receive it anyway. 196 197 :: 198 199 "Not processing API request. Wait duration for maximum parallel requests exceeds maximum" 200 201 The request has been denied by the rate limiter because too many parallel 202 requests are already in flight. The caller will receive a 429 HTTP status 203 response. 204 205 This is a common message when the rate limiter is doing its job of preventing 206 too many parallel requests at once. 207 208 :: 209 210 "Not processing API request. Wait duration exceeds maximum" 211 212 The request has been denied by the rate limiter because the request's waiting 213 duration would exceed the maximum configured waiting duration. For example, if 214 the maximum waiting duration was ``5s`` and due to the backlog of the rate 215 limiter, this request would need to wait ``10s``, then this request would be 216 thrown out. A 429 HTTP response status code would be returned to the caller. 217 218 This is the most common message when the rate limiter is doing its job of 219 pacing the incoming requests into Cilium. 220 221 :: 222 223 "Not processing API request due to cancelled context while waiting" 224 225 The request has been denied by the rate limiter because after the request has 226 waited its calculated waiting duration, the context associated with the request 227 has been cancelled. In the most likely scenario, this means that there was an 228 HTTP timeout while the request was actively being rate-limited or in other 229 words, actively being delayed by the rate limiter. A 429 HTTP response status 230 code is returned to the caller.