github.com/cilium/cilium@v1.16.2/Documentation/configuration/api-rate-limiting.rst

github.com/cilium/cilium@v1.16.2/Documentation/configuration/api-rate-limiting.rst (about)

     1  .. only:: not (epub or latex or html)
     2  
     3      WARNING: You are looking at unreleased Cilium documentation.
     4      Please use the official rendered version released here:
     5      https://docs.cilium.io
     6  
     7  .. _api_rate_limiting:
     8  
     9  *****************
    10  API Rate Limiting
    11  *****************
    12  
    13  The per node Cilium agent is essentially event-driven. For example, the CNI
    14  plugin is invoked when a new workload is scheduled onto the node which in turn
    15  makes an API call to the Cilium agent to allocate an IP address and create the
    16  Cilium endpoint. Another example is loading of network policy or service
    17  definitions where changes of these definitions will create an event which will
    18  notify the Cilium agent that a modification is required.
    19  
    20  Due to being event-driven, the amount of work performed by the Cilium agent
    21  highly depends on the rate of external events it receives. In order to
    22  constrain the resources that the Cilium agent consumes, it can be helpful to
    23  restrict the rate and allowed parallel executions of API calls.
    24  
    25  Default Rate Limits
    26  ===================
    27  
    28  The following API calls are currently subject to rate limiting:
    29  
    30  ========================== ====== ===== ============= ============ ================= =========== ===============================
    31  API Call                   Limit  Burst Max Parallel  Min Parallel Max Wait Duration Auto Adjust Estimated Processing Duration
    32  ========================== ====== ===== ============= ============ ================= =========== ===============================
    33  ``PUT /endpoint/{id}``     0.5/s  4     4                          15s               True        2s
    34  ``DELETE /endpoint/{id}``               4             4                              True        200ms
    35  ``GET /endpoint/{id}/*``   4/s    4     4             2            10s               True        200ms
    36  ``PATCH /endpoint/{id}*``  0.5/s  4     4                          15s               True        1s
    37  ``GET /endpoint``          1/s    4     2             2                              True        300ms
    38  ========================== ====== ===== ============= ============ ================= =========== ===============================
    39  
    40  Configuration
    41  =============
    42  
    43  The ``api-rate-limit`` option can be used to overwrite individual settings of the
    44  default configuration::
    45  
    46     --api-rate-limit endpoint-create=rate-limit:2/s,rate-burst:4
    47  
    48  API call to Configuration mapping
    49  ---------------------------------
    50  
    51  ========================== ====================
    52  API Call                   Config Name
    53  ========================== ====================
    54  ``PUT /endpoint/{id}``     ``endpoint-create``
    55  ``DELETE /endpoint/{id}``  ``endpoint-delete``
    56  ``GET /endpoint/{id}/*``   ``endpoint-get``
    57  ``PATCH /endpoint/{id}*``  ``endponit-patch``
    58  ``GET /endpoint``          ``endpoint-list``
    59  ========================== ====================
    60  
    61  Configuration Parameters
    62  ------------------------
    63  
    64  ================================= ========= ========= =====================================================================================
    65  Configuration Key                 Example   Default   Description
    66  ================================= ========= ========= =====================================================================================
    67  ``rate-limit``                    ``5/m``   None      Allowed requests per time unit in the format ``<number>/<duration>``.
    68  ``rate-burst``                    ``4``     None      Burst of API requests allowed by rate limiter.
    69  ``min-wait-duration``             ``10ms``  ``0``     Minimum wait duration each API call has to wait before being processed.
    70  ``max-wait-duration``             ``15s``   ``0``     Maximum duration an API call is allowed to wait before it fails.
    71  ``estimated-processing-duration`` ``100ms`` ``0``     Estimated processing duration of an average API call. Used for automatic adjustment.
    72  ``auto-adjust``                   ``true``  ``false`` Enable automatic adjustment of ``rate-limit``, ``rate-burst`` and ``parallel-requests``.
    73  ``parallel-requests``             ``4``     ``0``     Number of parallel API calls allowed.
    74  ``min-parallel-requests``         ``2``     ``0``     Lower limit of parallel requests when auto-adjusting.
    75  ``max-parallel-requests``         ``6``     ``0``     Upper limit of parallel requests when auto-adjusting.
    76  ``mean-over``                     ``10``    ``10``    Number of API calls to calculate mean processing duration for auto adjustment.
    77  ``log``                           ``true``  ``false`` Log an Info message for each API call processed.
    78  ``delayed-adjustment-factor``     ``0.25``  ``0.5``   Factor for slower adjustment of ``rate-burst`` and ``parallel-requests``.
    79  ``max-adjustment-factor``         ``10.0``  ``100.0`` Maximum factor the auto-adjusted values can deviate from the initial base values configured.
    80  ================================= ========= ========= =====================================================================================
    81  
    82  Valid duration values
    83  ---------------------
    84  
    85  The ``rate-limit`` option expects a value in the form ``<number>/<duration>``
    86  where ``<duration>`` is a value that can be parsed with `ParseDuration()
    87  <https://golang.org/pkg/time/#ParseDuration>`_. The supported units are:
    88  ``ns``, ``us``, ``ms``, ``s``, ``m``, ``h``.
    89  
    90  **Examples:**
    91  
    92  * ``rate-limit:10/2m``
    93  * ``rate-limit:3.5/h``
    94  * ``rate-limit:1/100ms``
    95  
    96  Automatic Adjustment
    97  ====================
    98  
    99  Static values are relatively useless as the Cilium agent will run on different
   100  machine types. Deriving rate limits based on number of available CPU cores or
   101  available memory can be misleading as well as the Cilium agent may be subject
   102  to CPU and memory constraints.
   103  
   104  For this reason, all API call rate limiting is done with automatic adjustment
   105  of the limits with the goal to stay as close as possible to the configured
   106  estimated processing duration. This processing duration is specified for each
   107  group of API call and is constantly monitored.
   108  
   109  On completion of every API call, new limits are calculated. For this purpose, an
   110  adjustment factor is calculated:
   111  
   112  .. code-block:: go
   113  
   114      AdjustmentFactor := EstimatedProcessingDuration / MeanProcessingDuration
   115      AdjustmentFactor = Min(Max(AdjustmentFactor, 1.0/MaxAdjustmentFactor), MaxAdjustmentFactor)
   116  
   117  This adjustment factor is then applied to ``rate-limit``, ``rate-burst`` and
   118  ``parallel-requests`` and will steer the mean processing duration to get closer
   119  to the estimated processing duration.
   120  
   121  If ``delayed-adjustment-factor`` is specified, then this additional factor is
   122  used to slow the growth of the ``rate-burst`` and ``parallel-requests`` as both
   123  values should typically adjust slower than ``rate-limit``:
   124  
   125  .. code-block:: go
   126  
   127      NewValue = OldValue * AdjustmentFactor
   128      NewValue = OldValue + ((NewValue - OldValue) * DelayedAdjustmentFactor)
   129  
   130  Metrics
   131  =======
   132  
   133  All API calls subject to rate limiting will expose :ref:`metrics_api_rate_limiting`. Example::
   134  
   135      cilium_api_limiter_adjustment_factor                  api_call="endpoint-create"                                                0.695787
   136      cilium_api_limiter_processed_requests_total           api_call="endpoint-create" outcome="success" return_code="200"            7.000000
   137      cilium_api_limiter_processing_duration_seconds        api_call="endpoint-create" value="estimated"                              2.000000
   138      cilium_api_limiter_processing_duration_seconds        api_call="endpoint-create" value="mean"                                   2.874443
   139      cilium_api_limiter_rate_limit                         api_call="endpoint-create" value="burst"                                  4.000000
   140      cilium_api_limiter_rate_limit                         api_call="endpoint-create" value="limit"                                  0.347894
   141      cilium_api_limiter_requests_in_flight                 api_call="endpoint-create" value="in-flight"                              0.000000
   142      cilium_api_limiter_requests_in_flight                 api_call="endpoint-create" value="limit"                                  0.000000
   143      cilium_api_limiter_wait_duration_seconds              api_call="endpoint-create" value="max"                                    15.000000
   144      cilium_api_limiter_wait_duration_seconds              api_call="endpoint-create" value="mean"                                   0.000000
   145      cilium_api_limiter_wait_duration_seconds              api_call="endpoint-create" value="min"                                    0.000000
   146  
   147  Understanding the log output
   148  ============================
   149  
   150  The API rate limiter logs under the ``rate`` subsystem. An example message can
   151  be seen below::
   152  
   153     level=info msg="API call has been processed" name=endpoint-create processingDuration=772.847247ms subsys=rate totalDuration=14.923958916s uuid=d34a2e1f-1ac9-11eb-8663-42010a8a0fe1 waitDurationTotal=14.151023084s
   154  
   155  The following is an explanation for all the API rate limiting messages:
   156  
   157  ::
   158  
   159     "Processing API request with rate limiter"
   160  
   161  The request was admitted into the rate limiter. The associated HTTP context
   162  (caller's request) has not yet timed out. The request will now be rate-limited
   163  according to the configuration of the rate limiter. It will enter the waiting
   164  stage according to the computed waiting duration.
   165  
   166  ::
   167  
   168     "API request released by rate limiter"
   169  
   170  
   171  The request has finished waiting its computed duration to achieve
   172  rate-limiting. The underlying HTTP API action will now take place. This means
   173  that this request was not thrown back at the caller with a 429 HTTP status
   174  code.
   175  
   176  This is a common message when the requests are being processed within the
   177  configured bounds of the rate limiter.
   178  
   179  ::
   180  
   181     "API call has been processed":
   182  
   183  The API rate limiter has processed this request and the underlying HTTP API
   184  action has finished. This means the request is no longer actively waiting or in
   185  other words, no longer being rate-limited. This does not mean the underlying
   186  HTTP action has succeeded; only that this request has been dealt with.
   187  
   188  ::
   189  
   190     "Not processing API request due to cancelled context"
   191  
   192  The underlying HTTP context (request) was cancelled. In other words, the caller
   193  has given up on the request. This most likely means that the HTTP request timed
   194  out. A 429 HTTP response status code is returned to the caller, which may or
   195  may not receive it anyway.
   196  
   197  ::
   198  
   199     "Not processing API request. Wait duration for maximum parallel requests exceeds maximum"
   200  
   201  The request has been denied by the rate limiter because too many parallel
   202  requests are already in flight. The caller will receive a 429 HTTP status
   203  response.
   204  
   205  This is a common message when the rate limiter is doing its job of preventing
   206  too many parallel requests at once.
   207  
   208  ::
   209  
   210     "Not processing API request. Wait duration exceeds maximum"
   211  
   212  The request has been denied by the rate limiter because the request's waiting
   213  duration would exceed the maximum configured waiting duration. For example, if
   214  the maximum waiting duration was ``5s`` and due to the backlog of the rate
   215  limiter, this request would need to wait ``10s``, then this request would be
   216  thrown out. A 429 HTTP response status code would be returned to the caller.
   217  
   218  This is the most common message when the rate limiter is doing its job of
   219  pacing the incoming requests into Cilium.
   220  
   221  ::
   222  
   223     "Not processing API request due to cancelled context while waiting"
   224  
   225  The request has been denied by the rate limiter because after the request has
   226  waited its calculated waiting duration, the context associated with the request
   227  has been cancelled. In the most likely scenario, this means that there was an
   228  HTTP timeout while the request was actively being rate-limited or in other
   229  words, actively being delayed by the rate limiter. A 429 HTTP response status
   230  code is returned to the caller.