github.com/zorawar87/trillian@v1.2.1/quota/etcd/README.md (about) 1 # Etcd quotas 2 3 Package etcd (and its subpackages) contain an etcd-based 4 [quota.Manager](https://github.com/google/trillian/blob/3cf59cdfd0/quota/quota.go#L101) 5 implementation, with a corresponding REST-based configuration service. 6 7 ## Usage 8 9 First, ensure both `logserver` and `logsigner` are started with the 10 `--etcd_servers` and `--quota_system=etcd` flags, in addition to other flags. 11 `logserver` must also be started with a non-empty `--http_endpoint` flag, so the 12 REST quota API can be bound. 13 14 For example: 15 16 ```bash 17 trillian_log_server \ 18 --etcd_servers=... \ 19 --http_endpoint=localhost:8091 \ 20 --quota_system=etcd 21 22 trillian_log_signer --etcd_servers=... --quota_system=etcd 23 ``` 24 25 If correctly started, the servers will be using etcd quotas. The default 26 configuration is empty, which means no quotas are enforced. 27 28 The REST quota API may be used to create and update configurations. 29 30 For example, the command below creates a sequencing-based, `global/write` quota. 31 Assuming an expected sequencing performance of 50 QPS, the `max_tokens` 32 specified below implies a backlog of 4h. 33 34 ```bash 35 curl \ 36 -d '@-' \ 37 -s \ 38 -H 'Content-Type: application/json' \ 39 -X POST \ 40 'localhost:8091/v1beta1/quotas/global/write/config' <<EOF 41 { 42 "name": "quotas/global/write/config", 43 "config": { 44 "state": "ENABLED", 45 "max_tokens": 288000, 46 "sequencing_based": { 47 } 48 } 49 } 50 EOF 51 ``` 52 53 To list all configured quotas, run: 54 55 ```bash 56 curl 'localhost:8091/v1beta1/quotas?view=FULL' 57 ``` 58 59 Quotas may be retrieved individually or via a series of filters, updated and 60 deleted through the REST API as well. See 61 [quotapb.proto](https://github.com/google/trillian/blob/master/quota/etcd/quotapb/quotapb.proto) 62 for an in-depth description of entities and available methods. 63 64 ### Maintenance and token exhaustion 65 66 During regular system operation, no quota-related maintenance should be 67 required, as the system should generate at least as many tokens as it spends. 68 69 If token exhaustion occurs, there are a few built-in mechanisms that allow for 70 manual intervention. The question of whether intervention is needed, though, is 71 an important one and should be answered before any attempts are made to bypass 72 the system. For example: 73 74 * is the `logsigner` working properly and able to keep with the current demand? 75 * is there a spike in requests that may justify the current token exhaustion? 76 77 For "genuine" token exhaustion (i.e. the system really is under a load it can't 78 cope with), it may be beneficial to let the quota system deny requests until 79 regular operation is resumed. 80 81 That said, the sections below describe actions that may taken to deal with token 82 exhaustion. All examples use `global/read` as the quota in question; substitute 83 the name as appropriate. 84 85 #### Resetting quotas 86 87 Resetting a quota restores its current token count to the configured 88 `max_tokens` value. 89 90 ```bash 91 curl -X PATCH \ 92 'localhost:8091/v1beta1/quotas/global/read/config?reset_quota=true' 93 ``` 94 95 #### Disabling quotas 96 97 Disabling a quota makes it inactive, effective immediately. Disabled quotas may 98 be enabled again with a similar update (changing "DISABLED" to "ENABLED"). 99 100 ```bash 101 curl \ 102 -d '@-' \ 103 -s \ 104 -H 'Content-Type: application/json' \ 105 -X PATCH \ 106 'localhost:8091/v1beta1/quotas/global/read/config' <<EOF 107 { 108 "config": { 109 "state": "DISABLED" 110 }, 111 "update_mask": ["state"] 112 } 113 EOF 114 ``` 115 116 #### Deleting quotas 117 118 Permanently deletes a quota. Consider disabling for a temporary solution. 119 120 ```bash 121 curl -X DELETE 'localhost:8091/v1beta1/quotas/global/read/config' 122 ``` 123 124 ### Flags 125 126 The following flags apply to etcd quotas: 127 128 * [--quota_dry_run](https://github.com/google/trillian/blob/3cf59cdfd0/server/trillian_log_server/main.go#L61) 129 (log and map servers) 130 * [--quota_increase_factor](https://github.com/google/trillian/blob/3cf59cdfd0/server/trillian_log_signer/main.go#L60) 131 (logsigner) 132 * [quota_max_cache_entries](https://github.com/google/trillian/blob/c0a332878f/server/trillian_log_server/main.go#L71) 133 (log and map servers) 134 * [quota_min_batch_size](https://github.com/google/trillian/blob/c0a332878f/server/trillian_log_server/main.go#L69) 135 (log and map servers) 136 137 `--quota_dry_run`, when set to true, stops quota depletion from blocking 138 requests. This applies to all quotas, so it's only recommended in early 139 evaluations of the quota system. 140 141 `--quota_increase_factor` is related to token leakage protection. It applies 142 only to sequencing-based quotas. If `--quota_increase_factor` is 1, each new 143 leaf sequenced by `logsigner` restores exactly one token. If it's higher than 1, 144 more tokens are restored per leaf batch. A value slightly higher than 1 (e.g. 145 1.1) is recommended, so there is some protection against token leakage without 146 too much compromise of the quota system in exceptional situations. 147 148 `--quota_max_cache_entries` and `--quota_min_batch_size` are related to token 149 caching. Some level of token caching (i.e. both flags having values > 0) is 150 recommended to lessen the latency impact of rate limiting. 151 152 `--quota_min_batch_size` is the minimum number of tokens acquired from etcd. If 153 a particular request demands fewer tokens than the minimal batch size, the 154 remaining tokens are kept in memory, potentially saving new requests to etcd 155 until those are consumed. 156 157 `--quota_max_cache_entries` determines how many quota Specs are cached. Tokens 158 are cached per Spec using a LRU replacement policy. In case of systems with a 159 high number of trees or users, the least used ones are evicted from the cache 160 (and their tokens returned). 161 162 ### Monitoring 163 164 The following metrics are relevant when considering quota behavior: 165 166 * [interceptor_request_count](https://github.com/google/trillian/blob/3cf59cdfd0/server/interceptor/interceptor.go#L91) 167 * [interceptor_request_denied_count](https://github.com/google/trillian/blob/3cf59cdfd0/server/interceptor/interceptor.go#L95) 168 * [quota_acquired_tokens](https://github.com/google/trillian/blob/3cf59cdfd0/quota/metrics.go#L70) 169 * [quota_returned_tokens](https://github.com/google/trillian/blob/3cf59cdfd0/quota/metrics.go#L71) 170 * [quota_replenished_tokens](https://github.com/google/trillian/blob/3cf59cdfd0/quota/metrics.go#L71) 171 172 Requests denied due to token shortage are labeled on 173 **interceptor_request_denied_count** as 174 [insufficient_tokens](https://github.com/google/trillian/blob/3cf59cdfd0/server/interceptor/interceptor.go#L38). 175 The ratio between **denied_with_insufficient_tokens** and 176 **interceptor_request_count** is a strong indicator of token exhaustion. 177 178 ## General concepts 179 180 Trillian quotas have a finite number of tokens that get consumed by requests. 181 Once a quota reaches zero tokens, all requests that would otherwise consume a 182 token from it will fail with a **resource_exhausted** error. Tokens are 183 replenished by different mechanisms, depending on the quota configuration (e.g, 184 X tokens every Y seconds). 185 186 Quotas are designed so that a set of quotas, in different levels of granularity, 187 apply to a single request. 188 189 A quota 190 [Spec](https://github.com/google/trillian/blob/3cf59cdfd0/quota/quota.go#L56) 191 identifies a particular quota and represents to which requests it applies. Specs 192 contain a 193 [Group](https://github.com/google/trillian/blob/3cf59cdfd0/quota/quota.go#L27) 194 (`global`, `tree` and `user`) and 195 [Kind](https://github.com/google/trillian/blob/3cf59cdfd0/quota/quota.go#L44) 196 (`read` or `write`). 197 198 A few Spec examples are: 199 200 * `global/read` (all read requests) 201 * `global/write` (all write requests) 202 * `trees/123/write` (write requests for tree 123) 203 * `users/alice/read` (read requests made by user "alice") 204 205 Each request, depending on whether it's a read or write request, subtracts 206 tokens from the following Specs: 207 208 | read requests | write requests | 209 | -------------- | --------------- | 210 | users/$id/read | users/$id/write | 211 | trees/$id/read | trees/$id/write | 212 | global/read | global/write | 213 214 Quotas that aren't explicitly configured are considered infinite and won't block 215 requests. 216 217 ## Etcd quotas 218 219 Etcd quotas implement the concepts described above by storing the quota 220 configuration and token count in etcd. 221 222 Two replenishment mechanisms are available: sequencing-based and time-based. 223 224 Sequencing-based replenishment is tied to `logsigner's` progress. A token is 225 restored for each leaf sequenced from the `Unsequenced` table. As such, it's 226 only applicable to `global/write` and `trees/write` quotas. 227 228 Time-based sequencing replenishes X tokens every Y seconds. It may be applied to 229 all quotas. 230 231 ### MMD protection 232 233 Sequencing-based quotas may be used as a form of MMD protection. If the number 234 of write requests accepted by Trillian going beyond the `logsigner's` configured 235 processing capability, tokens will eventually get exhausted and the system will 236 fail new write requests with a **resource_exhausted** error. While not ideal, 237 this helps avoid an eventual MMD loss, which may be a graver offense than 238 temporary loss of availability. 239 240 Both `global/write` and `trees/write` quotas may be used for MMD protection 241 purposes. It's strongly recommended that `global/write` is set up as a last line 242 of defense for all systems. 243 244 ### QPS limits 245 246 Time-based quotas effectively work as QPS (queries-per-second) limits (X tokens 247 in Y seconds is roughly equivalent to X/Y QPS). 248 249 All quotas may be configured as time-based, but they may be particularly useful 250 as per-tree (e.g. limiting test or archival trees) or as per-user. 251 252 ### Default quotas 253 254 Default quotas are pre-configured limits that get automatically applied to new 255 trees or users. 256 257 TODO(codingllama): Default quotas are not yet implemented. 258 259 ### Quota users 260 261 User level quotas are applied to "quota users". Trillian makes no assumptions 262 about what a quota user is. Therefore, initially, there's a single default user 263 that is charged for all requests (note that, since no quotas are created by 264 default, this user charges quotas that are effectively infinite).