github.com/thanos-io/thanos@v0.32.5/docs/components/query-frontend.md (about) 1 # Query Frontend 2 3 The `thanos query-frontend` command implements a service that can be put in front of Thanos Queriers to improve the read path. It is based on the [Cortex Query Frontend](https://cortexmetrics.io/docs/architecture/#query-frontend) component so you can find some common features like `Splitting` and `Results Caching`. 4 5 Query Frontend is fully stateless and horizontally scalable. 6 7 Example command to run Query Frontend: 8 9 ```bash 10 thanos query-frontend \ 11 --http-address "0.0.0.0:9090" \ 12 --query-frontend.downstream-url="<thanos-querier>:<querier-http-port>" 13 ``` 14 15 _**NOTE:** Currently only range queries (`/api/v1/query_range` API call) are actually processed through Query Frontend. All other API calls just directly go to the downstream Querier, which means only range queries are split and cached. But we are planning to support instant queries as well. 16 17 For more information please check out [initial design proposal](../proposals-done/202004-embedd-cortex-frontend.md). 18 19 ## Features 20 21 ### Splitting 22 23 Query Frontend splits a long query into multiple short queries based on the configured `--query-range.split-interval` flag. The default value of `--query-range.split-interval` is `24h`. When caching is enabled it should be greater than `0`. 24 25 There are some benefits from query splitting: 26 27 1. Safeguard. It prevents large queries from causing OOM issues to Queries. 28 2. Better parallelization. 29 3. Better load balancing for Queries. 30 31 ### Retry 32 33 Query Frontend supports a retry mechanism to retry query when HTTP requests are failing. There is a `--query-range.max-retries-per-request` flag to limit the maximum retry times. 34 35 ### Caching 36 37 Query Frontend supports caching query results and reuses them on subsequent queries. If the cached results are incomplete, Query Frontend calculates the required subqueries and executes them in parallel on downstream queriers. Query Frontend can optionally align queries with their step parameter to improve the cacheability of the query results. Currently, in-memory cache (fifo cache), memcached, and redis are supported. 38 39 #### Excluded from caching 40 41 * Requests that support deduplication and having it disabled with `dedup=false`. Read more about deduplication in [Dedup documentation](query.md#deduplication-enabled). 42 * Requests that specify Store Matchers. 43 * Requests where downstream queriers set the header `Cache-Control=no-store` in the response: 44 * Requests with a partial **response**. 45 * Requests with other warnings. 46 47 #### In-memory 48 49 ```yaml mdox-exec="go run scripts/cfggen/main.go --name=queryfrontend.InMemoryResponseCacheConfig" 50 type: IN-MEMORY 51 config: 52 max_size: "" 53 max_size_items: 0 54 validity: 0s 55 ``` 56 57 `max_size: ` Maximum memory size of the cache in bytes. A unit suffix (KB, MB, GB) may be applied. 58 59 **_NOTE:** If both `max_size` and `max_size_items` are not set, then the *cache* would not be created. 60 61 If either of `max_size` or `max_size_items` is set, then there is no limit on other field. For example - only set `max_size_item` to 1000, then `max_size` is unlimited. Similarly, if only `max_size` is set, then `max_size_items` is unlimited. 62 63 Example configuration: [kube-thanos](https://github.com/thanos-io/kube-thanos/blob/master/examples/all/manifests/thanos-query-frontend-deployment.yaml#L50-L54) 64 65 #### Memcached 66 67 ```yaml mdox-exec="go run scripts/cfggen/main.go --name=queryfrontend.MemcachedResponseCacheConfig" 68 type: MEMCACHED 69 config: 70 addresses: [] 71 timeout: 0s 72 max_idle_connections: 0 73 max_async_concurrency: 0 74 max_async_buffer_size: 0 75 max_get_multi_concurrency: 0 76 max_item_size: 0 77 max_get_multi_batch_size: 0 78 dns_provider_update_interval: 0s 79 auto_discovery: false 80 expiration: 0s 81 ``` 82 83 `expiration` specifies memcached cache valid time. If set to 0s, so using a default of 24 hours expiration time. 84 85 If a `set` operation is skipped because of the item size is larger than `max_item_size`, this event is tracked by a counter metric `cortex_memcache_client_set_skip_total`. 86 87 Other cache configuration parameters, you can refer to [memcached-index-cache](store.md#memcached-index-cache). 88 89 The default memcached config is: 90 91 ```yaml 92 type: MEMCACHED 93 config: 94 addresses: [your-memcached-addresses] 95 timeout: 500ms 96 max_idle_connections: 100 97 max_item_size: 1MiB 98 max_async_concurrency: 10 99 max_async_buffer_size: 10000 100 max_get_multi_concurrency: 100 101 max_get_multi_batch_size: 0 102 dns_provider_update_interval: 10s 103 expiration: 24h 104 ``` 105 106 #### Redis 107 108 The default redis config is: 109 110 ```yaml mdox-exec="go run scripts/cfggen/main.go --name=queryfrontend.RedisResponseCacheConfig" 111 type: REDIS 112 config: 113 addr: "" 114 username: "" 115 password: "" 116 db: 0 117 dial_timeout: 5s 118 read_timeout: 3s 119 write_timeout: 3s 120 max_get_multi_concurrency: 100 121 get_multi_batch_size: 100 122 max_set_multi_concurrency: 100 123 set_multi_batch_size: 100 124 tls_enabled: false 125 tls_config: 126 ca_file: "" 127 cert_file: "" 128 key_file: "" 129 server_name: "" 130 insecure_skip_verify: false 131 cache_size: 0 132 master_name: "" 133 max_async_buffer_size: 10000 134 max_async_concurrency: 20 135 expiration: 24h0m0s 136 ``` 137 138 `expiration` specifies redis cache valid time. If set to 0s, so using a default of 24 hours expiration time. 139 140 Other cache configuration parameters, you can refer to [redis-index-cache](store.md#redis-index-cache). 141 142 ### Slow Query Log 143 144 Query Frontend supports `--query-frontend.log-queries-longer-than` flag to log queries running longer than some duration. 145 146 ## Naming 147 148 Naming is hard :) Please check [here](https://github.com/thanos-io/thanos/pull/2434#discussion_r408300683) to see why we chose `query-frontend` as the name. 149 150 ## Recommended Downstream Tripper Configuration 151 152 You can configure the parameters of the HTTP client that `query-frontend` uses for the downstream URL with parameters `--query-frontend.downstream-tripper-config` and `--query-frontend.downstream-tripper-config-file`. If it is pointing to a single host, most likely a load-balancer, then it is highly recommended to increase `max_idle_conns_per_host` via these parameters to at least 100 because otherwise `query-frontend` will not be able to leverage HTTP keep-alive connections, and the latency will be 10 - 20% higher. By default, the Go HTTP client will only keep two idle connections per each host. 153 154 Keys which denote a duration are strings that can end with `s` or `m` to indicate seconds or minutes respectively. All of the other keys are integers. Supported keys are: 155 156 * `idle_conn_timeout` - timeout of idle connections (string); 157 * `response_header_timeout` - maximum duration to wait for a response header (string); 158 * `tls_handshake_timeout` - maximum duration of a TLS handshake (string); 159 * `expect_continue_timeout` - [Go source code](https://github.com/golang/go/blob/912f0750472dd4f674b69ca1616bfaf377af1805/src/net/http/transport.go#L220-L226) (string); 160 * `max_idle_conns` - maximum number of idle connections to all hosts (integer); 161 * `max_idle_conns_per_host` - maximum number of idle connections to each host (integer); 162 * `max_conns_per_host` - maximum number of connections to each host (integer); 163 164 You can find the default values [here](https://github.com/thanos-io/thanos/blob/55cb8ca38b3539381dc6a781e637df15c694e50a/pkg/exthttp/transport.go#L12-L27). 165 166 ## Forward Headers to Downstream Queriers 167 168 `--query-frontend.forward-header` flag provides list of request headers forwarded by query frontend to downstream queriers. 169 170 If downstream queriers need basic authentication to access, we can run query-frontend: 171 172 ```bash 173 thanos query-frontend \ 174 --http-address "0.0.0.0:9090" \ 175 --query-frontend.forward-header "Authorization" 176 --query-frontend.downstream-url="<thanos-querier>:<querier-http-port>" 177 ``` 178 179 ## Flags 180 181 ```$ mdox-exec="thanos query-frontend --help" 182 usage: thanos query-frontend [<flags>] 183 184 Query frontend command implements a service deployed in front of queriers to 185 improve query parallelization and caching. 186 187 Flags: 188 --cache-compression-type="" 189 Use compression in results cache. 190 Supported values are: 'snappy' and ” (disable 191 compression). 192 -h, --help Show context-sensitive help (also try 193 --help-long and --help-man). 194 --http-address="0.0.0.0:10902" 195 Listen host:port for HTTP endpoints. 196 --http-grace-period=2m Time to wait after an interrupt received for 197 HTTP Server. 198 --http.config="" [EXPERIMENTAL] Path to the configuration file 199 that can enable TLS or authentication for all 200 HTTP endpoints. 201 --labels.default-time-range=24h 202 The default metadata time range duration for 203 retrieving labels through Labels and Series API 204 when the range parameters are not specified. 205 --labels.max-query-parallelism=14 206 Maximum number of labels requests will be 207 scheduled in parallel by the Frontend. 208 --labels.max-retries-per-request=5 209 Maximum number of retries for a single 210 label/series API request; beyond this, 211 the downstream error is returned. 212 --labels.partial-response Enable partial response for labels requests 213 if no partial_response param is specified. 214 --no-labels.partial-response for disabling. 215 --labels.response-cache-config=<content> 216 Alternative to 217 'labels.response-cache-config-file' flag 218 (mutually exclusive). Content of YAML file that 219 contains response cache configuration. 220 --labels.response-cache-config-file=<file-path> 221 Path to YAML file that contains response cache 222 configuration. 223 --labels.response-cache-max-freshness=1m 224 Most recent allowed cacheable result for 225 labels requests, to prevent caching very recent 226 results that might still be in flux. 227 --labels.split-interval=24h 228 Split labels requests by an interval and 229 execute in parallel, it should be greater 230 than 0 when labels.response-cache-config is 231 configured. 232 --log.format=logfmt Log format to use. Possible options: logfmt or 233 json. 234 --log.level=info Log filtering level. 235 --log.request.decision= Deprecation Warning - This flag would 236 be soon deprecated, and replaced with 237 `request.logging-config`. Request Logging 238 for logging the start and end of requests. 239 By default this flag is disabled. LogFinishCall 240 : Logs the finish call of the requests. 241 LogStartAndFinishCall : Logs the start and 242 finish call of the requests. NoLogCall : 243 Disable request logging. 244 --query-frontend.compress-responses 245 Compress HTTP responses. 246 --query-frontend.downstream-tripper-config=<content> 247 Alternative to 248 'query-frontend.downstream-tripper-config-file' 249 flag (mutually exclusive). Content of YAML file 250 that contains downstream tripper configuration. 251 If your downstream URL is localhost or 252 127.0.0.1 then it is highly recommended to 253 increase max_idle_conns_per_host to at least 254 100. 255 --query-frontend.downstream-tripper-config-file=<file-path> 256 Path to YAML file that contains downstream 257 tripper configuration. If your downstream URL 258 is localhost or 127.0.0.1 then it is highly 259 recommended to increase max_idle_conns_per_host 260 to at least 100. 261 --query-frontend.downstream-url="http://localhost:9090" 262 URL of downstream Prometheus Query compatible 263 API. 264 --query-frontend.forward-header=<http-header-name> ... 265 List of headers forwarded by the query-frontend 266 to downstream queriers, default is empty 267 --query-frontend.log-queries-longer-than=0 268 Log queries that are slower than the specified 269 duration. Set to 0 to disable. Set to < 0 to 270 enable on all queries. 271 --query-frontend.org-id-header=<http-header-name> ... 272 Request header names used to identify the 273 source of slow queries (repeated flag). 274 The values of the header will be added to 275 the org id field in the slow query log. If 276 multiple headers match the request, the first 277 matching arg specified will take precedence. 278 If no headers match 'anonymous' will be used. 279 --query-frontend.vertical-shards=QUERY-FRONTEND.VERTICAL-SHARDS 280 Number of shards to use when 281 distributing shardable PromQL queries. 282 For more details, you can refer to 283 the Vertical query sharding proposal: 284 https://thanos.io/tip/proposals-accepted/202205-vertical-query-sharding.md 285 --query-range.align-range-with-step 286 Mutate incoming queries to align their 287 start and end with their step for better 288 cache-ability. Note: Grafana dashboards do that 289 by default. 290 --query-range.horizontal-shards=0 291 Split queries in this many requests 292 when query duration is below 293 query-range.max-split-interval. 294 --query-range.max-query-length=0 295 Limit the query time range (end - start time) 296 in the query-frontend, 0 disables it. 297 --query-range.max-query-parallelism=14 298 Maximum number of query range requests will be 299 scheduled in parallel by the Frontend. 300 --query-range.max-retries-per-request=5 301 Maximum number of retries for a single query 302 range request; beyond this, the downstream 303 error is returned. 304 --query-range.max-split-interval=0 305 Split query range below this interval in 306 query-range.horizontal-shards. Queries with a 307 range longer than this value will be split in 308 multiple requests of this length. 309 --query-range.min-split-interval=0 310 Split query range requests above this 311 interval in query-range.horizontal-shards 312 requests of equal range. Using 313 this parameter is not allowed with 314 query-range.split-interval. One should also set 315 query-range.split-min-horizontal-shards to a 316 value greater than 1 to enable splitting. 317 --query-range.partial-response 318 Enable partial response for query range 319 requests if no partial_response param is 320 specified. --no-query-range.partial-response 321 for disabling. 322 --query-range.request-downsampled 323 Make additional query for downsampled data in 324 case of empty or incomplete response to range 325 request. 326 --query-range.response-cache-config=<content> 327 Alternative to 328 'query-range.response-cache-config-file' flag 329 (mutually exclusive). Content of YAML file that 330 contains response cache configuration. 331 --query-range.response-cache-config-file=<file-path> 332 Path to YAML file that contains response cache 333 configuration. 334 --query-range.response-cache-max-freshness=1m 335 Most recent allowed cacheable result for query 336 range requests, to prevent caching very recent 337 results that might still be in flux. 338 --query-range.split-interval=24h 339 Split query range requests by an interval and 340 execute in parallel, it should be greater than 341 0 when query-range.response-cache-config is 342 configured. 343 --request.logging-config=<content> 344 Alternative to 'request.logging-config-file' 345 flag (mutually exclusive). Content 346 of YAML file with request logging 347 configuration. See format details: 348 https://thanos.io/tip/thanos/logging.md/#configuration 349 --request.logging-config-file=<file-path> 350 Path to YAML file with request logging 351 configuration. See format details: 352 https://thanos.io/tip/thanos/logging.md/#configuration 353 --tracing.config=<content> 354 Alternative to 'tracing.config-file' flag 355 (mutually exclusive). Content of YAML file 356 with tracing configuration. See format details: 357 https://thanos.io/tip/thanos/tracing.md/#configuration 358 --tracing.config-file=<file-path> 359 Path to YAML file with tracing 360 configuration. See format details: 361 https://thanos.io/tip/thanos/tracing.md/#configuration 362 --version Show application version. 363 --web.disable-cors Whether to disable CORS headers to be set by 364 Thanos. By default Thanos sets CORS headers to 365 be allowed by all. 366 367 ```