github.com/thanos-io/thanos@v0.32.5/docs/components/query-frontend.md

github.com/thanos-io/thanos@v0.32.5/docs/components/query-frontend.md (about)

     1  # Query Frontend
     2  
     3  The `thanos query-frontend` command implements a service that can be put in front of Thanos Queriers to improve the read path. It is based on the [Cortex Query Frontend](https://cortexmetrics.io/docs/architecture/#query-frontend) component so you can find some common features like `Splitting` and `Results Caching`.
     4  
     5  Query Frontend is fully stateless and horizontally scalable.
     6  
     7  Example command to run Query Frontend:
     8  
     9  ```bash
    10  thanos query-frontend \
    11      --http-address     "0.0.0.0:9090" \
    12      --query-frontend.downstream-url="<thanos-querier>:<querier-http-port>"
    13  ```
    14  
    15  _**NOTE:** Currently only range queries (`/api/v1/query_range` API call) are actually processed through Query Frontend. All other API calls just directly go to the downstream Querier, which means only range queries are split and cached. But we are planning to support instant queries as well.
    16  
    17  For more information please check out [initial design proposal](../proposals-done/202004-embedd-cortex-frontend.md).
    18  
    19  ## Features
    20  
    21  ### Splitting
    22  
    23  Query Frontend splits a long query into multiple short queries based on the configured `--query-range.split-interval` flag. The default value of `--query-range.split-interval` is `24h`. When caching is enabled it should be greater than `0`.
    24  
    25  There are some benefits from query splitting:
    26  
    27  1. Safeguard. It prevents large queries from causing OOM issues to Queries.
    28  2. Better parallelization.
    29  3. Better load balancing for Queries.
    30  
    31  ### Retry
    32  
    33  Query Frontend supports a retry mechanism to retry query when HTTP requests are failing. There is a `--query-range.max-retries-per-request` flag to limit the maximum retry times.
    34  
    35  ### Caching
    36  
    37  Query Frontend supports caching query results and reuses them on subsequent queries. If the cached results are incomplete, Query Frontend calculates the required subqueries and executes them in parallel on downstream queriers. Query Frontend can optionally align queries with their step parameter to improve the cacheability of the query results. Currently, in-memory cache (fifo cache), memcached, and redis are supported.
    38  
    39  #### Excluded from caching
    40  
    41  * Requests that support deduplication and having it disabled with `dedup=false`. Read more about deduplication in [Dedup documentation](query.md#deduplication-enabled).
    42  * Requests that specify Store Matchers.
    43  * Requests where downstream queriers set the header `Cache-Control=no-store` in the response:
    44    * Requests with a partial **response**.
    45    * Requests with other warnings.
    46  
    47  #### In-memory
    48  
    49  ```yaml mdox-exec="go run scripts/cfggen/main.go --name=queryfrontend.InMemoryResponseCacheConfig"
    50  type: IN-MEMORY
    51  config:
    52    max_size: ""
    53    max_size_items: 0
    54    validity: 0s
    55  ```
    56  
    57  `max_size: ` Maximum memory size of the cache in bytes. A unit suffix (KB, MB, GB) may be applied.
    58  
    59  **_NOTE:** If both `max_size` and `max_size_items` are not set, then the *cache* would not be created.
    60  
    61  If either of `max_size` or `max_size_items` is set, then there is no limit on other field. For example - only set `max_size_item` to 1000, then `max_size` is unlimited. Similarly, if only `max_size` is set, then `max_size_items` is unlimited.
    62  
    63  Example configuration: [kube-thanos](https://github.com/thanos-io/kube-thanos/blob/master/examples/all/manifests/thanos-query-frontend-deployment.yaml#L50-L54)
    64  
    65  #### Memcached
    66  
    67  ```yaml mdox-exec="go run scripts/cfggen/main.go --name=queryfrontend.MemcachedResponseCacheConfig"
    68  type: MEMCACHED
    69  config:
    70    addresses: []
    71    timeout: 0s
    72    max_idle_connections: 0
    73    max_async_concurrency: 0
    74    max_async_buffer_size: 0
    75    max_get_multi_concurrency: 0
    76    max_item_size: 0
    77    max_get_multi_batch_size: 0
    78    dns_provider_update_interval: 0s
    79    auto_discovery: false
    80    expiration: 0s
    81  ```
    82  
    83  `expiration` specifies memcached cache valid time. If set to 0s, so using a default of 24 hours expiration time.
    84  
    85  If a `set` operation is skipped because of the item size is larger than `max_item_size`, this event is tracked by a counter metric `cortex_memcache_client_set_skip_total`.
    86  
    87  Other cache configuration parameters, you can refer to [memcached-index-cache](store.md#memcached-index-cache).
    88  
    89  The default memcached config is:
    90  
    91  ```yaml
    92  type: MEMCACHED
    93  config:
    94    addresses: [your-memcached-addresses]
    95    timeout: 500ms
    96    max_idle_connections: 100
    97    max_item_size: 1MiB
    98    max_async_concurrency: 10
    99    max_async_buffer_size: 10000
   100    max_get_multi_concurrency: 100
   101    max_get_multi_batch_size: 0
   102    dns_provider_update_interval: 10s
   103    expiration: 24h
   104  ```
   105  
   106  #### Redis
   107  
   108  The default redis config is:
   109  
   110  ```yaml mdox-exec="go run scripts/cfggen/main.go --name=queryfrontend.RedisResponseCacheConfig"
   111  type: REDIS
   112  config:
   113    addr: ""
   114    username: ""
   115    password: ""
   116    db: 0
   117    dial_timeout: 5s
   118    read_timeout: 3s
   119    write_timeout: 3s
   120    max_get_multi_concurrency: 100
   121    get_multi_batch_size: 100
   122    max_set_multi_concurrency: 100
   123    set_multi_batch_size: 100
   124    tls_enabled: false
   125    tls_config:
   126      ca_file: ""
   127      cert_file: ""
   128      key_file: ""
   129      server_name: ""
   130      insecure_skip_verify: false
   131    cache_size: 0
   132    master_name: ""
   133    max_async_buffer_size: 10000
   134    max_async_concurrency: 20
   135    expiration: 24h0m0s
   136  ```
   137  
   138  `expiration` specifies redis cache valid time. If set to 0s, so using a default of 24 hours expiration time.
   139  
   140  Other cache configuration parameters, you can refer to [redis-index-cache](store.md#redis-index-cache).
   141  
   142  ### Slow Query Log
   143  
   144  Query Frontend supports `--query-frontend.log-queries-longer-than` flag to log queries running longer than some duration.
   145  
   146  ## Naming
   147  
   148  Naming is hard :) Please check [here](https://github.com/thanos-io/thanos/pull/2434#discussion_r408300683) to see why we chose `query-frontend` as the name.
   149  
   150  ## Recommended Downstream Tripper Configuration
   151  
   152  You can configure the parameters of the HTTP client that `query-frontend` uses for the downstream URL with parameters `--query-frontend.downstream-tripper-config` and `--query-frontend.downstream-tripper-config-file`. If it is pointing to a single host, most likely a load-balancer, then it is highly recommended to increase `max_idle_conns_per_host` via these parameters to at least 100 because otherwise `query-frontend` will not be able to leverage HTTP keep-alive connections, and the latency will be 10 - 20% higher. By default, the Go HTTP client will only keep two idle connections per each host.
   153  
   154  Keys which denote a duration are strings that can end with `s` or `m` to indicate seconds or minutes respectively. All of the other keys are integers. Supported keys are:
   155  
   156  * `idle_conn_timeout` - timeout of idle connections (string);
   157  * `response_header_timeout` - maximum duration to wait for a response header (string);
   158  * `tls_handshake_timeout` - maximum duration of a TLS handshake (string);
   159  * `expect_continue_timeout` - [Go source code](https://github.com/golang/go/blob/912f0750472dd4f674b69ca1616bfaf377af1805/src/net/http/transport.go#L220-L226) (string);
   160  * `max_idle_conns` - maximum number of idle connections to all hosts (integer);
   161  * `max_idle_conns_per_host` - maximum number of idle connections to each host (integer);
   162  * `max_conns_per_host` - maximum number of connections to each host (integer);
   163  
   164  You can find the default values [here](https://github.com/thanos-io/thanos/blob/55cb8ca38b3539381dc6a781e637df15c694e50a/pkg/exthttp/transport.go#L12-L27).
   165  
   166  ## Forward Headers to Downstream Queriers
   167  
   168  `--query-frontend.forward-header` flag provides list of request headers forwarded by query frontend to downstream queriers.
   169  
   170  If downstream queriers need basic authentication to access, we can run query-frontend:
   171  
   172  ```bash
   173  thanos query-frontend \
   174      --http-address     "0.0.0.0:9090" \
   175      --query-frontend.forward-header "Authorization"
   176      --query-frontend.downstream-url="<thanos-querier>:<querier-http-port>"
   177  ```
   178  
   179  ## Flags
   180  
   181  ```$ mdox-exec="thanos query-frontend --help"
   182  usage: thanos query-frontend [<flags>]
   183  
   184  Query frontend command implements a service deployed in front of queriers to
   185  improve query parallelization and caching.
   186  
   187  Flags:
   188        --cache-compression-type=""
   189                                   Use compression in results cache.
   190                                   Supported values are: 'snappy' and ” (disable
   191                                   compression).
   192    -h, --help                     Show context-sensitive help (also try
   193                                   --help-long and --help-man).
   194        --http-address="0.0.0.0:10902"
   195                                   Listen host:port for HTTP endpoints.
   196        --http-grace-period=2m     Time to wait after an interrupt received for
   197                                   HTTP Server.
   198        --http.config=""           [EXPERIMENTAL] Path to the configuration file
   199                                   that can enable TLS or authentication for all
   200                                   HTTP endpoints.
   201        --labels.default-time-range=24h
   202                                   The default metadata time range duration for
   203                                   retrieving labels through Labels and Series API
   204                                   when the range parameters are not specified.
   205        --labels.max-query-parallelism=14
   206                                   Maximum number of labels requests will be
   207                                   scheduled in parallel by the Frontend.
   208        --labels.max-retries-per-request=5
   209                                   Maximum number of retries for a single
   210                                   label/series API request; beyond this,
   211                                   the downstream error is returned.
   212        --labels.partial-response  Enable partial response for labels requests
   213                                   if no partial_response param is specified.
   214                                   --no-labels.partial-response for disabling.
   215        --labels.response-cache-config=<content>
   216                                   Alternative to
   217                                   'labels.response-cache-config-file' flag
   218                                   (mutually exclusive). Content of YAML file that
   219                                   contains response cache configuration.
   220        --labels.response-cache-config-file=<file-path>
   221                                   Path to YAML file that contains response cache
   222                                   configuration.
   223        --labels.response-cache-max-freshness=1m
   224                                   Most recent allowed cacheable result for
   225                                   labels requests, to prevent caching very recent
   226                                   results that might still be in flux.
   227        --labels.split-interval=24h
   228                                   Split labels requests by an interval and
   229                                   execute in parallel, it should be greater
   230                                   than 0 when labels.response-cache-config is
   231                                   configured.
   232        --log.format=logfmt        Log format to use. Possible options: logfmt or
   233                                   json.
   234        --log.level=info           Log filtering level.
   235        --log.request.decision=    Deprecation Warning - This flag would
   236                                   be soon deprecated, and replaced with
   237                                   `request.logging-config`. Request Logging
   238                                   for logging the start and end of requests.
   239                                   By default this flag is disabled. LogFinishCall
   240                                   : Logs the finish call of the requests.
   241                                   LogStartAndFinishCall : Logs the start and
   242                                   finish call of the requests. NoLogCall :
   243                                   Disable request logging.
   244        --query-frontend.compress-responses
   245                                   Compress HTTP responses.
   246        --query-frontend.downstream-tripper-config=<content>
   247                                   Alternative to
   248                                   'query-frontend.downstream-tripper-config-file'
   249                                   flag (mutually exclusive). Content of YAML file
   250                                   that contains downstream tripper configuration.
   251                                   If your downstream URL is localhost or
   252                                   127.0.0.1 then it is highly recommended to
   253                                   increase max_idle_conns_per_host to at least
   254                                   100.
   255        --query-frontend.downstream-tripper-config-file=<file-path>
   256                                   Path to YAML file that contains downstream
   257                                   tripper configuration. If your downstream URL
   258                                   is localhost or 127.0.0.1 then it is highly
   259                                   recommended to increase max_idle_conns_per_host
   260                                   to at least 100.
   261        --query-frontend.downstream-url="http://localhost:9090"
   262                                   URL of downstream Prometheus Query compatible
   263                                   API.
   264        --query-frontend.forward-header=<http-header-name> ...
   265                                   List of headers forwarded by the query-frontend
   266                                   to downstream queriers, default is empty
   267        --query-frontend.log-queries-longer-than=0
   268                                   Log queries that are slower than the specified
   269                                   duration. Set to 0 to disable. Set to < 0 to
   270                                   enable on all queries.
   271        --query-frontend.org-id-header=<http-header-name> ...
   272                                   Request header names used to identify the
   273                                   source of slow queries (repeated flag).
   274                                   The values of the header will be added to
   275                                   the org id field in the slow query log. If
   276                                   multiple headers match the request, the first
   277                                   matching arg specified will take precedence.
   278                                   If no headers match 'anonymous' will be used.
   279        --query-frontend.vertical-shards=QUERY-FRONTEND.VERTICAL-SHARDS
   280                                   Number of shards to use when
   281                                   distributing shardable PromQL queries.
   282                                   For more details, you can refer to
   283                                   the Vertical query sharding proposal:
   284                                   https://thanos.io/tip/proposals-accepted/202205-vertical-query-sharding.md
   285        --query-range.align-range-with-step
   286                                   Mutate incoming queries to align their
   287                                   start and end with their step for better
   288                                   cache-ability. Note: Grafana dashboards do that
   289                                   by default.
   290        --query-range.horizontal-shards=0
   291                                   Split queries in this many requests
   292                                   when query duration is below
   293                                   query-range.max-split-interval.
   294        --query-range.max-query-length=0
   295                                   Limit the query time range (end - start time)
   296                                   in the query-frontend, 0 disables it.
   297        --query-range.max-query-parallelism=14
   298                                   Maximum number of query range requests will be
   299                                   scheduled in parallel by the Frontend.
   300        --query-range.max-retries-per-request=5
   301                                   Maximum number of retries for a single query
   302                                   range request; beyond this, the downstream
   303                                   error is returned.
   304        --query-range.max-split-interval=0
   305                                   Split query range below this interval in
   306                                   query-range.horizontal-shards. Queries with a
   307                                   range longer than this value will be split in
   308                                   multiple requests of this length.
   309        --query-range.min-split-interval=0
   310                                   Split query range requests above this
   311                                   interval in query-range.horizontal-shards
   312                                   requests of equal range. Using
   313                                   this parameter is not allowed with
   314                                   query-range.split-interval. One should also set
   315                                   query-range.split-min-horizontal-shards to a
   316                                   value greater than 1 to enable splitting.
   317        --query-range.partial-response
   318                                   Enable partial response for query range
   319                                   requests if no partial_response param is
   320                                   specified. --no-query-range.partial-response
   321                                   for disabling.
   322        --query-range.request-downsampled
   323                                   Make additional query for downsampled data in
   324                                   case of empty or incomplete response to range
   325                                   request.
   326        --query-range.response-cache-config=<content>
   327                                   Alternative to
   328                                   'query-range.response-cache-config-file' flag
   329                                   (mutually exclusive). Content of YAML file that
   330                                   contains response cache configuration.
   331        --query-range.response-cache-config-file=<file-path>
   332                                   Path to YAML file that contains response cache
   333                                   configuration.
   334        --query-range.response-cache-max-freshness=1m
   335                                   Most recent allowed cacheable result for query
   336                                   range requests, to prevent caching very recent
   337                                   results that might still be in flux.
   338        --query-range.split-interval=24h
   339                                   Split query range requests by an interval and
   340                                   execute in parallel, it should be greater than
   341                                   0 when query-range.response-cache-config is
   342                                   configured.
   343        --request.logging-config=<content>
   344                                   Alternative to 'request.logging-config-file'
   345                                   flag (mutually exclusive). Content
   346                                   of YAML file with request logging
   347                                   configuration. See format details:
   348                                   https://thanos.io/tip/thanos/logging.md/#configuration
   349        --request.logging-config-file=<file-path>
   350                                   Path to YAML file with request logging
   351                                   configuration. See format details:
   352                                   https://thanos.io/tip/thanos/logging.md/#configuration
   353        --tracing.config=<content>
   354                                   Alternative to 'tracing.config-file' flag
   355                                   (mutually exclusive). Content of YAML file
   356                                   with tracing configuration. See format details:
   357                                   https://thanos.io/tip/thanos/tracing.md/#configuration
   358        --tracing.config-file=<file-path>
   359                                   Path to YAML file with tracing
   360                                   configuration. See format details:
   361                                   https://thanos.io/tip/thanos/tracing.md/#configuration
   362        --version                  Show application version.
   363        --web.disable-cors         Whether to disable CORS headers to be set by
   364                                   Thanos. By default Thanos sets CORS headers to
   365                                   be allowed by all.
   366  
   367  ```