github.com/muhammadn/cortex@v1.9.1-0.20220510110439-46bb7000d03d/docs/configuration/arguments.md (about)

     1  ---
     2  title: "Cortex Arguments"
     3  linkTitle: "Cortex Arguments Explained"
     4  weight: 2
     5  slug: arguments
     6  ---
     7  
     8  ## General Notes
     9  
    10  Cortex has evolved over several years, and the command-line options sometimes reflect this heritage. In some cases the default value for options is not the recommended value, and in some cases names do not reflect the true meaning. We do intend to clean this up, but it requires a lot of care to avoid breaking existing installations. In the meantime we regret the inconvenience.
    11  
    12  Duration arguments should be specified with a unit like `5s` or `3h`. Valid time units are "ms", "s", "m", "h".
    13  
    14  **Warning: some of the following config options apply only to chunks storage, which has been deprecated. You're encouraged to use the [blocks storage](../blocks-storage/_index.md).**
    15  
    16  ## Querier
    17  
    18  - `-querier.max-concurrent`
    19  
    20     The maximum number of top-level PromQL queries that will execute at the same time, per querier process.
    21     If using the query frontend, this should be set to at least (`-querier.worker-parallelism` * number of query frontend replicas). Otherwise queries may queue in the queriers and not the frontend, which will affect QoS.  Alternatively, consider using `-querier.worker-match-max-concurrent` to force worker parallelism to match `-querier.max-concurrent`.
    22  
    23  - `-querier.query-parallelism`
    24  
    25     This refers to database queries against the store when running the deprecated Cortex chunks storage (e.g. Bigtable or DynamoDB).  This is the max subqueries run in parallel per higher-level query.
    26  
    27  - `-querier.timeout`
    28  
    29     The timeout for a top-level PromQL query.
    30  
    31  - `-querier.max-samples`
    32  
    33     Maximum number of samples a single query can load into memory, to avoid blowing up on enormous queries.
    34  
    35  The next three options only apply when the querier is used together with the Query Frontend or Query Scheduler:
    36  
    37  - `-querier.frontend-address`
    38  
    39     Address of query frontend service, used by workers to find the frontend which will give them queries to execute.
    40  
    41  - `-querier.scheduler-address`
    42  
    43     Address of query scheduler service, used by workers to find the scheduler which will give them queries to execute. If set, `-querier.frontend-address` is ignored, and querier will use query scheduler.
    44  
    45  - `-querier.dns-lookup-period`
    46  
    47     How often the workers will query DNS to re-check where the query frontend or query scheduler is.
    48  
    49  - `-querier.worker-parallelism`
    50  
    51     Number of simultaneous queries to process, per query frontend or scheduler.
    52     See note on `-querier.max-concurrent`
    53  
    54  - `-querier.worker-match-max-concurrent`
    55  
    56     Force worker concurrency to match the -querier.max-concurrent option.  Overrides `-querier.worker-parallelism`.
    57     See note on `-querier.max-concurrent`
    58  
    59  
    60  ## Querier and Ruler
    61  
    62  The ingester query API was improved over time, but defaults to the old behaviour for backwards-compatibility. For best results both of these next two flags should be set to `true`:
    63  
    64  - `-querier.batch-iterators`
    65  
    66     This uses iterators to execute query, as opposed to fully materialising the series in memory, and fetches multiple results per loop.
    67  
    68  - `-querier.ingester-streaming`
    69  
    70     Use streaming RPCs to query ingester, to reduce memory pressure in the ingester.
    71  
    72  - `-querier.iterators`
    73  
    74     This is similar to `-querier.batch-iterators` but less efficient.
    75     If both `iterators` and `batch-iterators` are `true`, `batch-iterators` will take precedence.
    76  
    77  - `-promql.lookback-delta`
    78  
    79     Time since the last sample after which a time series is considered stale and ignored by expression evaluations.
    80  
    81  ## Query Frontend
    82  
    83  - `-querier.parallelise-shardable-queries`
    84  
    85     If set to true, will cause the query frontend to mutate incoming queries when possible by turning `sum` operations into sharded `sum` operations. This requires a shard-compatible schema (v10+). An abridged example:
    86     `sum by (foo) (rate(bar{baz=”blip”}[1m]))` ->
    87     ```
    88     sum by (foo) (
    89      sum by (foo) (rate(bar{baz=”blip”,__cortex_shard__=”0of16”}[1m])) or
    90      sum by (foo) (rate(bar{baz=”blip”,__cortex_shard__=”1of16”}[1m])) or
    91      ...
    92      sum by (foo) (rate(bar{baz=”blip”,__cortex_shard__=”15of16”}[1m]))
    93     )
    94     ```
    95     When enabled, the query-frontend requires a schema config to determine how/when to shard queries, either from a file or from flags (i.e. by the `-schema-config-file` CLI flag). This is the same schema config the queriers consume.
    96     It's also advised to increase downstream concurrency controls as well to account for more queries of smaller sizes:
    97  
    98     - `querier.max-outstanding-requests-per-tenant`
    99     - `querier.max-query-parallelism`
   100     - `querier.max-concurrent`
   101     - `server.grpc-max-concurrent-streams` (for both query-frontends and queriers)
   102  
   103     Furthermore, both querier and query-frontend components require the `querier.query-ingesters-within` parameter to know when to start sharding requests (ingester queries are not sharded). It's recommended to align this with `ingester.max-chunk-age`.
   104  
   105     Instrumentation (traces) also scale with the number of sharded queries and it's suggested to account for increased throughput there as well (for instance via `JAEGER_REPORTER_MAX_QUEUE_SIZE`).
   106  
   107  - `-querier.align-querier-with-step`
   108  
   109     If set to true, will cause the query frontend to mutate incoming queries and align their start and end parameters to the step parameter of the query.  This improves the cacheability of the query results.
   110  
   111  - `-querier.split-queries-by-day`
   112  
   113     If set to true, will cause the query frontend to split multi-day queries into multiple single-day queries and execute them in parallel.
   114  
   115  - `-querier.cache-results`
   116  
   117     If set to true, will cause the querier to cache query results.  The cache will be used to answer future, overlapping queries.  The query frontend calculates extra queries required to fill gaps in the cache.
   118  
   119  - `-frontend.max-cache-freshness`
   120  
   121     When caching query results, it is desirable to prevent the caching of very recent results that might still be in flux.  Use this parameter to configure the age of results that should be excluded.
   122  
   123  - `-frontend.memcached.{hostname, service, timeout}`
   124  
   125     Use these flags to specify the location and timeout of the memcached cluster used to cache query results.
   126  
   127  - `-frontend.redis.{endpoint, timeout}`
   128  
   129     Use these flags to specify the location and timeout of the Redis service used to cache query results.
   130  
   131  ## Distributor
   132  
   133  - `-distributor.shard-by-all-labels`
   134  
   135     In the original Cortex design, samples were sharded amongst distributors by the combination of (userid, metric name).  Sharding by metric name was designed to reduce the number of ingesters you need to hit on the read path; the downside was that you could hotspot the write path.
   136  
   137     In hindsight, this seems like the wrong choice: we do many orders of magnitude more writes than reads, and ingester reads are in-memory and cheap. It seems the right thing to do is to use all the labels to shard, improving load balancing and support for very high cardinality metrics.
   138  
   139     Set this flag to `true` for the new behaviour.
   140  
   141     Important to note is that when setting this flag to `true`, it has to be set on both the distributor and the querier (called `-distributor.shard-by-all-labels` on Querier as well). If the flag is only set on the distributor and not on the querier, you will get incomplete query results because not all ingesters are queried.
   142  
   143     **Upgrade notes**: As this flag also makes all queries always read from all ingesters, the upgrade path is pretty trivial; just enable the flag. When you do enable it, you'll see a spike in the number of active series as the writes are "reshuffled" amongst the ingesters, but over the next stale period all the old series will be flushed, and you should end up with much better load balancing. With this flag enabled in the queriers, reads will always catch all the data from all ingesters.
   144  
   145     **Warning**: disabling this flag can lead to a much less balanced distribution of load among the ingesters.
   146  
   147  - `-distributor.extra-query-delay`
   148     This is used by a component with an embedded distributor (Querier and Ruler) to control how long to wait until sending more than the minimum amount of queries needed for a successful response.
   149  
   150  - `distributor.ha-tracker.enable-for-all-users`
   151     Flag to enable, for all users, handling of samples with external labels identifying replicas in an HA Prometheus setup. This defaults to false, and is technically defined in the Distributor limits.
   152  
   153  - `distributor.ha-tracker.enable`
   154     Enable the distributors HA tracker so that it can accept samples from Prometheus HA replicas gracefully (requires labels). Global (for distributors), this ensures that the necessary internal data structures for the HA handling are created. The option `enable-for-all-users` is still needed to enable ingestion of HA samples for all users.
   155  
   156  - `distributor.drop-label`
   157     This flag can be used to specify label names that to drop during sample ingestion within the distributor and can be repeated in order to drop multiple labels.
   158  
   159  ### Ring/HA Tracker Store
   160  
   161  The KVStore client is used by both the Ring and HA Tracker (HA Tracker doesn't support memberlist as KV store).
   162  - `{ring,distributor.ha-tracker}.prefix`
   163     The prefix for the keys in the store. Should end with a /. For example with a prefix of foo/, the key bar would be stored under foo/bar.
   164  - `{ring,distributor.ha-tracker}.store`
   165     Backend storage to use for the HA Tracker (consul, etcd, inmemory, multi).
   166  - `{ring,distributor.ring}.store`
   167     Backend storage to use for the Ring (consul, etcd, inmemory, memberlist, multi).
   168  
   169  #### Consul
   170  
   171  By default these flags are used to configure Consul used for the ring. To configure Consul for the HA tracker,
   172  prefix these flags with `distributor.ha-tracker.`
   173  
   174  - `consul.hostname`
   175     Hostname and port of Consul.
   176  - `consul.acl-token`
   177     ACL token used to interact with Consul.
   178  - `consul.client-timeout`
   179     HTTP timeout when talking to Consul.
   180  - `consul.consistent-reads`
   181     Enable consistent reads to Consul.
   182  
   183  #### etcd
   184  
   185  By default these flags are used to configure etcd used for the ring. To configure etcd for the HA tracker,
   186  prefix these flags with `distributor.ha-tracker.`
   187  
   188  - `etcd.endpoints`
   189     The etcd endpoints to connect to.
   190  - `etcd.dial-timeout`
   191     The timeout for the etcd connection.
   192  - `etcd.max-retries`
   193     The maximum number of retries to do for failed ops.
   194  - `etcd.tls-enabled`
   195     Enable TLS.
   196  - `etcd.tls-cert-path`
   197     The TLS certificate file path.
   198  - `etcd.tls-key-path`
   199     The TLS private key file path.
   200  - `etcd.tls-ca-path`
   201     The trusted CA file path.
   202  - `etcd.tls-insecure-skip-verify`
   203     Skip validating server certificate.
   204  
   205  #### memberlist
   206  
   207  Warning: memberlist KV works only for the [hash ring](../architecture.md#the-hash-ring), not for the HA Tracker, because propagation of changes is too slow for HA Tracker purposes.
   208  
   209  When using memberlist-based KV store, each node maintains its own copy of the hash ring.
   210  Updates generated locally, and received from other nodes are merged together to form the current state of the ring on the node.
   211  Updates are also propagated to other nodes.
   212  All nodes run the following two loops:
   213  
   214  1. Every "gossip interval", pick random "gossip nodes" number of nodes, and send recent ring updates to them.
   215  2. Every "push/pull sync interval", choose random single node, and exchange full ring information with it (push/pull sync). After this operation, rings on both nodes are the same.
   216  
   217  When a node receives a ring update, node will merge it into its own ring state, and if that resulted in a change, node will add that update to the list of gossiped updates.
   218  Such update will be gossiped `R * log(N+1)` times by this node (R = retransmit multiplication factor, N = number of gossiping nodes in the cluster).
   219  
   220  If you find the propagation to be too slow, there are some tuning possibilities (default values are memberlist settings for LAN networks):
   221  - Decrease gossip interval (default: 200ms)
   222  - Increase gossip nodes (default 3)
   223  - Decrease push/pull sync interval (default 30s)
   224  - Increase retransmit multiplication factor (default 4)
   225  
   226  To find propagation delay, you can use `cortex_ring_oldest_member_timestamp{state="ACTIVE"}` metric.
   227  
   228  Flags for configuring KV store based on memberlist library:
   229  
   230  - `memberlist.nodename`
   231     Name of the node in memberlist cluster. Defaults to hostname.
   232  - `memberlist.randomize-node-name`
   233     This flag adds extra random suffix to the node name used by memberlist. Defaults to true. Using random suffix helps to prevent issues when running multiple memberlist nodes on the same machine, or when node names are reused (eg. in stateful sets).
   234  - `memberlist.retransmit-factor`
   235     Multiplication factor used when sending out messages (factor * log(N+1)). If not set, default value is used.
   236  - `memberlist.join`
   237     Other cluster members to join. Can be specified multiple times.
   238  - `memberlist.min-join-backoff`, `memberlist.max-join-backoff`, `memberlist.max-join-retries`
   239     These flags control backoff settings when joining the cluster.
   240  - `memberlist.abort-if-join-fails`
   241     If this node fails to join memberlist cluster, abort.
   242  - `memberlist.rejoin-interval`
   243     How often to try to rejoin the memberlist cluster. Defaults to 0, no rejoining. Occasional rejoin may be useful in some configurations, and is otherwise harmless.
   244  - `memberlist.left-ingesters-timeout`
   245     How long to keep LEFT ingesters in the ring. Note: this is only used for gossiping, LEFT ingesters are otherwise invisible.
   246  - `memberlist.leave-timeout`
   247     Timeout for leaving memberlist cluster.
   248  - `memberlist.gossip-interval`
   249     How often to gossip with other cluster members. Uses memberlist LAN defaults if 0.
   250  - `memberlist.gossip-nodes`
   251     How many nodes to gossip with in each gossip interval. Uses memberlist LAN defaults if 0.
   252  - `memberlist.pullpush-interval`
   253     How often to use pull/push sync. Uses memberlist LAN defaults if 0.
   254  - `memberlist.bind-addr`
   255     IP address to listen on for gossip messages. Multiple addresses may be specified. Defaults to 0.0.0.0.
   256  - `memberlist.bind-port`
   257     Port to listen on for gossip messages. Defaults to 7946.
   258  - `memberlist.packet-dial-timeout`
   259     Timeout used when connecting to other nodes to send packet.
   260  - `memberlist.packet-write-timeout`
   261     Timeout for writing 'packet' data.
   262  - `memberlist.transport-debug`
   263     Log debug transport messages. Note: global log.level must be at debug level as well.
   264  - `memberlist.gossip-to-dead-nodes-time`
   265     How long to keep gossiping to the nodes that seem to be dead. After this time, dead node is removed from list of nodes. If "dead" node appears again, it will simply join the cluster again, if its name is not reused by other node in the meantime. If the name has been reused, such a reanimated node will be ignored by other members.
   266  - `memberlist.dead-node-reclaim-time`
   267     How soon can dead's node name be reused by a new node (using different IP). Disabled by default, name reclaim is not allowed until `gossip-to-dead-nodes-time` expires. This can be useful to set to low numbers when reusing node names, eg. in stateful sets.
   268     If memberlist library detects that new node is trying to reuse the name of previous node, it will log message like this: `Conflicting address for ingester-6. Mine: 10.44.12.251:7946 Theirs: 10.44.12.54:7946 Old state: 2`. Node states are: "alive" = 0, "suspect" = 1 (doesn't respond, will be marked as dead if it doesn't respond), "dead" = 2.
   269  
   270  #### Multi KV
   271  
   272  This is a special key-value implementation that uses two different KV stores (eg. consul, etcd or memberlist). One of them is always marked as primary, and all reads and writes go to primary store. Other one, secondary, is only used for writes. The idea is that operator can use multi KV store to migrate from primary to secondary store in runtime.
   273  
   274  For example, migration from Consul to Etcd would look like this:
   275  
   276  - Set `ring.store` to use `multi` store. Set `-multi.primary=consul` and `-multi.secondary=etcd`. All consul and etcd settings must still be specified.
   277  - Start all Cortex microservices. They will still use Consul as primary KV, but they will also write share ring via etcd.
   278  - Operator can now use "runtime config" mechanism to switch primary store to etcd.
   279  - After all Cortex microservices have picked up new primary store, and everything looks correct, operator can now shut down Consul, and modify Cortex configuration to use `-ring.store=etcd` only.
   280  - At this point, Consul can be shut down.
   281  
   282  Multi KV has following parameters:
   283  
   284  - `multi.primary` - name of primary KV store. Same values as in `ring.store` are supported, except `multi`.
   285  - `multi.secondary` - name of secondary KV store.
   286  - `multi.mirror-enabled` - enable mirroring of values to secondary store, defaults to true
   287  - `multi.mirror-timeout` - wait max this time to write to secondary store to finish. Default to 2 seconds. Errors writing to secondary store are not reported to caller, but are logged and also reported via `cortex_multikv_mirror_write_errors_total` metric.
   288  
   289  Multi KV also reacts on changes done via runtime configuration. It uses this section:
   290  
   291  ```yaml
   292  multi_kv_config:
   293      mirror_enabled: false
   294      primary: memberlist
   295  ```
   296  
   297  Note that runtime configuration values take precedence over command line options.
   298  
   299  ### HA Tracker
   300  
   301  HA tracking has two of its own flags:
   302  - `distributor.ha-tracker.cluster`
   303     Prometheus label to look for in samples to identify a Prometheus HA cluster. (default "cluster")
   304  - `distributor.ha-tracker.replica`
   305     Prometheus label to look for in samples to identify a Prometheus HA replica. (default "`__replica__`")
   306  
   307  It's reasonable to assume people probably already have a `cluster` label, or something similar. If not, they should add one along with `__replica__` via external labels in their Prometheus config. If you stick to these default values your Prometheus config could look like this (`POD_NAME` is an environment variable which must be set by you):
   308  
   309  ```yaml
   310  global:
   311    external_labels:
   312      cluster: clustername
   313      __replica__: $POD_NAME
   314  ```
   315  
   316  HA Tracking looks for the two labels (which can be overwritten per user)
   317  
   318  It also talks to a KVStore and has it's own copies of the same flags used by the Distributor to connect to for the ring.
   319  - `distributor.ha-tracker.failover-timeout`
   320     If we don't receive any samples from the accepted replica for a cluster in this amount of time we will failover to the next replica we receive a sample from. This value must be greater than the update timeout (default 30s)
   321  - `distributor.ha-tracker.store`
   322     Backend storage to use for the ring (consul, etcd, inmemory, multi). Inmemory only works if there is a single distributor and ingester running in the same process (for testing purposes). (default "consul")
   323  - `distributor.ha-tracker.update-timeout`
   324     Update the timestamp in the KV store for a given cluster/replica only after this amount of time has passed since the current stored timestamp. (default 15s)
   325  
   326  ## Ingester
   327  
   328  - `-ingester.max-chunk-age`
   329  
   330    The maximum duration of a timeseries chunk in memory. If a timeseries runs for longer than this the current chunk will be flushed to the store and a new chunk created. (default 12h)
   331  
   332  - `-ingester.max-chunk-idle`
   333  
   334    If a series doesn't receive a sample for this duration, it is flushed and removed from memory.
   335  
   336  - `-ingester.max-stale-chunk-idle`
   337  
   338    If a series receives a [staleness marker](https://www.robustperception.io/staleness-and-promql), then we wait for this duration to get another sample before we close and flush this series, removing it from memory. You want it to be at least 2x the scrape interval as you don't want a single failed scrape to cause a chunk flush.
   339  
   340  - `-ingester.chunk-age-jitter`
   341  
   342    To reduce load on the database exactly 12 hours after starting, the age limit is reduced by a varying amount up to this. Don't enable this along with `-ingester.spread-flushes` (default 0m)
   343  
   344  - `-ingester.spread-flushes`
   345  
   346    Makes the ingester flush each timeseries at a specific point in the `max-chunk-age` cycle. This means multiple replicas of a chunk are very likely to contain the same contents which cuts chunk storage space by up to 66%. Set `-ingester.chunk-age-jitter` to `0` when using this option. If a chunk cache is configured (via `-store.chunks-cache.memcached.hostname`) then duplicate chunk writes are skipped which cuts write IOPs.
   347  
   348  - `-ingester.join-after`
   349  
   350     How long to wait in PENDING state during the [hand-over process](../guides/ingesters-rolling-updates.md#chunks-storage-with-wal-disabled-hand-over) (supported only by the [chunks storage](../chunks-storage/_index.md)). (default 0s)
   351  
   352  - `-ingester.max-transfer-retries`
   353  
   354     How many times a LEAVING ingester tries to find a PENDING ingester during the [hand-over process](../guides/ingesters-rolling-updates.md#chunks-storage-with-wal-disabled-hand-over) (supported only by the [chunks storage](../chunks-storage/_index.md)). Negative value or zero disables hand-over process completely. (default 10)
   355  
   356  - `-ingester.normalise-tokens`
   357  
   358     Deprecated. New ingesters always write "normalised" tokens to the ring. Normalised tokens consume less memory to encode and decode; as the ring is unmarshalled regularly, this significantly reduces memory usage of anything that watches the ring.
   359  
   360     Cortex 0.4.0 is the last version that can *write* denormalised tokens. Cortex 0.5.0 and above always write normalised tokens.
   361  
   362     Cortex 0.6.0 is the last version that can *read* denormalised tokens. Starting with Cortex 0.7.0 only normalised tokens are supported, and ingesters writing denormalised tokens to the ring (running Cortex 0.4.0 or earlier with `-ingester.normalise-tokens=false`) are ignored by distributors. Such ingesters should either switch to using normalised tokens, or be upgraded to Cortex 0.5.0 or later.
   363  
   364  - `-ingester.chunk-encoding`
   365  
   366    Pick one of the encoding formats for timeseries data, which have different performance characteristics.
   367    `Bigchunk` uses the Prometheus V2 code, and expands in memory to arbitrary length.
   368    `Varbit`, `Delta` and `DoubleDelta` use Prometheus V1 code, and are fixed at 1K per chunk.
   369    Defaults to `Bigchunk` starting version 0.7.0.
   370  
   371  - `-store.bigchunk-size-cap-bytes`
   372  
   373     When using bigchunks, start a new bigchunk and flush the old one if the old one reaches this size. Use this setting to limit memory growth of ingesters with a lot of timeseries that last for days.
   374  
   375  - `-ingester-client.expected-timeseries`
   376  
   377     When `push` requests arrive, pre-allocate this many slots to decode them. Tune this setting to reduce memory allocations and garbage. This should match the `max_samples_per_send` in your `queue_config` for Prometheus.
   378  
   379  - `-ingester-client.expected-samples-per-series`
   380  
   381     When `push` requests arrive, pre-allocate this many slots to decode them. Tune this setting to reduce memory allocations and garbage. Under normal conditions, Prometheus scrapes should arrive with one sample per series.
   382  
   383  - `-ingester-client.expected-labels`
   384  
   385     When `push` requests arrive, pre-allocate this many slots to decode them. Tune this setting to reduce memory allocations and garbage. The optimum value will depend on how many labels are sent with your timeseries samples.
   386  
   387  - `-store.chunk-cache.cache-stubs`
   388  
   389     Where you don't want to cache every chunk written by ingesters, but you do want to take advantage of chunk write deduplication, this option will make ingesters write a placeholder to the cache for each chunk.
   390     Make sure you configure ingesters with a different cache to queriers, which need the whole value.
   391  
   392  #### Flusher
   393  
   394  - `-flusher.wal-dir`
   395     Directory where the WAL data should be recovered from.
   396  
   397  - `-flusher.concurrent-flushes`
   398     Number of concurrent flushes.
   399  
   400  - `-flusher.flush-op-timeout`
   401     Duration after which a flush should timeout.
   402  
   403  ## Runtime Configuration file
   404  
   405  Cortex has a concept of "runtime config" file, which is simply a file that is reloaded while Cortex is running. It is used by some Cortex components to allow operator to change some aspects of Cortex configuration without restarting it. File is specified by using `-runtime-config.file=<filename>` flag and reload period (which defaults to 10 seconds) can be changed by `-runtime-config.reload-period=<duration>` flag. Previously this mechanism was only used by limits overrides, and flags were called `-limits.per-user-override-config=<filename>` and `-limits.per-user-override-period=10s` respectively. These are still used, if `-runtime-config.file=<filename>` is not specified.
   406  
   407  At the moment runtime configuration may contain per-user limits, multi KV store, and ingester instance limits.
   408  
   409  Example runtime configuration file:
   410  
   411  ```yaml
   412  overrides:
   413    tenant1:
   414      ingestion_rate: 10000
   415      max_series_per_metric: 100000
   416      max_series_per_query: 100000
   417    tenant2:
   418      max_samples_per_query: 1000000
   419      max_series_per_metric: 100000
   420      max_series_per_query: 100000
   421  
   422  multi_kv_config:
   423      mirror_enabled: false
   424      primary: memberlist
   425  
   426  ingester_limits:
   427    max_ingestion_rate: 42000
   428    max_inflight_push_requests: 10000
   429  ```
   430  
   431  When running Cortex on Kubernetes, store this file in a config map and mount it in each services' containers.  When changing the values there is no need to restart the services, unless otherwise specified.
   432  
   433  The `/runtime_config` endpoint returns the whole runtime configuration, including the overrides. In case you want to get only the non-default values of the configuration you can pass the `mode` parameter with the `diff` value.
   434  
   435  ## Ingester, Distributor & Querier limits.
   436  
   437  Cortex implements various limits on the requests it can process, in order to prevent a single tenant overwhelming the cluster.  There are various default global limits which apply to all tenants which can be set on the command line.  These limits can also be overridden on a per-tenant basis by using `overrides` field of runtime configuration file.
   438  
   439  The `overrides` field is a map of tenant ID (same values as passed in the `X-Scope-OrgID` header) to the various limits.  An example could look like:
   440  
   441  ```yaml
   442  overrides:
   443    tenant1:
   444      ingestion_rate: 10000
   445      max_series_per_metric: 100000
   446      max_series_per_query: 100000
   447    tenant2:
   448      max_samples_per_query: 1000000
   449      max_series_per_metric: 100000
   450      max_series_per_query: 100000
   451  ```
   452  
   453  Valid per-tenant limits are (with their corresponding flags for default values):
   454  
   455  - `ingestion_rate_strategy` / `-distributor.ingestion-rate-limit-strategy`
   456  - `ingestion_rate` / `-distributor.ingestion-rate-limit`
   457  - `ingestion_burst_size` / `-distributor.ingestion-burst-size`
   458  
   459    The per-tenant rate limit (and burst size), in samples per second. It supports two strategies: `local` (default) and `global`.
   460  
   461    The `local` strategy enforces the limit on a per distributor basis, actual effective rate limit will be N times higher, where N is the number of distributor replicas.
   462  
   463    The `global` strategy enforces the limit globally, configuring a per-distributor local rate limiter as `ingestion_rate / N`, where N is the number of distributor replicas (it's automatically adjusted if the number of replicas change). The `ingestion_burst_size` refers to the per-distributor local rate limiter (even in the case of the `global` strategy) and should be set at least to the maximum number of samples expected in a single push request. For this reason, the `global` strategy requires that push requests are evenly distributed across the pool of distributors; if you use a load balancer in front of the distributors you should be already covered, while if you have a custom setup (ie. an authentication gateway in front) make sure traffic is evenly balanced across distributors.
   464  
   465    The `global` strategy requires the distributors to form their own ring, which is used to keep track of the current number of healthy distributor replicas. The ring is configured by `distributor: { ring: {}}` / `-distributor.ring.*`.
   466  
   467  - `max_label_name_length` / `-validation.max-length-label-name`
   468  - `max_label_value_length` / `-validation.max-length-label-value`
   469  - `max_label_names_per_series` / `-validation.max-label-names-per-series`
   470  
   471    Also enforced by the distributor, limits on the on length of labels and their values, and the total number of labels allowed per series.
   472  
   473  - `reject_old_samples` / `-validation.reject-old-samples`
   474  - `reject_old_samples_max_age` / `-validation.reject-old-samples.max-age`
   475  - `creation_grace_period` / `-validation.create-grace-period`
   476  
   477    Also enforce by the distributor, limits on how far in the past (and future) timestamps that we accept can be.
   478  
   479  - `max_series_per_user` / `-ingester.max-series-per-user`
   480  - `max_series_per_metric` / `-ingester.max-series-per-metric`
   481  
   482    Enforced by the ingesters; limits the number of active series a user (or a given metric) can have.  When running with `-distributor.shard-by-all-labels=false` (the default), this limit will enforce the maximum number of series a metric can have 'globally', as all series for a single metric will be sent to the same replication set of ingesters.  This is not the case when running with `-distributor.shard-by-all-labels=true`, so the actual limit will be N/RF times higher, where N is number of ingester replicas and RF is configured replication factor.
   483  
   484    An active series is a series to which a sample has been written in the last `-ingester.max-chunk-idle` duration, which defaults to 5 minutes.
   485  
   486  - `max_global_series_per_user` / `-ingester.max-global-series-per-user`
   487  - `max_global_series_per_metric` / `-ingester.max-global-series-per-metric`
   488  
   489     Like `max_series_per_user` and `max_series_per_metric`, but the limit is enforced across the cluster. Each ingester is configured with a local limit based on the replication factor, the `-distributor.shard-by-all-labels` setting and the current number of healthy ingesters, and is kept updated whenever the number of ingesters change.
   490  
   491     Requires `-distributor.replication-factor`, `-distributor.shard-by-all-labels`, `-distributor.sharding-strategy` and `-distributor.zone-awareness-enabled` set for the ingesters too.
   492  
   493  - `max_series_per_query` / `-ingester.max-series-per-query`
   494  
   495  - `max_samples_per_query` / `-ingester.max-samples-per-query`
   496  
   497    Limits on the number of timeseries and samples returns by a single ingester during a query.
   498  
   499  - `max_metadata_per_user` / `-ingester.max-metadata-per-user`
   500  - `max_metadata_per_metric` / `-ingester.max-metadata-per-metric`
   501    Enforced by the ingesters; limits the number of active metadata a user (or a given metric) can have.  When running with `-distributor.shard-by-all-labels=false` (the default), this limit will enforce the maximum number of metadata a metric can have 'globally', as all metadata for a single metric will be sent to the same replication set of ingesters.  This is not the case when running with `-distributor.shard-by-all-labels=true`, so the actual limit will be N/RF times higher, where N is number of ingester replicas and RF is configured replication factor.
   502  
   503  - `max_fetched_series_per_query` / `querier.max-fetched-series-per-query`
   504    When running Cortex with blocks storage this limit is enforced in the queriers on unique series fetched from ingesters and store-gateways (long-term storage).
   505  
   506  - `max_global_metadata_per_user` / `-ingester.max-global-metadata-per-user`
   507  - `max_global_metadata_per_metric` / `-ingester.max-global-metadata-per-metric`
   508  
   509     Like `max_metadata_per_user` and `max_metadata_per_metric`, but the limit is enforced across the cluster. Each ingester is configured with a local limit based on the replication factor, the `-distributor.shard-by-all-labels` setting and the current number of healthy ingesters, and is kept updated whenever the number of ingesters change.
   510  
   511     Requires `-distributor.replication-factor`, `-distributor.shard-by-all-labels`, `-distributor.sharding-strategy` and `-distributor.zone-awareness-enabled` set for the ingesters too.
   512  
   513  ## Ingester Instance Limits
   514  
   515  Cortex ingesters support limits that are applied per-instance, meaning they apply to each ingester process. These can be used to ensure individual ingesters are not overwhelmed regardless of any per-user limits. These limits can be set under the `ingester.instance_limits` block in the global configuration file, with command line flags, or under the `ingester_limits` field in the runtime configuration file.
   516  
   517  An example as part of the runtime configuration file:
   518  
   519  ```yaml
   520  ingester_limits:
   521    max_ingestion_rate: 20000
   522    max_series: 1500000
   523    max_tenants: 1000
   524    max_inflight_push_requests: 30000
   525  ```
   526  
   527  Valid ingester instance limits are (with their corresponding flags):
   528  
   529  - `max_ingestion_rate` \ `--ingester.instance-limits.max-ingestion-rate`
   530  
   531    Limit the ingestion rate in samples per second for an ingester. When this limit is reached, new requests will fail with an HTTP 500 error.
   532  
   533  - `max_series` \ `-ingester.instance-limits.max-series`
   534  
   535    Limit the total number of series that an ingester keeps in memory, across all users. When this limit is reached, requests that create new series will fail with an HTTP 500 error.
   536  
   537  - `max_tenants` \ `-ingester.instance-limits.max-tenants`
   538  
   539    Limit the maximum number of users an ingester will accept metrics for. When this limit is reached, requests from new users will fail with an HTTP 500 error.
   540  
   541  - `max_inflight_push_requests` \ `-ingester.instance-limits.max-inflight-push-requests`
   542  
   543    Limit the maximum number of requests being handled by an ingester at once. This setting is critical for preventing ingesters from using an excessive amount of memory during high load or temporary slow downs. When this limit is reached, new requests will fail with an HTTP 500 error.
   544  
   545  ## Storage
   546  
   547  - `s3.force-path-style`
   548  
   549    Set this to `true` to force the request to use path-style addressing (`http://s3.amazonaws.com/BUCKET/KEY`). By default, the S3 client will use virtual hosted bucket addressing when possible (`http://BUCKET.s3.amazonaws.com/KEY`).
   550  
   551  ## DNS Service Discovery
   552  
   553  Some clients in Cortex support service discovery via DNS to find addresses of backend servers to connect to (ie. caching servers). The clients supporting it are:
   554  
   555  - [Blocks storage's memcached cache](../blocks-storage/store-gateway.md#caching)
   556  - [All caching memcached servers](./config-file-reference.md#memcached-client-config)
   557  - [Memberlist KV store](./config-file-reference.md#memberlist-config)
   558  
   559  ### Supported discovery modes
   560  
   561  The DNS service discovery, inspired from Thanos DNS SD, supports different discovery modes. A discovery mode is selected adding a specific prefix to the address. The supported prefixes are:
   562  
   563  - **`dns+`**<br />
   564    The domain name after the prefix is looked up as an A/AAAA query. For example: `dns+memcached.local:11211`
   565  - **`dnssrv+`**<br />
   566    The domain name after the prefix is looked up as a SRV query, and then each SRV record is resolved as an A/AAAA record. For example: `dnssrv+_memcached._tcp.memcached.namespace.svc.cluster.local`
   567  - **`dnssrvnoa+`**<br />
   568    The domain name after the prefix is looked up as a SRV query, with no A/AAAA lookup made after that. For example: `dnssrvnoa+_memcached._tcp.memcached.namespace.svc.cluster.local`
   569  
   570  If **no prefix** is provided, the provided IP or hostname will be used straightaway without pre-resolving it.
   571  
   572  If you are using a managed memcached service from [Google Cloud](https://cloud.google.com/memorystore/docs/memcached/auto-discovery-overview), or [AWS](https://docs.aws.amazon.com/AmazonElastiCache/latest/mem-ug/AutoDiscovery.HowAutoDiscoveryWorks.html), use the [auto-discovery](./config-file-reference.md#memcached-client-config) flag instead of DNS discovery, then use the discovery/configuration endpoint as the domain name without any prefix.
   573  
   574  ## Logging of IP of reverse proxy
   575  
   576  If a reverse proxy is used in front of Cortex it might be diffult to troubleshoot errors. The following 3 settings can be used to log the IP address passed along by the reverse proxy in headers like X-Forwarded-For.
   577  
   578  - `-server.log_source_ips_enabled`
   579  
   580    Set this to `true` to add logging of the IP when a Forwarded, X-Real-IP or X-Forwarded-For header is used. A field called `sourceIPs` will be added to error logs when data is pushed into Cortex.
   581  
   582  - `-server.log-source-ips-header`
   583  
   584    Header field storing the source IPs. It is only used if `-server.log-source-ips-enabled` is true and if `-server.log-source-ips-regex` is set. If not set the default Forwarded, X-Real-IP or X-Forwarded-For headers are searched.
   585  
   586  - `-server.log-source-ips-regex`
   587  
   588    Regular expression for matching the source IPs. It should contain at least one capturing group the first of which will be returned. Only used if `-server.log-source-ips-enabled` is true and if `-server.log-source-ips-header` is set. If not set the default Forwarded, X-Real-IP or X-Forwarded-For headers are searched.