github.com/muhammadn/cortex@v1.9.1-0.20220510110439-46bb7000d03d/docs/blocks-storage/production-tips.md

github.com/muhammadn/cortex@v1.9.1-0.20220510110439-46bb7000d03d/docs/blocks-storage/production-tips.md (about)

1 ---
2 title: "Production tips"
3 linkTitle: "Production tips"
4 weight: 4
5 slug: production-tips
6 ---
7
8 This page shares some tips and things to take in consideration when setting up a production Cortex cluster based on the blocks storage.
9
10 ## Ingester
11
12 ### Ensure a high number of max open file descriptors
13
14 The ingester stores received series into per-tenant TSDB blocks. Both TSDB WAL, head and compacted blocks are composed by a relatively large number of files which gets loaded via mmap. This means that the ingester keeps file descriptors open for TSDB WAL segments, chunk files and compacted blocks which haven't reached the retention period yet.
15
16 If your Cortex cluster has many tenants or ingester is running with a long `-blocks-storage.tsdb.retention-period`, the ingester may hit the **`file-max` ulimit** (maximum number of open file descriptions by a process); in such case, we recommend increasing the limit on your system or enabling [shuffle sharding](../guides/shuffle-sharding.md).
17
18 The rule of thumb is that a production system shouldn't have the `file-max` ulimit below `65536`, but higher values are recommended (eg. `1048576`).
19
20 ### Ingester disk space
21
22 Ingesters create blocks on disk as samples come in, then every 2 hours (configurable) they cut off those blocks and start a new block.
23
24 We typically configure ingesters to retain these blocks for longer, to allow time to recover if something goes wrong uploading to the long-term store and to reduce work in queriers - more detail [here](#how-to-estimate--querierquery-store-after).
25
26 If you configure ingesters with `-blocks-storage.tsdb.retention-period=24h`, a rule of thumb for disk space required is to take the number of timeseries after replication and multiply by 30KB.
27
28 For example, if you have 20M active series replicated 3 ways, this gives approx 1.7TB. Divide by the number of ingesters and allow some margin for growth, e.g. if you have 20 ingesters then 100GB each should work, or 150GB each to be more comfortable.
29
30 ## Querier
31
32 ### Ensure caching is enabled
33
34 The querier relies on caching to reduce the number API calls to the storage bucket. Ensure [caching](./querier.md#caching) is properly configured and [properly scaled](#ensure-memcached-is-properly-scaled).
35
36 ### Ensure bucket index is enabled
37
38 The bucket index reduces the number of API calls to the storage bucket and, when enabled, the querier is up and running immediately after the startup (no need to run an initial bucket scan). Ensure [bucket index](./bucket-index.md) is enabled for the querier.
39
40 ### Avoid querying non compacted blocks
41
42 When running Cortex blocks storage cluster at scale, querying non compacted blocks may be inefficient for two reasons:
43
44 1. Non compacted blocks contain duplicated samples (as effect of the ingested samples replication)
45 2. Overhead introduced querying many small indexes
46
47 Because of this, we would suggest to avoid querying non compacted blocks. In order to do it, you should:
48
49 1. Run the [compactor](./compactor.md)
50 1. Configure queriers `-querier.query-store-after` large enough to give compactor enough time to compact newly uploaded blocks (_see below_)
51 1. Configure queriers `-querier.query-ingesters-within` equal to `-querier.query-store-after` plus 5m (5 minutes is just a delta to query the boundary both from ingesters and queriers)
52 1. Configure ingesters `-blocks-storage.tsdb.retention-period` at least as `-querier.query-ingesters-within`
53 1. Lower `-blocks-storage.bucket-store.ignore-deletion-marks-delay` to 1h, otherwise non compacted blocks could be queried anyway, even if their compacted replacement is available
54
55 #### How to estimate `-querier.query-store-after`
56
57 The `-querier.query-store-after` should be set to a duration large enough to give compactor enough time to compact newly uploaded blocks, and queriers and store-gateways to discover and sync newly compacted blocks.
58
59 The following diagram shows all the timings involved in the estimation. This diagram should be used only as a template and you're expected to tweak the assumptions based on real measurements in your Cortex cluster. In this example, the following assumptions have been done:
60
61 - An ingester takes up to 30 minutes to upload a block to the storage
62 - The compactor takes up to 3 hours to compact 2h blocks shipped from all ingesters
63 - Querier and store-gateways take up to 15 minutes to discover and load a new compacted block
64
65 Given these assumptions, in the worst case scenario it would take up to 6h and 45m since when a sample has been ingested until that sample has been appended to a block flushed to the storage and that block has been [vertically compacted](./compactor.md) with all other overlapping 2h blocks shipped from ingesters.
66
67 ![Avoid querying non compacted blocks](/images/blocks-storage/avoid-querying-non-compacted-blocks.png)
68 
69
70 ## Store-gateway
71
72 ### Ensure caching is enabled
73
74 The store-gateway heavily relies on caching both to speed up the queries and to reduce the number of API calls to the storage bucket. Ensure [caching](./store-gateway.md#caching) is properly configured and [properly scaled](#ensure-memcached-is-properly-scaled).
75
76 ### Ensure bucket index is enabled
77
78 The bucket index reduces the number of API calls to the storage bucket and the startup time of the store-gateway. Ensure [bucket index](./bucket-index.md) is enabled for the store-gateway.
79
80 ### Ensure a high number of max open file descriptors
81
82 The store-gateway stores each block’s index-header on the local disk and loads it via mmap. This means that the store-gateway keeps a file descriptor open for each loaded block. If your Cortex cluster has many blocks in the bucket, the store-gateway may hit the **`file-max` ulimit** (maximum number of open file descriptions by a process); in such case, we recommend increasing the limit on your system or running more store-gateway instances with blocks sharding enabled.
83
84 The rule of thumb is that a production system shouldn't have the `file-max` ulimit below `65536`, but higher values are recommended (eg. `1048576`).
85
86 ## Compactor
87
88 ### Ensure the compactor has enough disk space
89
90 The compactor generally needs a lot of disk space in order to download source blocks from the bucket and store the compacted block before uploading it to the storage. Please refer to [Compactor disk utilization](./compactor.md#compactor-disk-utilization) for more information about how to do capacity planning.
91
92 ### Ensure deletion marks migration is disabled after first run
93
94 Cortex 1.7.0 introduced a change to store block deletion marks in a per-tenant global `markers/` location in the object store. If your Cortex cluster has been created with a Cortex version older than 1.7.0, the compactor needs to migrate deletion marks to the global location.
95
96 The migration is a one-time operation run at compactor startup. Running it at every compactor startup is a waste of resources and increases the compactor startup time so, once it successfully run once, you should disable it via `-compactor.block-deletion-marks-migration-enabled=false` (or its respective YAML config option).
97
98 You can see that the initial migration is done by looking for the following message in the compactor's logs:
99 `msg="migrated block deletion marks to the global markers location"`.
100
101 ## Caching
102
103 ### Ensure memcached is properly scaled
104
105 The rule of thumb to ensure memcached is properly scaled is to make sure evictions happen infrequently. When that's not the case and they affect query performances, the suggestion is to scale out the memcached cluster adding more nodes or increasing the memory limit of existing ones.
106
107 We also recommend to run a different memcached cluster for each cache type (metadata, index, chunks). It's not required, but suggested to not worry about the effect of memory pressure on a cache type against others.
108
109 ## Alertmanager
110
111 ### Ensure Alertmanager networking is hardened
112
113 If the Alertmanager API is enabled, users with access to Cortex can autonomously configure the Alertmanager, including receiver integrations that allow to issue network requests to the configured URL (eg. webhook). If the Alertmanager network is not hardened, Cortex users may have the ability to issue network requests to any network endpoint including services running in the local network accessible by the Alertmanager itself.
114
115 Despite hardening the system is out of the scope of Cortex, Cortex provides a basic built-in firewall to block connections created by Alertmanager receiver integrations:
116
117 - `-alertmanager.receivers-firewall-block-cidr-networks`
118 - `-alertmanager.receivers-firewall-block-private-addresses`
119
120 _These settings can also be overridden on a per-tenant basis via overrides specified in the [runtime config](../configuration/arguments.md#runtime-configuration-file)._