github.com/muhammadn/cortex@v1.9.1-0.20220510110439-46bb7000d03d/docs/blocks-storage/production-tips.md (about) 1 --- 2 title: "Production tips" 3 linkTitle: "Production tips" 4 weight: 4 5 slug: production-tips 6 --- 7 8 This page shares some tips and things to take in consideration when setting up a production Cortex cluster based on the blocks storage. 9 10 ## Ingester 11 12 ### Ensure a high number of max open file descriptors 13 14 The ingester stores received series into per-tenant TSDB blocks. Both TSDB WAL, head and compacted blocks are composed by a relatively large number of files which gets loaded via mmap. This means that the ingester keeps file descriptors open for TSDB WAL segments, chunk files and compacted blocks which haven't reached the retention period yet. 15 16 If your Cortex cluster has many tenants or ingester is running with a long `-blocks-storage.tsdb.retention-period`, the ingester may hit the **`file-max` ulimit** (maximum number of open file descriptions by a process); in such case, we recommend increasing the limit on your system or enabling [shuffle sharding](../guides/shuffle-sharding.md). 17 18 The rule of thumb is that a production system shouldn't have the `file-max` ulimit below `65536`, but higher values are recommended (eg. `1048576`). 19 20 ### Ingester disk space 21 22 Ingesters create blocks on disk as samples come in, then every 2 hours (configurable) they cut off those blocks and start a new block. 23 24 We typically configure ingesters to retain these blocks for longer, to allow time to recover if something goes wrong uploading to the long-term store and to reduce work in queriers - more detail [here](#how-to-estimate--querierquery-store-after). 25 26 If you configure ingesters with `-blocks-storage.tsdb.retention-period=24h`, a rule of thumb for disk space required is to take the number of timeseries after replication and multiply by 30KB. 27 28 For example, if you have 20M active series replicated 3 ways, this gives approx 1.7TB. Divide by the number of ingesters and allow some margin for growth, e.g. if you have 20 ingesters then 100GB each should work, or 150GB each to be more comfortable. 29 30 ## Querier 31 32 ### Ensure caching is enabled 33 34 The querier relies on caching to reduce the number API calls to the storage bucket. Ensure [caching](./querier.md#caching) is properly configured and [properly scaled](#ensure-memcached-is-properly-scaled). 35 36 ### Ensure bucket index is enabled 37 38 The bucket index reduces the number of API calls to the storage bucket and, when enabled, the querier is up and running immediately after the startup (no need to run an initial bucket scan). Ensure [bucket index](./bucket-index.md) is enabled for the querier. 39 40 ### Avoid querying non compacted blocks 41 42 When running Cortex blocks storage cluster at scale, querying non compacted blocks may be inefficient for two reasons: 43 44 1. Non compacted blocks contain duplicated samples (as effect of the ingested samples replication) 45 2. Overhead introduced querying many small indexes 46 47 Because of this, we would suggest to avoid querying non compacted blocks. In order to do it, you should: 48 49 1. Run the [compactor](./compactor.md) 50 1. Configure queriers `-querier.query-store-after` large enough to give compactor enough time to compact newly uploaded blocks (_see below_) 51 1. Configure queriers `-querier.query-ingesters-within` equal to `-querier.query-store-after` plus 5m (5 minutes is just a delta to query the boundary both from ingesters and queriers) 52 1. Configure ingesters `-blocks-storage.tsdb.retention-period` at least as `-querier.query-ingesters-within` 53 1. Lower `-blocks-storage.bucket-store.ignore-deletion-marks-delay` to 1h, otherwise non compacted blocks could be queried anyway, even if their compacted replacement is available 54 55 #### How to estimate `-querier.query-store-after` 56 57 The `-querier.query-store-after` should be set to a duration large enough to give compactor enough time to compact newly uploaded blocks, and queriers and store-gateways to discover and sync newly compacted blocks. 58 59 The following diagram shows all the timings involved in the estimation. This diagram should be used only as a template and you're expected to tweak the assumptions based on real measurements in your Cortex cluster. In this example, the following assumptions have been done: 60 61 - An ingester takes up to 30 minutes to upload a block to the storage 62 - The compactor takes up to 3 hours to compact 2h blocks shipped from all ingesters 63 - Querier and store-gateways take up to 15 minutes to discover and load a new compacted block 64 65 Given these assumptions, in the worst case scenario it would take up to 6h and 45m since when a sample has been ingested until that sample has been appended to a block flushed to the storage and that block has been [vertically compacted](./compactor.md) with all other overlapping 2h blocks shipped from ingesters. 66 67 ![Avoid querying non compacted blocks](/images/blocks-storage/avoid-querying-non-compacted-blocks.png) 68 <!-- Diagram source at https://docs.google.com/presentation/d/1bHp8_zcoWCYoNU2AhO2lSagQyuIrghkCncViSqn14cU/edit --> 69 70 ## Store-gateway 71 72 ### Ensure caching is enabled 73 74 The store-gateway heavily relies on caching both to speed up the queries and to reduce the number of API calls to the storage bucket. Ensure [caching](./store-gateway.md#caching) is properly configured and [properly scaled](#ensure-memcached-is-properly-scaled). 75 76 ### Ensure bucket index is enabled 77 78 The bucket index reduces the number of API calls to the storage bucket and the startup time of the store-gateway. Ensure [bucket index](./bucket-index.md) is enabled for the store-gateway. 79 80 ### Ensure a high number of max open file descriptors 81 82 The store-gateway stores each block’s index-header on the local disk and loads it via mmap. This means that the store-gateway keeps a file descriptor open for each loaded block. If your Cortex cluster has many blocks in the bucket, the store-gateway may hit the **`file-max` ulimit** (maximum number of open file descriptions by a process); in such case, we recommend increasing the limit on your system or running more store-gateway instances with blocks sharding enabled. 83 84 The rule of thumb is that a production system shouldn't have the `file-max` ulimit below `65536`, but higher values are recommended (eg. `1048576`). 85 86 ## Compactor 87 88 ### Ensure the compactor has enough disk space 89 90 The compactor generally needs a lot of disk space in order to download source blocks from the bucket and store the compacted block before uploading it to the storage. Please refer to [Compactor disk utilization](./compactor.md#compactor-disk-utilization) for more information about how to do capacity planning. 91 92 ### Ensure deletion marks migration is disabled after first run 93 94 Cortex 1.7.0 introduced a change to store block deletion marks in a per-tenant global `markers/` location in the object store. If your Cortex cluster has been created with a Cortex version older than 1.7.0, the compactor needs to migrate deletion marks to the global location. 95 96 The migration is a one-time operation run at compactor startup. Running it at every compactor startup is a waste of resources and increases the compactor startup time so, once it successfully run once, you should disable it via `-compactor.block-deletion-marks-migration-enabled=false` (or its respective YAML config option). 97 98 You can see that the initial migration is done by looking for the following message in the compactor's logs: 99 `msg="migrated block deletion marks to the global markers location"`. 100 101 ## Caching 102 103 ### Ensure memcached is properly scaled 104 105 The rule of thumb to ensure memcached is properly scaled is to make sure evictions happen infrequently. When that's not the case and they affect query performances, the suggestion is to scale out the memcached cluster adding more nodes or increasing the memory limit of existing ones. 106 107 We also recommend to run a different memcached cluster for each cache type (metadata, index, chunks). It's not required, but suggested to not worry about the effect of memory pressure on a cache type against others. 108 109 ## Alertmanager 110 111 ### Ensure Alertmanager networking is hardened 112 113 If the Alertmanager API is enabled, users with access to Cortex can autonomously configure the Alertmanager, including receiver integrations that allow to issue network requests to the configured URL (eg. webhook). If the Alertmanager network is not hardened, Cortex users may have the ability to issue network requests to any network endpoint including services running in the local network accessible by the Alertmanager itself. 114 115 Despite hardening the system is out of the scope of Cortex, Cortex provides a basic built-in firewall to block connections created by Alertmanager receiver integrations: 116 117 - `-alertmanager.receivers-firewall-block-cidr-networks` 118 - `-alertmanager.receivers-firewall-block-private-addresses` 119 120 _These settings can also be overridden on a per-tenant basis via overrides specified in the [runtime config](../configuration/arguments.md#runtime-configuration-file)._