github.com/muhammadn/cortex@v1.9.1-0.20220510110439-46bb7000d03d/docs/blocks-storage/production-tips.md (about)

     1  ---
     2  title: "Production tips"
     3  linkTitle: "Production tips"
     4  weight: 4
     5  slug: production-tips
     6  ---
     7  
     8  This page shares some tips and things to take in consideration when setting up a production Cortex cluster based on the blocks storage.
     9  
    10  ## Ingester
    11  
    12  ### Ensure a high number of max open file descriptors
    13  
    14  The ingester stores received series into per-tenant TSDB blocks. Both TSDB WAL, head and compacted blocks are composed by a relatively large number of files which gets loaded via mmap. This means that the ingester keeps file descriptors open for TSDB WAL segments, chunk files and compacted blocks which haven't reached the retention period yet.
    15  
    16  If your Cortex cluster has many tenants or ingester is running with a long `-blocks-storage.tsdb.retention-period`, the ingester may hit the **`file-max` ulimit** (maximum number of open file descriptions by a process); in such case, we recommend increasing the limit on your system or enabling [shuffle sharding](../guides/shuffle-sharding.md).
    17  
    18  The rule of thumb is that a production system shouldn't have the `file-max` ulimit below `65536`, but higher values are recommended (eg. `1048576`).
    19  
    20  ### Ingester disk space
    21  
    22  Ingesters create blocks on disk as samples come in, then every 2 hours (configurable) they cut off those blocks and start a new block.
    23  
    24  We typically configure ingesters to retain these blocks for longer, to allow time to recover if something goes wrong uploading to the long-term store and to reduce work in queriers - more detail [here](#how-to-estimate--querierquery-store-after).
    25  
    26  If you configure ingesters with `-blocks-storage.tsdb.retention-period=24h`, a rule of thumb for disk space required is to take the number of timeseries after replication and multiply by 30KB.
    27  
    28  For example, if you have 20M active series replicated 3 ways, this gives approx 1.7TB.  Divide by the number of ingesters and allow some margin for growth, e.g. if you have 20 ingesters then 100GB each should work, or 150GB each to be more comfortable.
    29  
    30  ## Querier
    31  
    32  ### Ensure caching is enabled
    33  
    34  The querier relies on caching to reduce the number API calls to the storage bucket. Ensure [caching](./querier.md#caching) is properly configured and [properly scaled](#ensure-memcached-is-properly-scaled).
    35  
    36  ### Ensure bucket index is enabled
    37  
    38  The bucket index reduces the number of API calls to the storage bucket and, when enabled, the querier is up and running immediately after the startup (no need to run an initial bucket scan). Ensure [bucket index](./bucket-index.md) is enabled for the querier.
    39  
    40  ### Avoid querying non compacted blocks
    41  
    42  When running Cortex blocks storage cluster at scale, querying non compacted blocks may be inefficient for two reasons:
    43  
    44  1. Non compacted blocks contain duplicated samples (as effect of the ingested samples replication)
    45  2. Overhead introduced querying many small indexes
    46  
    47  Because of this, we would suggest to avoid querying non compacted blocks. In order to do it, you should:
    48  
    49  1. Run the [compactor](./compactor.md)
    50  1. Configure queriers `-querier.query-store-after` large enough to give compactor enough time to compact newly uploaded blocks (_see below_)
    51  1. Configure queriers `-querier.query-ingesters-within` equal to `-querier.query-store-after` plus 5m (5 minutes is just a delta to query the boundary both from ingesters and queriers)
    52  1. Configure ingesters `-blocks-storage.tsdb.retention-period` at least as `-querier.query-ingesters-within`
    53  1. Lower `-blocks-storage.bucket-store.ignore-deletion-marks-delay` to 1h, otherwise non compacted blocks could be queried anyway, even if their compacted replacement is available
    54  
    55  #### How to estimate `-querier.query-store-after`
    56  
    57  The `-querier.query-store-after` should be set to a duration large enough to give compactor enough time to compact newly uploaded blocks, and queriers and store-gateways to discover and sync newly compacted blocks.
    58  
    59  The following diagram shows all the timings involved in the estimation. This diagram should be used only as a template and you're expected to tweak the assumptions based on real measurements in your Cortex cluster. In this example, the following assumptions have been done:
    60  
    61  - An ingester takes up to 30 minutes to upload a block to the storage
    62  - The compactor takes up to 3 hours to compact 2h blocks shipped from all ingesters
    63  - Querier and store-gateways take up to 15 minutes to discover and load a new compacted block
    64  
    65  Given these assumptions, in the worst case scenario it would take up to 6h and 45m since when a sample has been ingested until that sample has been appended to a block flushed to the storage and that block has been [vertically compacted](./compactor.md) with all other overlapping 2h blocks shipped from ingesters.
    66  
    67  ![Avoid querying non compacted blocks](/images/blocks-storage/avoid-querying-non-compacted-blocks.png)
    68  <!-- Diagram source at https://docs.google.com/presentation/d/1bHp8_zcoWCYoNU2AhO2lSagQyuIrghkCncViSqn14cU/edit -->
    69  
    70  ## Store-gateway
    71  
    72  ### Ensure caching is enabled
    73  
    74  The store-gateway heavily relies on caching both to speed up the queries and to reduce the number of API calls to the storage bucket. Ensure [caching](./store-gateway.md#caching) is properly configured and [properly scaled](#ensure-memcached-is-properly-scaled).
    75  
    76  ### Ensure bucket index is enabled
    77  
    78  The bucket index reduces the number of API calls to the storage bucket and the startup time of the store-gateway. Ensure [bucket index](./bucket-index.md) is enabled for the store-gateway.
    79  
    80  ### Ensure a high number of max open file descriptors
    81  
    82  The store-gateway stores each block’s index-header on the local disk and loads it via mmap. This means that the store-gateway keeps a file descriptor open for each loaded block. If your Cortex cluster has many blocks in the bucket, the store-gateway may hit the **`file-max` ulimit** (maximum number of open file descriptions by a process); in such case, we recommend increasing the limit on your system or running more store-gateway instances with blocks sharding enabled.
    83  
    84  The rule of thumb is that a production system shouldn't have the `file-max` ulimit below `65536`, but higher values are recommended (eg. `1048576`).
    85  
    86  ## Compactor
    87  
    88  ### Ensure the compactor has enough disk space
    89  
    90  The compactor generally needs a lot of disk space in order to download source blocks from the bucket and store the compacted block before uploading it to the storage. Please refer to [Compactor disk utilization](./compactor.md#compactor-disk-utilization) for more information about how to do capacity planning.
    91  
    92  ### Ensure deletion marks migration is disabled after first run
    93  
    94  Cortex 1.7.0 introduced a change to store block deletion marks in a per-tenant global `markers/` location in the object store. If your Cortex cluster has been created with a Cortex version older than 1.7.0, the compactor needs to migrate deletion marks to the global location.
    95  
    96  The migration is a one-time operation run at compactor startup. Running it at every compactor startup is a waste of resources and increases the compactor startup time so, once it successfully run once, you should disable it via `-compactor.block-deletion-marks-migration-enabled=false` (or its respective YAML config option).
    97  
    98  You can see that the initial migration is done by looking for the following message in the compactor's logs:
    99  `msg="migrated block deletion marks to the global markers location"`.
   100  
   101  ## Caching
   102  
   103  ### Ensure memcached is properly scaled
   104  
   105  The rule of thumb to ensure memcached is properly scaled is to make sure evictions happen infrequently. When that's not the case and they affect query performances, the suggestion is to scale out the memcached cluster adding more nodes or increasing the memory limit of existing ones.
   106  
   107  We also recommend to run a different memcached cluster for each cache type (metadata, index, chunks). It's not required, but suggested to not worry about the effect of memory pressure on a cache type against others.
   108  
   109  ## Alertmanager
   110  
   111  ### Ensure Alertmanager networking is hardened
   112  
   113  If the Alertmanager API is enabled, users with access to Cortex can autonomously configure the Alertmanager, including receiver integrations that allow to issue network requests to the configured URL (eg. webhook). If the Alertmanager network is not hardened, Cortex users may have the ability to issue network requests to any network endpoint including services running in the local network accessible by the Alertmanager itself.
   114  
   115  Despite hardening the system is out of the scope of Cortex, Cortex provides a basic built-in firewall to block connections created by Alertmanager receiver integrations:
   116  
   117  - `-alertmanager.receivers-firewall-block-cidr-networks`
   118  - `-alertmanager.receivers-firewall-block-private-addresses`
   119  
   120  _These settings can also be overridden on a per-tenant basis via overrides specified in the [runtime config](../configuration/arguments.md#runtime-configuration-file)._