github.com/muhammadn/cortex@v1.9.1-0.20220510110439-46bb7000d03d/docs/blocks-storage/migrate-from-chunks-to-blocks.md

github.com/muhammadn/cortex@v1.9.1-0.20220510110439-46bb7000d03d/docs/blocks-storage/migrate-from-chunks-to-blocks.md (about)

     1  ---
     2  title: "Migrate Cortex cluster from chunks to blocks"
     3  linkTitle: "Migrate Cortex cluster from chunks to blocks"
     4  weight: 5
     5  slug: migrate-cortex-cluster-from-chunks-to-blocks
     6  ---
     7  
     8  This article describes how to migrate existing Cortex cluster from chunks storage to blocks storage,
     9  and highlight possible issues you may encounter in the process.
    10  
    11  _This document replaces the [Cortex proposal](https://cortexmetrics.io/docs/proposals/ingesters-migration/),
    12  which was written before support for migration was in place._
    13  
    14  ## Introduction
    15  
    16  This article **assumes** that:
    17  
    18  - Cortex cluster is managed by Kubernetes
    19  - Cortex is using chunks storage
    20  - Ingesters are using WAL
    21  - Cortex version 1.4.0 or later.
    22  
    23  _If your ingesters are not using WAL, the documented procedure will still apply, but the presented migration script will not work properly without changes, as it assumes that ingesters are managed via StatefulSet._
    24  
    25  The migration procedure is composed by 3 steps:
    26  
    27  1. [Preparation](#step-1-preparation)
    28  1. [Ingesters migration](#step-2-ingesters-migration)
    29  1. [Cleanup](#step-3-cleanup)
    30  
    31  _In case of any issue during or after the migration, this document also outlines a [Rollback](#rollback) strategy._
    32  
    33  ## Step 1: Preparation
    34  
    35  Before starting the migration of ingesters, we need to prepare other services.
    36  
    37  ### Querier and Ruler
    38  
    39  _Everything discussed for querier applies to ruler as well, since it shares querier configuration – CLI flags prefix is `-querier` even when used by ruler._
    40  
    41  Querier and ruler need to be reconfigured as follow:
    42  
    43  - `-querier.second-store-engine=blocks`
    44  - `-querier.query-store-after=0`
    45  
    46  #### `-querier.second-store-engine=blocks`
    47  
    48  Querier (and ruler) needs to be reconfigured to query both chunks storage and blocks storage at the same time.
    49  This is achieved by using `-querier.second-store-engine=blocks` option, and providing querier with full blocks configuration, but keeping "primary" store set to `-store.engine=chunks`.
    50  
    51  #### `-querier.query-store-after=0`
    52  
    53  Querier (and ruler) has an option `-querier.query-store-after` to query store only if query hits data older than some period of time.
    54  For example, if ingesters keep 12h of data in memory, there is no need to hit the store for queries that only need last 1h of data.
    55  During the migration, this flag needs to be set to 0, to make queriers always consult the store when handling queries.
    56  As chunks ingesters shut down, they flush chunks to the storage. They are then replaced with new ingesters configured
    57  to use blocks. Queriers cannot fetch recent chunks from ingesters directly (as blocks ingester don't reload chunks),
    58  and need to use storage instead.
    59  
    60  ### Query-frontend
    61  
    62  Query-frontend needs to be reconfigured as follow:
    63  
    64  - `-querier.parallelise-shardable-queries=false`
    65  
    66  #### `-querier.parallelise-shardable-queries=false`
    67  
    68  Query frontend has an option `-querier.parallelise-shardable-queries` to split some incoming queries into multiple queries based on sharding factor used in v11 schema of chunk storage.
    69  As the description implies, it only works when using chunks storage.
    70  During and after the migration to blocks (and also after possible rollback), this option needs to be disabled otherwise query-frontend will generate queries that cannot be satisfied by blocks storage.
    71  
    72  ### Compactor and Store-gateway
    73  
    74  [Compactor](./compactor.md) and [store-gateway](./store-gateway.md) services should be deployed and successfully up and running before migrating ingesters.
    75  
    76  ### Ingester – blocks
    77  
    78  Migration script presented in Step 2 assumes that there are two StatefulSets of ingesters: existing one configured with chunks, and the new one with blocks.
    79  New StatefulSet with blocks ingesters should have 0 replicas at the beginning of migration.
    80  
    81  ### Table-Manager - chunks
    82  
    83  If you use a store with provisioned IO, e.g. DynamoDB, scale up the provision before starting the migration.
    84  Each ingester will need to flush all chunks before exiting, so will write to the store at many times the normal rate.
    85  
    86  Stop or reconfigure the table-manager to stop it adjusting the provision back to normal.
    87  (Don't do the migration on Wednesday night when a new weekly table might be required.)
    88  
    89  ## Step 2: Ingesters migration
    90  
    91  We have developed a script available in Cortex [`tools/migrate-ingester-statefulsets.sh`](https://github.com/cortexproject/cortex/blob/master/tools/migrate-ingester-statefulsets.sh) to migrate ingesters between two StatefulSets, shutting down ingesters one by one.
    92  
    93  It can be used like this:
    94  
    95  ```
    96  $ tools/migrate-ingester-statefulsets.sh <namespace> <ingester-old> <ingester-new> <num-instances>
    97  ```
    98  
    99  Where parameters are:
   100  - `<namespace>`: Kubernetes namespace where the Cortex cluster is running
   101  - `<ingester-old>`: name of the ingesters StatefulSet to scale down (running chunks storage)
   102  - `<ingester-new>`: name of the ingesters StatefulSet to scale up (running blocks storage)
   103  - `<num-instances>`: number of instances to scale down (in `ingester-old` statefulset) and scale up (in `ingester-new`), or "all" – which will scale down all remaining instances in `ingester-old` statefulset
   104  
   105  After starting new pod in `ingester-new` statefulset, script then triggers `/shutdown` endpoint on the old ingester. When the flushing on the old ingester is complete, scale down of statefulset continues, and process repeats.
   106  
   107  _The script supports both migration from chunks to blocks, and viceversa (eg. rollback)._
   108  
   109  ### Known issues
   110  
   111  There are few known issues with the script:
   112  
   113  - If expected messages don't appear in the log, but pod keeps on running, the script will never finish.
   114  - Script doesn't verify that flush finished without any error.
   115  
   116  ## Step 3: Cleanup
   117  
   118  When the ingesters migration finishes, there are still two StatefulSets, with original StatefulSet (running the chunks storage) having 0 instances now.
   119  
   120  At this point, we can delete the old StatefulSet and its persistent volumes and recreate it with final blocks storage configuration (eg. changing PVs), and use the script again to move pods from `ingester-blocks` to `ingester`.
   121  
   122  Querier (and ruler) can be reconfigured to use `blocks` as "primary" store to search, and `chunks` as secondary:
   123  
   124  - `-store.engine=blocks`
   125  - `-querier.second-store-engine=chunks`
   126  - `-querier.use-second-store-before-time=<timestamp after ingesters migration has completed>`
   127  - `-querier.ingester-streaming=true`
   128  
   129  #### `-querier.use-second-store-before-time`
   130  
   131  The CLI flag `-querier.use-second-store-before-time` (or its respective YAML config option) is only available for secondary store.
   132  This flag can be set to a timestamp when migration has finished, and it avoids querying secondary store (chunks) for data when running queries that don't need data before given time.
   133  
   134  Both primary and secondary stores are queried before this time, so the overlap where some data is in chunks and some in blocks is covered.
   135  
   136  ## Rollback
   137  
   138  If rollback to chunks is needed for any reason, it is possible to use the same migration script with reversed arguments:
   139  
   140  - Scale down ingesters StatefulSet running blocks storage
   141  - Scale up ingesters StatefulSet running chunks storage
   142  
   143  _Blocks ingesters support the same `/shutdown` endpoint for flushing data._
   144  
   145  During the rollback, queriers and rulers need to use the same configuration changes as during migration. You should also make sure the following settings are applied:
   146  
   147  - `-store.engine=chunks`
   148  - `-querier.second-store-engine=blocks`
   149  - `-querier.use-second-store-before-time` should not be set
   150  - `-querier.ingester-streaming=false`
   151  
   152  Once the rollback is complete, some configuration changes need to stay in place, because some data has already been stored to blocks:
   153  
   154  - The query sharding in the query-frontend must be kept disabled, otherwise querying blocks will not work correctly
   155  - `store-gateway` needs to keep running, otherwise querying blocks will fail
   156  - `compactor` may be shutdown, after it has no more compaction work to do
   157  
   158  Kubernetes resources related to the ingesters running the blocks storage may be deleted.
   159  
   160  ### Known issues
   161  
   162  After rollback, chunks ingesters will replay their old Write-Ahead-Log, thus loading old chunks into memory.
   163  WAL doesn't remember whether these old chunks were already flushed or not, so they will be flushed again to the storage.
   164  Until that flush happens, Cortex reports those chunks as unflushed, which may trigger some alerts based on `cortex_oldest_unflushed_chunk_timestamp_seconds` metric.
   165  
   166  ## Appendix
   167  
   168  ### Jsonnet config
   169  
   170  This section shows how to use [cortex-jsonnet](https://github.com/grafana/cortex-jsonnet) to configure additional services.
   171  
   172  We will assume that `main.jsonnet` is main configuration for the cluster, that also imports `temp.jsonnet` – with our temporary configuration for migration.
   173  
   174  In `main.jsonnet` we have something like this:
   175  
   176  ```jsonnet
   177  local cortex = import 'cortex/cortex.libsonnet';
   178  local wal = import 'cortex/wal.libsonnet';
   179  local temp = import 'temp.jsonnet';
   180  
   181  // Note that 'tsdb' is not imported here.
   182  cortex + wal + temp {
   183    _images+:: (import 'images.libsonnet'),
   184  
   185    _config+:: {
   186      cluster: 'k8s-cluster',
   187      namespace: 'k8s-namespace',
   188  
   189  ...
   190  ```
   191  
   192  To configure querier to use secondary store for querying, we need to add:
   193  
   194  ```
   195      querier_second_storage_engine: 'blocks',
   196      blocks_storage_bucket_name: 'bucket-for-storing-blocks',
   197  ```
   198  
   199  to the `_config` object in main.jsonnet.
   200  
   201  Let's generate blocks configuration now in `temp.jsonnet`.
   202  There are comments inside that should give you an idea about what's happening.
   203  Most important thing is generating resources with blocks configuration, and exposing some of them.
   204  
   205  
   206  ```jsonnet
   207  {
   208    local cortex = import 'cortex/cortex.libsonnet',
   209    local tsdb = import 'cortex/tsdb.libsonnet',
   210    local rootConfig = self._config,
   211    local statefulSet = $.apps.v1beta1.statefulSet,
   212  
   213    // Prepare TSDB resources, but hide them. Cherry-picked resources will be exposed later.
   214    tsdb_config:: cortex + tsdb + {
   215      _config+:: {
   216        cluster: rootConfig.cluster,
   217        namespace: rootConfig.namespace,
   218        external_url: rootConfig.external_url,
   219  
   220        // This Cortex cluster is using the blocks storage.
   221        storage_tsdb_bucket_name: rootConfig.storage_tsdb_bucket_name,
   222        cortex_store_gateway_data_disk_size: '100Gi',
   223        cortex_compactor_data_disk_class: 'fast',
   224      },
   225  
   226      // We create another statefulset for ingesters here, with different name.
   227      ingester_blocks_statefulset: self.newIngesterStatefulSet('ingester-blocks', self.ingester_container) +
   228                                   statefulSet.mixin.spec.withReplicas(0),
   229  
   230      ingester_blocks_pdb: self.newIngesterPdb('ingester-blocks-pdb', 'ingester-blocks'),
   231      ingester_blocks_service: $.util.serviceFor(self.ingester_blocks_statefulset, self.ingester_service_ignored_labels),
   232    },
   233  
   234    _config+: {
   235      queryFrontend+: {
   236        // Disabled because querying blocks-data breaks if query is rewritten for sharding.
   237        sharded_queries_enabled: false,
   238      },
   239    },
   240  
   241    // Expose some services from TSDB configuration, needed for running Querier with Chunks as primary and TSDB as secondary store.
   242    tsdb_store_gateway_pdb: self.tsdb_config.store_gateway_pdb,
   243    tsdb_store_gateway_service: self.tsdb_config.store_gateway_service,
   244    tsdb_store_gateway_statefulset: self.tsdb_config.store_gateway_statefulset,
   245  
   246    tsdb_memcached_metadata: self.tsdb_config.memcached_metadata,
   247  
   248    tsdb_ingester_statefulset: self.tsdb_config.ingester_blocks_statefulset,
   249    tsdb_ingester_pdb: self.tsdb_config.ingester_blocks_pdb,
   250    tsdb_ingester_service: self.tsdb_config.ingester_blocks_service,
   251  
   252    tsdb_compactor_statefulset: self.tsdb_config.compactor_statefulset,
   253  
   254    // Querier and ruler configuration used during migration, and after.
   255    query_config_during_migration:: {
   256      // Disable streaming, as it is broken when querying both chunks and blocks ingesters at the same time.
   257      'querier.ingester-streaming': 'false',
   258  
   259      // query-store-after is required during migration, since new ingesters running on blocks will not load any chunks from chunks-WAL.
   260      // All such chunks are however flushed to the store.
   261      'querier.query-store-after': '0',
   262    },
   263  
   264    query_config_after_migration:: {
   265      'querier.ingester-streaming': 'true',
   266      'querier.query-ingesters-within': '13h',  // TSDB ingesters have data for up to 4d.
   267      'querier.query-store-after': '12h',  // Can be enabled once blocks ingesters are running for 12h.
   268  
   269      // Switch TSDB and chunks. TSDB is "primary" now so that we can skip querying chunks for old queries.
   270      // We can do this, because querier/ruler have both configurations.
   271      'store.engine': 'blocks',
   272      'querier.second-store-engine': 'chunks',
   273  
   274      'querier.use-second-store-before-time': '2020-07-28T17:00:00Z',  // If migration from chunks finished around 18:10 CEST, no need to query chunk store for queries before this time.
   275    },
   276  
   277    querier_args+:: self.tsdb_config.blocks_metadata_caching_config + self.query_config_during_migration,  // + self.query_config_after_migration,
   278    ruler_args+:: self.tsdb_config.blocks_metadata_caching_config + self.query_config_during_migration,  // + self.query_config_after_migration,
   279  }
   280  ```