github.com/muhammadn/cortex@v1.9.1-0.20220510110439-46bb7000d03d/docs/proposals/tenant_retention.md (about)

     1  ---
     2  title: "Retention of Tenant Data from Blocks Storage"
     3  linkTitle: "Retention of Tenant Data from Blocks Storage"
     4  weight: 1
     5  slug: tenant-retention
     6  ---
     7  
     8  - Author: [Allenzhli](https://github.com/Allenzhli)
     9  - Date: January 2021
    10  - Status: Proposed
    11  
    12  ## Retention of tenant data
    13  
    14  ## Problem
    15  
    16  Metric data is growing over time per-tenant, at the same time, the value of data decreases. We want to have a retention policy like prometheus does. In Cortex, data retention is typically achieved via a bucket policy. However, this has two main issues:
    17  
    18  1. Not every backend storage support bucket policies
    19  2. Bucket policies don't easily allow a per-tenant custom retention
    20  
    21  ## Background
    22  
    23  ### tenants
    24  When using blocks storage, Cortex stores tenant’s data in object store for long-term storage of blocks, tenant id as part of the object store path. We discover all tenants via scan the root dir of bucket.
    25  
    26  ### runtime config
    27  Using the "overrides" mechanism (part of runtime config) already allows for per-tenant settings. See [runtime-configuration-file](https://cortexmetrics.io/docs/configuration/arguments/#runtime-configuration-file) for more details. Using it for tenant retention would fit nicely. Admin could set per-tenant retention here, and also have a single global value for tenants that don't have custom value set.
    28  
    29  ## Proposal
    30  
    31  ### retention period field
    32  
    33  We propose to introduce just one new field `RetentionPeriod` in the Limits struct(defined at pkg/util/validation/limits.go).
    34  
    35  `RetentionPeriod` setting how long historical metric data retention period per-tenant. `0` is disable.
    36  
    37  Runtime config is reloaded periodically (defaults to 10 seconds), so we can update the retention settings on-the-fly.
    38  
    39  For each tenant, if a tenant-specific *runtime_config* value exists, it will be used directly, otherwise, if a default *limits_config* value exists, then the default value will be used; If neither exists, do nothing.
    40  
    41  ### Implementation
    42  
    43  A BlocksCleaner within the Compactor run periodically (which defaults to 15 minutes) and the retention logic will insert into it. The logic should compare retention value to block `maxTime` and blocks that match `maxTime < now - retention` will be marked for delete.
    44  
    45  Blocks deletion is not immediate, but follows a two steps process. See [soft-and-hard-blocks-deletion](https://cortexmetrics.io/docs/blocks-storage/compactor/#soft-and-hard-blocks-deletion)