github.com/muhammadn/cortex@v1.9.1-0.20220510110439-46bb7000d03d/docs/proposals/tenant_retention.md (about) 1 --- 2 title: "Retention of Tenant Data from Blocks Storage" 3 linkTitle: "Retention of Tenant Data from Blocks Storage" 4 weight: 1 5 slug: tenant-retention 6 --- 7 8 - Author: [Allenzhli](https://github.com/Allenzhli) 9 - Date: January 2021 10 - Status: Proposed 11 12 ## Retention of tenant data 13 14 ## Problem 15 16 Metric data is growing over time per-tenant, at the same time, the value of data decreases. We want to have a retention policy like prometheus does. In Cortex, data retention is typically achieved via a bucket policy. However, this has two main issues: 17 18 1. Not every backend storage support bucket policies 19 2. Bucket policies don't easily allow a per-tenant custom retention 20 21 ## Background 22 23 ### tenants 24 When using blocks storage, Cortex stores tenant’s data in object store for long-term storage of blocks, tenant id as part of the object store path. We discover all tenants via scan the root dir of bucket. 25 26 ### runtime config 27 Using the "overrides" mechanism (part of runtime config) already allows for per-tenant settings. See [runtime-configuration-file](https://cortexmetrics.io/docs/configuration/arguments/#runtime-configuration-file) for more details. Using it for tenant retention would fit nicely. Admin could set per-tenant retention here, and also have a single global value for tenants that don't have custom value set. 28 29 ## Proposal 30 31 ### retention period field 32 33 We propose to introduce just one new field `RetentionPeriod` in the Limits struct(defined at pkg/util/validation/limits.go). 34 35 `RetentionPeriod` setting how long historical metric data retention period per-tenant. `0` is disable. 36 37 Runtime config is reloaded periodically (defaults to 10 seconds), so we can update the retention settings on-the-fly. 38 39 For each tenant, if a tenant-specific *runtime_config* value exists, it will be used directly, otherwise, if a default *limits_config* value exists, then the default value will be used; If neither exists, do nothing. 40 41 ### Implementation 42 43 A BlocksCleaner within the Compactor run periodically (which defaults to 15 minutes) and the retention logic will insert into it. The logic should compare retention value to block `maxTime` and blocks that match `maxTime < now - retention` will be marked for delete. 44 45 Blocks deletion is not immediate, but follows a two steps process. See [soft-and-hard-blocks-deletion](https://cortexmetrics.io/docs/blocks-storage/compactor/#soft-and-hard-blocks-deletion)