github.com/thanos-io/thanos@v0.32.5/docs/proposals-accepted/202012-deletions-object-storage.md (about)

     1  ---
     2  type: proposal
     3  title: Delete series for object storage
     4  status: approved
     5  owner: bwplotka, Harshitha1234, metalmatze
     6  menu: proposals-accepted
     7  ---
     8  
     9  ### Ticket: https://github.com/thanos-io/thanos/issues/1598
    10  
    11  ## Summary
    12  
    13  This design document proposes deletion of series for object storage in Thanos. This feature mainly causes changes in the Store component and masks the series from objstore corresponding to the user deletion requests and does not trigger actual deletions in the object storage which is proposed to be the future work extension of the current proposal.
    14  
    15  ## Motivation
    16  
    17  The main motivation for considering deletions in the object storage are the following use cases
    18  
    19  * **Accidental insertion of confidential data:** This is a scenario where a user accidentally inserts confidential data and wants to delete it immediately. In this case, the user expects their request to accelerate an immediate deletion of the series pertaining to the blocks concerned with that specific data to be deleted.
    20  * **GDPR:** Masking data and eventual deletion is expected.
    21  * **Deletions to sustain user requirements:** Let’s assume the user has some data which leads to some unexpected results or causes performance degradation (due to high cardinality) and the user wants to restore the previous data set-up for obtaining the desired results. In this scenario, the user would want to send a request to mask the data for the time being as there isn’t a high priority requirement to delete the data but eventually during the compaction the user can expect the data to be deleted by the compactor not leading to any major performance issues.
    22  * **Helps achieving series based retention (e.g rule aggregation) [#903](https://github.com/thanos-io/thanos/issues/903).**
    23  
    24  ## Goals
    25  
    26  * Unblock users and allow series deletion in the object storage using tombstones.
    27  * Performing deletions at admin level.
    28  
    29  ## Proposed Approach
    30  
    31  * We propose to implement deletions via the tombstones approach using a CLI tool.
    32  * A tombstone is proposed to be a **Custom format global - single file per request**. Example of a tombstone file is as follows
    33  
    34  ```
    35  {
    36  		"matchers":     "up{source=\"prometheus\"}",
    37  		"minTime":      -62167219200000,
    38  		"maxTime":      253402300799000,
    39  		"creationTime": 1598367375935,
    40  		"author":       "John Gabriel",
    41  		"reason":       "not specified",
    42  }
    43  ```
    44  
    45  * **Why Custom format global - single file per request**? :
    46    * As global tombstones are concerned, they would be immutable and deletion requests concurrency wouldn't be an issue (because each request creates a new object) and the compactor would load multiple tombstones upfront (before compacting), performs compaction and makes necessary changes to the block and then deletes the tombstones once done. If new tombstones are created in the meanwhile, it wouldn't cause a problem as the next compaction run will take them in account.
    47    * We can easily have multiple writers
    48    * No need to have index data of the blocks
    49  * A user is expected to enter the following details for performing deletions:
    50    * **label matchers**
    51    * **start timestamp**
    52    * **end timestamp** (start and end timestamps of the series data the user expects to be deleted)
    53    * **creation timestamp**
    54    * **author name**
    55    * **reason for deletion**
    56  * The entered details are processed by the CLI tool to create a tombstone file (unique for a request and irrespective of the presence of series), and the file is uploaded to the object storage making it accessible to all components.
    57  * **Filename optimization**: The filename is created from the hash of matchers, minTime and maxTime. This helps re-write an existing tombstone, whenever a same request is made in the future hence avoiding duplication of the same request. (NOTE: Requests which entail common deletions still creates different tombstones.)
    58  * Store Gateway masks the series on processing the global tombstone files from the object storage. At chunk level, whenever there's a match with the data corresponding to atleast one of the tombstones, we skip the chunk, potentially resulting in the masking of chunk.
    59  
    60  ## Considerations
    61  
    62  * When any of the timestamps i.e., start timestamp or end timestamp or both remain unspecified we go with default values.
    63  * We don’t want to add this feature to the sidecar. The sidecar is expected to be kept lightweight.
    64  
    65  ## Alternatives
    66  
    67  #### Where the deletion API sits
    68  
    69  1. **Compactor:**
    70  * Pros: Easiest as it's single writer in Thanos (but not for Cortex)
    71  * Cons: Does not have quick access to querying or block indexes
    72  1. **Store:**
    73  * Pros: Does have quick access to querying or block indexes
    74  * Cons: There would be many writers, and store never was about to write things, it only needed read access
    75  
    76  #### How we store the tombstone and in what format
    77  
    78  1. Prometheus Format per block
    79  2. Prometheus Format global
    80  3. Custom format per block
    81  4. Custom format global - appendable file
    82  
    83  Reasons for not choosing 1 & 2 tombstone formats i.e., **Prometheus format**:
    84  * In this format, for creation of a tombstone, we will need the series ID. To find the series ID we need to pull the index of all the blocks which will never scale.
    85  * This is a mutable file in Prometheus and which is not the case in Thanos Object Store.
    86  
    87  Reasons for not choosing 3 i.e., **Custom format per block**:
    88  * This becomes the function of number of blocks which hinders scalability and adds complexity.
    89  
    90  Reasons for not choosing 4 i.e., **Custom format global - appendable file**:
    91  * Read-after-write consistency cannot be guaranteed.
    92  
    93  ## Action Plan
    94  
    95  * Add a CLI tool which creates tombstones
    96  * Store Gateway should be able to mask based on the tombstones from object storage
    97  
    98  ## Challenges
    99  
   100  * How to "compact" / "remove" **Custom format global - single file per request** files when applied.
   101  * Do we want those deletion to be applied only during compaction or also for already compacted blocks. If yes how? 5%? And how to tell that when is 5%? What component would do that?
   102  * Any rate limiting, what if there are too many files? Would that scale?
   103  
   104  ## Future Work
   105  
   106  * Performing actual deletions in the object storage and rewriting the blocks based on tombstones by the compactor.
   107  * Have a max waiting duration feature for performing deletions where a default value is considered if explicitly not specified by the user. (only after this time passes the deletions are performed by the compactor in the next compaction cycle)
   108  * Have the undoing deletions feature and there are two proposed ways
   109    * API to undelete a time series - maybe delete the whole tombstones file?
   110    * “Imaginary” deletion that can delete other tombstones