github.com/treeverse/lakefs@v1.24.1-0.20240520134607-95648127bfb0/design/accepted/gc_plus/gc-plus-milestone2-execution-plan.md (about) 1 # Uncommitted Garbage Collection - Milestone 2 Execution Plan 2 3 Uncommitted Garbage Collection [Proposal](https://github.com/treeverse/lakeFS/blob/master/design/accepted/gc_plus/uncommitted-gc.md) 4 5 ## Milestone 1 6 7 The first beta version that was released included: 8 1. Implementation of the clean run flow for old and new repository structures (without optimizations) 9 2. Mark & Sweep 10 3. Integration tests 11 4. Backup & Restore - minimal support using rclone 12 13 ## Milestone 2 14 15 ### Goals 16 1. Removing the limitation of a read-only lakeFS during the GC+ job run 17 18 ### Non-Goals 19 1. Performance improvements: 20 * [Optimaized listing on old repository structure](https://github.com/treeverse/lakeFS/issues/4620) 21 * [Efficient listing on committed entries](https://github.com/treeverse/lakeFS/issues/4600) 22 * Benchmarks - verify uncommitted GC [performance requirements](https://github.com/treeverse/lakeFS/blob/e316cafe7717bb3203e4018837a41415aa61f74b/design/accepted/gc_plus/uncommitted-gc.md?plain=1#L185) are kept 23 2. [Implement optimized run flow](https://github.com/treeverse/lakeFS/issues/4489) 24 3. Support for non-S3 repositories 25 * Azure 26 * GCP 27 4. Incorporation of committed & uncommitted GC into a single job 28 * Including GC changes, configuration, and behavior changes to fit GC 29 5. Metrics and Logging additions 30 6. Deployment to lakeFS Cloud 31 7. Improve Backup & Restore 32 33 ### Plan 34 35 * marks dependency 36 37 1. Required changes by lakeFS (including breaking changes): 38 * [[Get/Link]PhysicalAddress](https://github.com/treeverse/lakeFS/issues/4476) 39 * [Validation of cutoff time](https://github.com/treeverse/lakeFS/issues/4695) 40 * [StageObject API](https://github.com/treeverse/lakeFS/issues/4480) 41 * [CopyObject API](https://github.com/treeverse/lakeFS/issues/4477) 42 * [S3 Gateway CopyObject](https://github.com/treeverse/lakeFS/issues/4478) 43 * [lakeFSFS renameObject method](https://github.com/treeverse/lakeFS/issues/4479) 44 * [Track copied objects in ref-store](https://github.com/treeverse/lakeFS/issues/4562) 45 46 2. [Integration tests](https://github.com/treeverse/lakeFS/issues/4830) to verify lakeFS is safe while GC is running 47 48 By the end of this milestone, we will release a lakeFS version that includes all the additions. 49 50 **Due date: 15/01/2023**