github.com/treeverse/lakefs@v1.24.1-0.20240520134607-95648127bfb0/design/accepted/metastore-project.md (about)

     1  # Next Generation Metastore - Project
     2  
     3  
     4  ## Milestone 1 - lakeFS Metastore Proposal
     5  
     6  
     7  - *High-level design*: Map and select one of the two ways to implement the lakeFS Metastore. Understand each suggested design option, and list pros and cons. The potential architectures are: 
     8    - Metastore as additional functionality inside lakeFS 
     9    - Metastore as an external client to lakeFS
    10  
    11  - *Data model*: Define the Data model for Metastore entities. Consider metadata access patterns, typical Metastore operations, and the operations lakeFS will provide over metadata.
    12    Questions we like to address:
    13    - Can (and should) we use Graveler to model metadata?
    14    - How diff, merge and commit operations will look like?
    15    - How are we going to tie data versioning to metadata versioning?
    16    - How will conflict resolution look like, and will it be affected?
    17    - How to enable import and export from an existing Metastore?
    18    - How to model Metastore's statistics?
    19  
    20  - *Communication with lakeFS*: Investigate the options for passing lakeFS references (repository/branch/ref) from Metastore clients to lakeFS Metastore. 
    21    Questions we like to address:
    22    - How can lakeFS metastore acn get the information as a remote Hive Metastore? Any alternatives without passing the data?
    23    - How lakeFS Metastore can co-exist with other Metastores?
    24  
    25  - *Metastore hooks*: What does it take to support hooks? Does anybody put it into use? is it still relevant?
    26  
    27  - *Authentication* with lakeFS - optional. 
    28  
    29  
    30  Open items for the design document:
    31  
    32  - Define the relationship between Metastore and lakeFS repositories. 1:1, 1:n, m:n?
    33