github.com/treeverse/lakefs@v1.24.1-0.20240520134607-95648127bfb0/design/open/iceberg-branching.md (about)

     1  # Iceberg support: branching & isolation
     2  
     3  ## Problem Description
     4  
     5  Iceberg stores metadata files that represent a snapshot of a given Iceberg table.
     6  Those metadata files save the location of the data hard-coded.
     7  This is problematic with lakeFS, since we cannot work on the same files with different branches.
     8  
     9  For example when working on branch `main`, the metadata files will be written with the property `"location" : "s3a://example-repo/main/db/table1"`.
    10  This means the data is located in `"s3a://example-repo/main/db/table1"`.
    11  However, when branching out from main to `feature-branch`, the metadata files still point to the main branch.
    12  With this we lose the capability of isolation.
    13  
    14  
    15  ## Goals
    16  
    17  Enable working in isolation with an Iceberg table. Create a branch in lakeFS, and operate on the table on the branch.
    18  At first stage: support iceberg without being catalog agnostic - meaning work with only lakeFS catalog and not be compatible with other catalogs.
    19  At second stage: being catalog agnostic. Users will be able to configure both lakeFS catalog and other catalog together.
    20  - We will start by supporting a limited set of catalogs, for example Glue, Snowflake and Tabular.
    21  
    22  ## Non Goals
    23  
    24  - Support merge operation.
    25  - Enable lakeFS operations using a dedicated SQL syntax.
    26  
    27  
    28  ## Proposed Design: Dedicated lakeFS catalog
    29  
    30  The catalog stores current metadata pointer for Iceberg tables.
    31  With lakeFS, this pointer needs to be versioned as well.
    32  In order to prevent from adding versioning capabilities to existing catalogs, we'll extend a catalog that can utilize lakeFS to keep the pointer.
    33  Therefore, we'll extend the Hadoop catalog to work with lakeFS
    34  - Every operation of the catalog happens with the scope of a reference in lakeFS. This will be extracted from the table path.
    35  - Write location on metadata files to be relative: the repo and brach will not be written.
    36  - The catalog will know to use the metadata files from the relevant references
    37    
    38  Implement LakeFSFileIO:
    39  [FileIO](https://iceberg.apache.org/javadoc/master/org/apache/iceberg/io/FileIO.html) is the primary interface between the core Iceberg library and underlying storage (read more [here](https://tabular.io/blog/iceberg-fileio/#:~:text=FileIO%20is%20the%20primary%20interface,how%20straightforward%20the%20interface%20is.)). In practice, it is just two operations: open for read and open for write.
    40  We'll Extend hadoop fileIO to work with lakeFS, and to be compatible with lakeFS catalog.
    41  As with the catalog, the reference in lakeFS will be extracted from the table path.
    42  
    43  This design will require users to only have a single writer per branch. This is because the Hadoop catalog uses a rename operation which is not atomic in lakeFS. 
    44  We think it's a reasonable limitation to start with. We may be able to overcome this limitation in the future, using the `IfNoneMatch` flag in the lakeFS API.
    45  
    46  ## Alternatives Considered
    47  
    48  ### Implementation of lakeFS FileIO alone
    49  In order to enable the users to choose their preferred catalog, we considered implementing only lakeFS FileIO.
    50  As mentioned above, this will require the catalog to be able to version metadata pointers.
    51  This will be very hard and won't enable us to fail fast.
    52  
    53  
    54  ## Usage
    55  
    56  Iceberg operations:
    57  
    58  The user will need to set the catalog implementation to be the lakeFS catalog. For example, if working in Spark:
    59  `spark.sql.catalog.lakefs.catalog-impl=io.lakefs.iceberg.LakeFSCatalog`
    60  The user will also need to configure a Hadoop FileSystem that can interact with objects on lakeFS, like the S3AFileSystem or LakeFSFileSystem.
    61  
    62  Every operation on iceberg table (writing, reading, compaction, etc.) will be performed on the table in lakeFS.
    63  The tables will be defined in iceberg. 
    64  Meaning, the user will need to specify the full path of the table, containing the repo and any reference in lakeFS. 
    65  
    66  For example:
    67  
    68  `SELECT * FROM example-repo.main.table1;`
    69  
    70  
    71  lakeFS:
    72  lakeFS operations will stay as is.
    73  Interacting with lakeFS will be through lakeFS clients, not through Iceberg.
    74  Branch, diff and commit operations should work out of the box (since the locations in the metadata files are relative).
    75  Diff will not show changes done to the table (like in delta diff).
    76  
    77  ## Packaging
    78  
    79  We will publish this catalog to Maven central.
    80  To start using Iceberg with lakeFS, the user will need to add the lakeFS catalog as a dependency.
    81  
    82  ## Open Questions
    83  - Can this catalog become part of the Iceberg source code?
    84  - Migration: how to move my existing tables to lakeFS?
    85  - Iceberg diff: how to view table changes in lakeFS.
    86  - Merge: how to allow merging between branches?
    87  - Catalog agnosticity: how can we use the benefits of other catalogs with lakeFS?
    88  - Add files operation to iceberg