github.com/treeverse/lakefs@v1.24.1-0.20240520134607-95648127bfb0/docs/project/code-migrate-1.0-sdk.md (about)

     1  ---
     2  title: Migrating to 1.0
     3  parent: The lakeFS Project
     4  description: Code migration guide detailing API and SDK upgrades, deprecated and new API operations, and SDK migration processes for both Java/JVM and Python with code refactoring examples for improved stability and compatibility.
     5  
     6  ---
     7  
     8  # lakeFS 1.0 - Code Migration Guide
     9  
    10  Version 1.0.0 promises API and SDK stability. By "API" we mean any access to a lakeFS REST endpoint. By "SDK" we mean auto-generated lakeFS clients: `lakefs-sdk` for Python and `io.lakefs:sdk` for Java. This guide details the steps to allow you to upgrade your code to enjoy this stability.
    11  
    12  Avoid using APIs and SDKs labeled as `experimental`, `internal`, or `deprecated`. If you must use them, be prepared to adjust your application to align with any lakeFS server updates.
    13  
    14  Your software developed without such APIs should be compatible with all minor version updates of the lakeFS server from the version you originally developed with.
    15  
    16  If you rely on a publicly released API and SDK, it will adhere to semantic versioning. Transitioning your application to a minor SDK version update should be smooth.
    17  
    18  The operation names and tags from the [`api/swagger.yml`](https://github.com/treeverse/lakeFS/blob/7d9feeb0211a637e2b8a63abaa629efc968d7c9e/api/swagger.yml) specification might differ based on the SDK or coding language in use.
    19  
    20  ### Deleted API Operations
    21  
    22  The following API operations have been removed:
    23  
    24  - `updatePassword`
    25  - `forgotPassword`
    26  - `logBranchCommits`
    27  - `expandTemplate`
    28  - `createMetaRange`
    29  - `ingestRange`
    30  - `updateBranchToken`
    31  
    32  ### Internal API Operations
    33  
    34  The following operations are for `internal` use only and should not be used in your application code. Some deprecated operations have alternatives provided.
    35  
    36  - `setupCommPrefs`
    37  - `getSetupState`
    38  - `setup`
    39  - `getAuthCapabilities`
    40  - `uploadObjectPreflight`
    41  - `setGarbageCollectionRulesPreflight`
    42  - `createBranchProtectionRulePreflight`
    43  - `postStatsEvents`
    44  - `dumpRefs` (will be replaced with a long-running API later)
    45  - `restoreRefs` (will be replaced with a long-running API later)
    46  - `createSymlinkFile` (Deprecated)
    47  - `getStorageConfig` (Deprecated. Alternative: `getConfig`)
    48  - `getLakeFSVersion` (Deprecated. Alternative: `getConfig`)
    49  - `stageObject` (Deprecated. Alternatives: `get/link physical address` or `import`)
    50  - `internalDeleteBranchProtectionRule` (Deprecated. Temporary backward support. Alternative: `setBranchProtectionRules`)
    51  - `internalCreateBranchProtectionRule` (Deprecated. Temporary backward support. Alternative: `setBranchProtectionRules`)
    52  - `internalGetBranchProtectionRule` (Deprecated. Temporary backward support. Alternative: `getBranchProtectionRules`)
    53  - `internalDeleteGarbageCollectionRules` (Deprecated. Temporary backward support. Alternative: `deleteGCRules`)
    54  - `internalSetGarbageCollectionRules` (Deprecated. Temporary backward support. Alternative: `setGCRules`)
    55  - `internalGetGarbageCollectionRules` (Deprecated. Temporary backward support. Alternative: `getGCRules`)
    56  - `prepareGarbageCollectionCommits`
    57  - `getGarbageCollectionConfig`
    58  
    59  ### New/Updated API Operations
    60  
    61  Here are the newly added or updated operations:
    62  
    63  - `getConfig` (Retrieve lakeFS version and storage info)
    64  - `setBranchProtectionRules` (Route updated)
    65  - `getBranchProtectionRules` (Route updated)
    66  - `getGCRules` (New route introduced)
    67  - `setGCRules` (New route introduced)
    68  - `deleteGCRules` (New route introduced)
    69  - `importStatus` (Response structure updated: 'ImportStatusResp' to 'ImportStatus')
    70  - `uploadObject` (Parameters 'if-none-match' and 'storageClass' are now deprecated)
    71  - `prepareGarbageCollectionCommits` (Request body removed)
    72  - `getOtfDiffs` & `otfDiff` (Removed from 'otf diff' tag; retained in 'experimental' tag)
    73  
    74  ## Migrating SDK Code for Java and JVM-based Languages
    75  
    76  ### Introduction
    77  
    78  If you are using the lakeFS client for Java or for any other JVM-based language, be aware that the current package is not stable with respect to minor version upgrades. Transitioning from `io.lakefs:lakefs-client` to `io.lakefs:sdk` will necessitate rewriting your API calls to fit the new design paradigm.
    79  
    80  
    81  ### Problem with the Old Style
    82  
    83  Previously, API calls required developers to pass all parameters, including optional ones, in a single function call. As demonstrated in this older style:
    84  
    85  ```java
    86  ObjectStats objectStat = objectsApi.statObject(
    87      objectLoc.getRepository(), objectLoc.getRef(), objectLoc.getPath(),
    88      false, false);
    89  ```
    90  
    91  This method posed a couple of challenges:
    92  
    93  1. **Inflexibility with Upgrades:** If an optional parameter were introduced in newer versions, existing code would fail to compile.
    94  2. **Maintenance Difficulty:** Long argument lists can be challenging to manage and understand, leading to potential mistakes and readability issues.
    95  
    96  Adopting the Fluent Style
    97  
    98  In the revised SDK, API calls adopt a fluent style, making the code more modular and adaptive to changes.
    99  
   100  Here's an example of the new style:
   101  
   102  ```java
   103  ObjectStats objectStat = objectsApi
   104      .statObject(
   105          objectLoc.getRepository(), objectLoc.getRef(), objectLoc.getPath()
   106      )
   107      .userMetadata(true)
   108      .execute();
   109  ```
   110  
   111  
   112  ### Here's a breakdown of the changes:
   113  
   114  1. **Initial Function Call:** Begin by invoking the desired function with all required parameters.
   115  2. **Modifying Optional Parameters:** Chain any modifications to optional parameters after the initial function. For instance, `userMetadata` is changed in the example above.
   116  3. **Unused Optional Parameters**: You can safely ignore these. For instance, this code ignores the `presign` optional parameter because it never uses it.
   117  4. **Execution:** Complete the call with the `.execute()` method.
   118  
   119  This new design offers several advantages:
   120  
   121  - **Compatibility with Upgrades:** When a new optional parameter is introduced, existing code will use its default value, preserving compatibility with minor server version upgrades.
   122  - **Improved Readability:** The fluent style makes it evident which parameters are required and which ones are optional, enhancing code clarity.
   123  
   124  When migrating your code, ensure you refactor all your API calls to adopt the new fluent style. This ensures that your application remains maintainable and is safeguarded against potential issues arising from minor SDK version upgrades.
   125  
   126  For an illustrative example of the transition between styles, you can view the changes made in this pull request: [lakeFS pull request #6529](https://github.com/treeverse/lakeFS/pull/6529/files#diff-4c50b9ac3bf6bfc05e3b6ff0fbe2fd3214f31afb5b449732d90efe5f97f67167R666).
   127  
   128  
   129  ## Migrating SDK Code for Python
   130  
   131  ### Introduction
   132  
   133  If you are using the lakeFS client for Python, be aware that the current package is not stable with respect to minor version upgrades. Transitioning from `lakefs-client` to `lakefs-sdk` will necessitate rewriting your API calls.
   134  
   135  ### Here's a breakdown of the changes:
   136  
   137  1. **Modules change**
   138     - The previous `model` module was renamed to `models`, meaning that `lakefs_client.model` imports should be replaced with `lakefs_sdk.models` imports.
   139     - The `apis` module in `lakefs_client` is deprecated and no longer supported. To migrate to the new `api` module in `lakefs_sdk`, you should replace all imports of `lakefs_client.apis` with imports of `lakefs_sdk.api`. We still recommend using the `lakefs_sdk.LakeFSClient` class instead of using the `api` module directly. The `LakeFSClient` class provides a higher-level interface to the LakeFS API and makes it easier to use LakeFS in your applications.
   140  2. **`upload_object` API call:** The `content` parameter value passed to the `objects_api.upload_object` method call should be either a `string` containing the path to the uploaded file, or `bytes` of data to be uploaded.
   141  3. **`get_object`** **API call**: The return value of `client.get_object(...)` is a `bytearray` containing the content of the object.
   142  4. `**client.{operation}_api**`**:** The `lakefs-client` package’s `LakeFSClient` class’s deprecation-marked operations (`client.{operation}`) will no longer be available in the `lakefs-sdk` package’s `LakeFSClient` class. In their place, the `client.{operation}_api` should be used.
   143  5. **Minimum Python Version**: 3.7
   144  6. **Fetching results from response objects**: Instead of fetching the required results properties from a dictionary using `response_result.get_property(prop_name)`, the response objects will include domain specific entities, thus referring to the properties in the `results` of the response - `response_result.prop_name`. For example, instead of:
   145  
   146  ```python
   147  response = lakefs_client.branches.diff_branch(repository='repo', branch='main')
   148  diff = response.results[0] # 'results' is a 'DiffList' object
   149  path = diff.get_property('path') # 'diff' is a dictionary
   150  ```
   151  
   152  You should use:
   153  
   154  ```python
   155  response = lakefs_client.branches_api.diff_branch(repository='repo', branch='main')
   156  diff = response.results[0] # 'results' is a 'DiffList' object
   157  path = diff.path # 'diff' is a 'Diff' object
   158  ```