github.com/treeverse/lakefs@v1.24.1-0.20240520134607-95648127bfb0/docs/integrations/airbyte.md (about)

     1  ---
     2  title: Airbyte
     3  description: Use Airbyte with lakeFS to easily sync data between applications and S3 with lakeFS version control.
     4  parent: Integrations
     5  ---
     6  
     7  # Airbyte 
     8  
     9  [Airbyte](https://airbyte.io/) is an open-source platform for syncing data from applications, APIs, and databases to
    10  warehouses, lakes, and other destinations. You can use Airbyte's connectors to get your data pipelines to consolidate
    11  many input sources.
    12  
    13  The integration between Airbyte and lakeFS brings resilience and manageability when you use Airbyte
    14  connectors to sync data to your S3 buckets by leveraging lakeFS branches and atomic commits and merges.
    15  
    16  ## Use cases
    17  
    18  You can take advantage of lakeFS consistency guarantees and [Data Lifecycle Management]({% link understand/data_lifecycle_management/index.md %}) when ingesting data to S3 using lakeFS:
    19  
    20  1. Consolidate many data sources to a single branch and expose them to consumers simultaneously when merging to the `main` branch.
    21  1. Test incoming data for breaking schema changes using [lakeFS hooks][data-quality-gates].
    22  1. Prevent consumers from reading partial data from connectors which failed half-way through sync.
    23  1. Experiment with ingested data on a branch before exposing it.
    24  
    25  ## S3 Connector
    26  
    27  lakeFS exposes an [S3 Gateway][s3-gateway] that enables applications to communicate
    28  with lakeFS the same way they would with Amazon S3.
    29  You can use Airbyte's [S3 Connector](https://airbyte.com/connectors/s3) to upload data to lakeFS.
    30  
    31  {: .warning-title}
    32  > Note
    33  >
    34  > If using Airbyte OSS, please ensure you are using S3 destination connector version [0.3.17 or higher](https://docs.airbyte.com/integrations/destinations/s3#changelog).
    35  > Previous connector versions are not supported.
    36  
    37  
    38  ### Configuring lakeFS using the connector
    39  
    40  Set the following parameters when creating a new Destination of type S3:
    41  
    42  | Name             | Value                                                        | Example                                                                                                         |
    43  |------------------|--------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------|
    44  | Endpoint         | The lakeFS S3 gateway URL                                    | `https://cute-axolotol.lakefs-demo.io`                                                                          |
    45  | S3 Bucket Name   | The lakeFS repository where the data will be written         | `example-repo`                                                                                                  |
    46  | S3 Bucket Path   | The branch and the path where the data will be written       | `main/data/from/airbyte` Where `main` is the branch name, and `data/from/airbyte` is the path under the branch. |
    47  | S3 Bucket Region | Not applicable to lakeFS, use `us-east-1`                    | `us-east-1`                                                                                                     |
    48  | S3 Key ID        | The lakeFS access key id used to authenticate to lakeFS.     | `AKIAlakefs12345EXAMPLE`                                                                                        |
    49  | S3 Access Key    | The lakeFS secret access key used to authenticate to lakeFS. | `abc/lakefs/1234567bPxRfiCYEXAMPLEKEY`                                                                          |
    50  
    51  The UI configuration will look as follows:
    52  
    53  ![S3 Destination Connector Configuration]({{ site.baseurl }}/assets/img/airbyte.png)
    54  
    55  [data-quality-gates]:  {% link understand/use_cases/cicd_for_data.md %}#using-hooks-as-data-quality-gates
    56  [s3-gateway]:  {% link understand/architecture.md %}#s3-gateway