github.com/treeverse/lakefs@v1.24.1-0.20240520134607-95648127bfb0/docs/understand/enterprise/orchestration.md (about)

     1  ---
     2  title: Run lakeFS Enterprise
     3  description: Start using lakeFS-enterprise
     4  parent: lakeFS Enterprise
     5  grand_parent: Understanding lakeFS
     6  ---
     7  
     8  # Run lakeFS Enterprise
     9  
    10  {% include toc.html %}
    11  
    12  ## Overview
    13  
    14  lakeFS Enterprise solution consists of 2 main components:
    15  1. lakeFS - Open Source: [treeverse/lakeFS](https://hub.docker.com/r/treeverse/lakefs),
    16     release info found in [Github releases](https://github.com/treeverse/lakeFS/releases).
    17  2. Fluffy - Proprietary: In charge of the Enterprise features. Can be retrieved from
    18     [Treeverse Dockerhub](https://hub.docker.com/u/treeverse) using the granted token.
    19  
    20  ### Prerequisites
    21  
    22  1. A KV Database, like postgres, should be configured and shared by fluffy and lakeFS.
    23  1. Access to configure SSO IdP, like Azure AD Enterprise Application.
    24  1. A proxy server should be configured to route traffic between the 2 servers.
    25  
    26  
    27  There are several ways to run lakeFS Enterprise. You can follow the examples below,
    28  or create your own setup using our resources available in our documentation.
    29  
    30  ## Using lakeFS Helm chart
    31  
    32  With every new release of lakeFS or Fluffy, the lakeFS team releases a new [lakeFS 
    33  helm chart](https://artifacthub.io/packages/helm/lakefs/lakefs). Together with the granted
    34  Fluffy token, you can run the full lakeFS Enterprise solution.
    35  
    36  As an example, the following `values` file will run lakeFS Enterprise with OIDC integration.
    37  
    38  ```yaml
    39  lakefsConfig: | 
    40    logging:
    41        level: "INFO"
    42    blockstore:
    43      type: s3
    44    database:
    45      type: postgres
    46      postgres:
    47        connection_string: <postgres-connection-string>
    48    auth:
    49      oidc:
    50        # the claim that's provided by the OIDC provider (e.g Okta) that will be used as the username according to OIDC provider claims provided after successful authentication
    51        friendly_name_claim_name: "<some-oidc-provider-claim-name>"
    52        default_initial_groups: ["Developers"]
    53      ui_config:
    54        login_cookie_names:
    55          - internal_auth_session
    56          - oidc_auth_session
    57  ingress:
    58    enabled: true
    59    ingressClassName: <class-name>
    60    hosts:
    61      # the ingress that will be created for lakeFS
    62      - host: <lakefs.ingress.domain>
    63        paths: 
    64         - /
    65  
    66  ##################################################
    67  ########### lakeFS enterprise - FLUFFY ###########
    68  ##################################################
    69  
    70  fluffy:
    71    enabled: true
    72    image:
    73      repository: treeverse/fluffy
    74      tag: '0.4.0'
    75      pullPolicy: IfNotPresent
    76      privateRegistry:
    77        enabled: true
    78        secretToken: <dockerhub-token-fluffy-image>
    79    fluffyConfig: |
    80      logging:
    81        format: "json"
    82        level: "INFO"
    83      database:
    84        type: postgres
    85        postgres:
    86          connection_string: <postgres-connection-string>
    87      auth:
    88        serve_listen_address: 0.0.0.0:9000
    89        logout_redirect_url: https://oidc-provider-url.com/logout/example
    90        oidc:
    91          enabled: true
    92          url: https://oidc-provider-url.com/
    93          client_id: <oidc-client-id>
    94          callback_base_url: https://<lakefs.ingress.domain>
    95          is_default_login: true
    96          # the claim name that represents the client identifier in the OIDC provider (e.g Okta)
    97          logout_client_id_query_parameter: client_id
    98          # the query parameters that will be used to redirect the user to the OIDC provider (e.g Okta) after logout
    99          logout_endpoint_query_parameters:
   100            - returnTo
   101            - https://<lakefs.ingress.domain>/oidc/login
   102    secrets:
   103      create: true
   104    sso:
   105      enabled: true
   106      oidc:
   107        enabled: true
   108        # secret given by the OIDC provider (e.g auth0, Okta, etc)
   109        client_secret: <oidc-client-secret>
   110    rbac:
   111      enabled: true
   112  
   113  useDevPostgres: true
   114  ```
   115  
   116  1. The example uses OIDC authentication method. For other methods, see [SSO]({% link reference/security/sso.md %}).
   117  2. Database configuration must be identical between lakeFS and Fluffy.
   118  3. `useDevPostgres` isn't suitable for production. When used, a local dev postgres is created.
   119  
   120  
   121  ## Using docker compose
   122  
   123  The following docker-compose file will spin up lakeFS, Fluffy and postgres as a shared KV database. 
   124  This setup uses OIDC as the SSO authentication method.
   125  Using a local postgres is not suitable for production use-cases.
   126  
   127  ```yaml
   128  version: "3"
   129  services:
   130    lakefs:
   131      image: "treeverse/lakefs:1.20.0"
   132      command: "RUN"
   133      ports:
   134        - "8080:8000"
   135      depends_on:
   136        - "postgres"
   137      environment:
   138        - LAKEFS_LOGGING_LEVEL=DEBUG
   139        - LAKEFS_AUTH_ENCRYPT_SECRET_KEY="some random secret string"
   140        - LAKEFS_AUTH_API_ENDPOINT=http://fluffy:9000/api/v1
   141        - LAKEFS_AUTH_API_SUPPORTS_INVITES=true
   142        - LAKEFS_AUTH_LOGOUT_REDIRECT_URL=http://fluffy:8000/oidc/logout
   143        - LAKEFS_AUTH_UI_CONFIG_LOGIN_URL=http://fluffy:8000/oidc/login
   144        - LAKEFS_AUTH_UI_CONFIG_LOGOUT_URL=http://fluffy:8000/oidc/logout
   145        - LAKEFS_AUTH_UI_CONFIG_RBAC=internal
   146        - LAKEFS_AUTH_UI_CONFIG_LOGIN_COOKIE_NAMES=[internal_auth_session,oidc_auth_session]
   147        - LAKEFS_AUTH_OIDC_FRIENDLY_NAME_CLAIM_NAME="nickname"
   148        - LAKEFS_AUTH_OIDC_DEFAULT_INITIAL_GROUPS=["Admins"]
   149        - LAKEFS_AUTH_AUTHENTICATION_API_ENDPOINT=http://fluffy:8000/api/v1
   150        - LAKEFS_AUTH_AUTHENTICATION_API_EXTERNAL_PRINCIPALS_ENABLED=true
   151        - LAKEFS_DATABASE_TYPE=postgres
   152        - LAKEFS_DATABASE_POSTGRES_CONNECTION_STRING=postgres://lakefs:lakefs@postgres/postgres?sslmode=disable
   153        - LAKEFS_BLOCKSTORE_TYPE=local
   154        - LAKEFS_BLOCKSTORE_LOCAL_PATH=/home/lakefs
   155        - LAKEFS_BLOCKSTORE_LOCAL_IMPORT_ENABLED=true
   156      entrypoint: ["/app/wait-for", "postgres:5432", "--", "/app/lakefs", "run"]
   157  
   158    postgres:
   159      image: "postgres:11"
   160      ports:
   161        - "5433:5432"
   162      environment:
   163        POSTGRES_USER: lakefs
   164        POSTGRES_PASSWORD: lakefs
   165  
   166    fluffy:
   167      image: "${FLUFFY_REPO:-treeverse}/fluffy:${TAG:-0.4.0}"
   168      command: "${COMMAND:-run}"
   169      ports:
   170        - "8000:8000"
   171        - "9000:9000"
   172      depends_on:
   173        - "postgres"
   174      environment:
   175        - FLUFFY_LOGGING_LEVEL=DEBUG
   176        - FLUFFY_DATABASE_TYPE=postgres
   177        - FLUFFY_DATABASE_POSTGRES_CONNECTION_STRING=postgres://lakefs:lakefs@postgres/postgres?sslmode=disable
   178        - FLUFFY_AUTH_ENCRYPT_SECRET_KEY="some random secret string"
   179        - FLUFFY_AUTH_SERVE_LISTEN_ADDRESS=0.0.0.0:9000
   180        - FLUFFY_LISTEN_ADDRESS=0.0.0.0:8000
   181        - FLUFFY_AUTH_SERVE_DISABLE_AUTHENTICATION=true
   182        - FLUFFY_AUTH_LOGOUT_REDIRECT_URL=<oidc-login-url>
   183        - FLUFFY_AUTH_POST_LOGIN_REDIRECT_URL=http://lakefs:8000/
   184        - FLUFFY_AUTH_OIDC_ENABLED=true
   185        - FLUFFY_AUTH_OIDC_URL=<oidc-endpoint>
   186        - FLUFFY_AUTH_OIDC_CLIENT_ID=<client-id>
   187        - FLUFFY_AUTH_OIDC_CLIENT_SECRET=<client-secret>
   188        - FLUFFY_AUTH_OIDC_CALLBACK_BASE_URL=http://fluffy:8000
   189        - FLUFFY_AUTH_OIDC_IS_DEFAULT_LOGIN=true
   190        - FLUFFY_AUTH_OIDC_LOGOUT_CLIENT_ID_QUERY_PARAMETER=client_id
   191      entrypoint: [ "/app/wait-for", "postgres:5432", "--", "/app/fluffy" ]
   192      configs:
   193        - source: fluffy.yaml
   194          target: /etc/fluffy/config.yaml
   195  
   196   #This tweak is unfortunate but also necessary. logout_endpoint_query_parameters is a list
   197   #of strings which isn't parsed nicely as env vars.
   198  configs:
   199    fluffy.yaml:
   200      content: |
   201        auth:
   202          oidc:
   203            logout_endpoint_query_parameters:
   204              - returnTo
   205              - http://localhost:8080/oidc/login
   206  ```
   207  
   208  ## More examples
   209  
   210  Few more examples for running lakeFS Enterprise based on the SSO provider and the setup. 
   211  
   212  ### Active Directory Federation Services (AD FS) (using SAML) without helm
   213  {:.no_toc}
   214  
   215  **Note:** If you'd like to run this example on a k8s cluster, follow this section
   216  and replace fluffy and lakeFS configuration in the helm's `values.yaml` file.
   217  {: .note }
   218  
   219  #### Azure Configuration
   220  
   221  1. Create an Enterprise Application with SAML toolkit - see [Azure quickstart](https://learn.microsoft.com/en-us/entra/identity/enterprise-apps/add-application-portal)
   222  1. Add users: **App > Users and groups**: Attach users and roles from their existing AD users
   223     list - only attached users will be able to login to lakeFS.
   224  1. Configure SAML: App >  Single sign-on > SAML:
   225     1. Entity ID: Add 2 ID’s, lakefs-url + lakefs-url/saml/metadata (e.g. https://lakefs.acme.com)
   226     1. Reply URL: lakefs-url/saml (e.g. https://lakefs.acme.com/saml)
   227     1. Sign on URL: lakefs-url/sso/login-saml (e.g. https://lakefs.acme.com/sso/login-saml)
   228     1. Relay State (Optional): /
   229  
   230  #### Fluffy Configuration
   231  
   232  **Note:** Full Fluffy configuration can be found [here]({% link understand/enterprise/fluffy-configuration.md %})..
   233  {: .note }
   234  
   235  1. `auth.saml.idp_metadata_url`: Set from the Azure app created above _SAML configuration > “App Federation Metadata Url”_
   236  1. `auth.saml.external_user_id_claim_name`: The claim that represents the UserID.
   237     The saim claim name must also be set in lakeFS config key `auth.cookie_auth_verification.external_user_id_claim_name`.
   238     The example below uses http://schemas.xmlsoap.org/ws/2005/05/identity/claims/name as the claim name.
   239  
   240  ```yaml
   241  # fluffy configuration.yaml 
   242  
   243  # SSO address (i.e login with Azure) and healthcheck address /_health
   244  listen_address: :8000
   245  logging:
   246    level: "INFO"
   247    audit_log_level: "INFO"
   248  # everything under database equal to the config in lakeFS
   249  database:
   250    type: postgres
   251    postgres:
   252      # same as lakeFS
   253      connection_string: <postgres connection string> 
   254  auth:
   255    # RBAC Service, default address is 0.0.0.0:9000, also lakeFS healcheck
   256    serve_listen_address: ':9000'
   257    encrypt:
   258      # same as lakeFS
   259      secret_key: shared-secrey-key 
   260    logout_redirect_url: https://<lakefs-url>
   261    post_login_redirect_url: https://<lakefs-url>
   262    saml:
   263      enabled: true
   264      sp_root_url: https://<lakefs-url>
   265      # generated SSL key, if not enforced on on the IdP level then can be anything as long as it a valid cert structure. 
   266      sp_x509_key_path: '/etc/saml_certs/rsa_saml_private.cert'
   267      sp_x509_cert_path: '/etc/saml_certs/rsa_saml_public.pem'
   268      sp_sign_request: false
   269      sp_signature_method: "http://www.w3.org/2001/04/xmldsig-more#rsa-sha256"
   270      idp_metadata_url: https://login.microsoftonline.com/<...>/federationmetadata/2007-06/federationmetadata.xml?appid=<app-id>
   271      external_user_id_claim_name: http://schemas.xmlsoap.org/ws/2005/05/identity/claims/name
   272      idp_skip_verify_tls_cert: true
   273  ```
   274  
   275  #### lakeFS Configuration
   276  
   277  ```yaml
   278  # lakeFS configuration.yaml 
   279  
   280  database:
   281    type: postgres
   282    postgres:
   283      # same as fluffy
   284      connection_string: <postgres connection string> 
   285  auth:
   286    encrypt:
   287      # same as fluffy
   288      secret_key: shared-secrey-key 
   289    api:
   290      # RBAC endpoint
   291      endpoint: http://<fluffy>:9000/api/v1
   292    cookie_auth_verification:
   293      auth_source: saml
   294      friendly_name_claim_name: displayName
   295      external_user_id_claim_name: http://schemas.xmlsoap.org/ws/2005/05/identity/claims/name
   296      default_initial_groups:
   297        - "Developers"
   298    ui_config:
   299      login_url: "https://<lakefs-url>/sso/login-saml"
   300      logout_url: "https://<lakefs-url>/sso/logout-saml"
   301      rbac: internal
   302      login_cookie_names:
   303         - internal_auth_session
   304         - saml_auth_session
   305  ```
   306  
   307  ### LDAP `values.yaml` file for helm deployments
   308  {:.no_toc}
   309  
   310  ```yaml
   311  lakefsConfig: |
   312    logging:
   313        level: "INFO"
   314    blockstore:
   315      type: local
   316    auth:
   317      remote_authenticator:
   318        enabled: true
   319        # RBAC group for first time users
   320        default_user_group: "Developers"
   321      ui_config:
   322        login_cookie_names:
   323          - internal_auth_session
   324  
   325  ingress:
   326    enabled: true
   327    ingressClassName: <class-name>
   328    hosts:
   329      - host: <lakefs.ingress.domain>
   330        paths: 
   331         - /
   332  
   333  ##################################################
   334  ########### lakeFS enterprise - FLUFFY ###########
   335  ##################################################
   336  
   337  fluffy:
   338    enabled: true
   339    image:
   340      privateRegistry:
   341        enabled: true
   342        secretToken: <dockerhub-token-fluffy-image>
   343    fluffyConfig: |
   344      logging:
   345        level: "INFO"
   346      auth:
   347        post_login_redirect_url: /
   348        ldap: 
   349          server_endpoint: 'ldaps://ldap.company.com:636'
   350          bind_dn: uid=<bind-user-name>,ou=Users,o=<org-id>,dc=<company>,dc=com
   351          username_attribute: uid
   352          user_base_dn: ou=Users,o=<org-id>,dc=<company>,dc=com
   353          user_filter: (objectClass=inetOrgPerson)
   354          connection_timeout_seconds: 15
   355          request_timeout_seconds: 17
   356  
   357    secrets:
   358      create: true
   359      
   360    sso:
   361      enabled: true
   362      ldap:
   363        enabled: true
   364        bind_password: <ldap bind password>
   365    rbac:
   366      enabled: true
   367  
   368  useDevPostgres: true
   369  ```
   370  
   371  ## Log Collection
   372  
   373  The recommended practice for collecting logs would be sending them to the container std (default configuration)
   374  and letting an external service to collect them to a sink. An example for logs collector would be [fluentbit](https://fluentbit.io/)
   375  that can collect container logs, format them and ship them to a target like S3.
   376  
   377  There are 2 kinds of logs, regular logs like an API error or some event description used for debugging
   378  and audit_logs that are describing a user action (i.e create branch).
   379  The distinction between regular logs and audit_logs is in the boolean field log_audit.
   380  lakeFS and fluffy share the same configuration structure under logging.* section in the config.