github.com/treeverse/lakefs@v1.24.1-0.20240520134607-95648127bfb0/docs/understand/enterprise/orchestration.md (about) 1 --- 2 title: Run lakeFS Enterprise 3 description: Start using lakeFS-enterprise 4 parent: lakeFS Enterprise 5 grand_parent: Understanding lakeFS 6 --- 7 8 # Run lakeFS Enterprise 9 10 {% include toc.html %} 11 12 ## Overview 13 14 lakeFS Enterprise solution consists of 2 main components: 15 1. lakeFS - Open Source: [treeverse/lakeFS](https://hub.docker.com/r/treeverse/lakefs), 16 release info found in [Github releases](https://github.com/treeverse/lakeFS/releases). 17 2. Fluffy - Proprietary: In charge of the Enterprise features. Can be retrieved from 18 [Treeverse Dockerhub](https://hub.docker.com/u/treeverse) using the granted token. 19 20 ### Prerequisites 21 22 1. A KV Database, like postgres, should be configured and shared by fluffy and lakeFS. 23 1. Access to configure SSO IdP, like Azure AD Enterprise Application. 24 1. A proxy server should be configured to route traffic between the 2 servers. 25 26 27 There are several ways to run lakeFS Enterprise. You can follow the examples below, 28 or create your own setup using our resources available in our documentation. 29 30 ## Using lakeFS Helm chart 31 32 With every new release of lakeFS or Fluffy, the lakeFS team releases a new [lakeFS 33 helm chart](https://artifacthub.io/packages/helm/lakefs/lakefs). Together with the granted 34 Fluffy token, you can run the full lakeFS Enterprise solution. 35 36 As an example, the following `values` file will run lakeFS Enterprise with OIDC integration. 37 38 ```yaml 39 lakefsConfig: | 40 logging: 41 level: "INFO" 42 blockstore: 43 type: s3 44 database: 45 type: postgres 46 postgres: 47 connection_string: <postgres-connection-string> 48 auth: 49 oidc: 50 # the claim that's provided by the OIDC provider (e.g Okta) that will be used as the username according to OIDC provider claims provided after successful authentication 51 friendly_name_claim_name: "<some-oidc-provider-claim-name>" 52 default_initial_groups: ["Developers"] 53 ui_config: 54 login_cookie_names: 55 - internal_auth_session 56 - oidc_auth_session 57 ingress: 58 enabled: true 59 ingressClassName: <class-name> 60 hosts: 61 # the ingress that will be created for lakeFS 62 - host: <lakefs.ingress.domain> 63 paths: 64 - / 65 66 ################################################## 67 ########### lakeFS enterprise - FLUFFY ########### 68 ################################################## 69 70 fluffy: 71 enabled: true 72 image: 73 repository: treeverse/fluffy 74 tag: '0.4.0' 75 pullPolicy: IfNotPresent 76 privateRegistry: 77 enabled: true 78 secretToken: <dockerhub-token-fluffy-image> 79 fluffyConfig: | 80 logging: 81 format: "json" 82 level: "INFO" 83 database: 84 type: postgres 85 postgres: 86 connection_string: <postgres-connection-string> 87 auth: 88 serve_listen_address: 0.0.0.0:9000 89 logout_redirect_url: https://oidc-provider-url.com/logout/example 90 oidc: 91 enabled: true 92 url: https://oidc-provider-url.com/ 93 client_id: <oidc-client-id> 94 callback_base_url: https://<lakefs.ingress.domain> 95 is_default_login: true 96 # the claim name that represents the client identifier in the OIDC provider (e.g Okta) 97 logout_client_id_query_parameter: client_id 98 # the query parameters that will be used to redirect the user to the OIDC provider (e.g Okta) after logout 99 logout_endpoint_query_parameters: 100 - returnTo 101 - https://<lakefs.ingress.domain>/oidc/login 102 secrets: 103 create: true 104 sso: 105 enabled: true 106 oidc: 107 enabled: true 108 # secret given by the OIDC provider (e.g auth0, Okta, etc) 109 client_secret: <oidc-client-secret> 110 rbac: 111 enabled: true 112 113 useDevPostgres: true 114 ``` 115 116 1. The example uses OIDC authentication method. For other methods, see [SSO]({% link reference/security/sso.md %}). 117 2. Database configuration must be identical between lakeFS and Fluffy. 118 3. `useDevPostgres` isn't suitable for production. When used, a local dev postgres is created. 119 120 121 ## Using docker compose 122 123 The following docker-compose file will spin up lakeFS, Fluffy and postgres as a shared KV database. 124 This setup uses OIDC as the SSO authentication method. 125 Using a local postgres is not suitable for production use-cases. 126 127 ```yaml 128 version: "3" 129 services: 130 lakefs: 131 image: "treeverse/lakefs:1.20.0" 132 command: "RUN" 133 ports: 134 - "8080:8000" 135 depends_on: 136 - "postgres" 137 environment: 138 - LAKEFS_LOGGING_LEVEL=DEBUG 139 - LAKEFS_AUTH_ENCRYPT_SECRET_KEY="some random secret string" 140 - LAKEFS_AUTH_API_ENDPOINT=http://fluffy:9000/api/v1 141 - LAKEFS_AUTH_API_SUPPORTS_INVITES=true 142 - LAKEFS_AUTH_LOGOUT_REDIRECT_URL=http://fluffy:8000/oidc/logout 143 - LAKEFS_AUTH_UI_CONFIG_LOGIN_URL=http://fluffy:8000/oidc/login 144 - LAKEFS_AUTH_UI_CONFIG_LOGOUT_URL=http://fluffy:8000/oidc/logout 145 - LAKEFS_AUTH_UI_CONFIG_RBAC=internal 146 - LAKEFS_AUTH_UI_CONFIG_LOGIN_COOKIE_NAMES=[internal_auth_session,oidc_auth_session] 147 - LAKEFS_AUTH_OIDC_FRIENDLY_NAME_CLAIM_NAME="nickname" 148 - LAKEFS_AUTH_OIDC_DEFAULT_INITIAL_GROUPS=["Admins"] 149 - LAKEFS_AUTH_AUTHENTICATION_API_ENDPOINT=http://fluffy:8000/api/v1 150 - LAKEFS_AUTH_AUTHENTICATION_API_EXTERNAL_PRINCIPALS_ENABLED=true 151 - LAKEFS_DATABASE_TYPE=postgres 152 - LAKEFS_DATABASE_POSTGRES_CONNECTION_STRING=postgres://lakefs:lakefs@postgres/postgres?sslmode=disable 153 - LAKEFS_BLOCKSTORE_TYPE=local 154 - LAKEFS_BLOCKSTORE_LOCAL_PATH=/home/lakefs 155 - LAKEFS_BLOCKSTORE_LOCAL_IMPORT_ENABLED=true 156 entrypoint: ["/app/wait-for", "postgres:5432", "--", "/app/lakefs", "run"] 157 158 postgres: 159 image: "postgres:11" 160 ports: 161 - "5433:5432" 162 environment: 163 POSTGRES_USER: lakefs 164 POSTGRES_PASSWORD: lakefs 165 166 fluffy: 167 image: "${FLUFFY_REPO:-treeverse}/fluffy:${TAG:-0.4.0}" 168 command: "${COMMAND:-run}" 169 ports: 170 - "8000:8000" 171 - "9000:9000" 172 depends_on: 173 - "postgres" 174 environment: 175 - FLUFFY_LOGGING_LEVEL=DEBUG 176 - FLUFFY_DATABASE_TYPE=postgres 177 - FLUFFY_DATABASE_POSTGRES_CONNECTION_STRING=postgres://lakefs:lakefs@postgres/postgres?sslmode=disable 178 - FLUFFY_AUTH_ENCRYPT_SECRET_KEY="some random secret string" 179 - FLUFFY_AUTH_SERVE_LISTEN_ADDRESS=0.0.0.0:9000 180 - FLUFFY_LISTEN_ADDRESS=0.0.0.0:8000 181 - FLUFFY_AUTH_SERVE_DISABLE_AUTHENTICATION=true 182 - FLUFFY_AUTH_LOGOUT_REDIRECT_URL=<oidc-login-url> 183 - FLUFFY_AUTH_POST_LOGIN_REDIRECT_URL=http://lakefs:8000/ 184 - FLUFFY_AUTH_OIDC_ENABLED=true 185 - FLUFFY_AUTH_OIDC_URL=<oidc-endpoint> 186 - FLUFFY_AUTH_OIDC_CLIENT_ID=<client-id> 187 - FLUFFY_AUTH_OIDC_CLIENT_SECRET=<client-secret> 188 - FLUFFY_AUTH_OIDC_CALLBACK_BASE_URL=http://fluffy:8000 189 - FLUFFY_AUTH_OIDC_IS_DEFAULT_LOGIN=true 190 - FLUFFY_AUTH_OIDC_LOGOUT_CLIENT_ID_QUERY_PARAMETER=client_id 191 entrypoint: [ "/app/wait-for", "postgres:5432", "--", "/app/fluffy" ] 192 configs: 193 - source: fluffy.yaml 194 target: /etc/fluffy/config.yaml 195 196 #This tweak is unfortunate but also necessary. logout_endpoint_query_parameters is a list 197 #of strings which isn't parsed nicely as env vars. 198 configs: 199 fluffy.yaml: 200 content: | 201 auth: 202 oidc: 203 logout_endpoint_query_parameters: 204 - returnTo 205 - http://localhost:8080/oidc/login 206 ``` 207 208 ## More examples 209 210 Few more examples for running lakeFS Enterprise based on the SSO provider and the setup. 211 212 ### Active Directory Federation Services (AD FS) (using SAML) without helm 213 {:.no_toc} 214 215 **Note:** If you'd like to run this example on a k8s cluster, follow this section 216 and replace fluffy and lakeFS configuration in the helm's `values.yaml` file. 217 {: .note } 218 219 #### Azure Configuration 220 221 1. Create an Enterprise Application with SAML toolkit - see [Azure quickstart](https://learn.microsoft.com/en-us/entra/identity/enterprise-apps/add-application-portal) 222 1. Add users: **App > Users and groups**: Attach users and roles from their existing AD users 223 list - only attached users will be able to login to lakeFS. 224 1. Configure SAML: App > Single sign-on > SAML: 225 1. Entity ID: Add 2 ID’s, lakefs-url + lakefs-url/saml/metadata (e.g. https://lakefs.acme.com) 226 1. Reply URL: lakefs-url/saml (e.g. https://lakefs.acme.com/saml) 227 1. Sign on URL: lakefs-url/sso/login-saml (e.g. https://lakefs.acme.com/sso/login-saml) 228 1. Relay State (Optional): / 229 230 #### Fluffy Configuration 231 232 **Note:** Full Fluffy configuration can be found [here]({% link understand/enterprise/fluffy-configuration.md %}).. 233 {: .note } 234 235 1. `auth.saml.idp_metadata_url`: Set from the Azure app created above _SAML configuration > “App Federation Metadata Url”_ 236 1. `auth.saml.external_user_id_claim_name`: The claim that represents the UserID. 237 The saim claim name must also be set in lakeFS config key `auth.cookie_auth_verification.external_user_id_claim_name`. 238 The example below uses http://schemas.xmlsoap.org/ws/2005/05/identity/claims/name as the claim name. 239 240 ```yaml 241 # fluffy configuration.yaml 242 243 # SSO address (i.e login with Azure) and healthcheck address /_health 244 listen_address: :8000 245 logging: 246 level: "INFO" 247 audit_log_level: "INFO" 248 # everything under database equal to the config in lakeFS 249 database: 250 type: postgres 251 postgres: 252 # same as lakeFS 253 connection_string: <postgres connection string> 254 auth: 255 # RBAC Service, default address is 0.0.0.0:9000, also lakeFS healcheck 256 serve_listen_address: ':9000' 257 encrypt: 258 # same as lakeFS 259 secret_key: shared-secrey-key 260 logout_redirect_url: https://<lakefs-url> 261 post_login_redirect_url: https://<lakefs-url> 262 saml: 263 enabled: true 264 sp_root_url: https://<lakefs-url> 265 # generated SSL key, if not enforced on on the IdP level then can be anything as long as it a valid cert structure. 266 sp_x509_key_path: '/etc/saml_certs/rsa_saml_private.cert' 267 sp_x509_cert_path: '/etc/saml_certs/rsa_saml_public.pem' 268 sp_sign_request: false 269 sp_signature_method: "http://www.w3.org/2001/04/xmldsig-more#rsa-sha256" 270 idp_metadata_url: https://login.microsoftonline.com/<...>/federationmetadata/2007-06/federationmetadata.xml?appid=<app-id> 271 external_user_id_claim_name: http://schemas.xmlsoap.org/ws/2005/05/identity/claims/name 272 idp_skip_verify_tls_cert: true 273 ``` 274 275 #### lakeFS Configuration 276 277 ```yaml 278 # lakeFS configuration.yaml 279 280 database: 281 type: postgres 282 postgres: 283 # same as fluffy 284 connection_string: <postgres connection string> 285 auth: 286 encrypt: 287 # same as fluffy 288 secret_key: shared-secrey-key 289 api: 290 # RBAC endpoint 291 endpoint: http://<fluffy>:9000/api/v1 292 cookie_auth_verification: 293 auth_source: saml 294 friendly_name_claim_name: displayName 295 external_user_id_claim_name: http://schemas.xmlsoap.org/ws/2005/05/identity/claims/name 296 default_initial_groups: 297 - "Developers" 298 ui_config: 299 login_url: "https://<lakefs-url>/sso/login-saml" 300 logout_url: "https://<lakefs-url>/sso/logout-saml" 301 rbac: internal 302 login_cookie_names: 303 - internal_auth_session 304 - saml_auth_session 305 ``` 306 307 ### LDAP `values.yaml` file for helm deployments 308 {:.no_toc} 309 310 ```yaml 311 lakefsConfig: | 312 logging: 313 level: "INFO" 314 blockstore: 315 type: local 316 auth: 317 remote_authenticator: 318 enabled: true 319 # RBAC group for first time users 320 default_user_group: "Developers" 321 ui_config: 322 login_cookie_names: 323 - internal_auth_session 324 325 ingress: 326 enabled: true 327 ingressClassName: <class-name> 328 hosts: 329 - host: <lakefs.ingress.domain> 330 paths: 331 - / 332 333 ################################################## 334 ########### lakeFS enterprise - FLUFFY ########### 335 ################################################## 336 337 fluffy: 338 enabled: true 339 image: 340 privateRegistry: 341 enabled: true 342 secretToken: <dockerhub-token-fluffy-image> 343 fluffyConfig: | 344 logging: 345 level: "INFO" 346 auth: 347 post_login_redirect_url: / 348 ldap: 349 server_endpoint: 'ldaps://ldap.company.com:636' 350 bind_dn: uid=<bind-user-name>,ou=Users,o=<org-id>,dc=<company>,dc=com 351 username_attribute: uid 352 user_base_dn: ou=Users,o=<org-id>,dc=<company>,dc=com 353 user_filter: (objectClass=inetOrgPerson) 354 connection_timeout_seconds: 15 355 request_timeout_seconds: 17 356 357 secrets: 358 create: true 359 360 sso: 361 enabled: true 362 ldap: 363 enabled: true 364 bind_password: <ldap bind password> 365 rbac: 366 enabled: true 367 368 useDevPostgres: true 369 ``` 370 371 ## Log Collection 372 373 The recommended practice for collecting logs would be sending them to the container std (default configuration) 374 and letting an external service to collect them to a sink. An example for logs collector would be [fluentbit](https://fluentbit.io/) 375 that can collect container logs, format them and ship them to a target like S3. 376 377 There are 2 kinds of logs, regular logs like an API error or some event description used for debugging 378 and audit_logs that are describing a user action (i.e create branch). 379 The distinction between regular logs and audit_logs is in the boolean field log_audit. 380 lakeFS and fluffy share the same configuration structure under logging.* section in the config.