github.com/treeverse/lakefs@v1.24.1-0.20240520134607-95648127bfb0/docs/reference/security/external-principals-aws.md (about)

     1  ---
     2  title: Login to lakeFS with AWS IAM Roles
     3  description: This section covers how to authenticate to lakeFS using AWS IAM.
     4  grand_parent: Reference
     5  parent: Security
     6  redirect_from:
     7    - /reference/external-principals-aws.html
     8  ---
     9  
    10  # Authenticate to lakeFS with AWS IAM Roles
    11  
    12  {: .d-inline-block }
    13  <a style="color: white;" href="#sso-for-lakefs-enterprise">lakeFS Enterprise</a>
    14  {: .label .label-purple }
    15  
    16  {: .note}
    17  > External principals API is available for lakeFS Enterprise. If you're using the open-source version you can check the [pluggable APIs](https://docs.lakefs.io/reference/security/rbac.html#pluggable-authentication-and-authorization).
    18  
    19  {% include toc.html %}
    20  
    21  ## Overview 
    22  
    23  lakeFS supports authenticating users programmatically using AWS IAM roles instead of using static lakeFS access and secret keys.
    24  The method enables you to bound IAM principal ARNs to lakeFS users.
    25  A single lakeFS user may have many AWS's principle ARNs attached to it. When a client is authenticating to a lakeFS server with an AWS's session, the actions performed by the client are on behalf of the user attached to the ARN.
    26  
    27  ### Using Session Names
    28  
    29  The bound ARN can be attached to a single lakeFS user with or without SessionName, serving different users.
    30  For example, consider the following mapping: 
    31  
    32  | Principal ARN                                       | lakeFS User |
    33  |-----------------------------------------------------|-------------|
    34  | arn:aws:sts::123456:assumed-role/Dev                | foo         |
    35  | arn:aws:sts::123456:assumed-role/Dev/john@acme.com  | john        |
    36  
    37  if the bound ARN were `arn:aws:sts::123456:assumed-role/Dev/<SessionName>` it would allow any principal assuming `Dev` role in AWS account `123456` to login to it.
    38  If the `SessionName` is `john@acme.com` then lakeFS would return token for `john` user
    39  
    40  ### How AWS authentication works
    41  
    42  The AWS STS API includes a method, `sts:GetCallerIdentity`, which allows you to validate the identity of a client. The client signs a GetCallerIdentity query using the AWS Signature v4 algorithm and sends it to the lakeFS server. 
    43  
    44  The `GetCallerIdentity` query consists of four pieces of information: the request URL, the request body, the request headers and the request method. The AWS signature is computed over those fields. The lakeFS server reconstructs the query using this information and forwards it on to the AWS STS service. Depending on the response from the STS service, the server authenticates the client.
    45  
    46  Notably, clients don't need network-level access themselves to talk to the AWS STS API endpoint; they merely need access to the credentials to sign the request. However, it means that the lakeFS server does need network-level access to send requests to the STS endpoint.
    47  
    48  Each signed AWS request includes the current timestamp to mitigate the risk of replay attacks. In addition, lakeFS allows you to require an additional header, `X-LakeFS-Server-ID` (added by default), to be present to mitigate against different types of replay attacks (such as a signed `GetCallerIdentity` request stolen from a dev lakeFS instance and used to authenticate to a prod lakeFS instance). 
    49  
    50  It's also important to note that Amazon does NOT appear to include any sort of authorization around calls to GetCallerIdentity. For example, if you have an IAM policy on your credential that requires all access to be MFA authenticated, non-MFA authenticated credentials will still be able to authenticate to lakeFS using this method.
    51  
    52  
    53  ## Server Configuration
    54  
    55  {: .note}
    56  > Note: lakeFS Helm chart supports the configuration since version `1.2.11` - see usage [values.yaml example](https://github.com/treeverse/charts/blob/master/examples/lakefs/enterprise/values-external-aws.yaml).
    57  
    58  * in lakeFS `auth.authentication_api.external_principals_enabled` must be set to `true` in the configuration file, other configuration (`auth.authentication_api.*`) can be found at at [configuration reference]({% link reference/configuration.md %})
    59  
    60  For the full list of the Fluffy server configuration, see [Fluffy Configuration]({% link understand/enterprise/fluffy-configuration.md %}) under `auth.external.aws_auth`
    61  
    62  
    63  {: .note}
    64  > By default lakeFS clients will add the parameter `X-LakeFS-Server-ID: <lakefs.ingress.domain>` to the initial [login request][login-api] for STS.
    65  
    66  
    67  **Example configuration with required headers:**
    68  
    69  Configuration for `lakefs.yaml`: 
    70  
    71  ```yaml
    72  auth:
    73    authentication_api:
    74      endpoint: http://<fluffy-sso>/api/v1
    75      external_principals_enabled: true
    76    api:
    77      endpoint: http://<fluffy-rbac>/api/v1
    78  ```
    79  
    80  Configuration for `fluffy.yaml`:
    81  
    82  ```yaml
    83  # fluffy address for lakefs auth.authentication_api.endpoint
    84  # used by lakeFS to login and get the token
    85  listen_address: <fluffy-sso>
    86  auth:
    87    # fluffy address for lakeFS auth.api.endpoint 
    88    # used by lakeFS to manage the lifecycle attach/detach of the external principals
    89    serve_listen_address: <fluffy-rbac>
    90    external:
    91      aws_auth:
    92        enabled: true
    93        # headers that must be present by the client when doing login request
    94        required_headers:
    95          # same host as the lakeFS server ingress
    96          X-LakeFS-Server-ID: <lakefs.ingress.domain>
    97  ```
    98  
    99  ## Administration of IAM Roles in lakeFS
   100  
   101  Administration referes to the management of the IAM roles that are allowed to authenticate to lakeFS.
   102  Operations such as attaching and detaching IAM roles to a user, listing the roles attached to a user, and listing the users attached to a role. 
   103  Currently this is done through the lakeFS [External Principals API][external-principal-admin] and generated clients.
   104  
   105  Example of attaching an IAM roles to a user:
   106  
   107  ```python
   108  import lakefs_sdk as lakefs  
   109  
   110  configuration = lakefs.Configuration(host = "...",username="...",password="...")
   111  username = "<lakefs-user>"
   112  api = lakefs.ApiClient(configuration)
   113  auth_api = lakefs.AuthApi(api)
   114  # attach the role(s)to a lakeFS user
   115  auth_api.create_user_external_principal(user_id=username, principal_id='arn:aws:sts::<id>:assumed-role/<role A>/<optional session name>')
   116  auth_api.create_user_external_principal(user_id=username, principal_id='arn:aws:sts::<id>:assumed-role/<role B>')
   117  # list the roles attached to the user
   118  resp = auth_api.list_user_external_principals(user_id=args.user)
   119  for p in resp.results:
   120      # do something
   121  ```
   122  
   123  ## Get lakeFS API Token
   124  
   125  The login to lakeFS is done by calling the [login API][login-api] with the `GetCallerIdentity` request signed by the client.
   126  Currently, the login operation is supported out of the box in [lakeFS Hadoop FileSystem][lakefs-hadoopfs] version 0.2.4, see [Spark usage][lakefs-spark].
   127  Other clients (i.e HTTP, Python etc) can use the login endpoint to authenticate to lakeFS but, you will have to build the request input.
   128  
   129  
   130  [external-principal-admin]:  {% link reference/cli.md %}#external
   131  [login-api]: {% link reference/api.md %}#auth/externalPrincipalLogin
   132  [lakefs-hadoopfs]:  {% link integrations/spark.md %}#lakefs-hadoop-filesystem
   133  [lakefs-spark]:  {% link integrations/spark.md %}#usage-with-temporaryawscredentialslakefstokenprovider