github.com/pachyderm/pachyderm@v1.13.4/doc/docs/1.11.x/enterprise/auth/index.md (about)

     1  # Configure Access Controls
     2  
     3  !!! note
     4      Access Controls is an enterprise feature that requires
     5      an active enterprise token.
     6  
     7  Pachyderm Access Controls enable you to log in to Pachyderm
     8  with a user configured in a third-party identity management
     9  platform and operate as that user with the data stored in
    10  Pachyderm. Pachyderm supports the following identity providers
    11  with the specified authentication protocols:
    12  
    13  - Okta™ with Security Assertion Markup Language(SAML)
    14  - Keycloak with OpenID Connect (OIDC)
    15  - Google™ Identity Platform (OIDC)
    16  - Auth0
    17  - GitHub™ OAuth
    18  
    19  Other configurations might work as well, but the list
    20  above summarizes officially supported platforms. In general, 
    21  most of the identity providers that support OIDC or SAML
    22  can integrate with Pachyderm. However, you might have to
    23  perform additional configuration steps.
    24  
    25  ## Roles
    26  
    27  By default, Pachyderm preconfigures one type of users called
    28  `admin`. Users in the `admin` group can perform any
    29  action on the cluster including appointing other admins.
    30  Furthermore, if you do not enable Pachyderm authentication,
    31  you have only one type of users – admin users.
    32  
    33  If you have activated access controls, in addition to the initial
    34  `admin` user, Pachyderm associates an Access Control List (ACL)
    35  with each repository. Each ACL can include the following
    36  roles:
    37  
    38  - `READER` - users who can consume data from the repo, but cannot edit it.
    39  Readers can execute such commands as `pachctl get file`
    40  `pachctl list file`, as well as create pipelines that use data
    41  from this repo. 
    42  - `WRITER` - users who can read and modify data in the repository by
    43  adding, deleting, or updating the files in the repo. Users with
    44  `WRITER` role can perform such operations as `pachctl put file`,
    45  `pachctl delete commit`, and others.
    46  - `OWNER` - users with `READER` and `WRITER` access who can also
    47  modify the repository's ACL.
    48  
    49  ## User Account Types
    50  
    51  Pachyderm defines the following account types:
    52  
    53  * For smaller teams and testing:
    54  
    55    * **GitHub user** is a user account that is associated with
    56    a GitHub account and logs in through the GitHub OAuth flow. If you do not
    57    use any third-party identity provider, you use this option. When a user tries
    58    to log in with a GitHub account, Pachyderm verifies the identity and
    59    sends a Pachyderm token for that account.
    60  
    61    * **Robot user** is a user account that logs in with a pach-generated authentication
    62    token. Typically, you create a user in simplified workflow scenarios, such
    63    as initial SAML configuration.
    64  
    65    * **Pipeline** is an account that Pachyderm creates for
    66    data pipelines. Pipelines inherit access control from its creator.
    67  
    68    * **SAML user** is a user account that is associated with a SAML identity provider.
    69    When a user tries to log in through a SAML ID provider, the system
    70    confirms the identity, associates
    71    that identity with a SAML identity provider account, and responds with
    72    the SAML identity provider token for that user. Pachyderm verifies the token,
    73    drops it, and creates a new internal token that encapsulates the information
    74    about the user.
    75  
    76    * **OIDC user** is a user that is associated with an OIDC identity provider,
    77    such as Keycloak, Okta, or other. If you have a user or group configured
    78    in your identity provider you can give them access to Pachyderm by configuring
    79    the Pachyderm authentication config.
    80  
    81  ## Access to Pipelines
    82  
    83  In Pachyderm, you do not explicitly grant users access to
    84  pipelines. Instead, pipelines infer access from their input
    85  and output repositories. To update a pipeline, you must have
    86  at least `READER`-level access to all pipeline inputs and at
    87  least `WRITER`-level access to the pipeline output. This is
    88  because pipelines read from their input repos and write
    89  to their output repos, and you cannot grant a pipeline
    90  more access than you have yourself.
    91  
    92  An `OWNER`, `WRITER`, or `READER` of a repo can subscribe a
    93  pipeline to that repo. When a user subscribes a pipeline
    94  to a repo, Pachyderm sets that user as an `OWNER` of that
    95  pipeline's output repo. If additional users need access
    96  to the output repository, the initial `OWNER` of a
    97  pipeline's output repo, or an admin, needs to configure
    98  these access rules. To update a pipeline, you must have
    99  `WRITER` access to the pipeline's output repos and `READER`
   100  access to the pipeline's input repos.
   101  
   102  
   103  ## Deactivating Authentication
   104  
   105  When an enterprise activation code expires, a
   106  Pachyderm cluster with enable authentication goes into an
   107  `admin-only` state. In this state, only admins have
   108  access to data that is stored in Pachyderm.
   109  This safety measure keeps sensitive data protected, even when
   110  an enterprise subscription becomes stale. As soon as the enterprise
   111  activation code is updated by using the dashboard or CLI, the
   112  Pachyderm cluster returns to its previous state.
   113  
   114  When you deactivate access controls on a Pachyderm cluster
   115  by running `pachctl auth deactivate`, the cluster returns
   116  to its original state that including the following changes:
   117  
   118  - All ACLs are deleted.
   119  - The cluster returns to being a blank slate in regards to
   120  access control, which means that everyone that can connect
   121  to Pachyderm can access and modify the data in all repos.
   122  - No users are present in Pachyderm, and no one can log in to Pachyderm.