github.com/treeverse/lakefs@v1.24.1-0.20240520134607-95648127bfb0/design/accepted/metadata_kv/kv_auth.md (about) 1 # Implementing Auth Package with KV Database 2 3 This document describes the entities of lakeFS auth package, the relationship between them and offers 4 a possible representation of them in the KV database. 5 6 ## Entities 7 8 Some entities, like `User` and `Group` has an `ID` field which is a serial int key handled by postgres. 9 It will be migrated to a generated random string ID. 10 11 ### User 12 13 - Can be looked up by `ID`, `Username` & `Email`. 14 - Deleted by `Username`. 15 - Listed by `Username`. 16 17 ### Group 18 19 - Looked up, deleted and listed by `DisplayName`. 20 21 ### Policies 22 23 - Looked up, deleted and listed by `DisplayName`. 24 25 ### Credentials 26 27 - Has a foreign id reference to `User.ID`. 28 - The `User.ID` is used for looking up the `User.Username`. 29 - Almost all actions are in the context of `User.Username`, i.e. it is always passed. 30 - The only action without a `User.Username` passed is credentials lookup by `AccessKeyID`. 31 32 33 ## Entities Relationship 34 35 ### Group <-> User 36 37 - Add/Remove `User` to `Group` by `User.Username` & `Group.DisplayName`. 38 - List all the `Group`s for a `User` by `User.Username`. 39 - List all the `Users`s in a `Group` by `Group.DisplayName`. 40 41 ### Policy <-> User 42 43 - Attach/Detach `Policy` to `User` by `User.Username` & `Policy.DisplayName`. 44 - List all `Policies` for a `User` by `User.Username`. 45 - List all effective `Policies` for a `User` by `User.Username`, this includes all `Group Policies` for Groups the user is a member of. 46 47 ### Policy <-> Group 48 49 - Attach/Detach `Policy` to `Group` by `Group.DisplayName` & `Policy.DisplayName`. 50 - List all `Policies` for a `Group` by `Group.DisplayName`. 51 - `Group` cannot be a member of another `Group`, therefore all effective policies are attached to it directly. 52 53 ## Representation in the KV world 54 55 - All keys are prefixed by the reserved package prefix `auth/`. 56 57 - `User`s key will be in the form of `users/<UserName>`. 58 - `Group`s key will be in the form of `groups/<DisplayName>`. 59 - `Policies`s key will be in the form of `policies/<DisplayName>`. 60 - `Credentials`s key will be in the form of `users/<UserName>/credentials/<AccessKeyID>`. 61 62 ### Handling Secondary Indexes 63 64 Below are 2 options that vary in efficiency and complexity of the implementation. 65 66 #### Store just the minimum indexes 67 68 Keep only the minimum that is required to represent the relationship of the entities: 69 70 - A `User` membership of a `Group` under `groups/<DisplayName>/users/<Username>`. 71 - A `Policy` attached to a `User` under `users/<Username>/policies/<DisplayName>`. 72 - A `Policy` attached to a `Group` under `groups/<Displayname>/policies/<DisplayName>`. 73 74 Any deletion of an entity would first remove all its secondary indexes, then the entity itself. 75 For example, deleting a `Policy` would have to list all `User`s & `Group`'s `Policies` and delete any attachment if existed. 76 Only then it should delete the `Policy` entity from the store. 77 78 Listing by anything other than the key will require to list all the entities which are relevant. 79 Some examples: 80 81 1. Finding a `User` by `AccessKeyID` will require to list all the users. 82 1. Listing the `User`'s effective policies will require to list all entities under `groups/`. 83 84 The amount of entities in the Auth world isn't big (<10k) and cached in the server, it's unlikely it will incur a 85 notable performance degradation. 86 87 #### Store all secondary indexes 88 89 In addition to the indexes in the above suggestion, also store: 90 91 - A `User` membership of a `Group` under `users/<Username>/groups/<DisplayName>`. 92 - `Credentials` attached to a `User` under `credentials/<AccessKeyID>/<Username>`. 93 - A `Policy` attached to a `User` under `policies/<DisplayName>/users/<Username>`. 94 - A `Policy` attached to a `Group` under `policies/<DisplayName>/groups/<DisplayName>`. 95 96 Every operation that updates a relationship between two entities would have to be stored everywhere. 97 The upside is a possible increase in performance ("possible" since we read less entities, but perform much more round-trips to the cache/store). 98 99 ### Working with no locks 100 101 In the Postgres era, we rely on it to keep the entities and relationship consistent. 102 For example, a deleted `Policy` would cascade to the `auth_user_policies` & `auth_group_policies` tables. 103 104 105 In the KV world, we must cleanup the secondary indexes first, before deleting the entity itself. 106 Since the KV is lock-free, we might still get into troubles. 107 For example, a `Policy` is requested to be deleted. The operation starts by listing all `User`s & `Group`s with that `Policy` attached to it. 108 While iterating, it's possible that another unlisted `User` would be attached to the `Policy`. Once the `Policy` is deleted, 109 we would remain with an attachment to it. There are at least 2 mitigations for it: 110 111 1. Do (a) list and delete of secondary indexes (b) delete the entity (c) another round of list and delete. 112 2. Any store access should consider that the secondary indexes are fragile, and should always retrieve the entities pointed by it. 113 The "truth" is in the entities themselves and not in the secondary index. 114 115 ### Decision 116 117 Due to its simplicity and without clear evidence of lakeFS installations with many users, I believe that option #1 is better. 118 As OIDC would likely be introduced soon, it's even likelier that users will handle the auth part elsewhere for installations with many users.