github.com/git-lfs/git-lfs@v2.5.2+incompatible/docs/proposals/locking.md

github.com/git-lfs/git-lfs@v2.5.2+incompatible/docs/proposals/locking.md (about)

1 # Locking feature proposal
2
3 We need the ability to lock files to discourage (we can never prevent) parallel
4 editing of binary files which will result in an unmergeable situation. This is
5 not a common theme in git (for obvious reasons, it conflicts with its
6 distributed, parallel nature), but is a requirement of any binary management
7 system, since files are very often completely unmergeable, and no-one likes
8 having to throw their work away & do it again.
9
10 ## What not to do: single branch model
11
12 The simplest way to organise locking is to require that if binary files are only
13 ever edited on a single branch, and therefore editing this file can follow a
14 simple sequence:
15
16 1. File starts out read-only locally
17 2. User locks the file, user is required to have the latest version locally from
18 the 'main' branch
19 3. User edits file & commits 1 or more times
20 4. User pushes these commits to the main branch
21 5. File is unlocked (and made read only locally again)
22
23 ## A more usable approach: multi-branch model
24
25 In practice teams need to work on more than one branch, and sometimes that work
26 will have corresponding binary edits.
27
28 It's important to remember that the core requirement is to prevent *unintended
29 parallel edits of an unmergeable file*.
30
31 One way to address this would be to say that locking a file locks it across all
32 branches, and that lock is only released when the branch where the edit is is
33 merged back into a 'primary' branch. The problem is that although that allows
34 branching and also prevents merge conflicts, it forces merging of feature
35 branches before a further edit can be made by someone else.
36
37 An alternative is that locking a file locks it across all branches, but when the
38 lock is released, further locks on that file can only be taken on a descendant
39 of the latest edit that has been made, whichever branch it is on. That means
40 a change to the rules of the lock sequence, namely:
41
42 1. File starts out read-only locally
43 2. User tries to lock a file. This is only allowed if:
44 * The file is not already locked by anyone else, AND
45 * One of the following are true:
46 * The user has, or agrees to check out, a descendant of the latest commit
47 that was made for that file, whatever branch that was on, OR
48 * The user stays on their current commit but resets the locked file to the
49 state of the latest commit (making it modified locally, and
50 also cherry-picking changes for that file in practice).
51 3. User edits file & commits 1 or more times, on any branch they like
52 4. User pushes the commits
53 5. File is unlocked if:
54 * the latest commit to that file has been pushed (on any branch), and
55 * the file is not locally edited
56
57 This means that long-running branches can be maintained but that editing of a
58 binary file must always incorporate the latest binary edits. This means that if
59 this system is always respected, there is only ever one linear stream of
60 development for this binary file, even though that 'thread' may wind its way
61 across many different branches in the process.
62
63 This does mean that no-one's changes are accidentally lost, but it does mean
64 that we are either making new branches dependent on others, OR we're
65 cherry-picking changes to individual files across branches. This does change
66 the traditional git workflow, but importantly it achieves the core requirement
67 of never *accidentally* losing anyone's changes. How changes are threaded
68 across branches is always under the user's control.
69
70 ## Breaking the rules
71 We must allow the user to break the rules if they know what they are doing.
72 Locking is there to prevent unintended binary merge conflicts, but sometimes you
73 might want to intentionally create one, with the full knowledge that you're
74 going to have to manually merge the result (or more likely, pick one side and
75 discard the other) later down the line. There are 2 cases of rule breaking to
76 support:
77
78 1. **Break someone else's lock**
79 People lock files and forget they've locked them, then go on holiday, or
80 worse, leave the company. You can't be stuck not being able to edit that file
81 so must be able to forcibly break someone else's lock. Ideally this should
82 result in some kind of notification to the original locker (might need to be a
83 special value-add on BB/Stash). This effectively removes the other person's
84 lock and is likely to cause them problems if they had edited and try to push
85 next time.
86
87 2. **Allow a parallel lock**
88 Actually similar to breaking someone else's lock, except it lets you take
89 another lock on a file in parallel, leaving their lock in place too, and
90 knowing that you're going to have to resolve the merge problem later. You
91 could handle this just by manually making files read/write, then using 'force
92 push' to override hooks that prevent pushing when not locked. However by
93 explicitly registering a parallel lock (possible form: 'git lfs lock
94 --force') this could be recorded and communicated to anyone else with a lock,
95 letting them know about possible merge issues down the line.
96
97 ## Detailed feature points
98 |No | Feature | Notes
99 |---|---------|------------------
100 |1 |Lock server must be available at same API URL|
101 |2 |Identify unmergeable files as subset of lfs files|`git lfs track -b` ?
102 |3 |Make unmergeable files read-only on checkout|Perform in smudge filter
103 |4 |Lock a file<ul><li>Check with server which must atomically check/set</li><li>Check person requesting the lock is checked out on a commit which is a descendent of the last edit of that file (locally or on server, although last lock shouldn't have been released until push anyway), or allow --force to break rule</li><li>Record lock on server</li><li>Make file read/write locally if success</li></ul>|`git lfs lock <file>`?
104 |5 |Release a lock<ul><li>Check if locally modified, if so must discard</li><li>Check if user has more recent commit of this file than server, if so must push first</li><li>Release lock on server atomically</li><li>Make local file read-only</li></ul>|`git lfs unlock <file>`?
105 |6 |Break a lock, ie override someone else's lock and take it yourself.<ul><li>Release lock on server atomically</li><li>Proceed as per 'Lock a file'</li><li>Notify original lock holder HOW?</li></ul>|`git lfs lock -break <file>`?
106 |7 |Release lock on reset (maybe). Configurable option / prompt? May be resetting just to start editing again|
107 |8 |Release lock on push (maybe, if unmodified). See above|
108 |9 |Cater for read-only binary files when merging locally<ul><li>Because files are read-only this might prevent merge from working when actually it's valid.</li><li>Always fine to merge the latest version of a binary file to anywhere else</li><li>Fine to merge the non-latest version if user is aware that this may cause merge problems (see Breaking the rules)</li><li>Therefore this feature is about dealing with the read-only flag and issuing a warning if not the latest</li></ul>|
109 |10 |List current locks<ul><li>That the current user has</li><li>That anyone has</li><li>Potentially scoped to folder</li></ul>|`git lfs lock --list [paths...]`
110 |11 |Reject a push containing a binary file currently locked by someone else|pre-receive hook on server, allow --force to override (i.e. existing parameter to git push)
111
112 ## Locking challenges
113
114 ### Making files read-only
115
116 This is useful because it means it provides a reminder that the user should be
117 locking the file before they start to edit it, to avoid the case of an unexpected
118 merge later on.
119
120 I've done some tests with chmod and discovered:
121
122 * Removing the write bit doesn't cause the file to be marked modified (good)
123 * In most editors it either prevents saving or (in Apple tools) prompts to
124 'unlock'. The latter is slightly unhelpful
125 * In terms of marking files that need locking, adding custom flags to
126 .gitattributes (like 'lock') seems to work; `git check-attr -a <file>`
127 correctly lists the custom attribute
128 * Once a file is marked read-only however, `git checkout` replaces it without
129 prompting, with the write bit set
130 * We can use the `post-checkout` hook to make files read-only, but we don't get
131 any file information, on refs. This means we'd have to scan the whole working
132 copy to figure out what we needed to mark read-only. To do this we'd have to
133 have the attribute information and all the current lock information. This
134 could be time consuming.
135 * A way to speed up the `post-checkout` would be to diff the pre- and post-ref
136 information that's provided and only check the files that changed. In the case
137 of single-file checkouts I'm not sure this is possible though.
138 * We could also feed either the diff or a file scan into `git check-attr --stdin`
139 in order to share the exe, or do our own attribute matching
140 * It's not entirely clear yet how merge & rebase might operate. May also need
141 the `post-merge` hook
142 * See contrib/hooks/setgitperms.perl for an example; so this isn't unprecedented
143
144 #### Test cases for post-checkout
145
146 * Checkout a branch
147 * Calls `post-checkout` with pre/post SHA and branch=1
148 * Checkout a tag
149 * Calls `post-checkout` with pre/post SHA and branch=1 (even though it's a tag)
150 * Checkout by commit SHA
151 * Calls `post-checkout` with pre/post SHA and branch=1 (even though it's a plain SHA)
152 * Checkout named files (e.g. discard changes)
153 * Calls `post-checkout` with identical pre/post SHA (HEAD) and branch=0
154 * Reset all files (discard all changes ie git reset --hard HEAD)
155 * Doesn't call `post-checkout` - could restore write bit, but must have been
156 set anyway for file to be edited, so not a problem?
157 * Reset a branch to a previous commit
158 * Doesn't call `post-checkout` - PROBLEM because can restore write bit & file
159 was not modified. BUT: rare & maybe liveable
160 * Merge a branch with lockable file changes (non-conflicting)
161 * Rebase a branch with lockable files (non-conflicting)
162 * Merge conflicts - fix then commit
163 * Rebase conflicts - fix then continue
164 *
165
166
167 ## Implementation details (Initial simple API-only pass)
168 ### Types
169 To make the implementing locking on the lfs-test-server as well as other servers
170 in the future easier, it makes sense to create a `lock` package that can be
171 depended upon from any server. This will go along with Steve's refactor which
172 touches the `lfs` package quite a bit.
173
174 Below are enumerated some of the types that will presumably land in this
175 sub-package.
176
177 ```go
178 // Lock represents a single lock that against a particular path.
179 //
180 // Locks returned from the API may or may not be currently active, according to
181 // the Expired flag.
182 type Lock struct {
183 // Id is the unique identifier corresponding to this particular Lock. It
184 // must be consistent with the local copy, and the server's copy.
185 Id string `json:"id"`
186 // Path is an absolute path to the file that is locked as a part of this
187 // lock.
188 Path string `json:"path"`
189 // Committer is the author who initiated this lock.
190 Committer struct {
191 Name string `json:"name"`
192 Email string `json:"email"`
193 } `json:"creator"`
194 // CommitSHA is the commit that this Lock was created against. It is
195 // strictly equal to the SHA of the minimum commit negotiated in order
196 // to create this lock.
197 CommitSHA string `json:"commit_sha"
198 // LockedAt is a required parameter that represents the instant in time
199 // that this lock was created. For most server implementations, this
200 // should be set to the instant at which the lock was initially
201 // received.
202 LockedAt time.Time `json:"locked_at"`
203 // ExpiresAt is an optional parameter that represents the instant in
204 // time that the lock stopped being active. If the lock is still active,
205 // the server can either a) not send this field, or b) send the
206 // zero-value of time.Time.
207 UnlockedAt time.Time `json:"unlocked_at,omitempty"`
208 }
209
210 // Active returns whether or not the given lock is still active against the file
211 // that it is protecting.
212 func (l *Lock) Active() bool {
213 return time.IsZero(l.UnlockedAt)
214 }
215 ```
216
217 ### Proposed Commands
218
219 #### `git lfs lock <path>`
220
221 The `lock` command will be used in accordance with the multi-branch flow as
222 proposed above to request that lock be granted to the specific path passed an
223 argument to the command.
224
225 ```go
226 // LockRequest encapsulates the payload sent across the API when a client would
227 // like to obtain a lock against a particular path on a given remote.
228 type LockRequest struct {
229 // Path is the path that the client would like to obtain a lock against.
230 Path string `json:"path"`
231 // LatestRemoteCommit is the SHA of the last known commit from the
232 // remote that we are trying to create the lock against, as found in
233 // `.git/refs/origin/<name>`.
234 LatestRemoteCommit string `json:"latest_remote_commit"`
235 // Committer is the individual that wishes to obtain the lock.
236 Committer struct {
237 // Name is the name of the individual who would like to obtain the
238 // lock, for instance: "Rick Olson".
239 Name string `json:"name"`
240 // Email is the email assopsicated with the individual who would
241 // like to obtain the lock, for instance: "rick@github.com".
242 Email string `json:"email"`
243 } `json:"committer"`
244 }
245 ```
246
247 ```go
248 // LockResponse encapsulates the information sent over the API in response to
249 // a `LockRequest`.
250 type LockResponse struct {
251 // Lock is the Lock that was optionally created in response to the
252 // payload that was sent (see above). If the lock already exists, then
253 // the existing lock is sent in this field instead, and the author of
254 // that lock remains the same, meaning that the client failed to obtain
255 // that lock. An HTTP status of "409 - Conflict" is used here.
256 //
257 // If the lock was unable to be created, this field will hold the
258 // zero-value of Lock and the Err field will provide a more detailed set
259 // of information.
260 //
261 // If an error was experienced in creating this lock, then the
262 // zero-value of Lock should be sent here instead.
263 Lock Lock `json:"lock"`
264 // CommitNeeded holds the minimum commit SHA that client must have to
265 // obtain the lock.
266 CommitNeeded string `json:"commit_needed"`
267 // Err is the optional error that was encountered while trying to create
268 // the above lock.
269 Err error `json:"error,omitempty"`
270 }
271 ```
272
273
274 #### `git lfs unlock <path>`
275
276 The `unlock` command is responsible for releasing the lock against a particular
277 file. The command takes a `<path>` argument which the LFS client will have to
278 internally resolve into a Id to unlock.
279
280 The API associated with this command can also be used on the server to remove
281 existing locks after a push.
282
283 ```go
284 // An UnlockRequest is sent by the client over the API when they wish to remove
285 // a lock associated with the given Id.
286 type UnlockRequest struct {
287 // Id is the identifier of the lock that the client wishes to remove.
288 Id string `json:"id"`
289 }
290 ```
291
292 ```go
293 // UnlockResult is the result sent back from the API when asked to remove a
294 // lock.
295 type UnlockResult struct {
296 // Lock is the lock corresponding to the asked-about lock in the
297 // `UnlockPayload` (see above). If no matching lock was found, this
298 // field will take the zero-value of Lock, and Err will be non-nil.
299 Lock Lock `json:"lock"`
300 // Err is an optional field which holds any error that was experienced
301 // while removing the lock.
302 Err error `json:"error,omitempty"`
303 }
304 ```
305
306 Clients can determine whether or not their lock was removed by calling the
307 `Active()` method on the returned Lock, if `UnlockResult.Err` is nil.
308
309 #### `git lfs locks (-r <remote>|-b <branch|-p <path>)|(-i id)`
310
311 For many operations, the LFS client will need to have knowledge of existing
312 locks on the server. Additionally, the client should not have to self-sort/index
313 this (potentially) large set. To remove this need, both the `locks` command and
314 corresponding API method take several filters.
315
316 Clients should turn the flag-values that were passed during the command
317 invocation into `Filter`s as described below, and batched up into the `Filters`
318 field in the `LockListRequest`.
319
320 ```go
321 // Property is a constant-type that narrows fields pertaining to the server's
322 // Locks.
323 type Property string
324
325 const (
326 Branch Property = "branch"
327 Id Property = "id"
328 // (etc) ...
329 )
330
331 // LockListRequest encapsulates the request sent to the server when the client
332 // would like a list of locks that match the given criteria.
333 type LockListRequest struct {
334 // Filters is the set of filters to query against. If the client wishes
335 // to obtain a list of all locks, an empty array should be passed here.
336 Filters []{
337 // Prop is the property to search against.
338 Prop Property `json:"prop"`
339 // Value is the value that the property must take.
340 Value string `json:"value"`
341 } `json:"filters"`
342 // Cursor is an optional field used to tell the server which lock was
343 // seen last, if scanning through multiple pages of results.
344 //
345 // Servers must return a list of locks sorted in reverse chronological
346 // order, so the Cursor provides a consistent method of viewing all
347 // locks, even if more were created between two requests.
348 Cursor string `json:"cursor,omitempty"`
349 // Limit is the maximum number of locks to return in a single page.
350 Limit int `json:"limit"`
351 }
352 ```
353
354 ```go
355 // LockList encapsulates a set of Locks.
356 type LockList struct {
357 // Locks is the set of locks returned back, typically matching the query
358 // parameters sent in the LockListRequest call. If no locks were matched
359 // from a given query, then `Locks` will be represented as an empty
360 // array.
361 Locks []Lock `json:"locks"`
362 // NextCursor returns the Id of the Lock the client should update its
363 // cursor to, if there are multiple pages of results for a particular
364 // `LockListRequest`.
365 NextCursor string `json:"next_cursor,omitempty"`
366 // Err populates any error that was encountered during the search. If no
367 // error was encountered and the operation was succesful, then a value
368 // of nil will be passed here.
369 Err error `json:"error,omitempty"`
370 }