github.com/git-lfs/git-lfs@v2.5.2+incompatible/docs/proposals/locking.md (about)

     1  # Locking feature proposal
     2  
     3  We need the ability to lock files to discourage (we can never prevent) parallel
     4  editing of binary files which will result in an unmergeable situation. This is
     5  not a common theme in git (for obvious reasons, it conflicts with its
     6  distributed, parallel nature), but is a requirement of any binary management
     7  system, since files are very often completely unmergeable, and no-one likes
     8  having to throw their work away & do it again.
     9  
    10  ## What not to do: single branch model
    11  
    12  The simplest way to organise locking is to require that if binary files are only
    13  ever edited on a single branch, and therefore editing this file can follow a
    14  simple sequence:
    15  
    16  1. File starts out read-only locally
    17  2. User locks the file, user is required to have the latest version locally from
    18     the 'main' branch
    19  3. User edits file & commits 1 or more times
    20  4. User pushes these commits to the main branch
    21  5. File is unlocked (and made read only locally again)
    22  
    23  ## A more usable approach: multi-branch model
    24  
    25  In practice teams need to work on more than one branch, and sometimes that work
    26  will have corresponding binary edits.
    27  
    28  It's important to remember that the core requirement is to prevent *unintended
    29  parallel edits of an unmergeable file*.
    30  
    31  One way to address this would be to say that locking a file locks it across all
    32  branches, and that lock is only released when the branch where the edit is is
    33  merged back into a 'primary' branch. The problem is that although that allows
    34  branching and also prevents merge conflicts, it forces merging of feature
    35  branches before a further edit can be made by someone else.
    36  
    37  An alternative is that locking a file locks it across all branches, but when the
    38  lock is released, further locks on that file can only be taken on a descendant
    39  of the latest edit that has been made, whichever branch it is on. That means
    40  a change to the rules of the lock sequence, namely:
    41  
    42  1. File starts out read-only locally
    43  2. User tries to lock a file. This is only allowed if:
    44     * The file is not already locked by anyone else, AND
    45     * One of the following are true:
    46        * The user has, or agrees to check out, a descendant of the latest commit
    47          that was made for that file, whatever branch that was on, OR
    48        * The user stays on their current commit but resets the locked file to the
    49          state of the latest commit (making it modified locally, and
    50          also cherry-picking changes for that file in practice).
    51  3. User edits file & commits 1 or more times, on any branch they like
    52  4. User pushes the commits
    53  5. File is unlocked if:
    54     * the latest commit to that file has been pushed (on any branch), and
    55     * the file is not locally edited
    56  
    57  This means that long-running branches can be maintained but that editing of a
    58  binary file must always incorporate the latest binary edits. This means that if
    59  this system is always respected, there is only ever one linear stream of
    60  development for this binary file, even though that 'thread' may wind its way
    61  across many different branches in the process.
    62  
    63  This does mean that no-one's changes are accidentally lost, but it does mean
    64  that we are either making new branches dependent on others, OR we're
    65  cherry-picking changes to individual files across branches. This does change
    66  the traditional git workflow, but importantly it achieves the core requirement
    67  of never *accidentally* losing anyone's changes. How changes are threaded
    68  across branches is always under the user's control.
    69  
    70  ## Breaking the rules
    71  We must allow the user to break the rules if they know what they are doing.
    72  Locking is there to prevent unintended binary merge conflicts, but sometimes you
    73  might want to intentionally create one, with the full knowledge that you're
    74  going to have to manually merge the result (or more likely, pick one side and
    75  discard the other) later down the line. There are 2 cases of rule breaking to
    76  support:
    77  
    78  1. **Break someone else's lock**
    79    People lock files and forget they've locked them, then go on holiday, or
    80    worse, leave the company. You can't be stuck not being able to edit that file
    81    so must be able to forcibly break someone else's lock. Ideally this should
    82    result in some kind of notification to the original locker (might need to be a
    83    special value-add on BB/Stash). This effectively removes the other person's
    84    lock and is likely to cause them problems if they had edited and try to push
    85    next time.
    86  
    87  2. **Allow a parallel lock**
    88    Actually similar to breaking someone else's lock, except it lets you take
    89    another lock on a file in parallel, leaving their lock in place too, and
    90    knowing that you're going to have to resolve the merge problem later.  You
    91    could handle this just by manually making files read/write, then using 'force
    92    push' to override hooks that prevent pushing when not locked. However by
    93    explicitly registering a parallel lock (possible form: 'git lfs lock
    94    --force') this could be recorded and communicated to anyone else with a lock,
    95    letting them know about possible merge issues down the line.
    96  
    97  ## Detailed feature points
    98  |No | Feature | Notes
    99  |---|---------|------------------
   100  |1  |Lock server must be available at same API URL|
   101  |2  |Identify unmergeable files as subset of lfs files|`git lfs track -b` ?
   102  |3  |Make unmergeable files read-only on checkout|Perform in smudge filter
   103  |4  |Lock a file<ul><li>Check with server which must atomically check/set</li><li>Check person requesting the lock is checked out on a commit which is a descendent of the last edit of that file (locally or on server, although last lock shouldn't have been released until push anyway), or allow --force to break rule</li><li>Record lock on server</li><li>Make file read/write locally if success</li></ul>|`git lfs lock <file>`?
   104  |5  |Release a lock<ul><li>Check if locally modified, if so must discard</li><li>Check if user has more recent commit of this file than server, if so must push first</li><li>Release lock on server atomically</li><li>Make local file read-only</li></ul>|`git lfs unlock <file>`?
   105  |6  |Break a lock, ie override someone else's lock and take it yourself.<ul><li>Release lock on server atomically</li><li>Proceed as per 'Lock a file'</li><li>Notify original lock holder HOW?</li></ul>|`git lfs lock -break <file>`?
   106  |7  |Release lock on reset (maybe). Configurable option / prompt? May be resetting just to start editing again|
   107  |8  |Release lock on push (maybe, if unmodified). See above|
   108  |9  |Cater for read-only binary files when merging locally<ul><li>Because files are read-only this might prevent merge from working when actually it's valid.</li><li>Always fine to merge the latest version of a binary file to anywhere else</li><li>Fine to merge the non-latest version if user is aware that this may cause merge problems (see Breaking the rules)</li><li>Therefore this feature is about dealing with the read-only flag and issuing a warning if not the latest</li></ul>|
   109  |10 |List current locks<ul><li>That the current user has</li><li>That anyone has</li><li>Potentially scoped to folder</li></ul>|`git lfs lock --list [paths...]`
   110  |11 |Reject a push containing a binary file currently locked by someone else|pre-receive hook on server, allow --force to override (i.e. existing parameter to git push)
   111  
   112  ## Locking challenges
   113  
   114  ### Making files read-only
   115  
   116  This is useful because it means it provides a reminder that the user should be
   117  locking the file before they start to edit it, to avoid the case of an unexpected
   118  merge later on. 
   119  
   120  I've done some tests with chmod and discovered:
   121  
   122  * Removing the write bit doesn't cause the file to be marked modified (good)
   123  * In most editors it either prevents saving or (in Apple tools) prompts to
   124    'unlock'. The latter is slightly unhelpful
   125  * In terms of marking files that need locking, adding custom flags to
   126    .gitattributes (like 'lock') seems to work; `git check-attr -a <file>`
   127    correctly lists the custom attribute
   128  * Once a file is marked read-only however, `git checkout` replaces it without
   129    prompting, with the write bit set
   130  * We can use the `post-checkout` hook to make files read-only, but we don't get
   131    any file information, on refs. This means we'd have to scan the whole working
   132    copy to figure out what we needed to mark read-only. To do this we'd have to
   133    have the attribute information and all the current lock information. This
   134    could be time consuming.
   135    * A way to speed up the `post-checkout` would be to diff the pre- and post-ref
   136      information that's provided and only check the files that changed. In the case
   137      of single-file checkouts I'm not sure this is possible though.
   138    * We could also feed either the diff or a file scan into `git check-attr --stdin`
   139      in order to share the exe, or do our own attribute matching
   140  * It's not entirely clear yet how merge & rebase might operate. May also need
   141    the `post-merge` hook
   142  * See contrib/hooks/setgitperms.perl for an example; so this isn't unprecedented
   143  
   144  #### Test cases for post-checkout
   145  
   146  * Checkout a branch
   147    * Calls `post-checkout` with pre/post SHA and branch=1
   148  * Checkout a tag
   149    * Calls `post-checkout` with pre/post SHA and branch=1 (even though it's a tag)
   150  * Checkout by commit SHA
   151    * Calls `post-checkout` with pre/post SHA and branch=1 (even though it's a plain SHA)
   152  * Checkout named files (e.g. discard changes)
   153    * Calls `post-checkout` with identical pre/post SHA (HEAD) and branch=0
   154  * Reset all files (discard all changes ie git reset --hard HEAD) 
   155    * Doesn't call `post-checkout` - could restore write bit, but must have been
   156      set anyway for file to be edited, so not a problem?
   157  * Reset a branch to a previous commit
   158    * Doesn't call `post-checkout` - PROBLEM because can restore write bit & file
   159      was not modified. BUT: rare & maybe liveable
   160  * Merge a branch with lockable file changes (non-conflicting)
   161  * Rebase a branch with lockable files (non-conflicting)
   162  * Merge conflicts - fix then commit
   163  * Rebase conflicts - fix then continue
   164  * 
   165  
   166  
   167  ## Implementation details (Initial simple API-only pass)
   168  ### Types
   169  To make the implementing locking on the lfs-test-server as well as other servers
   170  in the future easier, it makes sense to create a `lock` package that can be
   171  depended upon from any server. This will go along with Steve's refactor which
   172  touches the `lfs` package quite a bit.
   173  
   174  Below are enumerated some of the types that will presumably land in this
   175  sub-package.
   176  
   177  ```go
   178  // Lock represents a single lock that against a particular path.
   179  //
   180  // Locks returned from the API may or may not be currently active, according to
   181  // the Expired flag.
   182  type Lock struct {
   183          // Id is the unique identifier corresponding to this particular Lock. It
   184          // must be consistent with the local copy, and the server's copy.
   185          Id string `json:"id"`
   186          // Path is an absolute path to the file that is locked as a part of this
   187          // lock.
   188          Path string `json:"path"`
   189          // Committer is the author who initiated this lock.
   190          Committer struct {
   191                 Name  string `json:"name"`
   192                 Email string `json:"email"`
   193          } `json:"creator"`
   194          // CommitSHA is the commit that this Lock was created against. It is
   195          // strictly equal to the SHA of the minimum commit negotiated in order
   196          // to create this lock.
   197          CommitSHA string `json:"commit_sha"
   198          // LockedAt is a required parameter that represents the instant in time
   199          // that this lock was created. For most server implementations, this
   200          // should be set to the instant at which the lock was initially
   201          // received.
   202          LockedAt time.Time `json:"locked_at"`
   203          // ExpiresAt is an optional parameter that represents the instant in
   204          // time that the lock stopped being active. If the lock is still active,
   205          // the server can either a) not send this field, or b) send the
   206          // zero-value of time.Time.
   207          UnlockedAt time.Time `json:"unlocked_at,omitempty"`
   208  }
   209  
   210  // Active returns whether or not the given lock is still active against the file
   211  // that it is protecting.
   212  func (l *Lock) Active() bool {
   213          return time.IsZero(l.UnlockedAt)
   214  }
   215  ```
   216  
   217  ### Proposed Commands
   218  
   219  #### `git lfs lock <path>`
   220  
   221  The `lock` command will be used in accordance with the multi-branch flow as
   222  proposed above to request that lock be granted to the specific path passed an
   223  argument to the command.
   224  
   225  ```go
   226  // LockRequest encapsulates the payload sent across the API when a client would
   227  // like to obtain a lock against a particular path on a given remote.
   228  type LockRequest struct {
   229          // Path is the path that the client would like to obtain a lock against.
   230          Path      string `json:"path"`
   231          // LatestRemoteCommit is the SHA of the last known commit from the
   232          // remote that we are trying to create the lock against, as found in
   233          // `.git/refs/origin/<name>`.
   234          LatestRemoteCommit string `json:"latest_remote_commit"`
   235          // Committer is the individual that wishes to obtain the lock.
   236          Committer struct {
   237                // Name is the name of the individual who would like to obtain the
   238                // lock, for instance: "Rick Olson".
   239                Name string `json:"name"`
   240                // Email is the email assopsicated with the individual who would
   241                // like to obtain the lock, for instance: "rick@github.com".
   242                Email string `json:"email"`
   243          } `json:"committer"`
   244  }
   245  ```
   246  
   247  ```go
   248  // LockResponse encapsulates the information sent over the API in response to
   249  // a `LockRequest`.
   250  type LockResponse struct {
   251          // Lock is the Lock that was optionally created in response to the
   252          // payload that was sent (see above). If the lock already exists, then
   253          // the existing lock is sent in this field instead, and the author of
   254          // that lock remains the same, meaning that the client failed to obtain
   255          // that lock. An HTTP status of "409 - Conflict" is used here.
   256          //
   257          // If the lock was unable to be created, this field will hold the
   258          // zero-value of Lock and the Err field will provide a more detailed set
   259          // of information.
   260          //
   261          // If an error was experienced in creating this lock, then the
   262          // zero-value of Lock should be sent here instead.
   263          Lock Lock `json:"lock"`
   264          // CommitNeeded holds the minimum commit SHA that client must have to
   265          // obtain the lock.
   266          CommitNeeded string `json:"commit_needed"`
   267          // Err is the optional error that was encountered while trying to create
   268          // the above lock.
   269          Err error `json:"error,omitempty"`
   270  }
   271  ```
   272  
   273  
   274  #### `git lfs unlock <path>`
   275  
   276  The `unlock` command is responsible for releasing the lock against a particular
   277  file. The command takes a `<path>` argument which the LFS client will have to
   278  internally resolve into a Id to unlock.
   279  
   280  The API associated with this command can also be used on the server to remove
   281  existing locks after a push.
   282  
   283  ```go
   284  // An UnlockRequest is sent by the client over the API when they wish to remove
   285  // a lock associated with the given Id.
   286  type UnlockRequest struct {
   287          // Id is the identifier of the lock that the client wishes to remove.
   288          Id string `json:"id"`
   289  }
   290  ```
   291  
   292  ```go
   293  // UnlockResult is the result sent back from the API when asked to remove a
   294  // lock.
   295  type UnlockResult struct {
   296          // Lock is the lock corresponding to the asked-about lock in the
   297          // `UnlockPayload` (see above). If no matching lock was found, this
   298          // field will take the zero-value of Lock, and Err will be non-nil.
   299          Lock Lock `json:"lock"`
   300          // Err is an optional field which holds any error that was experienced
   301          // while removing the lock.
   302          Err error `json:"error,omitempty"`
   303  }
   304  ```
   305  
   306  Clients can determine whether or not their lock was removed by calling the
   307  `Active()` method on the returned Lock, if `UnlockResult.Err` is nil.
   308  
   309  #### `git lfs locks (-r <remote>|-b <branch|-p <path>)|(-i id)`
   310  
   311  For many operations, the LFS client will need to have knowledge of existing
   312  locks on the server. Additionally, the client should not have to self-sort/index
   313  this (potentially) large set. To remove this need, both the `locks` command and
   314  corresponding API method take several filters.
   315  
   316  Clients should turn the flag-values that were passed during the command
   317  invocation into `Filter`s as described below, and batched up into the `Filters`
   318  field in the `LockListRequest`.
   319  
   320  ```go
   321  // Property is a constant-type that narrows fields pertaining to the server's
   322  // Locks.
   323  type Property string
   324  
   325  const (
   326          Branch Property = "branch"
   327          Id     Property = "id"
   328          // (etc) ...
   329  )
   330  
   331  // LockListRequest encapsulates the request sent to the server when the client
   332  // would like a list of locks that match the given criteria.
   333  type LockListRequest struct {
   334          // Filters is the set of filters to query against. If the client wishes
   335          // to obtain a list of all locks, an empty array should be passed here.
   336          Filters []{
   337                  // Prop is the property to search against.
   338                  Prop Property `json:"prop"`
   339                  // Value is the value that the property must take.
   340                  Value string `json:"value"`
   341          } `json:"filters"`
   342          // Cursor is an optional field used to tell the server which lock was
   343          // seen last, if scanning through multiple pages of results.
   344          //
   345          // Servers must return a list of locks sorted in reverse chronological
   346          // order, so the Cursor provides a consistent method of viewing all
   347          // locks, even if more were created between two requests.
   348          Cursor string `json:"cursor,omitempty"`
   349          // Limit is the maximum number of locks to return in a single page.
   350          Limit int `json:"limit"`
   351  }
   352  ```
   353  
   354  ```go
   355  // LockList encapsulates a set of Locks.
   356  type LockList struct {
   357          // Locks is the set of locks returned back, typically matching the query
   358          // parameters sent in the LockListRequest call. If no locks were matched
   359          // from a given query, then `Locks` will be represented as an empty
   360          // array.
   361          Locks []Lock `json:"locks"`
   362          // NextCursor returns the Id of the Lock the client should update its
   363          // cursor to, if there are multiple pages of results for a particular
   364          // `LockListRequest`.
   365          NextCursor string `json:"next_cursor,omitempty"`
   366          // Err populates any error that was encountered during the search. If no
   367          // error was encountered and the operation was succesful, then a value
   368          // of nil will be passed here.
   369          Err error `json:"error,omitempty"`
   370  }