github.com/2lambda123/git-lfs@v2.5.2+incompatible/docs/extensions.md (about)

     1  # Extending LFS
     2  
     3  Teams who use Git LFS often have custom requirements for how the pointer files
     4  and blobs should be handled.  Some examples of extensions that could be built:
     5  
     6  * Compress large files on clean, uncompress them on smudge/fetch
     7  * Encrypt files on clean, decrypt on smudge/fetch
     8  * Scan files on clean to make sure they don't contain sensitive information
     9  
    10  The basic extensibility model is that LFS extensions must be registered
    11  explicitly, and they will be invoked on clean and smudge to manipulate the
    12  contents of the files as needed.  On clean, LFS itself ensures that the pointer
    13  file is updated with all the information needed to be able to smudge correctly,
    14  and the extensions never modify the pointer file directly.
    15  
    16  NOTE: This feature is considered experimental, and included so developers can
    17  work on extensions. Exact details of how extensions work are subject to change
    18  based on feedback. It is possible for buggy extensions to leave your repository
    19  in a bad state, so don't rely on them with a production git repository without
    20  extensive testing.
    21  
    22  ## Registration
    23  
    24  To register an LFS extension, it must be added to the Git config.  Each
    25  extension needs to define:
    26  
    27  * Its unique name.  This will be used as part of the key in the pointer file.
    28  * The command to run on clean (when files are added to git).
    29  * The command to run on smudge (when files are downloaded and checked out).
    30  * The priority of the extension, which must be a unique, non-negative integer.
    31  
    32  The sequence `%f` in the clean and smudge commands will be replaced by the
    33  filename being processed.
    34  
    35  Here's an example extension registration in the Git config:
    36  
    37  ```
    38  [lfs "extension.foo"]
    39    clean = foo clean %f
    40    smudge = foo smudge %f
    41    priority = 0
    42  [lfs "extension.bar"]
    43    clean = bar clean %f
    44    smudge = bar smudge %f
    45    priority = 1
    46  ```
    47  
    48  ## Clean
    49  
    50  When staging a file, Git invokes the LFS clean filter, as described earlier.  If
    51  no extensions are installed, the LFS clean filter reads bytes from STDIN,
    52  calculates the SHA-256 signature, and writes the bytes to a temp file.  It then
    53  moves the temp file into the appropriate place in .git/lfs/objects and writes a
    54  valid pointer file to STDOUT.
    55  
    56  When an extension is installed, LFS will invoke the extension to do additional
    57  processing on the bytes before writing them into the temp file.  If multiple
    58  extensions are installed, they are invoked in the order defined by their
    59  priority.  LFS will also insert a key in the pointer file for each extension
    60  that was invoked, indicating both the order that the extension was invoked and
    61  the oid of the file before that extension was invoked. All of that information
    62  is required to be able to reliably smudge the file later.  Each new line in the
    63  pointer file will be of the form:
    64  
    65  `ext-{order}-{name} {hash-method}:{hash-of-input-to-extension}`
    66  
    67  This naming ensures that all extensions are written in both alphabetical and
    68  priority order, and also shows the progression of changes to the oid as it is
    69  processed by the extensions.
    70  
    71  Here's an example sequence, assuming extensions foo and bar are installed, as
    72  shown in the previous section.
    73  
    74  * Git passes the original contents of the file to LFS clean over STDIN.
    75  * LFS reads those bytes and calculates the original SHA-256 signature.
    76  * LFS streams the bytes to STDIN of `foo clean`, which is expected to write
    77    those bytes, modified or not, to its STDOUT.
    78  * LFS reads the bytes from STDOUT of `foo clean`, calculates the SHA-256
    79    signature, and writes them to STDIN of `bar clean`, which then writes those
    80    bytes, modified or not, to its STDOUT.
    81  * LFS reads the bytes from STDOUT of `bar clean`, calculates the SHA-256
    82    signature, and writes the bytes to a temp file.
    83  * When finished, LFS atomically moves the temp file into `.git/lfs/objects`.
    84  * LFS generates the pointer file, with some changes:
    85  * The oid and size keys are calculated from the final bytes written to LFS
    86    local storage.
    87  * LFS also writes keys named `ext-0-foo` and `ext-1-bar` into the pointer, along
    88    with their respective input oids.
    89  
    90  Here's an example pointer file, for a file processed by extensions foo and bar:
    91  
    92  ```
    93  version https://git-lfs.github.com/spec/v1
    94  ext-0-foo sha256:{original hash}
    95  ext-1-bar sha256:{hash after foo}
    96  oid sha256:{hash after bar}
    97  size 123
    98  (ending \n)
    99  ```
   100  
   101  Note: as an optimization, if an extension just does a pass-through, its key can
   102  be omitted from the pointer file.  This will make smudging the file a bit more
   103  efficient since that extension can be skipped.  LFS can detect a pass-through
   104  extension because the input and output oids will be the same.
   105  
   106  This implies that extensions must have no side effects other than writing to
   107  their STDOUT. Otherwise LFS has no way to know what extensions modified a file.
   108  
   109  ## Smudge
   110  
   111  When a file is checked out, Git invokes the LFS smudge filter, as described
   112  earlier. If no extensions are installed, the LFS smudge filter inspects the
   113  first 100 bytes of the bytes off STDIN, and if it is a pointer file, uses the
   114  oid to find the correct object in the LFS storage, and writes those bytes to
   115  STDOUT so that Git can write them to the working directory.
   116  
   117  If the pointer file indicates that extensions were invoked on that file, then
   118  those extensions must be installed in order to smudge.  If they are not
   119  installed, not found, or unusable for any reason, LFS will fail to smudge the
   120  file, and outputs an error indicating which extension is missing.
   121  
   122  Each of the extensions indicated in the pointer file must be invoked in reverse
   123  order to undo the changes they made to the contents of the file.  After each
   124  extension is invoked, LFS will compare the SHA-256 signature of the bytes output
   125  by the extension with the oid stored in the pointer file as the original input
   126  to that same extension.  Those signatures must match, otherwise the extension
   127  did not undo its changes correctly.  In that case, LFS fails to smudge the file,
   128  and outputs an error indicating which extension is failing.
   129  
   130  Here's an example sequence, indicating how LFS will smudge the pointer file
   131  shown in the previous section:
   132  
   133  * Git passes the bytes of the pointer file to LFS smudge over STDIN.  Note that
   134    when using `git lfs checkout`, LFS reads the files directly from disk rather
   135    than off STDIN.  The rest of the steps are unaffected either way.
   136  * LFS reads those bytes and inspects them to see if this is a pointer file.  If
   137    it was not, the bytes would just be passed through to STDOUT.
   138  * Since it is a pointer file, LFS reads the whole file off STDIN, parses it, and
   139    determines that extensions foo and bar both processed the file, in that order.
   140  * LFS uses the value of the oid key to find the blob in the `.git/lfs/objects`
   141    folder, or download from the server as needed.
   142  * LFS writes the contents of the blob to STDIN of `bar smudge`, which modifies
   143    them as needed and writes them to its STDOUT.
   144  * LFS reads the bytes from STDOUT of `bar smudge`, calculates the SHA-256
   145    signature, and writes the bytes to STDIN of `foo smudge`, which modifies them
   146    as needed and writes to them its STDOUT.
   147  * LFS reads the bytes from STDOUT of `foo smudge`, calculates the SHA-256
   148    signature, and writes the bytes to its own STDOUT.
   149  * At the end, ensure that the hashes calculated on the outputs of foo and bar
   150    match their corresponding input hashes from the pointer file.  If not, write a
   151    descriptive error message indicating which extension failed to undo its
   152    changes.
   153  * Question: On error, should we overwrite the file in the working directory with
   154    the original pointer file?  Can this be done reliably?
   155  
   156  ## Handling errors
   157  
   158  If there are errors in the configuration of LFS extensions, such as invalid
   159  extension names, duplicate priorities, etc, then any LFS commands that rely on
   160  them will abort with a descriptive error message.
   161  
   162  If an extension is unable to perform its task, it can indicate this error by
   163  returning a non-zero error code and writing a descriptive error message to its
   164  STDERR. The behavior on an error depends on whether we are cleaning or smudging.
   165  
   166  ### Clean
   167  
   168  If an extension fails to clean a file, it will return a non-zero error code and
   169  write an error message to its STDERR.  Because the file was not cleaned
   170  correctly, it can't be added to the index.  LFS will ensure that no pointer file
   171  is added or updated for failed files.  In addition, it will display the error
   172  messages for any files that could not be cleaned (and keep those errors in a
   173  log), so that the user can diagnose the failure, and then rerun "git add" on
   174  those files.
   175  
   176  ### Smudge
   177  
   178  If an extension fails to smudge a file, it will return a non-zero error code and
   179  write an error message to its STDERR.  Because the file was not smudged
   180  correctly, LFS cannot update that file in the working directory.  LFS will
   181  ensure that the pointer file is written to both the index and working directory.
   182  In addition, it will display the error messages for any files that could not be
   183  smudged (and keep those errors in a log), so that the user can diagnose the
   184  failure and then rerun `git-lfs checkout` to fix up any remaining pointer files.