github.com/git-lfs/git-lfs@v2.5.2+incompatible/docs/custom-transfers.md

github.com/git-lfs/git-lfs@v2.5.2+incompatible/docs/custom-transfers.md (about)

     1  # Adding Custom Transfer Agents to LFS
     2  
     3  ## Introduction
     4  
     5  Git LFS supports multiple ways to transfer (upload and download) files. In the
     6  core client, the basic way to do this is via a one-off HTTP request via the URL
     7  returned from the LFS API for a given object. The core client also supports
     8  extensions to allow resuming of downloads (via `Range` headers) and uploads (via
     9  the [tus.io](http://tus.io) protocol).
    10  
    11  Some people might want to be able to transfer content in other ways, however.
    12  To enable this, git-lfs allows configuring Custom Transfers, which are
    13  simply processes which must adhere to the protocol defined later in this
    14  document. git-lfs will invoke the process at the start of all transfers,
    15  and will communicate with the process via stdin/stdout for each transfer.
    16  
    17  ## Custom Transfer Type Selection
    18  
    19  In the LFS API request, the client includes a list of transfer types it
    20  supports. When replying, the API server will pick one of these and make any
    21  necessary adjustments to the returned object actions, in case the the picked
    22  transfer type needs custom details about how to do each transfer.
    23  
    24  ## Using a Custom Transfer Type without the API server
    25  
    26  In some cases the transfer agent can figure out by itself how and where
    27  the transfers should be made, without having to query the API server.
    28  In this case it's possible to use the custom transfer agent directly,
    29  without querying the server, by using the following config option:
    30  
    31  * `lfs.standalonetransferagent`, `lfs.<url>.standalonetransferagent`
    32  
    33    Specifies a custom transfer agent to be used if the API server URL matches as
    34    in `git config --get-urlmatch lfs.standalonetransferagent <apiurl>`.
    35    `git-lfs` will not contact the API server.  It instead sets stage 2 transfer
    36    actions to `null`.  `lfs.<url>.standalonetransferagent` can be used to
    37    configure a custom transfer agent for individual remotes.
    38    `lfs.standalonetransferagent` unconditionally configures a custom transfer
    39    agent for all remotes.  The custom transfer agent must be specified in
    40    a `lfs.customtransfer.<name>` settings group.
    41  
    42  ## Defining a Custom Transfer Type
    43  
    44  A custom transfer process is defined under a settings group called
    45  `lfs.customtransfer.<name>`, where `<name>` is an identifier (see
    46  [Naming](#naming) below).
    47  
    48  * `lfs.customtransfer.<name>.path`
    49  
    50    `path` should point to the process you wish to invoke. This will be invoked at
    51    the start of all transfers (possibly many times, see the `concurrent` option
    52    below) and the protocol over stdin/stdout is defined below in the
    53    [Protocol](#protocol) section.
    54  
    55  * `lfs.customtransfer.<name>.args`
    56  
    57    If the custom transfer process requires any arguments, these can be provided
    58    here. Typically you would only need this if your process was multi-purpose or
    59    particularly flexible, most of the time you won't need it.
    60  
    61  * `lfs.customtransfer.<name>.concurrent`
    62  
    63    If true (the default), git-lfs will invoke the custom transfer process
    64    multiple times in parallel, according to `lfs.concurrenttransfers`, splitting
    65    the transfer workload between the processes.
    66  
    67    If you would prefer that only one instance of the transfer process is invoked,
    68    maybe because you want to do your own parallelism internally (e.g. slicing
    69    files into parts), set this to false.
    70  
    71  * `lfs.customtransfer.<name>.direction`
    72  
    73    Specifies which direction the custom transfer process supports, either
    74    `download`, `upload`, or `both`. The default if unspecified is `both`.
    75  
    76  ## Naming
    77  
    78  Each custom transfer must have a name which is unique to the underlying
    79  mechanism, and the client and the server must agree on that name. The client
    80  will advertise this name to the server as a supported transfer approach, and if
    81  the server supports it, it will return relevant object action links. Because
    82  these may be very different from standard HTTP URLs it's important that the
    83  client and server agree on the name.
    84  
    85  For example, let's say I've implemented a custom transfer process which uses
    86  NFS. I could call this transfer type `nfs` - although it's not specific to my
    87  configuration exactly, it is specific to the way NFS works, and the server will
    88  need to give me different URLs. Assuming I define my transfer like this, and the
    89  server supports it, I might start getting object action links back like
    90  `nfs://<host>/path/to/object`
    91  
    92  ## Protocol
    93  
    94  The git-lfs client communicates with the custom transfer process via the stdin
    95  and stdout streams. No file content is communicated on these streams, only
    96  request / response metadata. The metadata exchanged is always in JSON format.
    97  External files will be referenced when actual content is exchanged.
    98  
    99  ### Line Delimited JSON
   100  
   101  Because multiple JSON messages will be exchanged on the same stream it's useful
   102  to delimit them explicitly rather than have the parser find the closing `}` in
   103  an arbitrary stream, therefore each JSON structure will be sent and received on
   104  a **single line** as per [Line Delimited
   105  JSON](https://en.wikipedia.org/wiki/JSON_Streaming#Line_delimited_JSON_2).
   106  
   107  In other words when git-lfs sends a JSON message to the custom transfer it will
   108  be on a single line, with a line feed at the end. The transfer process must
   109  respond the same way by writing a JSON structure back to stdout with a single
   110  line feed at the end (and flush the output).
   111  
   112  ### Protocol Stages
   113  
   114  The protocol consists of 3 stages:
   115  
   116  #### Stage 1: Intiation
   117  
   118  Immediately after invoking a custom transfer process, git-lfs sends initiation
   119  data to the process over stdin. This tells the process useful information about
   120  the configuration.
   121  
   122  The message will look like this:
   123  
   124  ```json
   125  { "event": "init", "operation": "download", "remote": "origin", "concurrent": true, "concurrenttransfers": 3 }
   126  ```
   127  
   128  * `event`: Always `init` to identify this message
   129  * `operation`: will be `upload` or `download` depending on transfer direction
   130  * `remote`: The Git remote.  It can be a remote name like `origin` or an URL
   131    like `ssh://git.example.com//path/to/repo`.  A standalone transfer agent can
   132    use it to determine the location of remote files.
   133  * `concurrent`: reflects the value of `lfs.customtransfer.<name>.concurrent`, in
   134    case the process needs to know
   135  * `concurrenttransfers`: reflects the value of `lfs.concurrenttransfers`, for if
   136    the transfer process wants to implement its own concurrency and wants to
   137    respect this setting.
   138  
   139  The transfer process should use the information it needs from the intiation
   140  structure, and also perform any one-off setup tasks it needs to do. It should
   141  then respond on stdout with a simple empty confirmation structure, as follows:
   142  
   143  ```json
   144  { }
   145  ```
   146  
   147  Or if there was an error:
   148  
   149  ```json
   150  { "error": { "code": 32, "message": "Some init failure message" } }
   151  ```
   152  
   153  #### Stage 2: 0..N Transfers
   154  
   155  After the initiation exchange, git-lfs will send any number of transfer
   156  requests to the stdin of the transfer process, in a serial sequence. Once a
   157  transfer request is sent to the process, it awaits a completion response before
   158  sending the next request.
   159  
   160  ##### Uploads
   161  
   162  For uploads the request sent from git-lfs to the transfer process will look
   163  like this:
   164  
   165  ```json
   166  { "event": "upload", "oid": "bf3e3e2af9366a3b704ae0c31de5afa64193ebabffde2091936ad2e7510bc03a", "size": 346232, "path": "/path/to/file.png", "action": { "href": "nfs://server/path", "header": { "key": "value" } } }
   167  ```
   168  
   169  * `event`: Always `upload` to identify this message
   170  * `oid`: the identifier of the LFS object
   171  * `size`: the size of the LFS object
   172  * `path`: the file which the transfer process should read the upload data from
   173  * `action`: the `upload` action copied from the response from the batch API.
   174    This contains `href` and `header` contents, which are named per HTTP
   175    conventions, but can be interpreted however the custom transfer agent wishes
   176    (this is an NFS example, but it doesn't even have to be an URL). Generally,
   177    `href` will give the primary connection details, with `header` containing any
   178    miscellaneous information needed.  `action` is `null` for standalone transfer
   179    agents.
   180  
   181  The transfer process should post one or more [progress messages](#progress) and
   182  then a final completion message as follows:
   183  
   184  ```json
   185  { "event": "complete", "oid": "bf3e3e2af9366a3b704ae0c31de5afa64193ebabffde2091936ad2e7510bc03a" }
   186  ```
   187  
   188  * `event`: Always `complete` to identify this message
   189  * `oid`: the identifier of the LFS object
   190  
   191  Or if there was an error in the transfer:
   192  
   193  ```json
   194  { "event": "complete", "oid": "bf3e3e2af9366a3b704ae0c31de5afa64193ebabffde2091936ad2e7510bc03a", "error": { "code": 2, "message": "Explain what happened to this transfer" } }
   195  ```
   196  
   197  * `event`: Always `complete` to identify this message
   198  * `oid`: the identifier of the LFS object
   199  * `error`: Should contain a `code` and `message` explaining the error
   200  
   201  ##### Downloads
   202  
   203  For downloads the request sent from git-lfs to the transfer process will look
   204  like this:
   205  
   206  ```json
   207  { "event": "download", "oid": "22ab5f63670800cc7be06dbed816012b0dc411e774754c7579467d2536a9cf3e", "size": 21245, "action": { "href": "nfs://server/path", "header": { "key": "value" } } }
   208  ```
   209  
   210  * `event`: Always `download` to identify this message
   211  * `oid`: the identifier of the LFS object
   212  * `size`: the size of the LFS object
   213  * `action`: the `download` action copied from the response from the batch API.
   214    This contains `href` and `header` contents, which are named per HTTP
   215    conventions, but can be interpreted however the custom transfer agent wishes
   216    (this is an NFS example, but it doesn't even have to be an URL). Generally,
   217    `href` will give the primary connection details, with `header` containing any
   218    miscellaneous information needed.  `action` is `null` for standalone transfer
   219    agents.
   220  
   221  Note there is no file path included in the download request; the transfer
   222  process should create a file itself and return the path in the final response
   223  after completion (see below).
   224  
   225  The transfer process should post one or more [progress messages](#progress) and
   226  then a final completion message as follows:
   227  
   228  ```json
   229  { "event": "complete", "oid": "22ab5f63670800cc7be06dbed816012b0dc411e774754c7579467d2536a9cf3e", "path": "/path/to/file.png" }
   230  ```
   231  
   232  * `event`: Always `complete` to identify this message
   233  * `oid`: the identifier of the LFS object
   234  * `path`: the path to a file containing the downloaded data, which the transfer
   235    process relinquishes control of to git-lfs. git-lfs will move the file into
   236    LFS storage.
   237  
   238  Or, if there was a failure transferring this item:
   239  
   240  ```json
   241  { "event": "complete", "oid": "22ab5f63670800cc7be06dbed816012b0dc411e774754c7579467d2536a9cf3e", "error": { "code": 2, "message": "Explain what happened to this transfer" } }
   242  ```
   243  
   244  * `event`: Always `complete` to identify this message
   245  * `oid`: the identifier of the LFS object
   246  * `error`: Should contain a `code` and `message` explaining the error
   247  
   248  Errors for a single transfer request should not terminate the process. The error
   249  should be returned in the response structure instead.
   250  
   251  The custom transfer adapter does not need to check the SHA of the file content
   252  it has downloaded, git-lfs will do that before moving the final content into
   253  the LFS store.
   254  
   255  ##### Progress
   256  
   257  In order to support progress reporting while data is uploading / downloading,
   258  the transfer process should post messages to stdout as follows before sending
   259  the final completion message:
   260  
   261  ```json
   262  { "event": "progress", "oid": "22ab5f63670800cc7be06dbed816012b0dc411e774754c7579467d2536a9cf3e", "bytesSoFar": 1234, "bytesSinceLast": 64 }
   263  ```
   264  
   265  * `event`: Always `progress` to identify this message
   266  * `oid`: the identifier of the LFS object
   267  * `bytesSoFar`: the total number of bytes transferred so far
   268  * `bytesSinceLast`: the number of bytes transferred since the last progress
   269    message
   270  
   271  The transfer process should post these messages such that the last one sent
   272  has `bytesSoFar` equal to the file size on success.
   273  
   274  #### Stage 3: Finish & Cleanup
   275  
   276  When all transfers have been processed, git-lfs will send the following message
   277  to the stdin of the transfer process:
   278  
   279  ```json
   280  { "event": "terminate" }
   281  ```
   282  
   283  On receiving this message the transfer process should clean up and terminate.
   284  No response is expected.
   285  
   286  ## Error handling
   287  
   288  Any unexpected fatal errors in the transfer process (not errors specific to a
   289  transfer request) should set the exit code to non-zero and print information to
   290  stderr. Otherwise the exit code should be 0 even if some transfers failed.
   291  
   292  ## A Note On Verify Actions
   293  
   294  You may have noticed that that only the `upload` and `download` actions are
   295  passed to the custom transfer agent for processing, what about the `verify`
   296  action, if the API returns one?
   297  
   298  Custom transfer agents do not handle the verification process, only the
   299  upload and download of content. The verify link is typically used to notify
   300  a system *other* than the actual content store after an upload was completed,
   301  therefore it makes more sense for that to be handled via the normal API process.