github.com/x-oss-byte/git-lfs@v2.5.2+incompatible/docs/custom-transfers.md (about) 1 # Adding Custom Transfer Agents to LFS 2 3 ## Introduction 4 5 Git LFS supports multiple ways to transfer (upload and download) files. In the 6 core client, the basic way to do this is via a one-off HTTP request via the URL 7 returned from the LFS API for a given object. The core client also supports 8 extensions to allow resuming of downloads (via `Range` headers) and uploads (via 9 the [tus.io](http://tus.io) protocol). 10 11 Some people might want to be able to transfer content in other ways, however. 12 To enable this, git-lfs allows configuring Custom Transfers, which are 13 simply processes which must adhere to the protocol defined later in this 14 document. git-lfs will invoke the process at the start of all transfers, 15 and will communicate with the process via stdin/stdout for each transfer. 16 17 ## Custom Transfer Type Selection 18 19 In the LFS API request, the client includes a list of transfer types it 20 supports. When replying, the API server will pick one of these and make any 21 necessary adjustments to the returned object actions, in case the the picked 22 transfer type needs custom details about how to do each transfer. 23 24 ## Using a Custom Transfer Type without the API server 25 26 In some cases the transfer agent can figure out by itself how and where 27 the transfers should be made, without having to query the API server. 28 In this case it's possible to use the custom transfer agent directly, 29 without querying the server, by using the following config option: 30 31 * `lfs.standalonetransferagent`, `lfs.<url>.standalonetransferagent` 32 33 Specifies a custom transfer agent to be used if the API server URL matches as 34 in `git config --get-urlmatch lfs.standalonetransferagent <apiurl>`. 35 `git-lfs` will not contact the API server. It instead sets stage 2 transfer 36 actions to `null`. `lfs.<url>.standalonetransferagent` can be used to 37 configure a custom transfer agent for individual remotes. 38 `lfs.standalonetransferagent` unconditionally configures a custom transfer 39 agent for all remotes. The custom transfer agent must be specified in 40 a `lfs.customtransfer.<name>` settings group. 41 42 ## Defining a Custom Transfer Type 43 44 A custom transfer process is defined under a settings group called 45 `lfs.customtransfer.<name>`, where `<name>` is an identifier (see 46 [Naming](#naming) below). 47 48 * `lfs.customtransfer.<name>.path` 49 50 `path` should point to the process you wish to invoke. This will be invoked at 51 the start of all transfers (possibly many times, see the `concurrent` option 52 below) and the protocol over stdin/stdout is defined below in the 53 [Protocol](#protocol) section. 54 55 * `lfs.customtransfer.<name>.args` 56 57 If the custom transfer process requires any arguments, these can be provided 58 here. Typically you would only need this if your process was multi-purpose or 59 particularly flexible, most of the time you won't need it. 60 61 * `lfs.customtransfer.<name>.concurrent` 62 63 If true (the default), git-lfs will invoke the custom transfer process 64 multiple times in parallel, according to `lfs.concurrenttransfers`, splitting 65 the transfer workload between the processes. 66 67 If you would prefer that only one instance of the transfer process is invoked, 68 maybe because you want to do your own parallelism internally (e.g. slicing 69 files into parts), set this to false. 70 71 * `lfs.customtransfer.<name>.direction` 72 73 Specifies which direction the custom transfer process supports, either 74 `download`, `upload`, or `both`. The default if unspecified is `both`. 75 76 ## Naming 77 78 Each custom transfer must have a name which is unique to the underlying 79 mechanism, and the client and the server must agree on that name. The client 80 will advertise this name to the server as a supported transfer approach, and if 81 the server supports it, it will return relevant object action links. Because 82 these may be very different from standard HTTP URLs it's important that the 83 client and server agree on the name. 84 85 For example, let's say I've implemented a custom transfer process which uses 86 NFS. I could call this transfer type `nfs` - although it's not specific to my 87 configuration exactly, it is specific to the way NFS works, and the server will 88 need to give me different URLs. Assuming I define my transfer like this, and the 89 server supports it, I might start getting object action links back like 90 `nfs://<host>/path/to/object` 91 92 ## Protocol 93 94 The git-lfs client communicates with the custom transfer process via the stdin 95 and stdout streams. No file content is communicated on these streams, only 96 request / response metadata. The metadata exchanged is always in JSON format. 97 External files will be referenced when actual content is exchanged. 98 99 ### Line Delimited JSON 100 101 Because multiple JSON messages will be exchanged on the same stream it's useful 102 to delimit them explicitly rather than have the parser find the closing `}` in 103 an arbitrary stream, therefore each JSON structure will be sent and received on 104 a **single line** as per [Line Delimited 105 JSON](https://en.wikipedia.org/wiki/JSON_Streaming#Line_delimited_JSON_2). 106 107 In other words when git-lfs sends a JSON message to the custom transfer it will 108 be on a single line, with a line feed at the end. The transfer process must 109 respond the same way by writing a JSON structure back to stdout with a single 110 line feed at the end (and flush the output). 111 112 ### Protocol Stages 113 114 The protocol consists of 3 stages: 115 116 #### Stage 1: Intiation 117 118 Immediately after invoking a custom transfer process, git-lfs sends initiation 119 data to the process over stdin. This tells the process useful information about 120 the configuration. 121 122 The message will look like this: 123 124 ```json 125 { "event": "init", "operation": "download", "remote": "origin", "concurrent": true, "concurrenttransfers": 3 } 126 ``` 127 128 * `event`: Always `init` to identify this message 129 * `operation`: will be `upload` or `download` depending on transfer direction 130 * `remote`: The Git remote. It can be a remote name like `origin` or an URL 131 like `ssh://git.example.com//path/to/repo`. A standalone transfer agent can 132 use it to determine the location of remote files. 133 * `concurrent`: reflects the value of `lfs.customtransfer.<name>.concurrent`, in 134 case the process needs to know 135 * `concurrenttransfers`: reflects the value of `lfs.concurrenttransfers`, for if 136 the transfer process wants to implement its own concurrency and wants to 137 respect this setting. 138 139 The transfer process should use the information it needs from the intiation 140 structure, and also perform any one-off setup tasks it needs to do. It should 141 then respond on stdout with a simple empty confirmation structure, as follows: 142 143 ```json 144 { } 145 ``` 146 147 Or if there was an error: 148 149 ```json 150 { "error": { "code": 32, "message": "Some init failure message" } } 151 ``` 152 153 #### Stage 2: 0..N Transfers 154 155 After the initiation exchange, git-lfs will send any number of transfer 156 requests to the stdin of the transfer process, in a serial sequence. Once a 157 transfer request is sent to the process, it awaits a completion response before 158 sending the next request. 159 160 ##### Uploads 161 162 For uploads the request sent from git-lfs to the transfer process will look 163 like this: 164 165 ```json 166 { "event": "upload", "oid": "bf3e3e2af9366a3b704ae0c31de5afa64193ebabffde2091936ad2e7510bc03a", "size": 346232, "path": "/path/to/file.png", "action": { "href": "nfs://server/path", "header": { "key": "value" } } } 167 ``` 168 169 * `event`: Always `upload` to identify this message 170 * `oid`: the identifier of the LFS object 171 * `size`: the size of the LFS object 172 * `path`: the file which the transfer process should read the upload data from 173 * `action`: the `upload` action copied from the response from the batch API. 174 This contains `href` and `header` contents, which are named per HTTP 175 conventions, but can be interpreted however the custom transfer agent wishes 176 (this is an NFS example, but it doesn't even have to be an URL). Generally, 177 `href` will give the primary connection details, with `header` containing any 178 miscellaneous information needed. `action` is `null` for standalone transfer 179 agents. 180 181 The transfer process should post one or more [progress messages](#progress) and 182 then a final completion message as follows: 183 184 ```json 185 { "event": "complete", "oid": "bf3e3e2af9366a3b704ae0c31de5afa64193ebabffde2091936ad2e7510bc03a" } 186 ``` 187 188 * `event`: Always `complete` to identify this message 189 * `oid`: the identifier of the LFS object 190 191 Or if there was an error in the transfer: 192 193 ```json 194 { "event": "complete", "oid": "bf3e3e2af9366a3b704ae0c31de5afa64193ebabffde2091936ad2e7510bc03a", "error": { "code": 2, "message": "Explain what happened to this transfer" } } 195 ``` 196 197 * `event`: Always `complete` to identify this message 198 * `oid`: the identifier of the LFS object 199 * `error`: Should contain a `code` and `message` explaining the error 200 201 ##### Downloads 202 203 For downloads the request sent from git-lfs to the transfer process will look 204 like this: 205 206 ```json 207 { "event": "download", "oid": "22ab5f63670800cc7be06dbed816012b0dc411e774754c7579467d2536a9cf3e", "size": 21245, "action": { "href": "nfs://server/path", "header": { "key": "value" } } } 208 ``` 209 210 * `event`: Always `download` to identify this message 211 * `oid`: the identifier of the LFS object 212 * `size`: the size of the LFS object 213 * `action`: the `download` action copied from the response from the batch API. 214 This contains `href` and `header` contents, which are named per HTTP 215 conventions, but can be interpreted however the custom transfer agent wishes 216 (this is an NFS example, but it doesn't even have to be an URL). Generally, 217 `href` will give the primary connection details, with `header` containing any 218 miscellaneous information needed. `action` is `null` for standalone transfer 219 agents. 220 221 Note there is no file path included in the download request; the transfer 222 process should create a file itself and return the path in the final response 223 after completion (see below). 224 225 The transfer process should post one or more [progress messages](#progress) and 226 then a final completion message as follows: 227 228 ```json 229 { "event": "complete", "oid": "22ab5f63670800cc7be06dbed816012b0dc411e774754c7579467d2536a9cf3e", "path": "/path/to/file.png" } 230 ``` 231 232 * `event`: Always `complete` to identify this message 233 * `oid`: the identifier of the LFS object 234 * `path`: the path to a file containing the downloaded data, which the transfer 235 process relinquishes control of to git-lfs. git-lfs will move the file into 236 LFS storage. 237 238 Or, if there was a failure transferring this item: 239 240 ```json 241 { "event": "complete", "oid": "22ab5f63670800cc7be06dbed816012b0dc411e774754c7579467d2536a9cf3e", "error": { "code": 2, "message": "Explain what happened to this transfer" } } 242 ``` 243 244 * `event`: Always `complete` to identify this message 245 * `oid`: the identifier of the LFS object 246 * `error`: Should contain a `code` and `message` explaining the error 247 248 Errors for a single transfer request should not terminate the process. The error 249 should be returned in the response structure instead. 250 251 The custom transfer adapter does not need to check the SHA of the file content 252 it has downloaded, git-lfs will do that before moving the final content into 253 the LFS store. 254 255 ##### Progress 256 257 In order to support progress reporting while data is uploading / downloading, 258 the transfer process should post messages to stdout as follows before sending 259 the final completion message: 260 261 ```json 262 { "event": "progress", "oid": "22ab5f63670800cc7be06dbed816012b0dc411e774754c7579467d2536a9cf3e", "bytesSoFar": 1234, "bytesSinceLast": 64 } 263 ``` 264 265 * `event`: Always `progress` to identify this message 266 * `oid`: the identifier of the LFS object 267 * `bytesSoFar`: the total number of bytes transferred so far 268 * `bytesSinceLast`: the number of bytes transferred since the last progress 269 message 270 271 The transfer process should post these messages such that the last one sent 272 has `bytesSoFar` equal to the file size on success. 273 274 #### Stage 3: Finish & Cleanup 275 276 When all transfers have been processed, git-lfs will send the following message 277 to the stdin of the transfer process: 278 279 ```json 280 { "event": "terminate" } 281 ``` 282 283 On receiving this message the transfer process should clean up and terminate. 284 No response is expected. 285 286 ## Error handling 287 288 Any unexpected fatal errors in the transfer process (not errors specific to a 289 transfer request) should set the exit code to non-zero and print information to 290 stderr. Otherwise the exit code should be 0 even if some transfers failed. 291 292 ## A Note On Verify Actions 293 294 You may have noticed that that only the `upload` and `download` actions are 295 passed to the custom transfer agent for processing, what about the `verify` 296 action, if the API returns one? 297 298 Custom transfer agents do not handle the verification process, only the 299 upload and download of content. The verify link is typically used to notify 300 a system *other* than the actual content store after an upload was completed, 301 therefore it makes more sense for that to be handled via the normal API process.