github.com/sercand/please@v13.4.0+incompatible/docs/cache.html (about) 1 2 <h1>Please cache</h1> 3 4 <p>In various places in these docs you might find reference to caches. 5 Please can make use of several caches to speed up its performance which are described here.</p> 6 7 <p>In all cases artifacts are only stored in the cache after a successful build or test run.<br/> 8 Please takes a <code>--nocache</code> flag which disables all caches for an individual run.</p> 9 10 <h2>The directory cache</h2> 11 12 <p>This is the simplest kind of cache; it's on by default and simply is a directory tree 13 (by default <code>~/.cache/please</code> or <code>~/Library/Caches/please</code>) 14 containing various versions of built artifacts. The main advantage of this is that it allows 15 extremely fast rebuilds when swapping between different versions of code 16 (notably git branches).</p> 17 18 <p>Note that the dir cache is <b>not</b> threadsafe or locked in any way beyond plz's normal 19 repo lock, so sharing the same directory between multiple projects is probably a Bad Idea.</p> 20 21 <h2>The HTTP cache</h2> 22 23 <p>This is a more advanced cache which, as one would expect, can run on a centralised machine 24 to share artifacts between multiple clients. It has a reasonably simple API, for reference: 25 <ul> 26 <li><code>GET /artifact/{os_name}/{artifact}</code>: Retrieves a particular artifact.</li> 27 <li><code>POST /artifact/{os_name}/{artifact}</code>: Stores a particular artifact.</li> 28 <li><code>DELETE /artifact/{artifact}</code>: Deletes all versions of a given artifact.</li> 29 <li><code>DELETE /</code>: Deletes all artifacts.</li> 30 </ul> 31 32 We should document this in more detail, especially since the formats can be subtle 33 (an awkward corner case requires multipart for some cases) but as described below it is probably 34 preferable to implement the RPC cache instead. 35 </p> 36 37 <p>The cache runs as a daemon and is fully threadsafe so is of course safe for multiple clients 38 to attempt to store / retrieve artifacts simultaneously. Because it's a daemon it maintains 39 its own stats about what artifacts it has so can be a little more intelligent than the dir 40 cache about what it should delete and when.</p> 41 42 <p>Please comes with an implementation of this cache as a standalone binary.</p> 43 44 <p>Thanks to Diana Costea who implemented the original version of this as part of her internship 45 with us, and prodded us into getting on and actually deploying it for our CI servers.</p> 46 47 <h2>The RPC cache</h2> 48 49 <p>This is very similar conceptually to the HTTP cache, but uses <a href="http://grpc.io">gRPC</a> 50 for communication (of course that uses HTTP itself underneath). The motivation was partly that 51 we realised fairly late that our cache semantics sometimes require multiple files to be communicated 52 in a single request, which implied multipart encodings which are of course not great fun. We 53 also felt that we could get better performance this way too. An awful lot of the internal code is 54 shared with the HTTP server so only the transport layer really differs.</p> 55 56 <p>The API can be found in <code>src/cache/proto/rpc_cache.proto</code>. It's not very complex 57 so would not be hard to implement, although again Please comes with an implementation of this 58 cache as a standalone binary.</p> 59 60 <h2>Notes</h2> 61 62 <p>Our current CI setup leans very heavily on these caches; every checkin to master triggers a build 63 of our repo in a clean environment, so initially nothing is present in plz-out. The build machines 64 maintain a local directory cache though which ensures things are pretty fast (the penalty for a 65 cache hit is obviously a bit worse than having the artifact in plz-out already, but it's pretty quick).</p> 66 67 <p>The build machines all push artifacts into a single central RPC cache and pull them back again as 68 needed, so generally after an initial test run of a PR subsequent builds are fast. Developer machines use 69 the RPC cache in read-only mode so can partake in these too; it would be nice if everyone was read-write 70 but generally the shared cache is vulnerable to developers having incompatible machine setups 71 (for example, different Go versions, even minor ones, cannot share compiled artifacts so many bad things 72 will happen if developers aren't all on exactly the same one). A very long-term goal is for Please 73 to have better insight into the machine-level deps of these things which would potentially allow 74 everyone to have read-write access.</p> 75 76 <p>Theoretically Please ensures hashes are as expected before storing or retrieving artifacts, but as just 77 noted there are ways to cause problems nonetheless. Hence the various caches store artifacts in 78 a pretty obvious filesystem structure analogous to the repo structure itself, so if any single 79 artifact is wrong it's not hard to excise it.</p>