github.com/sercand/please@v13.4.0+incompatible/docs/cache.html

github.com/sercand/please@v13.4.0+incompatible/docs/cache.html (about)

     1  
     2      <h1>Please cache</h1>
     3  
     4      <p>In various places in these docs you might find reference to caches.
     5        Please can make use of several caches to speed up its performance which are described here.</p>
     6  
     7      <p>In all cases artifacts are only stored in the cache after a successful build or test run.<br/>
     8        Please takes a <code>--nocache</code> flag which disables all caches for an individual run.</p>
     9  
    10      <h2>The directory cache</h2>
    11  
    12      <p>This is the simplest kind of cache; it's on by default and simply is a directory tree
    13        (by default <code>~/.cache/please</code> or <code>~/Library/Caches/please</code>)
    14        containing various versions of built artifacts. The main advantage of this is that it allows
    15        extremely fast rebuilds when swapping between different versions of code
    16        (notably git branches).</p>
    17  
    18      <p>Note that the dir cache is <b>not</b> threadsafe or locked in any way beyond plz's normal
    19        repo lock, so sharing the same directory between multiple projects is probably a Bad Idea.</p>
    20  
    21      <h2>The HTTP cache</h2>
    22  
    23      <p>This is a more advanced cache which, as one would expect, can run on a centralised machine
    24        to share artifacts between multiple clients. It has a reasonably simple API, for reference:
    25        <ul>
    26  	<li><code>GET /artifact/{os_name}/{artifact}</code>: Retrieves a particular artifact.</li>
    27  	<li><code>POST /artifact/{os_name}/{artifact}</code>: Stores a particular artifact.</li>
    28  	<li><code>DELETE /artifact/{artifact}</code>: Deletes all versions of a given artifact.</li>
    29  	<li><code>DELETE /</code>: Deletes all artifacts.</li>
    30        </ul>
    31  
    32        We should document this in more detail, especially since the formats can be subtle
    33        (an awkward corner case requires multipart for some cases) but as described below it is probably
    34        preferable to implement the RPC cache instead.
    35      </p>
    36  
    37      <p>The cache runs as a daemon and is fully threadsafe so is of course safe for multiple clients
    38        to attempt to store / retrieve artifacts simultaneously. Because it's a daemon it maintains
    39        its own stats about what artifacts it has so can be a little more intelligent than the dir
    40        cache about what it should delete and when.</p>
    41  
    42      <p>Please comes with an implementation of this cache as a standalone binary.</p>
    43  
    44      <p>Thanks to Diana Costea who implemented the original version of this as part of her internship
    45        with us, and prodded us into getting on and actually deploying it for our CI servers.</p>
    46  
    47      <h2>The RPC cache</h2>
    48  
    49      <p>This is very similar conceptually to the HTTP cache, but uses <a href="http://grpc.io">gRPC</a>
    50        for communication (of course that uses HTTP itself underneath). The motivation was partly that
    51        we realised fairly late that our cache semantics sometimes require multiple files to be communicated
    52        in a single request, which implied multipart encodings which are of course not great fun. We
    53        also felt that we could get better performance this way too. An awful lot of the internal code is
    54        shared with the HTTP server so only the transport layer really differs.</p>
    55  
    56      <p>The API can be found in <code>src/cache/proto/rpc_cache.proto</code>. It's not very complex
    57        so would not be hard to implement, although again Please comes with an implementation of this
    58        cache as a standalone binary.</p>
    59  
    60      <h2>Notes</h2>
    61  
    62      <p>Our current CI setup leans very heavily on these caches; every checkin to master triggers a build
    63        of our repo in a clean environment, so initially nothing is present in plz-out. The build machines
    64        maintain a local directory cache though which ensures things are pretty fast (the penalty for a
    65        cache hit is obviously a bit worse than having the artifact in plz-out already, but it's pretty quick).</p>
    66  
    67      <p>The build machines all push artifacts into a single central RPC cache and pull them back again as
    68        needed, so generally after an initial test run of a PR subsequent builds are fast. Developer machines use
    69        the RPC cache in read-only mode so can partake in these too; it would be nice if everyone was read-write
    70        but generally the shared cache is vulnerable to developers having incompatible machine setups
    71        (for example, different Go versions, even minor ones, cannot share compiled artifacts so many bad things
    72        will happen if developers aren't all on exactly the same one). A very long-term goal is for Please
    73        to have better insight into the machine-level deps of these things which would potentially allow
    74        everyone to have read-write access.</p>
    75  
    76      <p>Theoretically Please ensures hashes are as expected before storing or retrieving artifacts, but as just
    77        noted there are ways to cause problems nonetheless. Hence the various caches store artifacts in
    78        a pretty obvious filesystem structure analogous to the repo structure itself, so if any single
    79        artifact is wrong it's not hard to excise it.</p>