k8s.io/test-infra@v0.0.0-20240520184403-27c6b4c223d8/greenhouse/README.md (about)

     1  # Greenhouse
     2  
     3  Greenhouse is our [bazel remote caching](https://docs.bazel.build/versions/master/remote-caching.html) setup.
     4  We use this to provide faster build & test presubmits with a Globally shared cache (per repo).
     5  
     6  Most Bazel users should probably visit [the official docs](https://docs.bazel.build/versions/master/remote-caching.html) and select one of the options outlined there, with Prow/Kubernetes we are using a custom setup to explore:
     7  
     8  - better support for multiple repos / cache invalidation by changing the cache URL suffix
     9    (see also: `images/bootstrap/create_bazel_cache_rcs.sh`)
    10  - customized cache eviction / management
    11  - integration with our logging and metrics stacks
    12  
    13  
    14  ## Setup (on a Kubernetes Cluster)
    15  We use this with [Prow](./../prow), to set it up we do the following:
    16  
    17   - Install [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/) and [bazel](https://bazel.build/) and Point `KUBECONFIG` at your cluster.
    18     - for k8s.io use `make -C prow get-build-cluster-credentials`
    19   - Create a dedicated node. We use a GKE node-pool with a single node. Tag this node with label `dedicated=greenhouse` and taint `dedicated=greenhouse:NoSchedule` so your other tasks don't schedule on it.
    20     - for k8s.io (running on GKE) this is:
    21     ```
    22     gcloud beta container node-pools create greenhouse --cluster=prow --project=k8s-prow-builds --zone=us-central1-f --node-taints=dedicated=greenhouse:NoSchedule --node-labels=dedicated=greenhouse --machine-type=n1-standard-32 --num-nodes=1
    23     ```
    24     - if you're not on GKE you'll probably want to pick a node to dedicate and do something like:
    25     ```
    26     kubectl label nodes $GREENHOUSE_NODE_NAME dedicated=greenhouse
    27     kubectl taint nodes $GREENHOUSE_NODE_NAME dedicated=greenhouse:NoSchedule
    28     ```
    29   - Create the Kubernetes service so jobs can talk to it conveniently: `kubectl apply -f greenhouse/service.yaml`
    30   - Create a `StorageClass` / `PersistentVolumeClaim` for fast cache storage, we use `kubectl apply -f greenhouse/gce-fast-storage.yaml` for 3TB of pd-ssd storage
    31   - Finally deploy with `kubectl apply -f greenhouse/deployment.yaml`
    32     <!--TODO(bentheelder): make this easier to consume by other users?-->
    33     - NOTE: other uses will likely need to tweak this step to their needs, in particular the service and storage definitions
    34  
    35  
    36  ## Optional Setup:
    37  - tweak `metrics-service.yaml` and point prometheus at this service to collect metrics
    38  
    39  ## Cache Keying
    40  
    41  See [./../images/bootstrap/create_bazel_cache_rcs.sh](./../images/bootstrap/create_bazel_cache_rcs.sh)
    42  for our cache keying algorithm.
    43  
    44  In short:
    45  - we locate a number of host binaries known to be used by bazel (eg the
    46  system c compiler) within our image
    47  - we then lookup the package that owns each binary
    48  - from that we lookup the package's exact installed version
    49  - we use these in conjunction with the repo under test / built to compute a primary cache key
    50  
    51  This avoids [bazel#4558](https://github.com/bazelbuild/bazel/issues/4558).
    52  
    53  ## Playbook
    54  
    55  For operational details on working with greenhouse, see [Greenhouse Playbook][greenhouse-playbook]
    56  
    57  [greenhouse-playbook]: /docs/playbooks/greenhouse.md