k8s.io/test-infra@v0.0.0-20240520184403-27c6b4c223d8/greenhouse/README.md (about) 1 # Greenhouse 2 3 Greenhouse is our [bazel remote caching](https://docs.bazel.build/versions/master/remote-caching.html) setup. 4 We use this to provide faster build & test presubmits with a Globally shared cache (per repo). 5 6 Most Bazel users should probably visit [the official docs](https://docs.bazel.build/versions/master/remote-caching.html) and select one of the options outlined there, with Prow/Kubernetes we are using a custom setup to explore: 7 8 - better support for multiple repos / cache invalidation by changing the cache URL suffix 9 (see also: `images/bootstrap/create_bazel_cache_rcs.sh`) 10 - customized cache eviction / management 11 - integration with our logging and metrics stacks 12 13 14 ## Setup (on a Kubernetes Cluster) 15 We use this with [Prow](./../prow), to set it up we do the following: 16 17 - Install [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/) and [bazel](https://bazel.build/) and Point `KUBECONFIG` at your cluster. 18 - for k8s.io use `make -C prow get-build-cluster-credentials` 19 - Create a dedicated node. We use a GKE node-pool with a single node. Tag this node with label `dedicated=greenhouse` and taint `dedicated=greenhouse:NoSchedule` so your other tasks don't schedule on it. 20 - for k8s.io (running on GKE) this is: 21 ``` 22 gcloud beta container node-pools create greenhouse --cluster=prow --project=k8s-prow-builds --zone=us-central1-f --node-taints=dedicated=greenhouse:NoSchedule --node-labels=dedicated=greenhouse --machine-type=n1-standard-32 --num-nodes=1 23 ``` 24 - if you're not on GKE you'll probably want to pick a node to dedicate and do something like: 25 ``` 26 kubectl label nodes $GREENHOUSE_NODE_NAME dedicated=greenhouse 27 kubectl taint nodes $GREENHOUSE_NODE_NAME dedicated=greenhouse:NoSchedule 28 ``` 29 - Create the Kubernetes service so jobs can talk to it conveniently: `kubectl apply -f greenhouse/service.yaml` 30 - Create a `StorageClass` / `PersistentVolumeClaim` for fast cache storage, we use `kubectl apply -f greenhouse/gce-fast-storage.yaml` for 3TB of pd-ssd storage 31 - Finally deploy with `kubectl apply -f greenhouse/deployment.yaml` 32 <!--TODO(bentheelder): make this easier to consume by other users?--> 33 - NOTE: other uses will likely need to tweak this step to their needs, in particular the service and storage definitions 34 35 36 ## Optional Setup: 37 - tweak `metrics-service.yaml` and point prometheus at this service to collect metrics 38 39 ## Cache Keying 40 41 See [./../images/bootstrap/create_bazel_cache_rcs.sh](./../images/bootstrap/create_bazel_cache_rcs.sh) 42 for our cache keying algorithm. 43 44 In short: 45 - we locate a number of host binaries known to be used by bazel (eg the 46 system c compiler) within our image 47 - we then lookup the package that owns each binary 48 - from that we lookup the package's exact installed version 49 - we use these in conjunction with the repo under test / built to compute a primary cache key 50 51 This avoids [bazel#4558](https://github.com/bazelbuild/bazel/issues/4558). 52 53 ## Playbook 54 55 For operational details on working with greenhouse, see [Greenhouse Playbook][greenhouse-playbook] 56 57 [greenhouse-playbook]: /docs/playbooks/greenhouse.md