github.com/munnerz/test-infra@v0.0.0-20190108210205-ce3d181dc989/prow/scaling.md (about) 1 # Using Prow at Scale 2 3 If you are maintaining a Prow instance that will need to scale to handle a large 4 load, consider using the following best practices, features, and additional tools. 5 You may also be interested in ["Getting more out of Prow"](/prow/more_prow.md). 6 7 ## Features and Tools 8 9 ### Separate Build Cluster(s) 10 11 It is frequently not secure to run all ProwJobs in the same cluster that runs 12 Prow's service components (`hook`, `plank`, etc.). In particular, ProwJobs that 13 execute presubmit tests should typically be isolated from Prow's microservices. 14 This isolation prevents a malicious PR author from modifying the presubmit test 15 to do something evil like breaking out of the container and stealing secrets 16 that live in the cluster. 17 18 More than one build cluster can be used in order to isolate specific jobs from 19 each other, improve scalability, and offer different node shapes. 20 Instructions for configuring jobs to run in different clusters can be found [here](/prow/getting_started_deploy.md#Run-test-pods-in-different-clusters). 21 22 ### Pull Request Merge Automation 23 24 Pull Requests can be automatically merged when they satisfy configured merge 25 requirements using [`tide`](/prow/cmd/tide/). Automating merge is critical for 26 large projects where allowing human to click the merge button is either a bottle 27 neck, a security concern, or both. Tide ensures that PRs have been tested 28 against the most recent base branch commit before merging (retesting if 29 necessary), and automatically groups multiple PRs to be tested and merged as a 30 batch whenever possible. 31 32 ### Config File Split 33 34 If your Prow config starts to grow too large, consider splitting the job config 35 files into more specific and easily reviewed files. To use this pattern simply 36 aggregate all job configs in a directory of files with unique base names and 37 supply the directory path to components via `--job-config-path`. 38 39 The [`updateconfig` plugin](/prow/plugins/updateconfig) supports this pattern by 40 allowing multiple files to loaded into a single configmap under different keys 41 (different files once mounted to a container). 42 43 ### GitHub API Cache 44 45 [`ghproxy`](/ghproxy/) is a reverse proxy HTTP cache optimized for the GitHub API. 46 It takes advantage of how GitHub responds to E-tags in order to fulfill repeated 47 requests without spending additional API tokens. Check out this tool if you find 48 that your GitHub bot is consuming or approaching it's token limit. Similarly, 49 re-deploying Prow components may trigger a large amount of API requests to GitHub 50 which may trip the abuse detection mechanisms. At scale, the `tide` deployment 51 itself may create enough API throughput to trigger this on its own. Deploying the 52 GitHub proxy cache is critical to ensuring that Prow does not trip this mechanism 53 when operating at scale. 54 55 ### Config Driven GitHub Org Management 56 57 Managing org and repo scoped settings across multiple orgs and repos is not easy 58 with the mechanisms that GitHub provides. Only a few people have access to the 59 settings, they must be manually synced between repos, and they can easily become 60 inconsistent. These problems grow with number of orgs/repos and with the number 61 of contributors. 62 We have a few tools that automate this kind of administration and integrate well 63 with Prow: 64 - [`label_sync`](/label_sync/) is a tool that synchronizes labels and their 65 metadata across multiple orgs and repos in order to provide a consistent user 66 experience in a multi-repo project. 67 - [`branch_protector`](/prow/cmd/branchprotector) is a Prow component that 68 synchronizes GitHub branch requirements and restrictions based on config. 69 - [`peribolos`](/prow/cmd/periobolos) is a tool that synchronizes org settings, 70 teams, and memberships based on config. 71 72 ### Metrics 73 74 Prow exposes some [Prometheus metrics](/prow/metrics/README.md) that can be used to generate graphs and 75 alerts. If you are maintaining a Prow instance that handles important workloads 76 you should consider using these metrics for monitoring. 77 78 ## Best Practices 79 80 ### Don’t share Prow’s GitHub bot token with other automation. 81 82 Some parts of Prow do not behave well if the GitHub bot token's rate limit is 83 exhausted. It is imperative to avoid this so it is a good practice to avoid 84 using the bot token that Prow uses for any other purposes. 85 86 ### Working around GitHub's limited ACLs. 87 88 GitHub provides an extremely limited access control system that makes it 89 impossible to control granular permissions like authority to add and remove 90 specific labels from PRs and issues. Instead, write access to the entire 91 repo must be granted. This problem grows as projects scale and granular 92 permissions become more important. 93 94 Much of the GitHub automation that Prow provides is designed to fill in the gaps 95 in GitHub's permission system. The core idea is to limit repo write access to 96 the Prow bot (and a minimal number of repo admins) and then let Prow determine 97 if users have the appropriate permissions before taking action on their behalf. 98 The following is an overview of some of the automation Prow implements to work 99 around GitHub's limited permission system: 100 - Permission to trigger presubmit tests is determined based on org membership 101 as configured in the [`triggers`](https://github.com/kubernetes/test-infra/blob/526195d3e22cb90d784c1e4db1c43041a006c848/prow/plugins/plugins.go#L180) plugin config section. 102 - File ownership is described with OWNERS files and change approval is 103 enforced with the [`approve` plugin](/prow/plugins/approve). See the [docs](/prow/plugins/approve/approvers/README.md) for details. 104 - Org member review of the most recent version of the PR is enforced with the 105 [`lgtm` plugin](/prow/plugins/lgtm). 106 - Various other plugins manage labels, milestone, and issue state based on 107 `/foo` style commands from authorized users. Authorization may be based on 108 org membership, GitHub team membership, or OWNERS file membership. 109 - [`Tide`](/prow/cmd/tide) provides PR merge automation so that humans do not need to (and are not 110 allowed to) merge PRs. Without Tide, a user either has no permission to 111 merge or they have repo write access which grants permission to merge any PR 112 in the entire repo. Additionally, Tide enforces merge requirements like 113 required and forbidden labels that humans may not respect if they are allowed 114 to manually click the merge button.