github.com/munnerz/test-infra@v0.0.0-20190108210205-ce3d181dc989/prow/scaling.md (about)

     1  # Using Prow at Scale
     2  
     3  If you are maintaining a Prow instance that will need to scale to handle a large
     4  load, consider using the following best practices, features, and additional tools.
     5  You may also be interested in ["Getting more out of Prow"](/prow/more_prow.md).
     6  
     7  ## Features and Tools
     8  
     9  ### Separate Build Cluster(s)
    10  
    11  It is frequently not secure to run all ProwJobs in the same cluster that runs
    12  Prow's service components (`hook`, `plank`, etc.). In particular, ProwJobs that
    13  execute presubmit tests should typically be isolated from Prow's microservices.
    14  This isolation prevents a malicious PR author from modifying the presubmit test
    15  to do something evil like breaking out of the container and stealing secrets
    16  that live in the cluster.
    17  
    18  More than one build cluster can be used in order to isolate specific jobs from
    19  each other, improve scalability, and offer different node shapes.
    20  Instructions for configuring jobs to run in different clusters can be found [here](/prow/getting_started_deploy.md#Run-test-pods-in-different-clusters).
    21  
    22  ### Pull Request Merge Automation
    23  
    24  Pull Requests can be automatically merged when they satisfy configured merge
    25  requirements using [`tide`](/prow/cmd/tide/). Automating merge is critical for
    26  large projects where allowing human to click the merge button is either a bottle
    27  neck, a security concern, or both. Tide ensures that PRs have been tested
    28  against the most recent base branch commit before merging (retesting if
    29  necessary), and automatically groups multiple PRs to be tested and merged as a
    30  batch whenever possible.
    31  
    32  ### Config File Split
    33  
    34  If your Prow config starts to grow too large, consider splitting the job config
    35  files into more specific and easily reviewed files. To use this pattern simply
    36  aggregate all job configs in a directory of files with unique base names and
    37  supply the directory path to components via `--job-config-path`. 
    38  
    39  The [`updateconfig` plugin](/prow/plugins/updateconfig) supports this pattern by
    40  allowing multiple files to loaded into a single configmap under different keys
    41  (different files once mounted to a container).
    42  
    43  ### GitHub API Cache
    44  
    45  [`ghproxy`](/ghproxy/) is a reverse proxy HTTP cache optimized for the GitHub API.
    46  It takes advantage of how GitHub responds to E-tags in order to fulfill repeated
    47  requests without spending additional API tokens. Check out this tool if you find
    48  that your GitHub bot is consuming or approaching it's token limit. Similarly,
    49  re-deploying Prow components may trigger a large amount of API requests to GitHub
    50  which may trip the abuse detection mechanisms. At scale, the `tide` deployment
    51  itself may create enough API throughput to trigger this on its own. Deploying the
    52  GitHub proxy cache is critical to ensuring that Prow does not trip this mechanism
    53  when operating at scale.
    54  
    55  ### Config Driven GitHub Org Management
    56  
    57  Managing org and repo scoped settings across multiple orgs and repos is not easy
    58  with the mechanisms that GitHub provides. Only a few people have access to the
    59  settings, they must be manually synced between repos, and they can easily become
    60  inconsistent. These problems grow with number of orgs/repos and with the number
    61  of contributors.
    62  We have a few tools that automate this kind of administration and integrate well
    63  with Prow:
    64  - [`label_sync`](/label_sync/) is a tool that synchronizes labels and their
    65  metadata across multiple orgs and repos in order to provide a consistent user
    66  experience in a multi-repo project.
    67  - [`branch_protector`](/prow/cmd/branchprotector) is a Prow component that
    68  synchronizes GitHub branch requirements and restrictions based on config.
    69  - [`peribolos`](/prow/cmd/periobolos) is a tool that synchronizes org settings,
    70  teams, and memberships based on config.
    71  
    72  ### Metrics
    73  
    74  Prow exposes some [Prometheus metrics](/prow/metrics/README.md) that can be used to generate graphs and
    75  alerts. If you are maintaining a Prow instance that handles important workloads
    76  you should consider using these metrics for monitoring.
    77  
    78  ## Best Practices
    79  
    80  ### Don’t share Prow’s GitHub bot token with other automation.
    81  
    82  Some parts of Prow do not behave well if the GitHub bot token's rate limit is
    83  exhausted. It is imperative to avoid this so it is a good practice to avoid
    84  using the bot token that Prow uses for any other purposes.
    85  
    86  ### Working around GitHub's limited ACLs.
    87  
    88  GitHub provides an extremely limited access control system that makes it
    89  impossible to control granular permissions like authority to add and remove
    90  specific labels from PRs and issues. Instead, write access to the entire
    91  repo must be granted. This problem grows as projects scale and granular
    92  permissions become more important.
    93  
    94  Much of the GitHub automation that Prow provides is designed to fill in the gaps
    95  in GitHub's permission system. The core idea is to limit repo write access to
    96  the Prow bot (and a minimal number of repo admins) and then let Prow determine
    97  if users have the appropriate permissions before taking action on their behalf.
    98  The following is an overview of some of the automation Prow implements to work
    99  around GitHub's limited permission system:
   100    - Permission to trigger presubmit tests is determined based on org membership
   101    as configured in the [`triggers`](https://github.com/kubernetes/test-infra/blob/526195d3e22cb90d784c1e4db1c43041a006c848/prow/plugins/plugins.go#L180) plugin config section.
   102    - File ownership is described with OWNERS files and change approval is
   103    enforced with the [`approve` plugin](/prow/plugins/approve). See the [docs](/prow/plugins/approve/approvers/README.md) for details.
   104    - Org member review of the most recent version of the PR is enforced with the
   105    [`lgtm` plugin](/prow/plugins/lgtm).
   106    - Various other plugins manage labels, milestone, and issue state based on 
   107    `/foo` style commands from authorized users. Authorization may be based on
   108    org membership, GitHub team membership, or OWNERS file membership.
   109    - [`Tide`](/prow/cmd/tide) provides PR merge automation so that humans do not need to (and are not
   110    allowed to) merge PRs. Without Tide, a user either has no permission to
   111    merge or they have repo write access which grants permission to merge any PR
   112    in the entire repo. Additionally, Tide enforces merge requirements like
   113    required and forbidden labels that humans may not respect if they are allowed
   114    to manually click the merge button.