k8s.io/test-infra@v0.0.0-20240520184403-27c6b4c223d8/workload-identity/README.md (about)

     1  # Overview
     2  
     3  Workload identity is the best practice for authenticating as a service account
     4  when running on GKE.
     5  
     6  See https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity for
     7  how this works
     8  
     9  ## Configuration
    10  ### Cluster and node pools
    11  
    12  First enable workload identity on the cluster and all node-pools in the cluster:
    13  
    14  ```bash
    15  enable-workload-identity.sh K8S_PROJECT ZONE CLUSTER
    16  ```
    17  
    18  ### Kubernetes service account
    19  
    20  Next ensure the kubernetes service account exists and that it has an
    21  `iam.gke.io/gcp-service-account` annotation. This associates it with the desired
    22  GCP service account.
    23  
    24  ```yaml
    25  apiVersion: v1
    26  kind: ServiceAccount
    27  metadata:
    28    annotations:
    29      iam.gke.io/gcp-service-account: SOMEBODY@PROJECT.iam.gserviceaccount.com
    30    name: SOMETHING
    31    namespace: SOMEWHERE
    32  ```
    33  
    34  ### GCP service account
    35  
    36  Once this service account exists in the cluster, then an owner of
    37  `SOMEBODY@PROJECT.iam.gserviceaccount.com` -- typically a `PROJECT` owner --
    38  runs `bind-service-accounts.sh`:
    39  
    40  ```bash
    41  bind-service-accounts.sh \
    42    K8S_PROJECT ZONE CLUSTER SOMEWHERE SOMETHING \
    43    SOMEBODY@PROJECT.iam.gserviceaccount.com
    44  ```
    45  
    46  This script assumes the same person can access both `K8S_PROJECT` and
    47  `PROJECT`. If that is not true then the `PROJECT` owner can just run this
    48  command directly:
    49  
    50  ```bash
    51  # Note: K8S_PROJECT is the project owning the GKE cluster
    52  #       whereas PROJECT owns the service account (may be the same)
    53  gcloud iam service-accounts add-iam-policy-binding \
    54    --project=PROJECT \
    55    --role=roles/iam.workloadIdentityUser \
    56    --member=serviceAccount:K8S_PROJECT.svc.id.goog[SOMEWHERE/SOMETHING] \
    57    SOMEBODY@PROJECT.iam.gserviceaccount.com
    58  ```
    59  
    60  This is what tells GCP that the `SOMEWHERE/SOMETHING` service account in
    61  `K8S_PROJECT` is authorized to act as
    62  `SOMEBODY@PROJECT.iam.gserviceaccount.com`.
    63  
    64  These are all described in the how-to GKE doc link at the top.
    65  
    66  ### Pods
    67  
    68  At this point any pod that:
    69  * Runs in a `K8S_PROJECT` GKE cluster
    70  * Inside the `SOMEWHERE` namespace
    71  * Using the `SOMETHING` service
    72  
    73  will authenticate as `SOMEBODY@PROJECT.iam.gserviceaccount.com` to GCP. The
    74  `bind-service-accounts.sh` script will verify this (see the how-to doc above for
    75  the manual command).
    76  
    77  Whenever you want a pod to authenticate this way, just configure the
    78  `serviceAccountName` on the `PodSpec`. See an example
    79  [pod](https://github.com/GoogleCloudPlatform/testgrid/blob/5c7bc80b18ccf00c773c34583628091890b401ab/cluster/summarizer_deployment.yaml#L22)
    80  and
    81  [prowjob](https://github.com/GoogleCloudPlatform/oss-test-infra/blob/9c466c14e4b4b5fbc8c837d0c61c779194e82d56/prow/prowjobs/GoogleCloudPlatform/oss-test-infra/gcp-oss-test-infra-config.yaml#L65):
    82  
    83  Here are minimal deployment and prow job that use an image and args to print the
    84  authenticated user to `STDOUT`:
    85  
    86  ```yaml
    87  apiVersion: apps/v1
    88  kind: Deployment
    89  metadata:
    90    name: foo
    91    namespace: SOMEWHERE # from above
    92    labels:
    93      app: foo
    94  spec:
    95    replicas: 1
    96    selector:
    97      matchLabels:
    98        app: foo
    99    template:
   100      metadata:
   101        labels:
   102          app: foo
   103      spec:
   104        serviceAccountName: SOMETHING # from above
   105        containers:
   106        - name: whatever
   107          image: google/cloud-sdk:slim
   108          args:
   109          - gcloud
   110          - auth
   111          - list
   112  ```
   113  
   114  ```yaml
   115  periodics:
   116  - name: foo
   117    interval: 10m
   118    decorate: true
   119    spec:
   120      serviceAccountName: SOMETHING # from above, note: namespace is chosen by prow
   121      containers:
   122      - image: google/cloud-sdk:slim
   123        args:
   124        - gcloud
   125        - auth
   126        - list
   127  ```
   128  
   129  ### Migrate Prow Job to Use Workload Identity
   130  
   131  > **Note: Workload identity works best with pod utilities**: Migrate to use pod
   132  > utilities if there is no `decoration: true` on your job, and come back to this
   133  > doc once that's done.
   134  
   135  #### Background
   136  
   137  Prow jobs run in kubernetes pods, and each pod is responsible for uploading artifacts
   138  to designated remote storage location(e.g. GCS), this upload is done by a
   139  container called `sidecar`. To be able to upload to GCS, `sidecar` container
   140  will need to way to authenticate to GCP, which was historically done by GCP
   141  service account key(normally stored as `service-account` secret in build
   142  clusters), this key is mounted onto `sidecar` container by prow, which it uses
   143  for authenticating with GCP.
   144  
   145  Workload identity is a keyless solution from GCP that appears to be more secure
   146  than storing keys, and Prow also supports this by running the entire pod
   147  with a service account that has GCS operation permission.
   148  
   149  To migrate from using GCP service account keys to workload identity, there are
   150  two different scenarios, and the steps are different. The differentiator is
   151  whether the prow job itself directly interacts with GCP or not, if any of the
   152  following config is in the job then it's very likely that the answer is yes:
   153  
   154  ```
   155  volumes:
   156    - name: <DOES_NOT_MATTER>
   157      secret:
   158        secretName: service-account
   159  ```
   160  
   161  ```
   162  labels:
   163    preset-service-account: "true"
   164  ```
   165  
   166  #### Migration Steps when Not Interacting with GCP
   167  
   168  Add the following sections on the prow job config:
   169  ```
   170  decoration_config:
   171    gcs_credentials_secret: "" # Use workload identity for uploading artifacts
   172  spec:
   173    # This account exists in "default" build cluster for now, for any other build cluster it
   174    # can be set up by following the steps from the top of this script.
   175    serviceAccountName: prowjob-default-sa
   176  ```
   177  
   178  See [example PR of
   179  migration](https://github.com/kubernetes/test-infra/pull/26374).
   180  
   181  #### Migration Steps when Interacting with GCP
   182  
   183  1. Inspect the test process/script invoked by the prow job, remove any logic
   184     that assumes the existence of GCP service account key file, or the
   185     environment variable of `GOOGLE_APPLICATION_CREDENTIALS` before migrating.
   186  
   187  1. Creating new GCP service account with necessary GCP IAM permissions, and set
   188     up workload identity with the new service account by following [Kubernetes
   189     service account](#kubernetes-service-account) and [GCP service
   190     account](#gcp-service-account) above.
   191  
   192  1. Modify prow job config by adding the sections similar to above, replacing the
   193     service account name with the new service account name.