sigs.k8s.io/cluster-api-provider-azure@v1.14.3/docs/proposals/20221611-workload-identity-integration.md (about) 1 ```yaml 2 title: Workload Identity Integration 3 authors: 4 - @sonasingh46 5 reviewers: 6 - @aramase 7 - @CecileRobertMichon 8 - @yastij 9 - @fabriziopandini 10 11 creation-date: 2022-11-16 12 last-updated: 2023-05-23 13 status: implementable 14 see-also: 15 - https://github.com/kubernetes-sigs/cluster-api-provider-azure/issues/2205 16 ``` 17 18 # Workload Identity Integration 19 20 ## <a name='TableofContents'></a>Table of Contents 21 22 <!-- vscode-markdown-toc --> 23 * [Table of Contents](#TableofContents) 24 * [Acronyms](#Acronyms) 25 * [Summary](#Summary) 26 * [Motivation](#Motivation) 27 * [Goals](#Goals) 28 * [Future Goals](#FutureGoals) 29 * [Personas](#Personas) 30 * [User Stories](#UserStories) 31 * [Proposal](#Proposal) 32 * [Implementation Details/Notes/Constraints](#ImplementationDetailsNotesConstraints) 33 * [Key Generation](#KeyGeneration) 34 * [OIDC URL Setup](#OidcUrlSetup) 35 * [Set Service Account Signing Flags](#SetServiceAccountSigningFlags) 36 * [Federated Credential](#FederatedCredential) 37 * [Distribute Keys To Management Cluster](#DistributeKeys) 38 * [Cloud Provider Azure Integration](#CloudProviderAzureIntegration) 39 * [Proposed API Changes](#ProposedApiChanges) 40 * [Proposed Deployment Configuration Changes](#ProposedConfigurationChanges) 41 * [Proposed Controller Changes](#ProposedControllerChanges) 42 * [Identity](#Identity) 43 * [Open Questions](#OpenQuestions) 44 * [1. How to achieve multi-tenancy?](#Howtomultitenancy) 45 * [2. How to distribute key pair to management cluster?](#Howtodistributekeys) 46 * [3. User Experience](#UserExperience) 47 * [Migration Plan](#MigrationPlan) 48 * [Test Plan](#TestPlan) 49 * [Implementation History](#ImplementationHistory) 50 51 <!-- vscode-markdown-toc-config 52 numbering=false 53 autoSave=false 54 /vscode-markdown-toc-config --> 55 <!-- /vscode-markdown-toc --> 56 57 ## <a name='Acronyms'></a>Acronyms 58 | Acronym | Full Form | 59 | ------------ | ------------------------| 60 | AD | Active Directory | 61 | AAD | Azure Active Directory | 62 | AZWI | Azure Workload Identity | 63 | OIDC | OpenID Connect | 64 | JWKS | JSON Web Key Sets | 65 66 ## <a name='Summary'></a>Summary 67 68 Workloads deployed in Kubernetes cluster may require Azure Active Directory application credential or managed identities to access azure protected resource e.g. Azure Key Vault, Virtual Machines etc. AAD Pod Identity helps access azure resources without the need of a secret management via Azure Managed Identities. AAD Pod Identity is now deprecated and Azure AD Workload Identity is the next iteration of the former. This design proposal aims to define the way for AZWI integration into capz for self managed clusters with keeping in mind other factor e.g. User Experience, Multi Tenancy, Backwards Compatibility etc. 69 70 For more details about AAD Pod Identity please visit this [link](https://github.com/Azure/aad-pod-identity) 71 72 ## <a name='Motivation'></a>Motivation 73 74 AZWI provides the capability to federate the identity with external identity providers in a Kubernetes native way. This approach overcomes several limitations of AAD Pod Identity as mentioned below. 75 - Removes scale and performance issues that existed for identity assignment. 76 - Supports K8s clusters hosted in any cloud or on premise. 77 - Supports both Linux and Windows workloads and removes the need for CRD and pods that intercept IMDS traffic. 78 79 From CAPZ perspective 80 - The AAD pod identity pod has to be deployed on all the nodes where the capz pod can be potentially scheduled. 81 - It has a dependency with the CNI and requires modifying iptables on the node. 82 - AAD pod identity is now deprecated. 83 84 To learn more about AZWI please visit this link https://azure.github.io/azure-workload-identity/docs/introduction.html 85 86 ### <a name='Goals'></a>Goals 87 88 - Enable CAPZ pod to use AZWI on the management cluster to authenticate to Azure to create/update/delete resources as part of workload cluster lifecycle management 89 90 ### <a name='NonGoals'></a>Non Goals 91 92 - Use workload identity for cloud provider azure once supported. 93 94 - Migrate to using workload identity in CI pipelines. 95 96 - Automation for migration from AAD pod identity to workload identity. 97 98 - Bootstrapping components (storage account, discovery document and JWKS) and enabling WI in the workload cluster. 99 100 ## <a name='Personas'></a>Personas 101 102 The following personas are available when writing user stories. 103 104 - John - Cloud Admin 105 - Installs, configures and maintains, management and workload clusters on Azure using CAPZ. 106 107 ## <a name='UserStories'></a>User Stories 108 109 - [S1] As a cloud admin I want to use workload identity in the management cluster in order to enhance security by not using the static azure credentials. I prefer to use CAPI pivoting to create management cluster which means creating a management cluster on Kind and then create a workload cluster on Azure from that and then convert the workload cluster to a management cluster. 110 111 - [S2] I am a cloud admin and I want to install CAP* on an existing Kubernetes cluster and make this a management cluster to create and manage workload clusters by using workload identity. For example, creating a management cluster on an already existing Kubernetes cluster on Azure. 112 113 - [S3] As a cloud admin I want to migrate to using workload identity for my management cluster which is still using older AAD pod identity. 114 115 ## <a name='Proposal'></a>Proposal 116 117 In AZWI model, Kubernetes cluster itself becomes the token issuer issuing tokens to Kubernetes service accounts. These service accounts can be configured to be trusted on Azure AD application or user assigned managed identity. Workload pods can use this service account token which is projected to it's volume and can exchange the projected service account token for an Azure AD access token. 118 119 The first step for creating a Kubernetes cluster using capz is creating a management cluster. On a high level the workflow looks the following to be able to use AZWI in the management cluster. 120 121 **Management Cluster on Kind** 122 123 Notes: 124 - Often, management cluster created on Kind is also termed as Bootstrap cluster. 125 - Cloud Provider azure deployment is not required in this case. 126 - This is also used in CAPI pivoting i.e creating a management cluster on Kind and then later creating a workload cluster from it and again this workload cluster is then converted to a management cluster and the Kind cluster is decommissioned. Refer to user story S1(#UserStories). 127 128 User Workflow: 129 130 - The operator/admin generates signing key pair or BYO key pair. 131 132 - A Kind cluster is created with appropriate flags on kube-apiserver and kube-controller-manager and the key pairs are mounted on the container path for control plane node. See [this](#set-service-account-signing-flags) section for more details on this. 133 - kube-apiserver flags 134 - --service-account-issuer 135 - --service-account-signing-key-file 136 - --service-account-key-file 137 - kube-controller-manager 138 - --service-account-private-key-file 139 140 - The operator/admin uploads the following two documents in a blob storage container. These documents are accessible publicly on a URL and this URL is commonly referred as issuer URL in this context. More about this [here](https://azure.github.io/azure-workload-identity/docs/installation/self-managed-clusters/oidc-issuer.html) 141 - Generate and upload the Discovery Document. 142 - Generate and upload the JWKS Document. 143 144 - A federated identity credential should be created between the identity and <Issuer URL, service account namespace, service account name>. More on this [here](https://azure.github.io/azure-workload-identity/docs/topics/federated-identity-credential.html) 145 146 - CAPI and CAPZ are deployed on the Kind cluster that supports workload identity. 147 148 - CAPZ pod uses the client ID and tenant ID of the Azure AD or user-assigned Identity by passing it in AzureClusterIdentity CR. The AzureClusterIdentity also has a capability to specify type of identity as `WorkloadIdentity` on the field `type`. 149 150 - The management cluster is now configured to use workload identity. A workload cluster can now be created by referencing the AzureClusterIdentity. 151 152 153 **Management Cluster on Azure Via Pivoting** 154 155 Notes: 156 - Cloud provider azure will be required to run on Kubernetes cluster in this case as the management cluster is created on Azure cloud. 157 158 User Workflow: 159 160 - All the steps are followed as described above in `Management Cluster on Kind` with a exception that a secret is created with name `<cluster-name>-sa` encompassing the key pairs that is generated in the previous step. This is done so that the key pairs gets distributed on the control plane node. More details on it [here](https://cluster-api.sigs.k8s.io/tasks/certs/using-custom-certificates.html) 161 162 - A workload cluster is created using Azure static credentials or aad pod identity. 163 164 - After the workload cluster is created, to convert it into a management cluster `clusterctl init` and `clusterctl move` commands are executed. More on this [here](https://cluster-api.sigs.k8s.io/clusterctl/commands/move.html?highlight=pivot#bootstrap--pivot) 165 166 - The workload cluster is now converted into a management cluster but still using static credentials. A migration step can be followed to migrate to using workload identity. 167 168 ### <a name='ImplementationDetailsNotesConstraints'></a>Implementation Details/Notes/Constraints 169 170 - AAD pod identity can co exist with AZWI 171 - Migration plan to AZWI for existing cluster. Refer to the [MigrationPlan](#migration-plan) section at the bottom of the document. 172 - For AZWI to work the following prerequisites must be met for self managed cluster. This is not required for managed cluster and follow this [link](https://azure.github.io/azure-workload-identity/docs/installation/managed-clusters.html) to know more about managed cluster setup. 173 - Key Generation 174 - OIDC URL Setup 175 - Set Service Account Signing Flags 176 177 #### <a name='KeyGeneration'></a>Key Generation 178 179 Admin should generate signing key pairs by using a tool such as openssl or bring their own public and private keys. 180 These keys will be mounted on a path on the containers running on the control plane node. These keys are required for signing the service account tokens that will be used by the capz pod. 181 182 #### <a name='OidcUrlSetup'></a>OIDC URL Setup 183 184 Two documents i.e Discovery and JWKS json documents needs to be generated and published to a public URL. The OIDC discovery document contains the metadata of the issuer. The JSON Web Key Sets (JWKS) document contains the public signing key(s) that allows AAD to verify the authenticity of the service account token. 185 186 Refer to [link](https://azure.github.io/azure-workload-identity/docs/installation/self-managed-clusters/oidc-issuer.html) for steps to setup and OIDC issuer URL. 187 188 The steps on a high level to setup is the following 189 - Create an azure blob storage account. 190 - Create a storage container. 191 - Generate the OIDC and JWKS document. 192 - The document should be accessible on the public accessible URL which will be used later. 193 194 #### <a name='SetServiceAccountSigningFlags'></a>Set Service Account Signing Flags 195 196 Setup the flags on the kind cluster. An example is shown below 197 198 ```yaml 199 kind: Cluster 200 apiVersion: kind.x-k8s.io/v1alpha4 201 nodes: 202 - role: control-plane 203 extraMounts: 204 # path on node where the public key exists 205 - hostPath: ${SERVICE_ACCOUNT_KEY_FILE} 206 containerPath: /etc/kubernetes/pki/sa.pub 207 # path on node where the private key exists 208 - hostPath: ${SERVICE_ACCOUNT_SIGNING_KEY_FILE} 209 containerPath: /etc/kubernetes/pki/sa.key 210 kubeadmConfigPatches: 211 - | 212 kind: ClusterConfiguration 213 apiServer: 214 extraArgs: 215 # the oidc url after it has been set up 216 service-account-issuer: ${SERVICE_ACCOUNT_ISSUER} 217 service-account-key-file: /etc/kubernetes/pki/sa.pub 218 service-account-signing-key-file: /etc/kubernetes/pki/sa.key 219 controllerManager: 220 extraArgs: 221 service-account-private-key-file: /etc/kubernetes/pki/sa.key 222 ``` 223 224 #### <a name='FederatedCredential'></a>Federated Credential 225 226 A federated identity should be created using the azure cli ( or via the azure portal). 227 Please see [this](https://azure.github.io/azure-workload-identity/docs/topics/federated-identity-credential.html) for reference. 228 229 ```bash 230 az identity federated-credential create \ 231 --name "kubernetes-federated-credential" \ 232 --identity-name "${USER_ASSIGNED_IDENTITY_NAME}" \ 233 --resource-group "${RESOURCE_GROUP}" \ 234 --issuer "${SERVICE_ACCOUNT_ISSUER}" \ 235 --subject "system:serviceaccount:${SERVICE_ACCOUNT_NAMESPACE}:${SERVICE_ACCOUNT_NAME}" 236 ``` 237 238 #### <a name='DistributeKeys'></a>Distribute Keys 239 240 Key pair can be distributed to a workload cluster by creating a secret with the name as `<cluster-name>-sa` encompassing the key pair. 241 Follow this [link](https://cluster-api.sigs.k8s.io/tasks/certs/using-custom-certificates.html) for more details. 242 243 ### <a name='CloudProviderAzureIntegration'></a>Cloud Provider Azure Integration 244 245 - Cloud provider azure should be deployed with projected service account token config in the config YAML. 246 247 - CAPZ should create cloud config by setting the following values if workload identity is used. 248 ```go 249 AADFederatedTokenFile string `json:"aadFederatedTokenFile,omitempty" yaml:"aadFederatedTokenFile,omitempty"` 250 UseFederatedWorkloadIdentityExtension bool `json:"useFederatedWorkloadIdentityExtension,omitempty" yaml:"useFederatedWorkloadIdentityExtension,omitempty"` 251 ``` 252 253 254 ### <a name='ProposedApiChanges'></a>Proposed API Changes 255 256 The AzureClusterIdentity spec has a `Type` field that can be used to define what type of Azure identity should be used. 257 258 ```go 259 // AzureClusterIdentitySpec defines the parameters that are used to create an AzureIdentity. 260 type AzureClusterIdentitySpec struct { 261 // Type is the type of Azure Identity used. 262 // ServicePrincipal, ServicePrincipalCertificate, UserAssignedMSI or ManualServicePrincipal. 263 Type IdentityType `json:"type"` 264 265 // ... 266 // ... 267 } 268 ``` 269 270 - Introducing one more acceptable value for `Type` in AzureClusterIdentity spec for workload identity is proposed. 271 272 ```go 273 // IdentityType represents different types of identities. 274 // +kubebuilder:validation:Enum=ServicePrincipal;UserAssignedMSI;ManualServicePrincipal;ServicePrincipalCertificate;WorkloadIdentity 275 type IdentityType string 276 277 const ( 278 // UserAssignedMSI represents a user-assigned managed identity. 279 UserAssignedMSI IdentityType = "UserAssignedMSI" 280 281 // ServicePrincipal represents a service principal using a client password as secret. 282 ServicePrincipal IdentityType = "ServicePrincipal" 283 284 // ManualServicePrincipal represents a manual service principal. 285 ManualServicePrincipal IdentityType = "ManualServicePrincipal" 286 287 // ServicePrincipalCertificate represents a service principal using a certificate as secret. 288 ServicePrincipalCertificate IdentityType = "ServicePrincipalCertificate" 289 290 //[Proposed Change] WorkloadIdentity represents azure workload identity. 291 WorkloadIdentity IdentityType = "WorkloadIdentity" 292 ) 293 294 ``` 295 - Making the field `type` on AzureClusterIdentity immutable. 296 297 ### <a name='ProposedConfigurationChanges'></a>Proposed Deployment Configuration Changes 298 299 - Service account token projected volume and volume mount config should be added in the CAPZ manager deployment config as described below: 300 301 ```yaml 302 volumeMounts: 303 - mountPath: /var/run/secrets/azure/tokens 304 name: azure-identity-token 305 readOnly: true 306 ... 307 volumes: 308 - name: azure-identity-token 309 projected: 310 defaultMode: 420 311 sources: 312 - serviceAccountToken: 313 audience: api://AzureADTokenExchange 314 expirationSeconds: 3600 315 path: azure-identity-token 316 ``` 317 318 ### <a name='ProposedControllerChanges'></a>Proposed Controller Changes 319 320 The identity code workflow in capz should use `azidentity` module to exchange token from AAD as displayed in the next section. 321 322 323 #### <a name='Identity'></a>Identity 324 325 Azure client and tenant ID are injected as env variables by the azwi webhook. But for the azwi workflow, client ID and tenant ID will be fetched from AzureClusterIdentity first and will use the env variables as a fallback option. 326 327 Following is a sample code that should be made into capz identity workflow. 328 329 ```go 330 331 // see the next code section for details on this function 332 cred, err := newWorkloadIdentityCredential(tenantID, clientID, tokenFilePath, wiCredOptions) 333 if err != nil { 334 return nil, errors.Wrap(err, "failed to setup workload identity") 335 } 336 337 client := subscriptions.NewClient() 338 339 // setCredentialsForWorkloadIdentity just setups the 340 // PublicCloud env URLs 341 params.AzureClients.setCredentialsForWorkloadIdentity(ctx, params.AzureCluster.Spec.SubscriptionID, params.AzureCluster.Spec.AzureEnvironment) 342 client.Authorizer = azidext.NewTokenCredentialAdapter(cred, []string{"https://management.azure.com//.default"}) 343 params.AzureClients.Authorizer = client.Authorizer 344 345 ``` 346 347 **NOTE:** 348 `azidext.NewTokenCredentialAdapter` is used to get a authorizer in order to add to the existing code workflow to adapt an azcore.TokenCredential type to an autorest.Authorizer type. 349 350 Also a go file e.g `workload_identity.go` in the `identity` package dealing with AZWI functionality. 351 352 ```go 353 354 type workloadIdentityCredential struct { 355 assertion string 356 file string 357 cred *azidentity.ClientAssertionCredential 358 lastRead time.Time 359 } 360 361 type workloadIdentityCredentialOptions struct { 362 azcore.ClientOptions 363 } 364 365 func newWorkloadIdentityCredential(tenantID, clientID, file string, options *workloadIdentityCredentialOptions) (*workloadIdentityCredential, error) { 366 w := &workloadIdentityCredential{file: file} 367 cred, err := azidentity.NewClientAssertionCredential(tenantID, clientID, w.getAssertion, &azidentity.ClientAssertionCredentialOptions{ClientOptions: options.ClientOptions}) 368 if err != nil { 369 return nil, err 370 } 371 w.cred = cred 372 return w, nil 373 } 374 375 func (w *workloadIdentityCredential) GetToken(ctx context.Context, opts policy.TokenRequestOptions) (azcore.AccessToken, error) { 376 return w.cred.GetToken(ctx, opts) 377 } 378 379 func (w *workloadIdentityCredential) getAssertion(context.Context) (string, error) { 380 if now := time.Now(); w.lastRead.Add(5 * time.Minute).Before(now) { 381 content, err := os.ReadFile(w.file) 382 if err != nil { 383 return "", err 384 } 385 w.assertion = string(content) 386 w.lastRead = now 387 } 388 return w.assertion, nil 389 } 390 391 ``` 392 393 ### <a name='OpenQuestions'></a>Open Questions 394 395 #### <a name='Howtomultitenancy'></a>1. How to achieve multi-tenancy? 396 397 The identity is tied to the client ID which can be supplied via AzureClusterIdentity. 398 399 #### <a name='Howtodistributekeys'></a>2. How to distribute key pair to management cluster? 400 401 Keys can be distributed by using the CABPK feature by creating a secret with name `<cluster-name>-sa` encompassing the key pair. More details on it [here](https://cluster-api.sigs.k8s.io/tasks/certs/using-custom-certificates.html) 402 403 #### <a name='UserExperience'></a>3. User Experience 404 405 Though AZWI has a lot of advantages as compared to AAD pod identity, setting up AZWI involves couple of manual step for self managed clusters and it can impact the user experience. Though, for managed clusters the configurations steps are not required. 406 407 ### <a name='MigrationPlan'></a>Migration Plan 408 409 Management clusters using AAD pod identity should have a seamless migration process which is well documented. 410 411 For migrating an existing cluster to use AZWI following steps should be taken. 412 413 **Migrating Greenfield Clusters** 414 - The steps in this section applies to greenfield cluster creation that uses CAPI pivoting to use workload identity. As in this case the key pairs are already distributed to the control plane nodes. 415 416 - Create a `UserAssignedIdentity` and create a new machine template to use user assigned identity. This step will become optional once cloud provider azure supports workload identity. 417 418 - Patch the control plane to include the following flag on the api server. 419 - `service-account-issuer: <value-is-service-account-issuer-url> 420 421 - Perform node rollout to propagate the changes by patching KCP and MachineDeployment objects to include the reference on the new AzureMachineTemplate created in the previous step. 422 423 - Create a new `AzureClusterIdentity` and specify the client ID, tenant ID and `type` to `WorkloadIdentity`. 424 425 - Patch AzureCluster to use the new `AzureClusterIdentity`. 426 427 **Migrating Brownfield Clusters** 428 429 - The steps in this section applies to existing clusters. 430 431 - Generate a key pair or use the same key pairs as present in `/etc/kubernetes/pki` 432 433 - Perform the following pre-requistes as discussed earlier in the **Bootstrap Cluster** section of the [document](#proposal): 434 - Generate and upload the JWKS and Discovery Document. 435 - Install the AZWI mutating admission webhook controller. 436 - Establish federated credential for the identity and `<Issuer, service account namespace, service account name>`. 437 438 - If you are not using the existing key pairs then perform a service account rotation using this guide. 439 https://azure.github.io/azure-workload-identity/docs/topics/self-managed-clusters/service-account-key-rotation.html 440 441 - If you are using the existing key pairs or if you just configure workload identity using the keys present in /etc/kubernetes/pki then rotation is not required. 442 443 - Create a `UserAssignedIdentity` and create a new machine template to use user assigned identity. This step will become optional once cloud provider azure supports workload identity. 444 445 - This step is not required if you are using the existing key pairs. Patch the control plane to include the following flags on the api server. 446 - `service-account-key-file:<public-key-path-on-controlplane-node>` 447 - `service-account-signing-key-file:<private-key-path-on-controlplane-node>` 448 - **Note:** The key pairs are present in `/etc/kubernetes/pki` directory if using the `<cluster-name>-sa` key pair. 449 450 - This step is not required if you are using the existing key pairs. Patch the control plane to include the following flags on the kube controller manager. 451 - `service-account-signing-key-file:<private-key-path-on-controlplane-node>` 452 453 - Patch the control plane to include the following flag on the api server. 454 - `service-account-issuer: <value-is-service-account-issuer-url>` 455 456 - Perform node rollout to propagate the changes by patching KCP and MachineDeployment objects to include the reference on the new AzureMachineTemplate created in the previous step. 457 458 - Upgrade to the CAPZ version that supports AZWI. 459 460 - Create a new `AzureClusterIdentity` and specify the client ID, tenant ID and `type` to `WorkloadIdentity`. 461 462 - Patch AzureCluster to use the new `AzureClusterIdentity`. 463 464 **Note:** Migration to AZWI is a tedious manual process and poses risk of human error. It should be better done via an automation. 465 466 ### <a name='TestPlan'></a>Test Plan 467 468 * Unit tests to validate newer workload identity functions and helper functions. 469 * Using AZWI for existing e2e tests for create, upgrade, scale down/up, and delete. 470 471 ## <a name='ImplementationHistory'></a>Implementation History 472 473 - Enabling workload identity feature in CAPZ 474 Ref Issue: https://github.com/kubernetes-sigs/cluster-api-provider-azure/issues/3588 475 Ref PR: https://github.com/kubernetes-sigs/cluster-api-provider-azure/pull/3583 476 477 - [ToDo]Integrating with cloud provider azure to use workload identity. 478 Ref Issue: https://github.com/kubernetes-sigs/cluster-api-provider-azure/issues/3589 479 480 - [ToDo] Moving all the CI jobs to be using workload identity. 481 Ref Issue: https://github.com/kubernetes-sigs/cluster-api-provider-azure/issues/3590