github.com/1aal/kubeblocks@v0.0.0-20231107070852-e1c03e598921/docs/developer_docs/fault_injection/gcp-fault.md (about) 1 --- 2 title: Simulate GCP faults 3 description: Simulate GCP faults 4 sidebar_position: 11 5 sidebar_label: Simulate GCP faults 6 --- 7 8 # Simulate GCP faults 9 10 By creating a GCPChaos experiment, you can simulate fault scenarios of the specified GCP instance. Currently, GCPChaos supports the following fault types: 11 12 * Node Stop: stops the specified GCP instance. 13 * Node Reset: reboots the specified GCP instance. 14 * Detach Volume: uninstalls the storage volume from the specified instance. 15 16 ## Before you start 17 18 * By default, the GCP authentication information for local code has been imported. If you have not imported the authentication, follow the steps in [Prerequisite](./prerequisite.md#check-your-permission). 19 20 * To connect to the GCP cluster easily, you can create a Kubernetes Secret file in advance to store authentication information. A `Secret` file sample is as follows: 21 22 ```yaml 23 apiVersion: v1 24 kind: Secret 25 metadata: 26 name: cloud-key-secret-gcp 27 namespace: default 28 type: Opaque 29 stringData: 30 service_account: your-gcp-service-account-base64-encode 31 ``` 32 33 * `name` means the Kubernetes Secret object. 34 * `namespace` means the namespace of the Kubernetes Secret object. 35 * `service_account` stores the service account key of your GCP cluster. Remember to complete Base64 encoding for your GCP service account key. To learn more about service account key, see [Creating and managing service account keys](https://cloud.google.com/iam/docs/keys-create-delete). 36 37 ## Simulate fault injections by kbcli 38 39 ### Stop 40 41 Chaos Mesh injects the `node-stop` fault into the specified GCP instance so that the GCP instance will be unavailable in 3 minutes. 42 43 ```bash 44 kbcli fault node stop [node1] [node2] -c=gcp --region=us-central1-c --duration=3m 45 ``` 46 47 After running the above command, the `node-stop` command creates resources, Secret `cloud-key-secret-gcp` and GCPChaos `node-chaos-w98j5`. You can run `kubectl describe node-chaos-w98j5` to verify whether the `node-stop` fault is injected successfully. 48 49 :::caution 50 51 When changing the cluster permissions, updating the key, or changing the cluster context, the `cloud-key-secret-gcp` must be deleted, and then the `node-stop` injection creates a new `cloud-key-secret-gcp` according to the new key. 52 53 ::: 54 55 ### Restart 56 57 Chaos Mesh inject an `instance-restart` fault into the specified GCP instance so that this instance will be restarted. 58 59 ```bash 60 kbcli fault node restart [node1] [node2] -c=gcp --region=us-central1-c 61 ``` 62 63 ### Detach volume 64 65 Chaos Mesh injects a `detach-volume` fault into the specified GCP instance so that this instance is detached from the specified storage volume within 3 minutes. 66 67 ```bash 68 kbcli fault node detach-volume [node1] -c=gcp --region=us-central1-c --device-name=/dev/sdb 69 ``` 70 71 ## Simulate fault injections by YAML file 72 73 ### GCP-stop example 74 75 1. Write the experiment configuration to the `aws-detach-volume.yaml` file. 76 77 In the following example, Chaos Mesh injects the `node-stop` fault into the specified GCP instance so that the GCP instance will be unavailable in 3 minutes. 78 79 ```yaml 80 apiVersion: chaos-mesh.org/v1alpha1 81 kind: GCPChaos 82 metadata: 83 creationTimestamp: null 84 generateName: node-chaos- 85 namespace: default 86 spec: 87 action: node-stop 88 duration: 30s 89 instance: gke-yjtest-default-pool-c2ee710b-fs5q 90 project: apecloud-platform-engineering 91 secretName: cloud-key-secret-gcp 92 zone: us-central1-c 93 ``` 94 95 2. Run `kubectl` to start an experiment. 96 97 ```bash 98 kubectl apply -f ./aws-detach-volume.yaml 99 ``` 100 101 ### GCP-restart example 102 103 1. Write the experiment configuration to the `aws-detach-volume.yaml` file. 104 105 In the following example, Chaos Mesh inject an `instance-restart` fault into the specified GCP instance so that this instance will be restarted. 106 107 ```yaml 108 apiVersion: chaos-mesh.org/v1alpha1 109 kind: GCPChaos 110 metadata: 111 creationTimestamp: null 112 generateName: node-chaos- 113 namespace: default 114 spec: 115 action: node-reset 116 duration: 30s 117 instance: gke-yjtest-default-pool-c2ee710b-fs5q 118 project: apecloud-platform-engineering 119 secretName: cloud-key-secret-gcp 120 zone: us-central1-c 121 ``` 122 123 2. Run `kubectl` to start an experiment. 124 125 ```bash 126 kubectl apply -f ./aws-detach-volume.yaml 127 ``` 128 129 ### GCP-detach-volume example 130 131 1. Write the experiment configuration to the `aws-detach-volume.yaml` file. 132 133 In the following example, Chaos Mesh injects a `detach-volume` fault into the specified GCP instance so that this instance is detached from the specified storage volume within 3 minutes. 134 135 ```yaml 136 apiVersion: chaos-mesh.org/v1alpha1 137 kind: GCPChaos 138 metadata: 139 creationTimestamp: null 140 generateName: node-chaos- 141 namespace: default 142 spec: 143 action: disk-loss 144 deviceNames: 145 - /dev/sdb 146 duration: 30s 147 instance: gke-yjtest-default-pool-c2ee710b-fs5q 148 project: apecloud-platform-engineering 149 secretName: cloud-key-secret-gcp 150 zone: us-central1-c 151 ``` 152 153 2. Run `kubectl` to start an experiment. 154 155 ```bash 156 kubectl apply -f ./aws-detach-volume.yaml 157 ``` 158 159 ### Field description 160 161 The following table shows the fields in the YAML configuration file. 162 163 | Parameter | Type | Descpription | Default value | Required | 164 | :--- | :--- | :--- | :--- | :--- | 165 | action | string | It indicates the specific type of faults. The available fault types include `node-stop`, `node-reset`, and `disk-loss`. | `node-stop` | Yes | 166 | mode | string | It indicates the mode of the experiment. The mode options include `one` (selecting a Pod at random), `all` (selecting all eligible Pods), `fixed` (selecting a specified number of eligible Pods), `fixed-percent` (selecting a specified percentage of the eligible Pods), and `random-max-percent` (selecting the maximum percentage of the eligible Pods). | None | Yes | 167 | value | string | It provides parameters for the `mode` configuration, depending on `mode`. For example, when `mode` is set to `fixed-percent`, `value` specifies the percentage of pods. | None | No | 168 | secretName | string | It indicates the name of the Kubernetes secret that stores the GCP authentication information. | None | No | 169 | project | string | It indicates the ID of GCP project. | None | Yes | real-testing-project | 170 | zone | string | Indicates the region of GCP instance. | None | Yes | 171 | instance | string | It indicates the name of GCP instance. | None | Yes | 172 | deviceNames | []string | This is a required field when the `action` is `disk-loss`. This field specifies the machine disk ID. | None | no | 173 | duration | string | It indicates the duration of the experiment. | None | Yes |