github.com/1aal/kubeblocks@v0.0.0-20231107070852-e1c03e598921/docs/developer_docs/fault_injection/gcp-fault.md (about)

     1  ---
     2  title: Simulate GCP faults
     3  description: Simulate GCP faults
     4  sidebar_position: 11
     5  sidebar_label: Simulate GCP faults
     6  ---
     7  
     8  # Simulate GCP faults
     9  
    10  By creating a GCPChaos experiment, you can simulate fault scenarios of the specified GCP instance. Currently, GCPChaos supports the following fault types:
    11  
    12  * Node Stop: stops the specified GCP instance.
    13  * Node Reset: reboots the specified GCP instance.
    14  * Detach Volume: uninstalls the storage volume from the specified instance.
    15  
    16  ## Before you start
    17  
    18  * By default, the GCP authentication information for local code has been imported. If you have not imported the authentication, follow the steps in [Prerequisite](./prerequisite.md#check-your-permission).
    19  
    20  * To connect to the GCP cluster easily, you can create a Kubernetes Secret file in advance to store authentication information. A `Secret` file sample is as follows:
    21    
    22    ```yaml
    23    apiVersion: v1
    24    kind: Secret
    25    metadata:
    26      name: cloud-key-secret-gcp
    27      namespace: default
    28    type: Opaque
    29    stringData:
    30      service_account: your-gcp-service-account-base64-encode
    31    ```
    32    
    33    * `name` means the Kubernetes Secret object.
    34    * `namespace` means the namespace of the Kubernetes Secret object.
    35    * `service_account` stores the service account key of your GCP cluster. Remember to complete Base64 encoding for your GCP service account key. To learn more about service account key, see [Creating and managing service account keys](https://cloud.google.com/iam/docs/keys-create-delete).
    36  
    37  ## Simulate fault injections by kbcli
    38  
    39  ### Stop
    40  
    41  Chaos Mesh injects the `node-stop` fault into the specified GCP instance so that the GCP instance will be unavailable in 3 minutes.
    42  
    43  ```bash
    44  kbcli fault node stop [node1] [node2] -c=gcp --region=us-central1-c --duration=3m
    45  ```
    46  
    47  After running the above command, the `node-stop` command creates resources, Secret `cloud-key-secret-gcp` and GCPChaos `node-chaos-w98j5`. You can run `kubectl describe node-chaos-w98j5` to verify whether the `node-stop` fault is injected successfully.
    48  
    49  :::caution
    50  
    51  When changing the cluster permissions, updating the key, or changing the cluster context, the `cloud-key-secret-gcp` must be deleted, and then the `node-stop` injection creates a new `cloud-key-secret-gcp` according to the new key.
    52  
    53  :::
    54  
    55  ### Restart
    56  
    57  Chaos Mesh inject an `instance-restart` fault into the specified GCP instance so that this instance will be restarted.
    58  
    59  ```bash
    60  kbcli fault node restart [node1] [node2] -c=gcp --region=us-central1-c
    61  ```
    62  
    63  ### Detach volume
    64  
    65  Chaos Mesh injects a `detach-volume` fault into the specified GCP instance so that this instance is detached from the specified storage volume within 3 minutes.
    66  
    67  ```bash
    68  kbcli fault node detach-volume [node1] -c=gcp --region=us-central1-c --device-name=/dev/sdb
    69  ```
    70  
    71  ## Simulate fault injections by YAML file
    72  
    73  ### GCP-stop example
    74  
    75  1. Write the experiment configuration to the `aws-detach-volume.yaml` file.
    76  
    77     In the following example, Chaos Mesh injects the `node-stop` fault into the specified GCP instance so that the GCP instance will be unavailable in 3 minutes.
    78  
    79     ```yaml
    80     apiVersion: chaos-mesh.org/v1alpha1
    81     kind: GCPChaos
    82     metadata:
    83       creationTimestamp: null
    84       generateName: node-chaos-
    85       namespace: default
    86     spec:
    87       action: node-stop
    88       duration: 30s
    89       instance: gke-yjtest-default-pool-c2ee710b-fs5q
    90       project: apecloud-platform-engineering
    91       secretName: cloud-key-secret-gcp
    92       zone: us-central1-c
    93     ```
    94  
    95  2. Run `kubectl` to start an experiment.
    96  
    97     ```bash
    98     kubectl apply -f ./aws-detach-volume.yaml
    99     ```
   100  
   101  ### GCP-restart example
   102  
   103  1. Write the experiment configuration to the `aws-detach-volume.yaml` file.
   104  
   105     In the following example, Chaos Mesh inject an `instance-restart` fault into the specified GCP instance so that this instance will be restarted.
   106  
   107     ```yaml
   108     apiVersion: chaos-mesh.org/v1alpha1
   109     kind: GCPChaos
   110     metadata:
   111       creationTimestamp: null
   112       generateName: node-chaos-
   113       namespace: default
   114     spec:
   115       action: node-reset
   116       duration: 30s
   117       instance: gke-yjtest-default-pool-c2ee710b-fs5q
   118       project: apecloud-platform-engineering
   119       secretName: cloud-key-secret-gcp
   120       zone: us-central1-c
   121     ```
   122  
   123  2. Run `kubectl` to start an experiment.
   124  
   125     ```bash
   126     kubectl apply -f ./aws-detach-volume.yaml
   127     ```
   128  
   129  ### GCP-detach-volume example
   130  
   131  1. Write the experiment configuration to the `aws-detach-volume.yaml` file.
   132  
   133     In the following example, Chaos Mesh injects a `detach-volume` fault into the specified GCP instance so that this instance is detached from the specified storage volume within 3 minutes.
   134  
   135     ```yaml
   136     apiVersion: chaos-mesh.org/v1alpha1
   137     kind: GCPChaos
   138     metadata:
   139       creationTimestamp: null
   140       generateName: node-chaos-
   141       namespace: default
   142     spec:
   143       action: disk-loss
   144       deviceNames:
   145       - /dev/sdb
   146       duration: 30s
   147       instance: gke-yjtest-default-pool-c2ee710b-fs5q
   148       project: apecloud-platform-engineering
   149       secretName: cloud-key-secret-gcp
   150       zone: us-central1-c
   151     ```
   152  
   153  2. Run `kubectl` to start an experiment.
   154  
   155     ```bash
   156     kubectl apply -f ./aws-detach-volume.yaml
   157     ```
   158  
   159  ### Field description
   160  
   161  The following table shows the fields in the YAML configuration file.
   162  
   163  | Parameter | Type | Descpription | Default value | Required |
   164  | :--- | :--- | :--- | :--- | :--- |
   165  | action | string | It indicates the specific type of faults. The available fault types include `node-stop`, `node-reset`, and `disk-loss`. | `node-stop` | Yes |
   166  | mode | string | It indicates the mode of the experiment. The mode options include `one` (selecting a Pod at random), `all` (selecting all eligible Pods), `fixed` (selecting a specified number of eligible Pods), `fixed-percent` (selecting a specified percentage of the eligible Pods), and `random-max-percent` (selecting the maximum percentage of the eligible Pods). | None | Yes |
   167  | value | string | It provides parameters for the `mode` configuration, depending on `mode`. For example, when `mode` is set to `fixed-percent`, `value` specifies the percentage of pods. | None | No |
   168  | secretName | string | It indicates the name of the Kubernetes secret that stores the GCP authentication information. | None | No |
   169  | project | string | It indicates the ID of GCP project. | None | Yes | real-testing-project |
   170  | zone | string | Indicates the region of GCP instance. | None | Yes |
   171  | instance | string | It indicates the name of GCP instance. | None | Yes |
   172  | deviceNames | []string | This is a required field when the `action` is `disk-loss`. This field specifies the machine disk ID. | None | no |
   173  | duration | string | It indicates the duration of the experiment. | None | Yes |