github.com/1aal/kubeblocks@v0.0.0-20231107070852-e1c03e598921/docs/developer_docs/fault_injection/pod-faults.md (about)

     1  ---
     2  title: Simulate pod faults
     3  description: Simulate pod faults
     4  sidebar_position: 3
     5  sidebar_label: Simulate pod faults
     6  ---
     7  
     8  # Simulate pod faults
     9  
    10  Pod faults support pod failure, pod kill, and container kill.
    11  
    12  * Pod failure: injects fault into a specified Pod to make the Pod unavailable for a while.
    13  * Pod kill: kills a specified Pod. To ensure that the Pod can be successfully restarted, you need to configure ReplicaSet or similar mechanisms.
    14  * Container kill: kills the specified container in the target Pod.
    15  
    16  ## Usage restrictions
    17  
    18  Chaos Mesh can inject PodChaos into any Pod, no matter whether the Pod is bound to Deployment, StatefulSet, DaemonSet, or other controllers. However, when you inject PodChaos into an independent Pod, some different situations might occur. For example, when you inject `pod-kill` chaos into an independent Pod, Chaos Mesh cannot guarantee that the application recovers from its failure.
    19  
    20  ## Before you start
    21  
    22  * Make sure there is no Control Manager of Chaos Mesh running on the target Pod.
    23  * If the fault type is `pod-kill`, ReplicaSet or a similar mechanism is configured to ensure that Pod can restart automatically.
    24  
    25  ## Simulate fault injections by kbcli
    26  
    27  Common flags for all types of Pod faults.
    28  
    29  📎 Table 1. Pod faults flags description
    30  
    31  | Option                  | Description              | Default value | Required |
    32  | :-----------------------| :------------------------| :------------ | :------- |
    33  | Pod name  | Add a Pod in the command to make this Pod unavailable. For example, <br /> `kbcli fault pod kill mysql-cluster-mysql-0` | Default | No |
    34  | `--namespace` | It specifies the namespace where the Chaos is created. | Current namespace | No |
    35  | `--ns-fault` | It specifies a namespace to make all Pods in this namespace unavailable. For example, <br /> `kbcli fault pod kill --ns-fault=kb-system` | Default | No |
    36  | `--node`   | It specifies a node to make all Pods on this node unavailable. For example, <br /> `kbcli fault pod kill --node=minikube-m02` | None | No |
    37  | `--label`  | It specifies a label to make the Pod with this label in the default namespace unavailable. For example, <br /> `kbcli fault pod kill --label=app.kubernetes.io/component=mysql` | None | No |
    38  | `--node-label` | It specifies a node label to make all Pods on the node with this node label unavailable. For example, <br /> `kbcli fault pod kill --node-label=kubernetes.io/arch=arm64` | None | No |
    39  | `--mode` | It specifies the mode of the experiment. The mode options include `one` (selecting a random Pod), `all` (selecting all eligible Pods), `fixed` (selecting a specified number of eligible Pods), `fixed-percent` (selecting a specified percentage of Pods from the eligible Pods), and `random-max-percent` (selecting the maximum percentage of Pods from the eligible Pods). | `all` | No |
    40  | `--value` | It provides parameters for the `mode` configuration, depending on `mode`. For example, when mode is set to `fixed-percent`, `--value` specifies the percentage of Pods. <br /> `kbcli fault pod kill --mode=fixed-percent --value=50` | None | No |
    41  | `--duration` | It specifies how long the experiment lasts. | 10 seconds | No |
    42  
    43  ### Pod kill
    44  
    45  Run the command below to inject `pod-kill` into all Pods in the default namespace and make the Pods unavailable for 30 seconds.
    46  
    47  ```bash
    48  kbcli fault pod kill
    49  ```
    50  
    51  ### Pod failure
    52  
    53  Run the command below to inject `pod-failure` into all Pods in the default namespace and make the Pods unavailable for 10 seconds.
    54  
    55  ```bash
    56  kbcli fault pod failure --duration=10s
    57  ```
    58  
    59  ### Container kill
    60  
    61  Run the command below to inject `container-kill` into the containers of all Pods in the default namespace once and make the containers unavailable for 10 seconds. `--container` is required.
    62  
    63  ```bash
    64  kbcli fault pod kill-container --container=mysql
    65  ```
    66  
    67  You can also add multiple containers. For example, run the command below to kill the `mysql` and `config-manager` containers in the default namespace.
    68  
    69  ```bash
    70  kbcli fault pod kill-container --container=mysql --container=config-manager
    71  ```
    72  
    73  ## Simulate fault injections by YAML
    74  
    75  This section introduces the YAML configuration file examples. You can view the YAML file by adding `--dry-run` at the end of the above kbcli commands. Meanwhile, you can also refer to the [Chaos Mesh official docs](https://chaos-mesh.org/docs/next/simulate-pod-chaos-on-kubernetes/#create-experiments-using-yaml-configuration-files) for details.
    76  
    77  ### Pod-kill example
    78  
    79  1. Write the experiment configuration to the `pod-kill.yaml` file.
    80  
    81      In the following example, Chaos Mesh injects `pod-kill` into the specified Pod and kills the Pod once.
    82  
    83      ```yaml
    84      apiVersion: chaos-mesh.org/v1alpha1
    85      kind: PodChaos
    86      metadata:
    87        creationTimestamp: null
    88        generateName: pod-chaos-
    89        namespace: default
    90      spec:
    91        action: pod-kill
    92        duration: 10s
    93        mode: fixed-percent
    94        selector:
    95          namespaces:
    96          - default
    97          labelSelectors:
    98          'app.kubernetes.io/component': 'mysql'
    99        value: "50"
   100      ```
   101  
   102  2. Run `kubectl` to start an experiment.
   103  
   104     ```bash
   105     kubectl apply -f ./pod-kill.yaml
   106     ```
   107  
   108  ### Pod-failure example
   109  
   110  1. Write the experiment configuration to the `pod-kill.yaml` file.
   111  
   112      In the following example, Chaos Mesh injects `pod-kill` into the specified Pod and kills the Pod once.
   113  
   114      ```yaml
   115      apiVersion: chaos-mesh.org/v1alpha1
   116      kind: PodChaos
   117      metadata:
   118        creationTimestamp: null
   119        generateName: pod-chaos-
   120        namespace: default
   121      spec:
   122        action: pod-kill
   123        duration: 10s
   124        mode: fixed-percent
   125        selector:
   126          namespaces:
   127          - default
   128          labelSelectors:
   129          'app.kubernetes.io/component': 'mysql'
   130        value: "50"
   131      ```
   132  
   133  2. Run `kubectl` to start an experiment.
   134  
   135     ```bash
   136     kubectl apply -f ./pod-kill.yaml
   137     ```
   138  
   139  ### Container-kill example
   140  
   141  1. Write the experiment configuration to the `pod-kill.yaml` file.
   142  
   143      In the following example, Chaos Mesh injects `pod-kill` into the specified Pod and kills the Pod once.
   144  
   145      ```yaml
   146      apiVersion: chaos-mesh.org/v1alpha1
   147      kind: PodChaos
   148      metadata:
   149        creationTimestamp: null
   150        generateName: pod-chaos-
   151        namespace: default
   152      spec:
   153        action: pod-kill
   154        duration: 10s
   155        mode: fixed-percent
   156        selector:
   157          namespaces:
   158          - default
   159          labelSelectors:
   160          'app.kubernetes.io/component': 'mysql'
   161        value: "50"
   162      ```
   163  
   164  2. Run `kubectl` to start an experiment.
   165  
   166     ```bash
   167     kubectl apply -f ./pod-kill.yaml
   168     ```
   169  
   170  ### Field description
   171  
   172  This table describes the fields in the YAML file.
   173  
   174  | Parameter | Type  | Description | Default value | Required | Example |
   175  | :---      | :---  | :---        | :---          | :---     | :---    |
   176  | action | string | It specifies the fault type to inject. The supported types include `pod-failure`, `pod-kill`, and `container-kill`. | None | Yes | `pod-kill` |
   177  | duration | string | It specifies the duration of the experiment. | None | Yes | 10s |
   178  | mode | string | It specifies the mode of the experiment. The mode options include `one` (selecting a random Pod), `all` (selecting all eligible Pods), `fixed` (selecting a specified number of eligible Pods), `fixed-percent` (selecting a specified percentage of Pods from the eligible Pods), and `random-max-percent` (selecting the maximum percentage of Pods from the eligible Pods). | None | Yes | `fixed-percent` |
   179  | value | string | It provides parameters for the `mode` configuration, depending on `mode`. For example, when `mode` is set to `fixed-percent`, `value` specifies the percentage of Pods. | None | No | 50 |
   180  | selector | struct | It specifies the target Pod by defining node and labels.| None | Yes. <br /> If not specified, the system kills all pods under the default namespece. |  |
   181  | containerNames | string | When you configure `action` to `container-kill`, this configuration is mandatory to specify the target container name for injecting faults. | None | No | mysql |
   182  | gracePeriod | int64 | When you configure `action` to `pod-kill`, this configuration is mandatory to specify the duration before deleting Pod. | 0 | No | 0 |
   183  | duration | string | It specifies the duration of the experiment. | None | Yes | 30s |