github.com/NVIDIA/aistore@v1.3.23-0.20240517131212-7df6609be51d/docs/troubleshooting.md

github.com/NVIDIA/aistore@v1.3.23-0.20240517131212-7df6609be51d/docs/troubleshooting.md (about)

     1  ---
     2  layout: post
     3  title: TROUBLESHOOTING
     4  permalink: /docs/troubleshooting
     5  redirect_from:
     6   - /troubleshooting.md/
     7   - /docs/troubleshooting.md/
     8  ---
     9  
    10  ## Introduction
    11  
    12  This text is intended to help troubleshooting AIStore operation. Easy-to-use TAB-completion based [CLI](/docs/cli.md) is one of the first tools to consider, and of the first commands would be the one that shows the state of the cluster:
    13  
    14  ```console
    15  $ ais show cluster <TAB-TAB>
    16  ...  proxy         smap          target
    17  p[202446p8082] ...
    18  t[147665t8084] ...
    19  ```
    20  
    21  Meaning, you can run `ais show cluster` (the short version), and you can also run it with any of the additional completions listed above.
    22  
    23  For example:
    24  
    25  ```console
    26  $ ais show cluster
    27  PROXY            MEM USED %    MEM AVAIL       UPTIME
    28  202446p8082      0.06%         31.28GiB        19m
    29  279128p8080      0.07%         31.28GiB        19m
    30  928059p8081[P]   0.08%         31.28GiB        19m
    31  
    32  TARGET           MEM USED %    MEM AVAIL       CAP USED %      CAP AVAIL       CPU USED %    REBALANCE      UPTIME
    33  147665t8084      0.07%         31.28GiB        14%             2.511TiB        0.00%         -              19m
    34  165274t8087      0.07%         31.28GiB        14%             2.511TiB        0.00%         -              19m
    35  198815t8088      0.08%         31.28GiB        14%             2.511TiB        0.00%         -              19m
    36  247389t8085      0.07%         31.28GiB        14%             2.511TiB        0.00%         -              19m
    37  426988t8086      0.07%         31.28GiB        14%             2.511TiB        0.00%         -              19m
    38  968103t8083      0.07%         31.28GiB        14%             2.511TiB        0.00%         -              19m
    39  ```
    40  
    41  Since at any given time there's only one primary gateway, you may also find it useful to be able to designate a different one administratively. This is easy - example:
    42  
    43  ```console
    44  $ ais cluster set-primary <TAB-TAB>
    45  p[202446p8082]  p[279128p8080]  p[928059p8081]
    46  $ ais cluster set-primary p[279128p8080]
    47  279128p8080 has been set as a new primary proxy
    48  
    49  $ ais show cluster
    50  PROXY            MEM USED %    MEM AVAIL       UPTIME
    51  202446p8082      0.08%         31.28GiB        46m
    52  279128p8080[P]   0.08%         31.28GiB        46m
    53  928059p8081      0.08%         31.28GiB        46m10s
    54  ...
    55  ```
    56  
    57  ## Cluster Integrity Errors
    58  
    59  The one category of errors that deserves special consideration is "cluster integrity". This category includes several numbered errors that may look as follows:
    60  
    61  ```
    62  cluster integrity error `cie#50`:
    63  Smaps have different origins: Smap v9[...111, t=9, p=3] vs p[232268p8080]: Smap v13[...281, t=4, p=4]
    64  ```
    65  
    66  Above is an example of an actual error - `cie#50` in this case. Generally though, a cluster integrity violation is detected when a node that previously was (and possibly currently remains) a member of a cluster `A` tries to join a different cluster `B`. "Mixing up" nodes (in particular, storage targets) between different AIStore clusters triggers automated rebalancing with further complications that may be very difficult to sort out.
    67  
    68  In many cases, the entirety of a troubleshooting step boils down to cleaning up the node's (obsolete) metadata - in particular, a copy of locally stored cluster map (aka Smap) and/or a copy of BMD. However, any type of metadata cleanup must be done with great caution after a careful review. To this end, the table below enumerates `cie` errors and provides some relevant context.
    69  
    70  | Error | When | Description |
    71  |--- | --- | --- |
    72  | `cie#10` | When a primary proxy (gateway) starts up, it will use its own local (copy of) Smap to query other nodes for cluster-wide metadata. | The error indicates that either one of the nodes, or the primary itself, belongs (or did belong) to a different cluster. |
    73  | `cie#30` | Same as above. | There are at least 2 targets in the cluster that "disagree" between themselves wrt their respective UUIDs. In other words, these two targets cannot be members of a single cluster. |
    74  | `cie#40` | At node startup, or (secondly) when bucket metadata (BMD) changes at runtime. | In both cases, the node's local instance of bucket metadata conflicts with the cluster's version. |
    75  | `cie#50` | Non-primary proxy or storage target: when receiving an updated cluster map that conflicts with the local copy. Primary proxy: when a joining node's Smap does not pass the validation. | In both cases, the node is not permitted to join (or is removed from) the cluster. |
    76  | `cie#60` | When a primary proxy (gateway) is starting up, it uses its own local Smap to query other nodes for cluster-wide metadata. | The error is specific to bucket metadata and is triggered when there are two or more versions that are mutually incompatible. |
    77  | `cie#70` | Same as above. | Same as above, except that there's a simple majority of nodes that have one of the BMD versions. |
    78  | `cie#80` | Joining existing cluster | When node tries to join a cluster we do compare the node's local copy of the cluster map with the existing one. The error, effectively, indicates that according to the node's own cluster map it must be a member of a different cluster. |
    79  | `cie#90` | Primary synchronizing cluster-wide metadata | In a AIS given cluster, the primary gateway is responsible for distributing cluster map, bucket metadata, and a few other critical pieces of cluster-wide metadata, so that all nodes have identical replicas. This process is called `metasync`. The error indicates that a split-brain like condition has been detected during `metasync`. |
    80  
    81  ## Storage Integrity Error
    82  
    83  Another category of errors is the "Storage Integrity Error (sie)," associated with mountpaths attached to the storage targets. A typical `sie` error may look as follows:
    84  
    85  ```
    86  storage integrity error: sie#30 ... :
    87  VMD is different ("/tmp/ais/7/3/.ais.vmd"): &{...} vs &{...}
    88  ```
    89  
    90  Each AIStore target maintains a list of configured mountpaths ([more here](overview.md)) and their properties. This metadata is maintained across mountpath-changing events (such as disk faults and new attachments). It is also persisted and replicated across available mountpaths.
    91  
    92  In addition, AIS target also stores its unique Node ID (aka `DaemonID`). This ID gets generated when the server joins an AIS cluster the first time and never changes during the server’s lifetime. The Node ID is recorded into each mountpath.
    93  
    94  A troubleshooting step to deal with `sie` is cleaning up persisted metadata (i.e. removing the `.ais.vmd` file), recorded Node ID on mountpaths (erasing `user.ais.daemon_id` xattr), and/or updating the node deployment config. The table below enumerates `sie` errors and provides some relevant context.
    95  
    96  **WARNING:** Caution while cleaning up the metadata on mountpaths. It could lead to loss or corruption of data.
    97  
    98  | Error | When | Description |
    99  |--- | --- | --- |
   100  | `sie#10` | When mountpaths in deployment config have different Node ID recorded on them | The error indicates that either one of the mountpaths belongs (or did belong) to a different target |
   101  | `sie#20` |  The target Node ID doesn't match persisted metadata | The Node ID assigned to the target node doesn't match Node ID recorded on a mountpath. Implies the mountpath has stale metadata and is/was part of a different target |
   102  | `sie#30` | The persisted metadata on the mountpaths doesn't match | There are at least two mountpaths disagree on the metadata |
   103  | `sie#40` | A mountpath has a corrupted metadata | At least one of the targets has a corrupted metadata file |
   104  | `sie#50` | A mountpath is present in the config file but missing in persisted metadata | The target has valid metadata persisted on mountpaths, but the mountpaths present in the metadata don't match those provided in the config |