github.com/NVIDIA/aistore@v1.3.23-0.20240517131212-7df6609be51d/docs/rebalance.md (about) 1 --- 2 layout: post 3 title: REBALANCE 4 permalink: /docs/rebalance 5 redirect_from: 6 - /rebalance.md/ 7 - /docs/rebalance.md/ 8 --- 9 10 ## Table of Contents 11 12 - [Global Rebalance](#global-rebalance) 13 - [CLI: usage examples](#cli-usage-examples) 14 - [Automated Resilvering](#automated-resilvering) 15 16 ## Global Rebalance 17 18 To maintain [consistent distribution of user data at all times](https://en.wikipedia.org/wiki/Consistent_hashing#Examples_of_use), AIStore rebalances itself based on *new* versions of its [cluster map](/cluster/map.go). 19 20 More exactly: 21 22 * When storage targets join or leave the cluster, the current *primary* (leader) proxy transactionally creates the *next* updated version of the cluster map; 23 * [Synchronizes](/ais/metasync.go) the new map across the entire cluster so that each and every node gets the version; 24 * Which further results in each AIS target starting to traverse its locally stored content, recomputing object locations, 25 * And sending at least some of the objects to their respective *new* locations 26 * Whereby object migration is carried out via intra-cluster optimized [communication mechanism](/transport/README.md) and over a separate [physical or logical network](/cmn/network.go), if provisioned. 27 28 Thus, cluster-wide rebalancing is totally and completely decentralized. When a single server joins (or goes down in a) cluster of N servers, approximately 1/Nth of the entire namespace will get rebalanced via direct target-to-target transfers. 29 30 Further, cluster-wide rebalancing does not require any downtime. 31 Incoming GET requests for the objects that haven't yet migrated (or are being moved) are handled internally via the mechanism that we call "get-from-neighbor". 32 The (rebalancing) target that must (according to the new cluster map) have the object but doesn't, will locate its "neighbor", get the object, and satisfy the original GET request transparently from the user. 33 34 Similar to all other AIS modules and sub-systems, global rebalance is controlled and monitored via the documented [RESTful API](http_api.md). 35 It might be easier and faster, though, to use [AIS CLI](/docs/cli.md) - see next section. 36 37 ## CLI: usage examples 38 39 1. Disable automated global rebalance (for instance, to perform maintenance or upgrade operations) and show resulting config in JSON on a randomly selected target: 40 41 ```console 42 $ ais config cluster rebalance.enabled=false 43 config successfully updated 44 45 $ ais show config 361179t8088 --json | grep -A 6 rebalance 46 47 "rebalance": { 48 "dest_retry_time": "2m", 49 "quiescent": "20s", 50 "compression": "never", 51 "multiplier": 4, 52 "enabled": false 53 }, 54 55 ``` 56 57 2. Re-enable automated global rebalance and show resulting config section as a simple `name/value` list: 58 59 ```console 60 $ ais config cluster rebalance.enabled=true 61 config successfully updated 62 63 $ ais show config <TAB-TAB> 64 125210p8082 181883t8089 249630t8087 361179t8088 477343p8081 675515t8084 70681p8080 782227p8083 840083t8086 911875t8085 65 66 $ ais show config 840083t8086 rebalance 67 PROPERTY VALUE DEFAULT 68 rebalance.compression never - 69 rebalance.dest_retry_time 2m - 70 rebalance.enabled true - 71 rebalance.multiplier 2 - 72 rebalance.quiescent 10s - 73 ``` 74 75 3. Monitoring: notice per-target statistics and the `EndTime` column 76 77 ```console 78 $ ais show rebalance 79 DaemonID RebID ObjRcv SizeRcv ObjSent SizeSent StartTime EndTime Aborted 80 ====== ====== ====== ====== ====== ====== ====== ====== ====== 81 181883t8089 1 0 0B 1058 1.27MiB 04-28 16:05:35 <not completed> false 82 249630t8087 1 0 0B 988 1.18MiB 04-28 16:05:35 <not completed> false 83 361179t8088 1 5029 6.02MiB 0 0B 04-28 16:05:35 <not completed> false 84 675515t8084 1 0 0B 989 1.18MiB 04-28 16:05:35 <not completed> false 85 840083t8086 1 0 0B 974 1.17MiB 04-28 16:05:35 <not completed> false 86 911875t8085 1 0 0B 1020 1.22MiB 04-28 16:05:35 <not completed> false 87 88 $ ais show rebalance 89 DaemonID RebID ObjRcv SizeRcv ObjSent SizeSent StartTime EndTime Aborted 90 ====== ====== ====== ====== ====== ====== ====== ====== ====== 91 181883t8089 1 0 0B 1058 1.27MiB 04-28 16:05:35 04-28 16:05:53 false 92 249630t8087 1 0 0B 988 1.18MiB 04-28 16:05:35 04-28 16:05:53 false 93 361179t8088 1 5029 6.02MiB 0 0B 04-28 16:05:35 04-28 16:05:53 false 94 675515t8084 1 0 0B 989 1.18MiB 04-28 16:05:35 04-28 16:05:53 false 95 840083t8086 1 0 0B 974 1.17MiB 04-28 16:05:35 04-28 16:05:53 false 96 911875t8085 1 0 0B 1020 1.22MiB 04-28 16:05:35 04-28 16:05:53 false 97 ``` 98 99 4. Since global rebalance is an [extended action (xaction)](/xact/README.md), it can be also monitored via generic `show xaction` API: 100 101 ```console 102 $ ais show job xaction rebalance 103 NODE ID KIND BUCKET OBJECTS BYTES START END STATE 104 181883t8089 g2 rebalance - 1058 1.27MiB 04-28 16:10:14 - Running 105 ... 106 ``` 107 108 5. Finally, you can always start and stop global rebalance administratively, for instance: 109 110 111 ```console 112 $ ais start rebalance 113 ``` 114 115 ## Automated Resilvering 116 117 While rebalance (previous section) takes care of the cluster *grow* and *shrink* events, resilver, as the name implies, is responsible for the [mountpath](overview.md#terminology) *added* and [mountpath](overview.md#terminology) *removed* events handled locally within (and by) each storage target. 118 119 In other words, global rebalance handles scaling (up and down) of the entire AIS cluster while automated *resilvering* takes care of disk attachments and disk faults within a given storage node. 120 121 * A [mountpath](overview.md#terminology) is a single disk **or** a volume (a RAID) formatted with a local filesystem of choice, **and** a local directory that AIS utilizes to store user data and AIS metadata. A mountpath can be disabled and (re)enabled, automatically or administratively, at any point during runtime. In a given cluster, a total number of mountpaths would normally compute as a direct product of `(number of storage targets) x (number of disks in each target)`. 122 123 As stated, mountpath removal can be done administratively (via API) or be triggered by a disk fault (see [filesystem health checking](/health/fshc.md). 124 Irrespectively of the original cause, mountpath-level events activate resilver that in many ways performs the same set of steps as the rebalance. 125 The one salient difference is that all object migrations are local (and, therefore, relatively fast(er)). 126 127 ### CLI Usage 128 129 Resilvering can be run on a specific target node or the entire cluster (when all targets execute resilvering in parallel). 130 131 Similar to global rebalancing, resilvering is a managed *eXtended operation* or [xaction](ic.md). 132 All xactions execute asyncrhonously and support a common set of documented APIs to start, terminate the xaction, inquire its progress, etc. The progress of resilvering can be monitored via `ais show job xaction` CLI. 133 134 Examples: 135 136 ```console 137 $ ais advanced resilver # all targets will be resilvered 138 Started resilver "NGxmOthtE", use 'ais show job xaction NGxmOthtE' to monitor the progress 139 140 $ ais advanced resilver BUQOt8086 # resilver a single node 141 Started resilver "NGxmOthtE", use 'ais show job xaction NGxmOthtE' to monitor the progress 142 ``` 143 144 Automated resilvering can also be disabled. Just like with `rebalance`, the resulting config can be viewed through the CLI: 145 NOTE: When automated resilvering is disabled, removing a mountpath may result in data loss. 146 147 ```console 148 $ ais config cluster resilver.enabled=false 149 config successfully updated 150 151 $ ais show config 361179t8088 resilver --json | grep -A 2 resilver 152 "resilver": { 153 "enabled": false 154 }, 155 156 $ ais config cluster resilver.enabled=true 157 config successfully updated 158 159 $ ais show config <TAB-TAB> 160 125210p8082 181883t8089 249630t8087 361179t8088 477343p8081 675515t8084 70681p8080 782227p8083 840083t8086 911875t8085 161 162 $ ais show config 361179t8088 resilver 163 PROPERTY VALUE 164 resilver.enabled true 165 ``` 166 167 ## IO Performance 168 169 During rebalancing, response latency and overall cluster throughput may substantially degrade.