github.com/NVIDIA/aistore@v1.3.23-0.20240517131212-7df6609be51d/docs/join_cluster.md

github.com/NVIDIA/aistore@v1.3.23-0.20240517131212-7df6609be51d/docs/join_cluster.md (about)

     1  ---
     2  layout: post
     3  title: JOIN CLUSTER
     4  permalink: /docs/join-cluster
     5  redirect_from:
     6   - /join_cluster.md/
     7   - /docs/join_cluster.md/
     8  ---
     9  
    10  **Note**: for the most recent updates on the topic of cluster membership and node lifecycle, please also check:
    11  
    12  * [Node lifecycle: maintenance mode, rebalance/rebuild, shutdown, decommission](/docs/lifecycle_node.md)
    13  
    14  Also, see related:
    15  
    16  * [Leaving aistore cluster](leave_cluster.md)
    17  * [Global rebalance](rebalance.md)
    18  * [CLI: `ais cluster` command](/docs/cli/cluster.md)
    19  * [Scripted integration tests](https://github.com/NVIDIA/aistore/tree/main/ais/test/scripts)
    20  
    21  ## Joining a Cluster: _discovery_ URL, and more
    22  
    23  First, some basic facts. AIStore clusters can be deployed with an arbitrary number of AIStore proxies. Each proxy/gateway implements RESTful API and provides full access to objects stored in the cluster. Each proxy collaborates with all other proxies to perform majority-voted HA failovers (section [Highly Available Control Plane](ha.md).
    24  
    25  All _electable_ proxies are functionally equivalent. The one that is elected as _primary_ is, among other things, responsible to _join_ nodes to the running cluster.
    26  
    27  To facilitate node-joining in presence of disruptive events, such as:
    28  
    29  * network failures, and/or
    30  * partial or complete loss of local copies of aistore metadata (e.g., cluster maps)
    31  
    32  - to still be able to reconnect and restore operation, we also provide so called *original* and *discovery* URLs in the cluster configuration.
    33  
    34  The latter is versioned, replicated, protected and distributed - solely by the elected primary.
    35  
    36  > **March 2024 UPDATE**: starting v3.23, the *original* URL does _not_ track the "original" primary. Instead, the current (or currently elected) primary takes full responsibility for updating both URLs with the single and singular purpose: optimizing time to join or rejoin cluster.
    37  
    38  For instance:
    39  
    40  When and if an HA event triggers automated failover, the role of the primary will be automatically assumed by a different proxy/gateway, with the corresponding cluster map (Smap) update getting synchronized across all running nodes.
    41  
    42  A new node, however, could potentially experience a problem when trying to join an already deployed and running cluster - simply because its configuration may still be referring to the old primary. The *original* and *discovery* URLs (see [AIStore configuration](/deploy/dev/local/aisnode_config.sh)) are precisely intended to address this scenario.