github.com/ghodss/etcd@v0.3.1-0.20140417172404-cc329bfa55cb/Documentation/clustering.md (about)

     1  ## Clustering
     2  
     3  ### Example cluster of three machines
     4  
     5  Let's explore the use of etcd clustering.
     6  We use Raft as the underlying distributed protocol which provides consistency and persistence of the data across all of the etcd instances.
     7  
     8  Let start by creating 3 new etcd instances.
     9  
    10  We use `-peer-addr` to specify server port and `-addr` to specify client port and `-data-dir` to specify the directory to store the log and info of the machine in the cluster:
    11  
    12  ```sh
    13  ./etcd -peer-addr 127.0.0.1:7001 -addr 127.0.0.1:4001 -data-dir machines/machine1 -name machine1
    14  ```
    15  
    16  **Note:** If you want to run etcd on an external IP address and still have access locally, you'll need to add `-bind-addr 0.0.0.0` so that it will listen on both external and localhost addresses.
    17  A similar argument `-peer-bind-addr` is used to setup the listening address for the server port.
    18  
    19  Let's join two more machines to this cluster using the `-peers` argument. A single connection to any peer will allow a new machine to join, but multiple can be specified for greater resiliency.
    20  
    21  ```sh
    22  ./etcd -peer-addr 127.0.0.1:7002 -addr 127.0.0.1:4002 -peers 127.0.0.1:7001,127.0.0.1:7003 -data-dir machines/machine2 -name machine2
    23  ./etcd -peer-addr 127.0.0.1:7003 -addr 127.0.0.1:4003 -peers 127.0.0.1:7001,127.0.0.1:7002 -data-dir machines/machine3 -name machine3
    24  ```
    25  
    26  We can retrieve a list of machines in the cluster using the HTTP API:
    27  
    28  ```sh
    29  curl -L http://127.0.0.1:4001/v2/machines
    30  ```
    31  
    32  We should see there are three machines in the cluster
    33  
    34  ```
    35  http://127.0.0.1:4001, http://127.0.0.1:4002, http://127.0.0.1:4003
    36  ```
    37  
    38  The machine list is also available via the main key API:
    39  
    40  ```sh
    41  curl -L http://127.0.0.1:4001/v2/keys/_etcd/machines
    42  ```
    43  
    44  ```json
    45  {
    46      "action": "get",
    47      "node": {
    48          "createdIndex": 1,
    49          "dir": true,
    50          "key": "/_etcd/machines",
    51          "modifiedIndex": 1,
    52          "nodes": [
    53              {
    54                  "createdIndex": 1,
    55                  "key": "/_etcd/machines/machine1",
    56                  "modifiedIndex": 1,
    57                  "value": "raft=http://127.0.0.1:7001&etcd=http://127.0.0.1:4001"
    58              },
    59              {
    60                  "createdIndex": 2,
    61                  "key": "/_etcd/machines/machine2",
    62                  "modifiedIndex": 2,
    63                  "value": "raft=http://127.0.0.1:7002&etcd=http://127.0.0.1:4002"
    64              },
    65              {
    66                  "createdIndex": 3,
    67                  "key": "/_etcd/machines/machine3",
    68                  "modifiedIndex": 3,
    69                  "value": "raft=http://127.0.0.1:7003&etcd=http://127.0.0.1:4003"
    70              }
    71          ]
    72      }
    73  }
    74  ```
    75  
    76  We can also get the current leader in the cluster:
    77  
    78  ```
    79  curl -L http://127.0.0.1:4001/v2/leader
    80  ```
    81  
    82  The first server we set up should still be the leader unless it has died during these commands.
    83  
    84  ```
    85  http://127.0.0.1:7001
    86  ```
    87  
    88  Now we can do normal SET and GET operations on keys as we explored earlier.
    89  
    90  ```sh
    91  curl -L http://127.0.0.1:4001/v2/keys/foo -XPUT -d value=bar
    92  ```
    93  
    94  ```json
    95  {
    96      "action": "set",
    97      "node": {
    98          "createdIndex": 4,
    99          "key": "/foo",
   100          "modifiedIndex": 4,
   101          "value": "bar"
   102      }
   103  }
   104  ```
   105  
   106  ### Rejoining to the Cluster
   107  
   108  If one machine disconnects from the cluster, it could rejoin the cluster automatically when the communication is recovered.
   109  
   110  If one machine is killed, it could rejoin the cluster when started with old name. If the peer address is changed, etcd will treat the new peer address as the refreshed one, which benefits instance migration, or virtual machine boot with different IP.
   111  
   112  **Note:** For now, it is user responsibility to ensure that the machine doesn't join the cluster that has the member with the same name. Or unexpected error will happen. It would be improved sooner or later.
   113  
   114  ### Killing Nodes in the Cluster
   115  
   116  Now if we kill the leader of the cluster, we can get the value from one of the other two machines:
   117  
   118  ```sh
   119  curl -L http://127.0.0.1:4002/v2/keys/foo
   120  ```
   121  
   122  We can also see that a new leader has been elected:
   123  
   124  ```
   125  curl -L http://127.0.0.1:4002/v2/leader
   126  ```
   127  
   128  ```
   129  http://127.0.0.1:7002
   130  ```
   131  
   132  or
   133  
   134  ```
   135  http://127.0.0.1:7003
   136  ```
   137  
   138  
   139  ### Testing Persistence
   140  
   141  Next we'll kill all the machines to test persistence.
   142  Type `CTRL-C` on each terminal and then rerun the same command you used to start each machine.
   143  
   144  Your request for the `foo` key will return the correct value:
   145  
   146  ```sh
   147  curl -L http://127.0.0.1:4002/v2/keys/foo
   148  ```
   149  
   150  ```json
   151  {
   152      "action": "get",
   153      "node": {
   154          "createdIndex": 4,
   155          "key": "/foo",
   156          "modifiedIndex": 4,
   157          "value": "bar"
   158      }
   159  }
   160  ```
   161  
   162  
   163  ### Using HTTPS between servers
   164  
   165  In the previous example we showed how to use SSL client certs for client-to-server communication.
   166  Etcd can also do internal server-to-server communication using SSL client certs.
   167  To do this just change the `-*-file` flags to `-peer-*-file`.
   168  
   169  If you are using SSL for server-to-server communication, you must use it on all instances of etcd.
   170  
   171  
   172  ### What size cluster should I use?
   173  
   174  Every command the client sends to the master is broadcast to all of the followers.
   175  The command is not committed until the majority of the cluster peers receive that command.
   176  
   177  Because of this majority voting property, the ideal cluster should be kept small to keep speed up and be made up of an odd number of peers.
   178  
   179  Odd numbers are good because if you have 8 peers the majority will be 5 and if you have 9 peers the majority will still be 5.
   180  The result is that an 8 peer cluster can tolerate 3 peer failures and a 9 peer cluster can tolerate 4 machine failures.
   181  And in the best case when all 9 peers are responding the cluster will perform at the speed of the fastest 5 machines.