github.com/ethersphere/bee/v2@v2.2.0/pkg/topology/kademlia/doc.go (about) 1 // Copyright 2020 The Swarm Authors. All rights reserved. 2 // Use of this source code is governed by a BSD-style 3 // license that can be found in the LICENSE file. 4 5 /* 6 Package kademlia provides an implementation of the topology.Driver interface 7 in a way that a kademlia connectivity is actively maintained by the node. 8 9 A thorough explanation of the logic in the `manage()` forever loop: 10 The `manageC` channel gets triggered every time there's a change in the 11 information regarding peers we know about. This can be a result of: (1) A peer 12 has disconnected from us (2) A peer has been added to the list of 13 known peers (from discovery, api, bootnode flag or just because it 14 was persisted in the address book and the node has been restarted). 15 16 So the information has been changed, and potentially upon disconnection, 17 the depth can travel to a shallower depth in result. 18 If a peer gets added through AddPeers, this does not necessarily infer 19 an immediate depth change, since the peer might end up in the backlog for 20 a long time until we actually need to connect to her. 21 22 The `manage()` forever-loop will connect to peers in order from shallower 23 to deeper depths. This is because of depth calculation method that prioritizes empty bins 24 That are shallower than depth. An in-depth look at `recalcDepth()` method 25 will clarify this (more below). So if we will connect to peers from deeper 26 to shallower depths, all peers in all bins will qualify as peers we'd like 27 to connect to (see `binSaturated` method), ending up connecting to everyone we know about. 28 29 Another important notion one must observe while inspecting how `manage()` 30 works, is that when we connect to peers depth can only move in one direction, 31 which is deeper. So this becomes our strategy and we operate with this in mind, 32 this is also why we iterate from shallower to deeper - since additional 33 connections to peers for whatever reason can only result in increasing depth. 34 35 Empty intermediate bins should be eliminated by the `binSaturated` method indicating 36 a bin size too short, which in turn means that connections should be established 37 within this bin. Empty bins have special status in terms of depth calculation 38 and as mentioned before they are prioritized over deeper, non empty bins and 39 they constitute as the node's depth when the latter is recalculated. 40 For the rationale behind this please refer to the appropriate chapters in the book of Swarm. 41 42 A special case of the `manage()` functionality is that when we iterate over 43 peers and we come across a peer that has PO >= depth, we would always like 44 to connect to that peer. This should always be enforced within the bounds of 45 the `binSaturated` function and guarantees an ever increasing kademlia depth 46 in an ever-increasing size of Swarm, resulting in smaller areas of responsibility 47 for the nodes, maintaining a general upper bound of the assigned nominal 48 area of responsibility in terms of actual storage requirement. See book of Swarm for more details. 49 50 Worth to note is that `manage()` will always try to initiate connections when 51 a bin is not saturated, however currently it will not try to eliminate connections 52 on bins which might be over-saturated. Ideally it should be very cheap to maintain a 53 connection to a peer in a bin, so we should theoretically not aspire to eliminate connections prematurely. 54 It is also safe to assume we will always have more than the lower bound of peers in a bin, why? 55 (1) Initially, we will always try to satisfy our own connectivity requirement to saturate the bin 56 (2) Later on, other peers will get notified about our advertised address and 57 will try to connect to us in order to satisfy their own connectivity thresholds 58 59 We should allow other nodes to dial in, in order to help them maintain a healthy topolgy. 60 It could be, however, that we would need to mark-and-sweep certain connections once a 61 theoretical upper bound has been reached. 62 63 Depth calculation explained: 64 When we calculate depth we must keep in mind the following constraints: 65 (1) A nearest-neighborhood constitutes of an arbitrary lower bound of the 66 closest peers we know about, this is defined in `nnLowWatermark` and is currently set to `2` 67 (2) Empty bins which are shallower than depth constitute as the node's area of responsibility 68 69 As of such, we would calculate depth in the following manner: 70 (1) Iterate over all peers we know about, from deepest (closest) to shallowest, and count until we reach `nnLowWatermark` 71 (2) Once we reach `nnLowWatermark`, mark current bin as depth candidate 72 (3) Iterate over all bins from shallowest to deepest, and look for the shallowest empty bin 73 (4) If the shallowest empty bin is shallower than the depth candidate - select shallowest bin as depth, otherwise select the candidate 74 75 Note: when we are connected to less or equal to `nnLowWatermark` peers, the 76 depth will always be considered `0`, thus a short-circuit is handling this edge 77 case explicitly in the `recalcDepth` method. 78 79 TODO: add pseudo-code how to calculate depth. 80 81 A few examples to depth calculation: 82 83 1. empty kademlia 84 bin | nodes 85 ------------- 86 ==DEPTH== 87 0 0 88 1 0 89 2 0 90 3 0 91 4 0 92 depth: 0 93 94 2. less or equal to two peers (nnLowWatermark=2) (a) 95 bin | nodes 96 ------------- 97 ==DEPTH== 98 0 1 99 1 1 100 2 0 101 3 0 102 4 0 103 depth: 0 104 105 3. less or equal to two peers (nnLowWatermark=2) (b) 106 bin | nodes 107 ------------- 108 ==DEPTH== 109 0 1 110 1 0 111 2 1 112 3 0 113 4 0 114 depth: 0 115 116 4. less or equal to two peers (nnLowWatermark=2) (c) 117 bin | nodes 118 ------------- 119 ==DEPTH== 120 0 2 121 1 0 122 2 0 123 3 0 124 4 0 125 depth: 0 126 127 5. empty shallow bin 128 bin | nodes 129 ------------- 130 0 1 131 ==DEPTH== 132 1 0 133 2 1 134 3 1 135 4 0 136 depth: 1 (depth candidate is 2, but 1 is shallower and empty) 137 138 6. no empty shallower bin, depth after nnLowerWatermark found 139 bin | nodes 140 ------------- 141 0 1 142 1 1 143 ==DEPTH== 144 2 1 145 3 1 146 4 0 147 depth: 2 (depth candidate is 2, shallowest empty bin is 4) 148 149 7. last bin size >= nnLowWatermark 150 bin | nodes 151 ------------- 152 0 1 153 1 1 154 2 1 155 ==DEPTH== 156 3 3 157 4 0 158 depth: 3 (depth candidate is 3, shallowest empty bin is 4) 159 160 8. all bins full 161 bin | nodes 162 ------------- 163 0 1 164 1 1 165 2 1 166 3 3 167 ==DEPTH== 168 4 2 169 depth: 4 (depth candidate is 4, no empty bins) 170 */ 171 package kademlia