github.com/ethersphere/bee/v2@v2.2.0/pkg/topology/kademlia/doc.go

github.com/ethersphere/bee/v2@v2.2.0/pkg/topology/kademlia/doc.go (about)

     1  // Copyright 2020 The Swarm Authors. All rights reserved.
     2  // Use of this source code is governed by a BSD-style
     3  // license that can be found in the LICENSE file.
     4  
     5  /*
     6  Package kademlia provides an implementation of the topology.Driver interface
     7  in a way that a kademlia connectivity is actively maintained by the node.
     8  
     9  A thorough explanation of the logic in the `manage()` forever loop:
    10  The `manageC` channel gets triggered every time there's a change in the
    11  information regarding peers we know about. This can be a result of: (1) A peer
    12  has disconnected from us (2) A peer has been added to the list of
    13  known peers (from discovery, api, bootnode flag or just because it
    14  was persisted in the address book and the node has been restarted).
    15  
    16  So the information has been changed, and potentially upon disconnection,
    17  the depth can travel to a shallower depth in result.
    18  If a peer gets added through AddPeers, this does not necessarily infer
    19  an immediate depth change, since the peer might end up in the backlog for
    20  a long time until we actually need to connect to her.
    21  
    22  The `manage()` forever-loop will connect to peers in order from shallower
    23  to deeper depths. This is because of depth calculation method that prioritizes empty bins
    24  That are shallower than depth. An in-depth look at `recalcDepth()` method
    25  will clarify this (more below). So if we will connect to peers from deeper
    26  to shallower depths, all peers in all bins will qualify as peers we'd like
    27  to connect to (see `binSaturated` method), ending up connecting to everyone we know about.
    28  
    29  Another important notion one must observe while inspecting how `manage()`
    30  works, is that when we connect to peers depth can only move in one direction,
    31  which is deeper. So this becomes our strategy and we operate with this in mind,
    32  this is also why we iterate from shallower to deeper - since additional
    33  connections to peers for whatever reason can only result in increasing depth.
    34  
    35  Empty intermediate bins should be eliminated by the `binSaturated` method indicating
    36  a bin size too short, which in turn means that connections should be established
    37  within this bin. Empty bins have special status in terms of depth calculation
    38  and as mentioned before they are prioritized over deeper, non empty bins and
    39  they constitute as the node's depth when the latter is recalculated.
    40  For the rationale behind this please refer to the appropriate chapters in the book of Swarm.
    41  
    42  A special case of the `manage()` functionality is that when we iterate over
    43  peers and we come across a peer that has PO >= depth, we would always like
    44  to connect to that peer. This should always be enforced within the bounds of
    45  the `binSaturated` function and guarantees an ever increasing kademlia depth
    46  in an ever-increasing size of Swarm, resulting in smaller areas of responsibility
    47  for the nodes, maintaining a general upper bound of the assigned nominal
    48  area of responsibility in terms of actual storage requirement. See book of Swarm for more details.
    49  
    50  Worth to note is that `manage()` will always try to initiate connections when
    51  a bin is not saturated, however currently it will not try to eliminate connections
    52  on bins which might be over-saturated. Ideally it should be very cheap to maintain a
    53  connection to a peer in a bin, so we should theoretically not aspire to eliminate connections prematurely.
    54  It is also safe to assume we will always have more than the lower bound of peers in a bin, why?
    55  (1) Initially, we will always try to satisfy our own connectivity requirement to saturate the bin
    56  (2) Later on, other peers will get notified about our advertised address and
    57  will try to connect to us in order to satisfy their own connectivity thresholds
    58  
    59  We should allow other nodes to dial in, in order to help them maintain a healthy topolgy.
    60  It could be, however, that we would need to mark-and-sweep certain connections once a
    61  theoretical upper bound has been reached.
    62  
    63  Depth calculation explained:
    64  When we calculate depth we must keep in mind the following constraints:
    65  (1) A nearest-neighborhood constitutes of an arbitrary lower bound of the
    66  closest peers we know about, this is defined in `nnLowWatermark` and is currently set to `2`
    67  (2) Empty bins which are shallower than depth constitute as the node's area of responsibility
    68  
    69  As of such, we would calculate depth in the following manner:
    70  (1) Iterate over all peers we know about, from deepest (closest) to shallowest, and count until we reach `nnLowWatermark`
    71  (2) Once we reach `nnLowWatermark`, mark current bin as depth candidate
    72  (3) Iterate over all bins from shallowest to deepest, and look for the shallowest empty bin
    73  (4) If the shallowest empty bin is shallower than the depth candidate - select shallowest bin as depth, otherwise select the candidate
    74  
    75  Note: when we are connected to less or equal to `nnLowWatermark` peers, the
    76  depth will always be considered `0`, thus a short-circuit is handling this edge
    77  case explicitly in the `recalcDepth` method.
    78  
    79  TODO: add pseudo-code how to calculate depth.
    80  
    81  A few examples to depth calculation:
    82  
    83  1. empty kademlia
    84  bin | nodes
    85  -------------
    86  ==DEPTH==
    87  0			0
    88  1			0
    89  2			0
    90  3			0
    91  4			0
    92  depth: 0
    93  
    94  2. less or equal to two peers (nnLowWatermark=2) (a)
    95  bin | nodes
    96  -------------
    97  ==DEPTH==
    98  0			1
    99  1			1
   100  2			0
   101  3			0
   102  4			0
   103  depth: 0
   104  
   105  3. less or equal to two peers (nnLowWatermark=2) (b)
   106  bin | nodes
   107  -------------
   108  ==DEPTH==
   109  0			1
   110  1			0
   111  2			1
   112  3			0
   113  4			0
   114  depth: 0
   115  
   116  4. less or equal to two peers (nnLowWatermark=2) (c)
   117  bin | nodes
   118  -------------
   119  ==DEPTH==
   120  0			2
   121  1			0
   122  2			0
   123  3			0
   124  4			0
   125  depth: 0
   126  
   127  5. empty shallow bin
   128  bin | nodes
   129  -------------
   130  0			1
   131  ==DEPTH==
   132  1			0
   133  2			1
   134  3			1
   135  4			0
   136  depth: 1 (depth candidate is 2, but 1 is shallower and empty)
   137  
   138  6. no empty shallower bin, depth after nnLowerWatermark found
   139  bin | nodes
   140  -------------
   141  0			1
   142  1			1
   143  ==DEPTH==
   144  2			1
   145  3			1
   146  4			0
   147  depth: 2 (depth candidate is 2, shallowest empty bin is 4)
   148  
   149  7. last bin size >= nnLowWatermark
   150  bin | nodes
   151  -------------
   152  0			1
   153  1			1
   154  2			1
   155  ==DEPTH==
   156  3			3
   157  4			0
   158  depth: 3 (depth candidate is 3, shallowest empty bin is 4)
   159  
   160  8. all bins full
   161  bin | nodes
   162  -------------
   163  0			1
   164  1			1
   165  2			1
   166  3			3
   167  ==DEPTH==
   168  4			2
   169  depth: 4 (depth candidate is 4, no empty bins)
   170  */
   171  package kademlia