github.com/benz9527/toy-box/algo@v0.0.0-20240221120937-66c0c6bd5abd/tree/ReadMe.md (about)

     1  # Tree data structure  
     2  
     3  A tree is a nonlinear hierarchical data structure that consists of nodes connected by edges.  
     4  
     5  ## Why Tree Data Structure?
     6  Other data structure such as arrays, linked list, stack and queue are linear data structures that score data 
     7  sequentially. in order to perform any operation in a linear data structure, the time complexity increases with 
     8  the increase in the data size. But, it is not acceptable in today's computational world.
     9    
    10  Different tree data structures allow quicker and easier access to the data as it is a non-linear data structure.
    11  
    12  ## Tree Terminologies
    13  
    14  ### Node
    15  A node is an entity that contains a key or value and pointers to ts child nodes.
    16  
    17  The last nodes of each path are called leaf nodes or external nodes that do not contain a link/pointer to child nodes.
    18  
    19  The node having at least a child node is called an internal node.
    20  
    21  ### Edge
    22  It is the link between any two nodes.
    23  
    24  ### Root
    25  It is the topmost node of a tree.
    26  
    27  ### Height of a Node
    28  The height of a node is the number of edges from the node to the deepest leaf (i.e. the longest path from the node to a leaf node).
    29  
    30  ### Depth of a Node
    31  The depth of a node is the number of edges from the root to the node.
    32  
    33  ### Height of a Tree
    34  The height of a Tree is the height of the root node or the depth of the deepest node.
    35  
    36  ### Degree of a Node
    37  The degree of a node is the total number of branches of that node.
    38  
    39  ### Forest
    40  A collection of disjoint trees is called a forest.
    41  
    42  # Tree Traversal
    43  Traversing a tree means visiting every node in the tree. You might, for instance, want to add all the values in the tree or find the largest one. For all these operations, you will need to visit each node of the tree.
    44  
    45  Every tree is a combination of:
    46  1.A node carrying data
    47  2.Two subtrees (left subtree, right subtree)
    48  
    49  ## Inorder
    50  1.First, visit all the nodes in the left subtree
    51  2.Then the root node
    52  3.Visit all the nodes in the right subtree
    53  
    54  ```
    55  inorder(root.left)
    56  display(root)
    57  inorder(root.right)
    58  ```
    59  
    60  ## Preorder
    61  1.Visit root node
    62  2.Visit all the nodes in the left subtree
    63  3.Visit all the nodes in the right subtree
    64  
    65  ```
    66  display(root)
    67  preorder(root.left)
    68  preorder(root.right)
    69  ```
    70  
    71  ## Postorder
    72  1.Visit all the nodes in the left subtree
    73  2.Visit all the nodes in the right subtree
    74  3.Visit the root node
    75  
    76  ```
    77  postorder(root.left)
    78  postorder(root.right)
    79  display(root)
    80  ``` 
    81  
    82  # Binary tree
    83  - data item
    84  - address of left child
    85  - address of right child
    86  
    87  ## full binary tree (02 tree)
    88  every parent node/internal node has either two or no children
    89  
    90  
    91  ### full binary tree theorems
    92  let
    93  i = the number of internal nodes
    94  n = be the total number of nodes
    95  l = number of leaves
    96  lambda = number of levels
    97  
    98  1.the number of leaves i + 1
    99  2.the total number of nodes is n = 2i + 1
   100  3.the number of internal nodes is  i = (n - 1) / 2
   101  4.the number of leaves is l = (n + 1) / 2
   102  5.the total number of nodes is 2l - 1
   103  6.the number of internal nodes is i = l - 1
   104  7.the number of leaves is at most 2^(lambda - 1)
   105  
   106  ## perfect binary tree(2 tree)
   107  every internal node has exactly two child nodes and all the leaf nodes are at the same level
   108  
   109  all the internal nodes have degree of 2
   110  
   111  1.if a single node has no children, it is a perfect binary tree of height h = 0
   112  2.if a node has h > 0, it is a perfect binary tree if both of its subtrees are of height h - 1 and are non-overlapping
   113  
   114  ### perfect binary tree theorems
   115  1.a perfect binary tree of height h has 2^(h + 1) - 1 node
   116  2.a perfect binary tree with n nodes has height log(n + 1) - 1 = theta(ln(n))
   117  3.a perfect binary tree of height h has 2^h leaf nodes
   118  4.the average depth of a node in a perfect binary tree is theta(ln(n))
   119  
   120  ## complete binary tree(021 tree)
   121  just like a full binary tree, but with two major differences
   122  1.every level must be completely filled
   123  2.all the leaf elements must lean towards the left
   124  3.the last leaf element might not have a right sbiling (i.e. a complete binary tree doesn't have to be a full binary tree)
   125  
   126  ## degenerate or pathological tree
   127  the tree having a single child either left or right
   128  
   129  ## skewed binary tree
   130  a pathological/degenerate tree in which the tree is either dominated by the left nodes or thr right nodes
   131  left-skewed binary tree
   132  right-skewed binary tree
   133  
   134  ## balanced binary tree(-1 tree)
   135  the difference between the height of the left and the right subtree for each node is either 0 or 1
   136  
   137  df = abs(height of left child - height of right child)
   138  
   139  ### the conditions for height-balance binary tree
   140  1.difference between the left and the right subtree for any node is not more than one
   141  2.the left subtree is balanced
   142  3.the right subtree is balanced
   143  
   144  ### applications
   145  AVL tree
   146  balanced binary search tree
   147  
   148  ## AVL tree
   149  it is a self-balancing binary search tree in which each node maintains extra information called a balance factor whose value is either -1, 0, or +1
   150  
   151  Inventor: Georgy Adelson-Velsky and Landis
   152  
   153  ### balance factor
   154  balance factor of a node in avl tree is the difference between the height of the left subtree and that of the right subtree of that node
   155  
   156  bf = height of left subtree - height of right subtree
   157  or
   158  bf = height of right subtree - height of left subtree
   159  
   160  ### rotating the subtrees in an avl tree
   161  - left rotate
   162  the arrangement of the nodes on the right is transformed into the arrangements on the left node.
   163    
   164  initial
   165  root-> x node
   166  x -> left: alpha node
   167  x -> right: y node
   168  y -> left: beta node
   169  y -> right: gamma node
   170  
   171  after left-rotate
   172  root -> y node
   173  y -> left: x node
   174  y -> right: gamma node
   175  x -> left: alpha node
   176  x -> right: beta node
   177  
   178  - right rotate
   179    the arrangement of the nodes on the left is transformed into the arrangements on the right node.
   180    
   181  - left-right and right-left rotate
   182  in left-right rotation, the arrangements are first shifted to the left and then to the right
   183    
   184  initial
   185  p -> z node
   186  z -> left: x node
   187  z -> right: delta node
   188  x -> left: alpha node
   189  x -> right: y node
   190  y -> left: beta node
   191  y -> right: gamma node
   192  
   193  1.do left rotate on x-y
   194  p -> z node
   195  z -> left: y node
   196  z -> right: delta node
   197  y -> left: x node
   198  y -> right: gamma node
   199  x -> left: alpha node
   200  x -> right: beta node
   201  (left-skewed tree)
   202  
   203  2.do right rotation on y-z
   204  p -> y node
   205  y -> left: x node
   206  y -> right: z node
   207  x -> left: alpha node
   208  x -> right: beta node
   209  z -> left: gamma node
   210  z -> right: delta node
   211  
   212  in right-left rotation, the arrangements are first shifted to the right and then to the left
   213  
   214  insert a new node
   215  always inserted as a leaf node with balance factor equal to 0
   216  
   217  1.go to the appropriate leaf node to insert a new node using the following recursive steps. Compare new key with root key of the current tree.
   218  a.if new key  < root key, call insertion algorithm on the left subtree of the current node until the leaf node is reached.
   219  b.else is new key > root key, call insertion algorithm on the right subtree of current node until the leaf node is reached.
   220  c.else, return leaf node
   221  2.compare leaf key obtained from the above steps with new key
   222  a.if new key < leaf key, make new node as the left child of leaf node
   223  b.else, make new node as right child of leaf node
   224  3.update balance factor of the nodes
   225  4.if the nodes are unbalanced, then rebalance the node
   226  a.if balance factor > 1, it means the height of the left subtree is greater than that of the right subtree. So, do a right rotation or left-right rotation
   227  - if new node key < left child key do right rotation
   228  - else, do left-right rotation
   229  b.if balance factor < -1, it means the height of the right subtree is greater than that of the left subtree. So, do right rotation or right-left rotation
   230    - if new node key > right child key do left rotation
   231      - else, do right-left rotation
   232      
   233  delete a node
   234  a node is always deleted as leaf node. After deleting a node, the balance factors of the nodes get changes. In order to rebalance the balance factor, suitable rotations are performed.
   235  
   236  1.locate node to be deleted (recursion is used to find node to be deleted)
   237  2.there are three cases for deleting a node
   238  a.if node to be deleted is the leaf node, then remove node to be deleted
   239  b.if node to be deleted has one child, then substitute the contents of node to be deleted with that of the child. remove the child
   240  c.if node to be deleted has two children, find the inorder successor w of node to be deleted (i.e. node with a minimum value of key in the right subtree)
   241  3.update balance factor of the nodes
   242  4.rebalance the tree if the balance factor of any of the nodes is not equal to -1, 0 or 1
   243  a.if balance factor of current node > 1
   244  - if balance factor of left child >= 0, do right rotation
   245  - else do left-right rotation
   246  b.if balance factor of current node < -1
   247  - if balance factor of right child <= 0, do left rotation
   248  - else do right-left rotation
   249  
   250  ## B-tree
   251  self-balancing search tree in which each node can contain more than one key and can have more than two children. It is a generalized form of the binary search tree.
   252  
   253  it is also known as a height-balanced m-way tree.
   254  
   255  ### why b-tree?
   256  the need for b-tree arose with the rise in the need for lesser time in accessing the physical storage media like a hard disk. The secondary storage devices are slower with a large capacity. There was a need for such types of data structures that minimize the disk accesses.
   257  
   258  Other data structures such as a binary search tree, avl tree, red-black tree, etc can storage only one key in one node. if you have to store a large number of keys, then the height of such trees becomes very large and the access time increases.
   259  
   260  however, b-tree can store many keys in a single node and can have multuple child nodes. This decreases the height significantly allowing faster disk accesses.
   261  
   262  ### b-tree properties
   263  1.for each node x, the keys are stored in increasing order
   264  2.in each node, there is a boolean value x.leaf which is true if x is a leaf
   265  3.if n is the order of the tree, each internal node can contain at most n - 1 keys along with a pointer to each child
   266  4.each node except root can have at most n children and at least n / 2 children
   267  5.all leaves have the same depth (i.e. height-h of the tree)
   268  6.the root has at least 2 children and contains a minimum of 1 key
   269  7.if n >= 1, then for any n-key b-tree of height h and minimum degree t >= 2, h >= log{t}{(n + 1) / 2}
   270  
   271  ### searching
   272  1.starting from the root node, compare k with the first keyof the node. if k = the first key of the node, return the node and the index.
   273  2.if k.leaf = true, return null (not found)
   274  3.if k < the first key of the root node, search the left child of this key recursively.
   275  4.if there is more than one key in the current node and k > the first key, compare k with the next key in the node.
   276  if k < next key, search the left child of this key (i.e. k lies in between the first and the second keys).
   277  else, search the right child of the key
   278  5.repeat steps 1 to 4 until the leaf us reached
   279  
   280  ### insertion 
   281  Inserting an element on a b-tree consists of two events: searching the appropriate node to insert the element and splitting the node if required. Insertion operation always takes place in the bottom-up approach.
   282  
   283  1.if the tree is empty, allocate a root node and insert the key.
   284  2.update the allowed number of keys in the node.
   285  3.search the appropriate node for insertion.
   286  4.if the node is full, follow the steps below.
   287  5.insert the elements in increasing order.
   288  6.now, there are elements greater than its limit. So, split at the median.
   289  7.push the median key upwards and make the left keys as a left child and the right keys as right child.
   290  8.if the node is not full, follow the steps below.
   291  9.insert the node in increasing order.
   292  
   293  ### deletion 
   294  deleting an element on a b-tree consists of 3 main events:
   295  1.searching the node where the key to be deleted exists
   296  2.deleting the key
   297  3.balancing the tree if required
   298  
   299  while deleting a tree, a condition called underflow may occur. underflow occurs when a node contains less than the minimum number of keys is should be hold.
   300  
   301  the terms to be understood before studying deletion operation are:
   302  1.inorder predecessor
   303  the largest key on the left child of a node is called its inorder predecessor
   304  2.inorder successor
   305  the smallest key on the right child of a node is called its inorder successor
   306  
   307  #### deletion operation
   308  before going through the steps below, one must know these facts about a b-tree of degree m
   309  1.a node can have a maximum of m children
   310  (i.e. 3)
   311  2.a node can contain a maximum of m - 1 keys
   312  (i.e. 2)
   313  3.a node should have a minimum of m/2 children
   314  (i.e. 2)
   315  4.a node (except root node) should contain a minimum of m/2 - 1 keys
   316  (i.e. 1)
   317  ε‘δΈŠε–ζ•΄
   318  
   319  case 1
   320  the key to be deleted lies in the leaf. there are 2 cases for it.
   321  1.the deletion of the key does not violate the property of the minimum number of keys a node should hold.
   322  2.the deletion of the key violates the property of the minimum number of keys a node should hold, In this case, we borrow a key from its immediate neighboring sibling node in the order of left to right.
   323  
   324  first, visit the immediate left sibling. if the left sibling node has more than a minimum number of keys, then borrow a key from this node.
   325  else, check to borrow from the immediate right sibling node.
   326  
   327  if both the immediate sibling nodes already have a minimum number of keys, then merge the node with either the left sibling node or the right sibling node. The merging is done through the parent node.
   328  
   329  case 2
   330  if the key to be deleted lies in the internal node, the following cases occurs.
   331  1.the internal node, which is deleted, is replaced by an inorder predecessor if the left child has more than the minimum number of keys.
   332  2.the internal node, which is deleted, is replaced by an inorder successor if the right child has more than the minimum number of keys.
   333  3.if either child has exactly a minimum number of keys then, merge the left and right children
   334  after merging if the parent node has less than the minimum number of keys then, look for the sibling as in case 1
   335  
   336  case 3
   337  in this case, the height of the tree shrinks. if the target key lies in an internal node, and the deletion of the key leads to a fewer number of keys in the node (i.e. less than the minimum required), then look for the inorder predecessor and the inorder successor. if both the children contain a minimum number of keys then, borrowing cannot take place. this leads to case 2. i.e. merging the children.
   338  
   339  again, look for the sibling to borrow a key. but, if the sibling also has only a minimum number of keys then, merge the node with the sibling along with the parent. Arrange the children accordingly (increasing order)
   340  
   341  ## red black tree
   342  2-3 tree
   343  
   344  it is a self-balancing binary search tree in which each node contains an extra bit for denoting the color of the node, either red or black.
   345  
   346  a red-black tree satisfies the following properties:
   347  1.red/black properties: every node is colored, either red or black
   348  2.root property: the root is black
   349  3.leaf property: every leaf(nil) is black
   350  4.red property: if a red node has children then, the children are always black
   351  5.depth property: for each node, any simple path from this node to any of its descendant leaf has the same black-depth
   352  (the number of black nodes)
   353  
   354  attributes
   355  color, key, leftChild, rightChild, parent(except root node)
   356  
   357  ### how the red-black tree maintains the property of self-balancing?
   358  the red-black color is meant for balancing the tree.
   359  
   360  the limitations put on the node ensure that any simple path from the root to a leaf is not more than twice as long as any other such path. it helps in maintaining the self-balancing property og the red-black tree.
   361  
   362  ### operations on a red-black tree
   363  rotating the subtrees in a red-black tree
   364  in rotation operation, the positions of the nodes of a subtree are interchanged.
   365  rotation operation is used for maintaining the properties of a red-black tree when they are violated by other operations such as insertion and deletion.
   366  
   367  #### left rotate
   368  the arrangement of the nodes on the right is transformed into the arrangements on the left node
   369  
   370  #### right rotate
   371  the arrangement of the nodes on the left is transformed into the arrangements on the right node.
   372  
   373  #### left-right rotate
   374  the arrangements are first shifted to the left and then to the right
   375  
   376  #### right-left rotate
   377  the arrangements are first shifted to the right and then to the left
   378  
   379  #### inserting an element into a red-black tree
   380  while inserting a new node, the new node is always inserted as red node. after insertion of a new node, if the tree is violating the properties of the red-black tree then, we do the following operations
   381  1.recolor
   382  2.rotation
   383  
   384  #### insert algorithm
   385  1.let y be the leaf (i.e. nil) and x be the root of the tree
   386  2.check if the tree is empty (i.e. whether x is nil). if yes, insert newNode as a root node and color it black.
   387  3.else, repeat steps following steps until leaf (nil) is readched.
   388  a)compare newKey with rootKey
   389  b)if newKey is greater than rootKey, traverse through the right subtree.
   390  c)else traverse through the left subtree
   391  4.assign the parent of the leaf as a parent of newNode
   392  5.if leafKey is greater than newKey, make newNode as rightChild
   393  6.else, make newNode as leftChild
   394  7.assign null to the left and rightChild of newNode.
   395  8.assigned red color to newNode
   396  9.call insertFix-algorithm to maintain the property of red-black tree if violated
   397  
   398  why newly inserted nodes are always red in a red-black tree?
   399  this is because inserting a red node does not violate the depth property of a red-black tree.
   400  if you attach a red node to a red node, then the rule is violated but it is easier to fix this problem than the problem than introduced by violating the depth property.
   401  
   402  algorithm to maintain red-black property after insertion
   403  this algorithm is used for maintaining the property of a red-black tree if insertion ofa newNode violates this property.
   404  1.do the following while the parent of newNode p is RED
   405  2.if p is the left child of grandParent gP of z, do the following.
   406  case1:
   407  a)if the color of the right child of gP of z is RED, set the color of both the children of gP as BLACK and the color of gP as RED.
   408  b)assign gP to newNode.
   409  case2:
   410  a)else if newNode is the right child of p then, assign p to newNode.
   411  b)left-rotate newNode
   412  case3:
   413  a)set colorof p as BLACK and color of gP as RED
   414  b)right-rotate gP
   415  3.else, do the following
   416  a)if the color of the left child of gP of z is RED, set the color of both the children of gP as BLACK and the colorof gP as RED
   417  b)assign gP to newNode
   418  c)else if newNode is the left of p then, assign p to newNode and right-rotate newNode
   419  d)set colorofp as BLACK and color of gP as RED
   420  e)left-rotate gP
   421  4.set the root of the tree as black
   422  
   423  algorithm to delete a node
   424  1.save the color of nodeToBeDeleted in originalColor
   425  2.if the left child of nodeToBeDeleted is null
   426  a)assigne the right child of nodeToBeDeleted to x
   427  b)transplant nodeToBeDeleted with x
   428  3.else if the right child of nodeToBeDeleted is null
   429  a)assign the left child of nodeToBeDeleted is null
   430  b)transplant nodeToBeDeleted with x
   431  4.else
   432  a)assign the minimum of right subtree of nodeToBeDeleted into y.
   433  b)save the colorof y in originalColor
   434  c)assign the rightChild of y into x
   435  d)if y is a child of nodeToBeDeleted, then set the parent of x as y.
   436  e)else, transplant y with rightChild of y
   437  f)transplate nodeToBeDeleted with y
   438  g)set the color of y with originalColor
   439  5.if the originalColor is BLACK, call DeleteFix(x)
   440  
   441  algorithm to maintain red-black property after deletion
   442  this algorithm is implemented when a black node is deleted because it violates the black depth property of the red-black tree
   443  
   444  this violation is corrected by assuming that node x (which is occupying y's original position) has an extra black. this makes node x neither red nor black. it is either doubly black or black-and-red. this violates the red black properties.
   445  
   446  however, the color attribute of x is not changed rather the extra black is represented in x's pointing to the node.
   447  
   448  the extra black can be removed if
   449  1.it reaches the root node
   450  2.if x points to a red-black node. in this case, x is colored black
   451  3.suitable rotations and recoloring are performed
   452  
   453  the following algorithm retains the properties of a red-black tree
   454  1.do the following until the x is not the root of the tree and the color of x is BLACK
   455  2.if x is the left child of its parent then,
   456  a)assign w to the sibling of x
   457  b)if the right child of parent of x is RED
   458  case1:
   459  a>set the color of the right child of the parent of x as BLACK
   460  b>set the color of the parent of x as RED
   461  c>left-rotate the parent of x
   462  d>assign the rightChild of the parent of x tow
   463  c)if the color of both the right and the leftChild of w is BLACK
   464  case2:
   465  a>set the color of w as RED
   466  b>assign the parent of x to x
   467  d)else if the color the rightChild of w is BLACK
   468  case3:
   469  a>set the color of the leftChild of w as BLACK
   470  b>set the color of w as RED
   471  c>right-rotate w
   472  d>assign the rightChild of the parent of x to w
   473  e)if any of the above cases do not occur, then do the following
   474  case4:
   475  a>set the color of w as the color of the parent of x.
   476  b>set the color of the parent of x as BLACK
   477  c>set the color of the right child of w as BLACK
   478  d>left-rotate the parent of x
   479  e>set x as the root of the tree
   480  3.else the same as above with right changed to left and vice versa
   481  4.set the color of x as BLACK