github.com/simpleiot/simpleiot@v0.18.3/docs/ref/research.md (about)

     1  # Research
     2  
     3  This document contains information that has been researched during the course of
     4  creating Simple IoT.
     5  
     6  ## Synchronization
     7  
     8  An IoT system is inherently distributed. At a minimum, there are three
     9  components:
    10  
    11  1. device (Go, C, etc.)
    12  1. cloud (Go)
    13  1. multiple browsers (Elm, Js)
    14  
    15  Data can be changed in any of the above locations and must be seamlessly
    16  synchronized to other locations. Failing to consider this simple requirement
    17  early in building the system can make for brittle and overly complex systems.
    18  
    19  ### The moving target problem
    20  
    21  As long as the connection between instances is solid, they will stay
    22  synchronized as each instance will receive all points it is interested in.
    23  Therefore, verifying synchronization by comparing Node hashes is a backup
    24  mechanism -- that allows us to see what changed when disconnected. The root
    25  hashes for a downstream instance changes every time anything in that system
    26  changes. This is very useful in that you only need to compare one value to
    27  ensure your entire config is synchronized, but it is also a disadvantage in that
    28  the top level hash is changing more often so you are trying to compare two
    29  moving targets. This is not a problem if things are changing slow enough that it
    30  does not matter if they are changing. However, this also limits the data rates
    31  to which we can scale.
    32  
    33  Some systems use a concept called Merkle clocks, where events are stored in a
    34  Merle DAG and existing nodes in the DAG are immutable and new events are always
    35  added as parents to existing events. An immutable DAG has an advantage in that
    36  you can always work back in history, which never changes. The SIOT Node tree is
    37  mutable by definition. Actual budget uses a similar concept in that it
    38  [uses a Merkle Trie](https://github.com/actualbudget/actual/discussions/257) to
    39  represent events in time and then prunes the tree as time goes on.
    40  
    41  We could create a separate structure to sync all events (points), but that would
    42  require a separate structure on the server for every downstream device and seems
    43  overly complex.
    44  
    45  Is it critical that we see all historical data? In an IoT system, there are
    46  essentially two sets of date -- current state/config, and historical data. The
    47  current state is most critical for most things, but historical data may be used
    48  for some algorithms and viewed by users. The volume of data makes it impractical
    49  to store all data in resource constrained edge systems. However, maybe it's a
    50  mistake to separate these two as synchronizing all data might simplify the
    51  system.
    52  
    53  One way to handle the moving target problem is to store an array of previous
    54  hashes for the device node in both instances -- perhaps for as long as the
    55  synchronization interval. The downstream could then fetch the upstream hash
    56  array and see if any of the entries match an entry in the downstream array. This
    57  would help cover the case where there may be some time difference when things
    58  get updated, but the history should be similar. If there is a hash in history
    59  that matches, then we are probably OK.
    60  
    61  Another approach would be to track metrics on how often the top level hash is
    62  updating -- if it is too often, then perhaps the system needs tuned.
    63  
    64  There could also be some type of stop-the-world lock where both systems stop
    65  processing new nodes during the sync operation. However, if they are not in
    66  sync, this probably won't help and definitely hurts scalability.
    67  
    68  ### Resgate
    69  
    70  [resgate.io](https://resgate.io) is an interesting project that solves the
    71  problem of creating a real-time API gateway where web clients are synchronized
    72  seamlessly. This project uses NATS.io for a backbone, which makes it interesting
    73  as NATS is core to this project.
    74  
    75  The Resgate system is primarily concerned with synchronizing browser contents.
    76  
    77  ### Couch/pouchdb
    78  
    79  Has some interesting ideas.
    80  
    81  ### Merkle Trees
    82  
    83  - https://medium.com/@rkkautsar/synchronizing-your-hierarchical-data-with-merkle-tree-dbfe37db3ab7
    84  - https://en.wikipedia.org/wiki/Merkle_tree
    85  - https://jack-vanlightly.com/blog/2016/10/24/exploring-the-use-of-hash-trees-for-data-synchronization-part-1
    86  - https://www.codementor.io/blog/merkle-trees-5h9arzd3n8
    87    - Version Control Systems Version control systems like Git and Mercurial use
    88      specialized merkle trees to manage versions of files and even directories.
    89      One advantage of using merkle trees in version control systems is we can
    90      simply compare hashes of files and directories between two commits to know
    91      if they've been modified or not, which is quite fast.
    92    - No-SQL distributed database systems like Apache Cassandra and Amazon
    93      DynamoDB use merkle trees to detect inconsistencies between data replicas.
    94      This process of repairing the data by comparing all replicas and updating
    95      each one of them to the newest version is also called anti-entropy repair.
    96      The process is also described in
    97      [Cassandra's documentation](https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsRepairNodesManualRepair.html).
    98  
    99  #### Scaling Merkel trees
   100  
   101  One limitation of Merkel trees is the difficulty of updating the tree
   102  concurrently. Some information on this:
   103  
   104  - [how to scale blockchains](https://www.forbes.com/sites/forbestechcouncil/2018/11/27/sidechains-how-to-scale-and-improve-blockchains-safely/?sh=193537e64418)
   105  - [Angela: A Sparse, Distributed, and Highly Concurrent Merkle Tree](https://people.eecs.berkeley.edu/~kubitron/courses/cs262a-F18/projects/reports/project1_report_ver3.pdf)
   106  
   107  ### Distributed key/value databases
   108  
   109  - etcd
   110  - NATS
   111    [key/value store](https://docs.nats.io/using-nats/developer/develop_jetstream/kv)
   112  
   113  ### Distributed Hash Tables
   114  
   115  - https://en.wikipedia.org/wiki/Distributed_hash_table
   116  
   117  ### CRDT (Conflict-free replicated data type)
   118  
   119  - https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type
   120  - [Yjs](https://yjs.dev/#community)
   121    - https://blog.kevinjahns.de/are-crdts-suitable-for-shared-editing/
   122  - https://tantaman.com/2022-10-18-lamport-sufficient-for-lww.html
   123  
   124  ### Databases
   125  
   126  - https://tantaman.com/2022-08-23-why-sqlite-why-now.html
   127    - instead of doing:
   128      `select comment.* from post join comment on comment.post_id = post.id where post.id = x and comment.date < cursor.date and comment.id < cursor.id order by date, id desc limit 101`
   129    - we do: `post.comments().last(10).after(curosr);`
   130  
   131  ### Timestamps
   132  
   133  - [Lamport timestamp](https://en.wikipedia.org/wiki/Lamport_timestamp)
   134    - used by Yjs
   135  
   136  ## Other IoT Systems
   137  
   138  ### AWS IoT
   139  
   140  - https://www.thingrex.com/aws_iot_thing_attributes_intro/
   141    - Thing properties include the following, which are analogous to SIOT node
   142      fields.
   143      - Name (Desription)
   144      - Type (Type)
   145      - Attributes (Points)
   146      - Groups (Described by tree structure)
   147      - Billing Group (Can also be described by tree structure)
   148  - https://www.thingrex.com/aws_iot_thing_type/
   149    - each type has a specified attributes -- kind of a neat idea