github.com/simpleiot/simpleiot@v0.18.3/docs/adr/1-consider-changing-point-data-type.md (about)

     1  # Point Data Type Changes
     2  
     3  - Author: Cliff Brake Last updated: 2023-06-13
     4  - Issue at: https://github.com/simpleiot/simpleiot/issues/254
     5  - PR/Discussion:
     6    - https://github.com/simpleiot/simpleiot/pull/279
     7    - https://github.com/simpleiot/simpleiot/pull/565
     8    - https://github.com/simpleiot/simpleiot/pull/566
     9  - Status: Review
    10  
    11  **Contents**
    12  
    13  <!-- toc -->
    14  
    15  ## Problem
    16  
    17  The current point data type is fairly simple and has proven useful and flexible
    18  to date, but we may benefit from additional or changed fields to support more
    19  scenarios. It seems in any data store, we need at the node level to be able to
    20  easily represent:
    21  
    22  1. arrays
    23  1. maps
    24  
    25  IoT systems are distributed systems that evolve over time. If can't easily
    26  handle schema changes and synchronize data between systems, we don't have
    27  anything.
    28  
    29  ## Context/Discussion
    30  
    31  Should we consider making the `point` struct more flexible?
    32  
    33  The reason for this is that it is sometimes hard to describe a
    34  sensor/configuration value with just a few fields.
    35  
    36  ### Requirements
    37  
    38  - IoT systems are often connected by unreliable networks (cellular, etc). All
    39    devices/instances in a SIOT should be able to functional autonomously (run
    40    rules, etc) and then synchronize again when connected.
    41  - all systems must converge to the same configuration state. We can probably
    42    tolerate some lost time series data, but configuration and current state must
    43    converge. When someone is remotely looking at a device state, we want to make
    44    sure they are seeing the same things a local operator is seeing.
    45  
    46  ### evolvability
    47  
    48  From Martin Kleppmann's book:
    49  
    50  > In a database, the process that writes to the database encodes the data, and
    51  > the process that reads from the database decodes it. There may just be a
    52  > single process accessing the database, in which case the reader is simply a
    53  > later version of the same process—in that case you can think of storing
    54  > something in the database as sending a message to your future self.
    55  >
    56  > Backward compatibility is clearly necessary here; otherwise your future self
    57  > won’t be able to decode what you previously wrote.
    58  >
    59  > In general, it’s common for several different processes to be accessing a
    60  > database at the same time. Those processes might be several different
    61  > applications or services, or they may simply be several instances of the same
    62  > service (running in parallel for scalability or fault tolerance). Either way,
    63  > in an environment where the application is changing, it is likely that some
    64  > processes accessing the database will be running newer code and some will be
    65  > running older code—for example because a new version is currently being
    66  > deployed in a rolling upgrade, so some instances have been updated while
    67  > others haven’t yet.
    68  >
    69  > This means that a value in the database may be written by a newer version of
    70  > the code, and subsequently read by an older version of the code that is still
    71  > running. Thus, forward compatibility is also often required for databases.
    72  >
    73  > However, there is an additional snag. Say you add a field to a record schema,
    74  > and the newer code writes a value for that new field to the database.
    75  > Subsequently, an older version of the code (which doesn’t yet know about the
    76  > new field) reads the record, updates it, and writes it back. In this
    77  > situation, the desirable behavior is usually for the old code to keep the new
    78  > field intact, even though it couldn’t be interpreted.
    79  >
    80  > The encoding formats discussed previously support such preservation of unknown
    81  > fields, but sometimes you need to take care at an application level, as
    82  > illustrated in Figure 4-7. For example, if you decode a database value into
    83  > model objects in the application, and later re-encode those model objects, the
    84  > unknown field might be lost in that translation process. Solving this is not a
    85  > hard problem; you just need to be aware of it.
    86  
    87  Some discussion of this book:
    88  https://community.tmpdir.org/t/book-review-designing-data-intensive-applications/288/6
    89  
    90  ### CRDTs
    91  
    92  Some good talks/discussions:
    93  
    94  > I also agree CRDTs are the future, but not for any reason as specific as the
    95  > ones in the article. Distributed state is so fundamentally complex that I
    96  > think we actually need CRDTs (or something like them) to reason about it
    97  > effectively. And certainly to build reliable systems. The abstraction of a
    98  > single, global, logical truth is so nice and tidy and appealing, but it
    99  > becomes so leaky that I think all successful systems for distributed state
   100  > will abandon it beyond a certain scale. --
   101  > [Peter Bourgon](https://lobste.rs/s/9fufgr/i_was_wrong_crdts_are_future)
   102  
   103  [CRDTs, the hard parts by Martin Kleppmann](https://youtu.be/x7drE24geUw)
   104  
   105  [Infinite Parallel Universes: State at the Edge](https://www.infoq.com/presentations/architecture-global-scale/)
   106  
   107  [Wikipedia article](https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type)
   108  
   109  Properties of CRDTs:
   110  
   111  - **Associative** (order in which operations are performed does matter)
   112  - **Commutative** (changing order of operands does not change result)
   113  - **Idempotent** (operation can be applied multiple times without changing the
   114    result, tolerate over-merging)
   115  
   116  The existing SIOT Node/Point data structures were created before I know what a
   117  CRDT was, but they happen to already give a node many of the properties of a
   118  CRDT -- IE, they can be modified independently, and then later merged with a
   119  reasonable level of conflict resolution.
   120  
   121  For reliable data synchronization in distributed systems, there has to be some
   122  metadata around data that facilitates synchronization. This can be done in two
   123  ways:
   124  
   125  1. add meta data in parallel to the data (turn JSON into a CRDT, example
   126     [automerge](https://github.com/automerge/automerge) or
   127     [yjs](https://docs.yjs.dev/))
   128  2. express all data using simple primitives that facilitate synchronization
   129  
   130  Either way, you have to accept constraints in your data storage and transmission
   131  formats.
   132  
   133  To date, we have chosen to follow the 2nd path (simple data primitives).
   134  
   135  ### Operational transforms
   136  
   137  There are two fundamental schools of thought regarding data synchronization:
   138  
   139  1. Operation transforms. In this method, a central server arbitrates all
   140     conflicts and hands the result back to other instances. This is an older
   141     technique and is used in applications like Google docs.
   142  2. CRDTs -- this is a newer technique that works with multiple network
   143     connections and does not require a central server. Each instance is capable
   144     of resolving conflicts themselves and converging to the same point.
   145  
   146  While a classical OT arrangement could probably work in a traditional SIOT
   147  system (where all devices talk to one cloud server), it would be nice if we are
   148  not constrained to this architecture. This would allow us to support peer
   149  synchronization in the future.
   150  
   151  ### Other Standards
   152  
   153  Some reference/discussion on other standards:
   154  
   155  #### Sparkplug
   156  
   157  https://github.com/eclipse/tahu/blob/master/sparkplug_b/sparkplug_b.proto
   158  
   159  The sparkplug data type is huge and could be used to describe very complex data.
   160  This standard came out of the industry 4.0 movement where a factory revolves
   161  around a common MQTT messaging server. The assumption is that everything is
   162  always connected to the MQTT server. However, with complex types, there is no
   163  provision for intelligent synchronization if one system is disconnected for some
   164  amount of time -- its all or nothing, thus it does not seem like a good fit for
   165  SIOT.
   166  
   167  #### SenML
   168  
   169  https://datatracker.ietf.org/doc/html/draft-ietf-core-senml-08#page-9
   170  
   171  #### tstorage
   172  
   173  The tstorage Go package has
   174  [an interesting data storage type](https://community.tmpdir.org/t/the-tstorage-time-series-package-for-go/331):
   175  
   176  ```go
   177  type Row struct {
   178  	// The unique name of metric.
   179  	// This field must be set.
   180  	Metric string
   181  	// An optional key-value properties to further detailed identification.
   182  	Labels []Label
   183  	// This field must be set.
   184  	DataPoint
   185  }
   186  
   187  type DataPoint struct {
   188  	// The actual value. This field must be set.
   189  	Value float64
   190  	// Unix timestamp.
   191  	Timestamp int64
   192  }
   193  
   194  type Label struct {
   195  	Name  string
   196  	Value string
   197  ```
   198  
   199  In this case there is one value and an array of labels, which are essentially
   200  key/value strings.
   201  
   202  #### InfluxDB
   203  
   204  InfluxDB's line protocol contains the following:
   205  
   206  ```go
   207  type Metric interface {
   208  	Time() time.Time
   209  	Name() string
   210  	TagList() []*Tag
   211  	FieldList() []*Field
   212  }
   213  
   214  type Tag struct {
   215  	Key   string
   216  	Value string
   217  }
   218  
   219  type Field struct {
   220  	Key   string
   221  	Value interface{}
   222  }
   223  ```
   224  
   225  where the Field.Value must contain one of the InfluxDB supported types (bool,
   226  uint, int, float, time, duration, string, or bytes).
   227  
   228  ### time-series storage considerations
   229  
   230  Is it necessary to have all values in one point, so they can be grouped as one
   231  entry in a time series data base like influxdb? Influx has a concept of tags and
   232  fields, and you can have as many as you want for each sample. Tags must be
   233  strings and are indexed and should be low cardinality. Fields can be any
   234  datatype influxdb supports. This is a very simple, efficient, and flexible data
   235  structure.
   236  
   237  ### Example: location data
   238  
   239  One system we are working with has extensive location information
   240  (City/State/Facility/Floor/Room/Isle) with each point. This is all stored in
   241  influx so we can easily query information for any location in the past. With
   242  SIOT, we could not currently store this information with each value point, but
   243  would rather store location information with the node as separate points. One
   244  concern is if the device would change location. However, if location is stored
   245  in points, then we will have a history of all location changes of the device. To
   246  query values for a location, we could run a two pass algorithm:
   247  
   248  1. query history and find time windows when devices are in a particular
   249     location.
   250  1. query these time ranges and devices for values
   251  
   252  This has the advantage that we don't need to store location data with every
   253  point, but we still have a clear history of what data come from where.
   254  
   255  ### Example: file system metrics
   256  
   257  When adding metrics, we end up with data like the following for disks
   258  partitions:
   259  
   260  ```
   261  Filesystem     Size Used Avail Use% Mounted on
   262  tmpfs          16806068224 0 16806068224   0% /dev
   263  tmpfs          16813735936 1519616 16812216320   0% /run
   264  ext2/ext3      2953064402944 1948218814464 854814945280  70% /
   265  tmpfs          16813735936 175980544 16637755392   1% /dev/shm
   266  tmpfs          16813740032 3108966400 13704773632  18% /tmp
   267  ext2/ext3      368837799936 156350181376 193680359424  45% /old3
   268  msdos          313942016 60329984 253612032  19% /boot
   269  ext2/ext3      3561716731904 2638277668864 742441906176  78% /scratch
   270  tmpfs          3362746368 118784 3362627584   0% /run/user/1000
   271  ext2/ext3      1968874332160 418203766784 1450633895936  22% /run/media/cbrake/59b35dd4-954b-4568-9fa8-9e7df9c450fc
   272  fuseblk        3561716731904 2638277668864 742441906176  78% /media/fileserver
   273  ext2/ext3      984372027392 339508314112 594836836352  36% /run/media/cbrake/backup2
   274  ```
   275  
   276  It would be handy if we could store filesystem as a tag, size/used/avail/% as
   277  fields, and mount point as text field.
   278  
   279  We already have an array of points in a node -- can we just make one array work?
   280  The size/used/avail/% could easily be stored as different points. The text field
   281  would store the mount point, which would tie all the stats for one partition
   282  together. Then the question is how to represent the filesystem? With the added
   283  `Key` field in proposal #2, we can now store the mount point as the key.
   284  
   285  | Type           | Key   | Text  | Value   |
   286  | -------------- | ----- | ----- | ------- |
   287  | filesystemSize | /home |       | 1243234 |
   288  | filesystemUsed | /home |       | 234222  |
   289  | filesystemType | /home | ext4  |         |
   290  | filesystemSize | /home |       | 1000000 |
   291  | filesystemUsed | /date |       | 10000   |
   292  | filesystemType | /home | btrfs |         |
   293  
   294  ### Representing arrays
   295  
   296  With the `key` field, we can represent arrays as a group of points, where key
   297  defines the position in the array. For node points to be automatically decoded
   298  into an array struct fields by the SIOT client manager, the key must be an
   299  integer represented in string form.
   300  
   301  One example where we do this is for selecting days of the week in schedule rule
   302  conditions. The key field is used to select the weekday. So we can have a series
   303  of points to represent Weekdays. In the below, Sunday is the 1st point set to 0,
   304  and Monday is the 2nd point, set to 1.
   305  
   306  ```go
   307  []data.Point{
   308    {
   309      Type: "weekday",
   310      Key: "0",
   311      Value: 0,
   312    },
   313    {
   314      Type: "weekday",
   315      key: "1",
   316      Value: 0,
   317    },
   318  }
   319  ```
   320  
   321  In this case, the condition node has a series of weekday points with keys 0-6 to
   322  represent the days of the week.
   323  
   324  The SIOT
   325  [data.Decode](https://pkg.go.dev/github.com/simpleiot/simpleiot/data#Decode) is
   326  used by the client manager to initialize array fields in a client struct. The
   327  following assumptions are made:
   328  
   329  - the value in the `key` field is converted to an int and used as the index into
   330    the field array.
   331  - if there are missing array entries, they are filled with zero values.
   332  - the
   333    [data.MergePoints](https://pkg.go.dev/github.com/simpleiot/simpleiot/data#MergePoints)
   334    uses the same algorithm.
   335  - if a point is inserted into the array or moved, all array points affected must
   336    be sent. For example, if you have an array of length 20, and you insert a new
   337    value at the beginning, then all 21 points must to be sent. This can have
   338    implications for rules or any other logic that use the Point `key` field.
   339  
   340  This does not have pefect CRDT properties, but typically these arrays are
   341  generally small and are only modifed in one place.
   342  
   343  If you need more advanced functionality, you can bypass the data Decode/Merge
   344  functions and process the points manually and then use any algorithm you want to
   345  process them.
   346  
   347  ### Point deletions
   348  
   349  To date, we've had no need to delete points, but it may be useful in the future.
   350  
   351  Consider the following sequence of point changes:
   352  
   353  1. t1: we have a point
   354  1. t2: A deletes the point
   355  1. t3: B concurrently change the point value
   356  
   357  The below table shows the point values over time with the current point merge
   358  algorithm:
   359  
   360  | Time | Value | Tombstone |
   361  | ---- | ----- | --------- |
   362  | t1   | 10    | 0         |
   363  | t2   | 10    | 1         |
   364  | t3   | 20    | 0         |
   365  
   366  In this case, the point becomes undeleted because the last write wins (LWW). Is
   367  this a problem? What is the desired behavior? A likely scenario is that a device
   368  will be continually sending value updates and a user will make a configuration
   369  change in the portal that deletes a point. Thus it seems delete changes should
   370  always have precedence. However, with the last write wins (LWW) merge algorithm,
   371  the tombstone value could get lost. It may make sense to:
   372  
   373  - make the tombstone value an int
   374  - only increment it
   375  - when merging points, the highest tombstone value wins
   376  - odd value of tombstone value means point is deleted
   377  
   378  Thus the tombstone value is merged independently of the timestamp and thus is
   379  always preserved, even if there concurrent modifications.
   380  
   381  The following table shows the values with the modified point merge algorithm.
   382  
   383  | Time | Value | Tombstone |
   384  | ---- | ----- | --------- |
   385  | t1   | 10    | 0         |
   386  | t2   | 10    | 1         |
   387  | t3   | 20    | 1         |
   388  
   389  ### Duration, Min, Max
   390  
   391  The current Point data type has Duration, Min, and Max fields. This is used for
   392  when a sensor value is averaged over some period of time, and then reported. The
   393  Duration, Min, Max fields are useful for describing what time period the point
   394  was obtained, and what the min/max values during this period were.
   395  
   396  ### Representing maps
   397  
   398  In the file system metrics example below, we would like to store a file system
   399  type for a particular mount type. We have 3 pieces of information:
   400  
   401  ```go
   402  data.Point {
   403    Type: "fileSystem",
   404    Text: "/media/data/",
   405    ????: "ext4",
   406  }
   407  ```
   408  
   409  Perhaps we could add a key field:
   410  
   411  ```go
   412  data.Point {
   413    Type: "fileSystem",
   414    Key: "/media/data/",
   415    Text: "ext4",
   416  }
   417  ```
   418  
   419  The `Key` field could also be useful for storing the mount point for other
   420  size/used, etc points.
   421  
   422  ### making use of common algorithms and visualization tools
   423  
   424  A simple point type makes it very nice to write common algorithms that take in
   425  points, and can always assume the value is in the value field. If we store
   426  multiple values in a point, then the algorithm needs to know which point to use.
   427  
   428  If an algorithm needs multiple values, it seems we could feed in multiple point
   429  types and discriminated by point type. For example, if an algorithm used to
   430  calculate % of a partition used could take in total size and used, store each,
   431  and the divide them to output %. The data does not necessarily need to live in
   432  the same point. Could this be used to get rid of the min/max fields in the
   433  point? Could these simply be separate points?
   434  
   435  - Having min/max/duration as separate points in influxdb should not be a problem
   436    for graphing -- you would simply qualify the point on a different type vs
   437    selecting a different field.
   438  - if there is a process that is doing advanced calculations (say taking the
   439    numerical integral of flow rate to get total flow), then this process could
   440    simply accumulate points and when it has all the points for a timestamp, then
   441    do the calculation.
   442  
   443  ### Schema changes and distributed synchronization
   444  
   445  A primary consideration of Simple IoT is easy and efficient data synchronization
   446  and easy schema changes.
   447  
   448  One argument against embedded maps in a point is that adding these maps would
   449  likely increase the possibility of schema version conflicts between versions of
   450  software because points are overwritten. Adding maps now introduces a schema
   451  into the point that is not synchronized at the key level. There will also be a
   452  temptation to put more information into point maps instead of creating more
   453  points.
   454  
   455  With the current point scheme, it is very easy to synchronize data, even if
   456  there are schema changes. All points are synchronized, so one version can write
   457  one set of points, and another version another, and all points will be sync'd to
   458  all instances.
   459  
   460  There is also a concern that if two different versions of the software use
   461  different combinations of field/value keys, there could be information lost. The
   462  simplicity and ease of merging Points into nodes is no longer simple. As an
   463  example:
   464  
   465  ```go
   466  Point {
   467    Type: "motorPIDConfig",
   468    Values: {
   469      {"P": 23},
   470      {"I": 0.8},
   471      {"D": 200},
   472    },
   473  }
   474  ```
   475  
   476  If an instance with an older version writes a point that only has the "P" and
   477  "I" values, then the "D" value would get lost. We could merge all maps on writes
   478  to prevent losing information. However if we have a case where we have 3
   479  systems:
   480  
   481  Aold -> Bnew -> Cnew
   482  
   483  If Aold writes an update to the above point, but only has P,I values, then this
   484  point is automatically forwarded to Bnew, and then Bnew forwards it to Cnew.
   485  However, Bnew may have had a copy with P,I,D values, but the D is lost when the
   486  point is forwarded from Aold -> Cnew. We could argue that Bnew has previously
   487  synchronized this point to Cnew, but what if Cnew was offline and Aold sent the
   488  point immediately after Cnew came online before Bnew synchronized its point.
   489  
   490  The bottom line is there are edge cases where we don't know if the point map
   491  data is fully synchronized as the map data is not hashed. If we implement arrays
   492  and maps as collections of points, then we can be more sure everything is
   493  synchronized correctly because each point is a struct with fixed fields.
   494  
   495  ### Is there any scenario where we need multiple tags/labels on a point?
   496  
   497  If we don't add maps to points, the assumption is any metadata can be added as
   498  additional points to the containing node. Will this cover all cases?
   499  
   500  ### Is there any scenario where we need multiple values in a point vs multiple points?
   501  
   502  If we have points that need to be grouped together, they could all be sent with
   503  the same timestamp. Whatever process is using the points could extract them from
   504  a timeseries store and then re-associate them based on common timestamps.
   505  
   506  Could duration/min/max be sent as separate points with the same timestamp
   507  instead of extra fields in the point?
   508  
   509  The NATS APIs allow you to send multiple points with a message, so if there is
   510  ever a need to describe data with multiple values (say min/max/etc), these can
   511  simply be sent as multiple points in one message.
   512  
   513  ### Is there any advantage to flat data structures?
   514  
   515  Flat data structures where the fields consist only of simple types (no nested
   516  objects, arrays, maps, etc). This is essentially what tables in a relational
   517  database are. One advantage to keeping the point type flat is it would map
   518  better into a relational database. If we add arrays to the Point type, then it
   519  will not longer map into a single relational database table.
   520  
   521  ## Design
   522  
   523  ### Original Point Type
   524  
   525  ```go
   526  type Point struct {
   527  	// ID of the sensor that provided the point
   528  	ID string `json:"id,omitempty"`
   529  
   530  	// Type of point (voltage, current, key, etc)
   531  	Type string `json:"type,omitempty"`
   532  
   533  	// Index is used to specify a position in an array such as
   534  	// which pump, temp sensor, etc.
   535  	Index int `json:"index,omitempty"`
   536  
   537  	// Time the point was taken
   538  	Time time.Time `json:"time,omitempty"`
   539  
   540  	// Duration over which the point was taken. This is useful
   541  	// for averaged values to know what time period the value applies
   542  	// to.
   543  	Duration time.Duration `json:"duration,omitempty"`
   544  
   545  	// Average OR
   546  	// Instantaneous analog or digital value of the point.
   547  	// 0 and 1 are used to represent digital values
   548  	Value float64 `json:"value,omitempty"`
   549  
   550  	// Optional text value of the point for data that is best represented
   551  	// as a string rather than a number.
   552  	Text string `json:"text,omitempty"`
   553  
   554  	// statistical values that may be calculated over the duration of the point
   555  	Min float64 `json:"min,omitempty"`
   556  	Max float64 `json:"max,omitempty"`
   557  }
   558  ```
   559  
   560  ### Proposal #1
   561  
   562  This proposal would move all the data into maps.
   563  
   564  ```go
   565  type Point struct {
   566      ID string
   567      Time time.Time
   568      Type string
   569      Tags map[string]string
   570      Values map[string]float64
   571      TextValues map[string]string
   572  }
   573  ```
   574  
   575  The existing min/max would just become fields. This would map better into
   576  influxdb. There would be some redundancy between Type and Field keys.
   577  
   578  ### Proposal #2
   579  
   580  ```go
   581  type Point struct {
   582  	// The 1st three fields uniquely identify a point when receiving updates
   583  	Type string
   584  	Key string
   585  
   586  	// The following fields are the values for a point
   587  	Time time.Time
   588  	(removed) Index float64
   589  	Value float64
   590  	Text string
   591  	Data []byte
   592  
   593  	// Metadata
   594  	Tombstone int
   595  }
   596  ```
   597  
   598  _Updated 2023-06-13: removed the `Index` field. We will use the `Key` field for
   599  array indices._
   600  
   601  Notable changes from the first implementation:
   602  
   603  - removal of the `ID` field, as any ID information should be contained in the
   604    parent node. The `ID` field is a legacy from 1-wire setups where we
   605    represented each 1-wire sensor as a point. However, it seems now each 1-wire
   606    sensor should have its own node.
   607  - addition of the `Key` field. This allows us to represent maps in a node, as
   608    well as add extra identifying information for a point.
   609  - the `Point` is now identified in the merge algorithm using the `Type` and
   610    `Key`. Before, the `ID`, `Type`, and `Index` were used.
   611  - the `Data` field is added to give us the flexibility to store/transmit data
   612    that does not fit in a Value or Text field. This should be used sparingly, but
   613    gives us some flexibility in the future for special cases. This came out of
   614    some comments in an Industry 4.0 community -- basically types/schemas are good
   615    in a communication standard, as long as you also have the capability to send a
   616    blob of data to handle all the special cases. This seems like good advice.
   617  - the `Tombstone` fields is added as an `int` and is always incremented. Odd
   618    values of `Tombstone` mean the point was deleted. When merging points, the
   619    highest tombstone value always wins.
   620  
   621  ## Decision
   622  
   623  Going with proposal #2 -- we can always revisit this later if needed. This has
   624  minimal impact on the existing code base.
   625  
   626  ## Objections/concerns
   627  
   628  (Some of these are general to the node/point concept in general)
   629  
   630  - Q: _with the point datatype, we lose types_
   631    - A: in a single application, this concern would perhaps be a high priority,
   632      but in a distributed system, data syncronization and schema migrations must
   633      be given priority. Typically these collections of points are translated to a
   634      type by the application code using the data, so any concerns can be handled
   635      there. At least we won't get JS undefined crashes as Go will fill in zero
   636      values.
   637  - Q: _this will be inefficient converting points to types_
   638    - A: this does take processing time, but this time is short compared to the
   639      network transfer times from distributed instances. Additionally,
   640      applications can cache nodes they care about so they don't have to translate
   641      the entire point array every time they use a node. Even a huge IoT system
   642      has a finite # of devices that can easily fit into memory of modern
   643      servers/machines.
   644  - Q: _this seems crude not to have full featured protobuf types with all the
   645    fields explicitely defined in protobuf. Additionally, can't protobuf handle
   646    type changes elegantly?_
   647    - A: protobuf can handle field additions and removal but we still have the
   648      edge cases where a point is sent from an old version of software that does
   649      not contain information written by a newer versions. Also, I'm not sure it
   650      is a good idea to have application specific type fields defined in protobuf,
   651      otherwise, you have a lot of work all along the communication chain to
   652      rebuild everything every time anything changes. With a generic types that
   653      rarely have to change, your core infrastructure can remain stable and any
   654      features only need to touch the edges of the system.
   655  - Q: _with nodes and points, we can only represent a type with a single level of
   656    fields_
   657    - A: this is not quite true, because with the key/index fields, we can now
   658      have array and map fields in a node. However, the point is taken that a node
   659      with its points cannot represent a deeply nested data structure. However,
   660      nodes can be nested to represent any data structure you like. This
   661      limitation is by design because otherwise syncronization would be very
   662      difficult. By limitting the complexity of the core data structures, we are
   663      making synchronziation and storage very simple. The tradeoff is a little
   664      more work to marshall/unmarshall node/point data structures into useful
   665      types in your application. However, marshalling code is easy compared to
   666      distributed systems, so we need to optmize the system for the hard parts. A
   667      little extra typing will not hurt anyone, and tooling could be developed if
   668      needed to assist in this.
   669  
   670  Generic core data structures also opens up the possibility to dynamically extend
   671  the system at run time without type changes. For instance, the GUI could render
   672  new nodes it has never seen before by sending it configuration nodes with
   673  declarative instructions on how to display the node. If core types need to
   674  change to do this type of thing, we have no chance at this type of intelligent
   675  functionality.
   676  
   677  ## Consequences
   678  
   679  Removing the Min/Max/Duration fields should not have any consequences now as I
   680  don't think we are using these fields yet.
   681  
   682  Quite a bit of code needs to change to remove ID and add Key to code using
   683  points.
   684  
   685  ## Additional Notes/Reference
   686  
   687  We also took a look at how to resolve loops in the node tree:
   688  
   689  https://github.com/simpleiot/simpleiot/issues/294
   690  
   691  This is part of the verification to confirm our basic types are robust and have
   692  adequate CRDT properties.