github.com/code-to-go/safepool.lib@v0.0.0-20221205180519-ee25e63c226e/DESIGN.md (about)

     1  # Intro
     2  Safepool is a distributed secure add-only content distribution based on passive storage. It comes both as a Go and binary library. 
     3  
     4  The pillars of the technology are:
     5  1. Data is stored on storage, which is partitioned by domains. Users are identified by a private/public EC key
     6  2. Data is encrypted on with AES256. Each domain has a different password.
     7  2. The AES password is encrypted with the public key of each user. The encrypted password is kept in the users' file
     8  3. When a new user joins the domain, his identity and related encrypted password is added to the users' file. When a user is removed from a domain, a new password is generated and shared with all users except him
     9  4. Both users' file and changes files are valid when they are signed by a trusted user. 
    10  A change file that is not trusted is ignored, while a users' file that is not trusted prevents futher operations
    11  
    12  
    13  # Key Concepts
    14  
    15  ## User
    16  A user is a person that intends to distribute data. A user is identified by a public/private key (ed25519). By extension a user is a software process run (potentially in background) under the identity of a user.
    17  
    18  ## Local Storage
    19  The local storage is a memory space on a device owned by a user. For performance reasons usually data is kept in a local storage
    20  
    21  ## Exchange
    22  An exchange is a location where data and changes are stored so to be asynchronously shared across users. 
    23  An exchange is implemented with existing technologies, such as SFTP, S3 and Azure Storage.
    24  
    25  
    26  ## Users
    27  A user is identified by a pair private/public key. A user can be either an admin or a follower.
    28  - admin: he can add/remove users
    29  - follower: he can share and download data but he cannot add/remove users
    30  
    31  ## Domain
    32  A domain identifies the users who can share specific data. Each user in a domain is identified by its private/public key.
    33  Domains have a hierarchical structure similar to Internet domains (e.g. public.safepool.zone). In the future this hierarchy may be used for shared access
    34  
    35  
    36  ## Lineage
    37  It is the sequence of changes applied to a file since its creation. 
    38  
    39  ## Beauty Context
    40  If multiple users modify the same file, the lineage has a fork. When a fork is present, a user must choose his favorite version by downloading it. The vote 
    41  
    42  ## Access
    43  Data located in transport is subject to access control. Since transport cannot offer active control, the access is passive by encryption. 
    44  
    45  _While all clients that access a exchanger can see the content, only entitled clients can decrypt specific content_
    46  
    47  In fact each file must be encrypted with a simmetric key (_AES256_)
    48  
    49  ## Synchronization
    50  This is the core operation when a client receives updates from the network and uploads possible changes. It is defined in multiple phases
    51  
    52  ### 1. Local discovery
    53  Files for each domain are checked against information on the DB. If no information about the file is on the DB, the hash is calculated so to check for rename cases. 
    54  In case of rename, the record is update with the status _UPDATE_.
    55  
    56  If the DB already contains information about the file, the modification time and the content are checked looking for changes; in case of changes, the status is set to _UPDATE_.
    57  
    58  ### 2. Remote discovery
    59  A client connect to the closest exchange (round-robin latency is used) available. Then files are filtered based on the Snowfallid, ignoring all files that are older according to the logs in the DB.
    60  
    61  For each change file:
    62  - the client rebuilds the chain of changes 
    63  
    64  
    65  # API
    66  
    67  ```
    68    func Start() 
    69    func Join(token string) error
    70    func AddExchange(domain string, exchange json) error
    71    func GetPublic() string
    72    func ListDomains() []string
    73    func State(domain string) []string
    74    func Watch(func handler(string))
    75  ```
    76  
    77  # Console Protocol
    78  
    79  Safepool 
    80  - Helo: provide server information
    81  - State: list domains and status
    82  - State [domain]: list files in a domain 
    83  - Add [domain/file]: add the file to the stage for the next push
    84  - Mon: monitor updates from all domains
    85  - AddDomain [invite]: add a new domain 
    86  - NewDomain [invite]: create a new domain
    87  
    88  
    89  ## Samples
    90  | Request | Response | 
    91  |------|----|
    92  | HELO | WESHARE 1.0 |
    93  | STATE | public.safepool.zone <br> test.safepool.zone |
    94  | STATE test.safepool.zone | sample.txt C-<br>other.txt U- |
    95  | ADD test.safepool.zone/
    96  
    97  # Design
    98  - Layer1: Storage
    99  - Layer2: Access
   100  - Layer3: Feeds
   101  
   102  
   103  ## Local 
   104  Each client keeps some information locally. Most data is stored in a SQLite db.
   105  
   106  ### TABLE Config
   107  contains configuration parameters both at global and domain level
   108  
   109  | Field | Type | Constraints | Description |
   110  |------|----|----|-----------|
   111  | domain | VARCHAR(128) |  | Domain the configuration refers to. When the config is global, the value is NULL |
   112  | key | VARCHAR(64) | NOT NULL | Key of the config |
   113  | value | VARCHAR(64) | NOT NULL | Value of the config |
   114  
   115  
   116  The following config parameters are supported:
   117  - identity.public: public key of the user
   118  - identity.private: private key of the user
   119  - 
   120  
   121  ### TABLE Changes
   122  tracks all change coming from the net
   123  
   124  | Field | Type | Constraints | Description |
   125  |------|----|----|-----------|
   126  | domain | VARCHAR(128) |  | Domain |
   127  | name | VARCHAR(128) |  | Full path of the file |
   128  | hash | CHAR(64) | NOT NULL | Hash of the file |
   129  | change | VARCHAR(16) | NOT NULL | Change file on the network |
   130  
   131  
   132  ### TABLE Keys
   133  tracks all change coming from the net
   134  
   135  | Field | Type | Constraints | Description |
   136  |------|----|----|-----------|
   137  | domain | VARCHAR(128) |  | Domain |
   138  | thread | VARCHAR(128) |  | Full path of the file |
   139  | id | CHAR(64) | NOT NULL | Hash of the file |
   140  | value | VARCHAR(16) | NOT NULL | Change file on the network |
   141  
   142  ### TABLE Identities
   143  Track known user and the trust level
   144  | Field | Type | Constraints | Description |
   145  |------|----|----|-----------|
   146  | nick | VARCHAR(128) |  | Domain |
   147  | identity | VARCHAR(128) |  | Domain |
   148  | trust | INTEGER |  | Full path of the file |
   149  
   150  
   151  ### TABLE Files
   152  links names on the file system and their hash value
   153  
   154  | Field | Type | Constraints | Description |
   155  |------|----|----|-----------|
   156  | domain | VARCHAR(8192) | NOT NULL | Domain |
   157  | thread | INTEGER | NOT NULL | Thread id |
   158  | name | VARCHAR(8192) |  | Name of the file |
   159  | hash | CHAR(64) | NOT NULL | Hash of the file |
   160  | id | CHAR(64) | NOT NULL | Snowflake Id |
   161  | modtime | INTEGER| NOT NULL | Last modification time of the file |
   162  
   163  
   164  
   165  
   166  ### TABLE Merkle
   167  Store the 
   168  
   169  
   170  ### Invite
   171  The invite is the way to access a domain. The invite contains one or more exchange credentials and the administrators of the group (their public key). It is encrypted with the public key of the receiver.  
   172  
   173  | Field | Type | Size (bits)| Content |
   174  |------|----|----|-----------|
   175  | version | uint | 16 | version of the file format, 1.0 at the moment|
   176  | admin | byte[32] | 32| public key of the admin that created the invite|
   177  | config | byte[n] | variable| transport configuration in json format|
   178  
   179  ### Users
   180  
   181  
   182  ## Remote layout
   183  The remote storage in a single folder named after the domain and contains the following files.
   184  In the below description:9,223,372,036,854,775,8
   185  - x is a snowflake id
   186  - n is a numeric split id. I
   187  
   188  All files have a version id, which is 1.0
   189  
   190  ```
   191  πŸ“¦public.safepool/main
   192  β”£πŸ“œ.keys
   193  ┃ ┣1541815603606036480
   194  ┃ β”—1629405603606036480
   195  β”—πŸ“œ.safepool
   196  ```
   197  
   198  
   199  ### .keys
   200  The _.keys_ folder contain a subfolder for each thread in the domain.
   201  Each subfolder contains encryption files for each thread
   202  
   203  
   204  The subfolder _public_ contains the key for all the users of the domain. When 
   205  Key files are stored under the _.key_ folder.
   206  Each key has 
   207  
   208  
   209  | Field | Type | Size (bits)| Content |
   210  |------|----|----|-----------|
   211  | version | uint | 16 | version of the file format, 1.0 at the moment|
   212  | users | User[] | variable| list of users|
   213  
   214  and each user consists of 
   215  
   216  | Field | Type | Size (bits)| Content |
   217  |------|----|----|-----------|
   218  | public | []byte | 128 | ed25519 public key|
   219  | flags | uint | 16 | reserved must be 0|
   220  | aes | string | variable | symmetric encryption key used 
   221  | name | string | variable | first name of the user|
   222  | name2 | string | variable | second name of the user (used in case of multiple users with the same name)|
   223  
   224  
   225  ### C.x 
   226  A change file contains an update on a file. It is made of
   227  
   228  | Field | Type | Size (bits)| Content |
   229  |------|----|----|-----------|
   230  | version | uint | 16 | version of the file format, 1.0 at the moment|
   231  | headerSize | uint | 32 | size of the header&#x00B9;, i.e. all the fields except|
   232  | names | string[] | variable | list of names in local |
   233  | origin | string | variable | list of names in exchanger |
   234  | xorHash | byte[] | 256 | xor hash of all parts hashes |
   235  | hashes | byte[][] | variable (x256) | hashes for each part of content&#x00B2; |
   236  | message | string | variable | optional markdown message for other users before they receive the change |
   237  | changes | Change[] | variable | changes against the origin file |
   238  | data | byte[] | variable | the actual data |
   239  
   240  Each Change is made in fact of
   241  | Field | Type | Size (bits)| Content |
   242  |------|----|----|-----------|
   243  | type | uint | 16 | type of change: create, replace, delete, insert|
   244  | from | uint | 32 | size of the header&#x00B9;, i.e. all the fields except|
   245  | from | uint | 32 | size of the header&#x00B9;, i.e. all the fields except|
   246  
   247  &#x00B9; All the fields before data are the file header
   248  
   249  &#x00B2; Parts are built with a Hashsplit algorithm
   250  
   251  
   252  
   253  ### A.x
   254  Action file. It defines actions each user can request to the other users. This includes:
   255  - Truncate: delete oldest change files. This usually requires merge of oldest files with latest changes
   256  
   257  ### Sign and encryption
   258  
   259  | File | Signed | Encrypted |
   260  |------|--------|-----------|
   261  | Group | &#10004;  | |
   262  | C.x | &#10004; | &#10004; |
   263  | K.x | &#10004; | |
   264  | N.n | &#10004; | &#10004; |
   265  | N.n | &#10004; | |
   266  | A.x | &#10004; | &#10004; |
   267  
   268  Signing is implemented with a ed25519 signing where the public/private keys are the identity of each user. 
   269  On the file system, both the signer public key and the signature are added after the content in binary form
   270  
   271  | Field | Type | Size (bits)| Content |
   272  |------|----|----|-----------|
   273  | public | uint | 256 | ed|
   274  | hash | uint | 256 | hash value|
   275  
   276  
   277  Encryption is implemented with AES256.