github.com/code-to-go/safepool.lib@v0.0.0-20221205180519-ee25e63c226e/DESIGN.md (about) 1 # Intro 2 Safepool is a distributed secure add-only content distribution based on passive storage. It comes both as a Go and binary library. 3 4 The pillars of the technology are: 5 1. Data is stored on storage, which is partitioned by domains. Users are identified by a private/public EC key 6 2. Data is encrypted on with AES256. Each domain has a different password. 7 2. The AES password is encrypted with the public key of each user. The encrypted password is kept in the users' file 8 3. When a new user joins the domain, his identity and related encrypted password is added to the users' file. When a user is removed from a domain, a new password is generated and shared with all users except him 9 4. Both users' file and changes files are valid when they are signed by a trusted user. 10 A change file that is not trusted is ignored, while a users' file that is not trusted prevents futher operations 11 12 13 # Key Concepts 14 15 ## User 16 A user is a person that intends to distribute data. A user is identified by a public/private key (ed25519). By extension a user is a software process run (potentially in background) under the identity of a user. 17 18 ## Local Storage 19 The local storage is a memory space on a device owned by a user. For performance reasons usually data is kept in a local storage 20 21 ## Exchange 22 An exchange is a location where data and changes are stored so to be asynchronously shared across users. 23 An exchange is implemented with existing technologies, such as SFTP, S3 and Azure Storage. 24 25 26 ## Users 27 A user is identified by a pair private/public key. A user can be either an admin or a follower. 28 - admin: he can add/remove users 29 - follower: he can share and download data but he cannot add/remove users 30 31 ## Domain 32 A domain identifies the users who can share specific data. Each user in a domain is identified by its private/public key. 33 Domains have a hierarchical structure similar to Internet domains (e.g. public.safepool.zone). In the future this hierarchy may be used for shared access 34 35 36 ## Lineage 37 It is the sequence of changes applied to a file since its creation. 38 39 ## Beauty Context 40 If multiple users modify the same file, the lineage has a fork. When a fork is present, a user must choose his favorite version by downloading it. The vote 41 42 ## Access 43 Data located in transport is subject to access control. Since transport cannot offer active control, the access is passive by encryption. 44 45 _While all clients that access a exchanger can see the content, only entitled clients can decrypt specific content_ 46 47 In fact each file must be encrypted with a simmetric key (_AES256_) 48 49 ## Synchronization 50 This is the core operation when a client receives updates from the network and uploads possible changes. It is defined in multiple phases 51 52 ### 1. Local discovery 53 Files for each domain are checked against information on the DB. If no information about the file is on the DB, the hash is calculated so to check for rename cases. 54 In case of rename, the record is update with the status _UPDATE_. 55 56 If the DB already contains information about the file, the modification time and the content are checked looking for changes; in case of changes, the status is set to _UPDATE_. 57 58 ### 2. Remote discovery 59 A client connect to the closest exchange (round-robin latency is used) available. Then files are filtered based on the Snowfallid, ignoring all files that are older according to the logs in the DB. 60 61 For each change file: 62 - the client rebuilds the chain of changes 63 64 65 # API 66 67 ``` 68 func Start() 69 func Join(token string) error 70 func AddExchange(domain string, exchange json) error 71 func GetPublic() string 72 func ListDomains() []string 73 func State(domain string) []string 74 func Watch(func handler(string)) 75 ``` 76 77 # Console Protocol 78 79 Safepool 80 - Helo: provide server information 81 - State: list domains and status 82 - State [domain]: list files in a domain 83 - Add [domain/file]: add the file to the stage for the next push 84 - Mon: monitor updates from all domains 85 - AddDomain [invite]: add a new domain 86 - NewDomain [invite]: create a new domain 87 88 89 ## Samples 90 | Request | Response | 91 |------|----| 92 | HELO | WESHARE 1.0 | 93 | STATE | public.safepool.zone <br> test.safepool.zone | 94 | STATE test.safepool.zone | sample.txt C-<br>other.txt U- | 95 | ADD test.safepool.zone/ 96 97 # Design 98 - Layer1: Storage 99 - Layer2: Access 100 - Layer3: Feeds 101 102 103 ## Local 104 Each client keeps some information locally. Most data is stored in a SQLite db. 105 106 ### TABLE Config 107 contains configuration parameters both at global and domain level 108 109 | Field | Type | Constraints | Description | 110 |------|----|----|-----------| 111 | domain | VARCHAR(128) | | Domain the configuration refers to. When the config is global, the value is NULL | 112 | key | VARCHAR(64) | NOT NULL | Key of the config | 113 | value | VARCHAR(64) | NOT NULL | Value of the config | 114 115 116 The following config parameters are supported: 117 - identity.public: public key of the user 118 - identity.private: private key of the user 119 - 120 121 ### TABLE Changes 122 tracks all change coming from the net 123 124 | Field | Type | Constraints | Description | 125 |------|----|----|-----------| 126 | domain | VARCHAR(128) | | Domain | 127 | name | VARCHAR(128) | | Full path of the file | 128 | hash | CHAR(64) | NOT NULL | Hash of the file | 129 | change | VARCHAR(16) | NOT NULL | Change file on the network | 130 131 132 ### TABLE Keys 133 tracks all change coming from the net 134 135 | Field | Type | Constraints | Description | 136 |------|----|----|-----------| 137 | domain | VARCHAR(128) | | Domain | 138 | thread | VARCHAR(128) | | Full path of the file | 139 | id | CHAR(64) | NOT NULL | Hash of the file | 140 | value | VARCHAR(16) | NOT NULL | Change file on the network | 141 142 ### TABLE Identities 143 Track known user and the trust level 144 | Field | Type | Constraints | Description | 145 |------|----|----|-----------| 146 | nick | VARCHAR(128) | | Domain | 147 | identity | VARCHAR(128) | | Domain | 148 | trust | INTEGER | | Full path of the file | 149 150 151 ### TABLE Files 152 links names on the file system and their hash value 153 154 | Field | Type | Constraints | Description | 155 |------|----|----|-----------| 156 | domain | VARCHAR(8192) | NOT NULL | Domain | 157 | thread | INTEGER | NOT NULL | Thread id | 158 | name | VARCHAR(8192) | | Name of the file | 159 | hash | CHAR(64) | NOT NULL | Hash of the file | 160 | id | CHAR(64) | NOT NULL | Snowflake Id | 161 | modtime | INTEGER| NOT NULL | Last modification time of the file | 162 163 164 165 166 ### TABLE Merkle 167 Store the 168 169 170 ### Invite 171 The invite is the way to access a domain. The invite contains one or more exchange credentials and the administrators of the group (their public key). It is encrypted with the public key of the receiver. 172 173 | Field | Type | Size (bits)| Content | 174 |------|----|----|-----------| 175 | version | uint | 16 | version of the file format, 1.0 at the moment| 176 | admin | byte[32] | 32| public key of the admin that created the invite| 177 | config | byte[n] | variable| transport configuration in json format| 178 179 ### Users 180 181 182 ## Remote layout 183 The remote storage in a single folder named after the domain and contains the following files. 184 In the below description:9,223,372,036,854,775,8 185 - x is a snowflake id 186 - n is a numeric split id. I 187 188 All files have a version id, which is 1.0 189 190 ``` 191 π¦public.safepool/main 192 β£π.keys 193 β β£1541815603606036480 194 β β1629405603606036480 195 βπ.safepool 196 ``` 197 198 199 ### .keys 200 The _.keys_ folder contain a subfolder for each thread in the domain. 201 Each subfolder contains encryption files for each thread 202 203 204 The subfolder _public_ contains the key for all the users of the domain. When 205 Key files are stored under the _.key_ folder. 206 Each key has 207 208 209 | Field | Type | Size (bits)| Content | 210 |------|----|----|-----------| 211 | version | uint | 16 | version of the file format, 1.0 at the moment| 212 | users | User[] | variable| list of users| 213 214 and each user consists of 215 216 | Field | Type | Size (bits)| Content | 217 |------|----|----|-----------| 218 | public | []byte | 128 | ed25519 public key| 219 | flags | uint | 16 | reserved must be 0| 220 | aes | string | variable | symmetric encryption key used 221 | name | string | variable | first name of the user| 222 | name2 | string | variable | second name of the user (used in case of multiple users with the same name)| 223 224 225 ### C.x 226 A change file contains an update on a file. It is made of 227 228 | Field | Type | Size (bits)| Content | 229 |------|----|----|-----------| 230 | version | uint | 16 | version of the file format, 1.0 at the moment| 231 | headerSize | uint | 32 | size of the header¹, i.e. all the fields except| 232 | names | string[] | variable | list of names in local | 233 | origin | string | variable | list of names in exchanger | 234 | xorHash | byte[] | 256 | xor hash of all parts hashes | 235 | hashes | byte[][] | variable (x256) | hashes for each part of content² | 236 | message | string | variable | optional markdown message for other users before they receive the change | 237 | changes | Change[] | variable | changes against the origin file | 238 | data | byte[] | variable | the actual data | 239 240 Each Change is made in fact of 241 | Field | Type | Size (bits)| Content | 242 |------|----|----|-----------| 243 | type | uint | 16 | type of change: create, replace, delete, insert| 244 | from | uint | 32 | size of the header¹, i.e. all the fields except| 245 | from | uint | 32 | size of the header¹, i.e. all the fields except| 246 247 ¹ All the fields before data are the file header 248 249 ² Parts are built with a Hashsplit algorithm 250 251 252 253 ### A.x 254 Action file. It defines actions each user can request to the other users. This includes: 255 - Truncate: delete oldest change files. This usually requires merge of oldest files with latest changes 256 257 ### Sign and encryption 258 259 | File | Signed | Encrypted | 260 |------|--------|-----------| 261 | Group | ✔ | | 262 | C.x | ✔ | ✔ | 263 | K.x | ✔ | | 264 | N.n | ✔ | ✔ | 265 | N.n | ✔ | | 266 | A.x | ✔ | ✔ | 267 268 Signing is implemented with a ed25519 signing where the public/private keys are the identity of each user. 269 On the file system, both the signer public key and the signature are added after the content in binary form 270 271 | Field | Type | Size (bits)| Content | 272 |------|----|----|-----------| 273 | public | uint | 256 | ed| 274 | hash | uint | 256 | hash value| 275 276 277 Encryption is implemented with AES256.