github.com/uber/kraken@v0.1.4/docs/CONFIGURATION.md (about) 1 **Table of Contents** 2 - [Examples](#examples) 3 - [Configuring Peer To Peer Download](#configuring-peer-to-peer-download) 4 - [Tracker Peer TTL](#tracker-peer-ttl) 5 - [Bandwidth](#bandwidth) 6 - [Connection Limits](#connection-limits) 7 - [Seeder TTI](#seeder-tti) 8 - [Torrent TTI On Disk](#torrent-tti-on-disk) 9 - [Configuring Hash Ring](#configuring-hash-ring) 10 - [Active Health Check](#active-health-check) 11 - [Passive Health Check](#passive-health-check) 12 - [Configuring Storage Backend For Origin And Build-Index](#configuring-storage-backend-for-origin-and-build-index) 13 - [Read-Only Registry Backend](#read-only-registry-backend) 14 - [Bandwidth on Origin](#bandwidth-on-origin) 15 16 # Examples 17 18 Here are some example configuration files we used for dev cluster (which can be started by running 19 `make devcluster`). 20 21 They are split into a base.yaml that contains configs that we have been using for test, development 22 and production, and a development.yaml that contains configs specifically needed for starting dev 23 cluster using Docker-for-Mac, and need to updated for production setups. 24 25 - Origin 26 - [base.yaml](../config/origin/base.yaml) 27 - [development.yaml](../examples/devcluster/config/origin/development.yaml) 28 29 - Tracker 30 - [base.yaml](../config/tracker/base.yaml) 31 - [development.yaml](../examples/devcluster/config/tracker/development.yaml) 32 33 - Build-index 34 - [base.yaml](../config/build-index/base.yaml) 35 - [development.yaml](../examples/devcluster/config/build-index/development.yaml) 36 37 - Proxy 38 - [base.yaml](../config/proxy/base.yaml) 39 - [development.yaml](../examples/devcluster/config/proxy/development.yaml) 40 41 - Agent 42 - [base.yaml](../config/agent/base.yaml) 43 - [development.yaml](../examples/devcluster/config/agent/development.yaml) 44 45 More in [examples/devcluster/README.md](../examples/devcluster/README.md) 46 47 # Configuring Peer To Peer Download 48 49 Kraken's peer-to-peer network consists of agents, origins and trackers. Origins are special dedicated peers that seed data from a storage backend (HDFS, S3, etc). Agents are peers that download from each other and from origins. Agents periodically announce each torrent they are currently downloading to tracker, and in return, receive a list of peers that are also seeding the same torrent. More details in [ARCHITECTURE.md](ARCHITECTURE.md) 50 51 ## Tracker Peer TTL 52 53 >tracker.yaml 54 >``` 55 >peerstore: 56 > redis: 57 > peer_set_window_size: 1h 58 > max_peer_set_windows: 5 59 >``` 60 As peers announce periodically to a tracker, the tracker stores the announce requests into several time window bucket. 61 Each announce request expires in `peer_set_window_size * max_peer_set_windows` time. 62 63 Then, the tracker returns a random set of peers selecting from `max_peer_set_windows` number of time bucket. 64 65 ## Announce Interval `TODO(evelynl94)` 66 67 ## Bandwidth 68 69 Download and upload bandwidths are configurable to prevent peers from saturating the host network. 70 >agent.yaml/origin.yaml 71 >``` 72 >scheduler: 73 > conn: 74 > bandwidth: 75 > enable: true 76 > egress_bits_per_sec: 1677721600 # 200*8 Mbit 77 > ingress_bits_per_sec: 2516582400 # 300*8 Mbit 78 >``` 79 80 ## Connection Limits 81 82 Number of connections per torrent can be limited by: 83 >agent.yaml/origin.yaml 84 >``` 85 >scheduler: 86 > connstate: 87 > max_open_conn: 10 88 >``` 89 There is no limit on number of torrents a peer can download simultaneously. 90 91 ## Pipeline limit `TODO(evelynl94)` 92 93 ## Seeder TTI 94 95 SeederTTI (time-to-idle) is the duration a completed torrent will exist without being read from before being removed from in-memory archive. 96 >agent.yaml/origin.yaml 97 >``` 98 >scheduler: 99 > seeder_tti: 5m 100 >``` 101 However, until it is deleted by periodic storage purge, completed torrents will remain on disk and can be re-opened on another peer's request. 102 103 ## Torrent TTI On Disk 104 105 Both agents and origins can be configured to cleanup idle torrents on disk periodically. 106 >agent.yaml/origin.yaml 107 >``` 108 >store: 109 > cache_cleanup: 110 > tti: 6h 111 > download_cleanup: 112 > tti: 6h 113 >``` 114 115 For origins, the number of files can also be limited as origins are dedicated seeders and hence normally caches files on disk for longer time. 116 >origin.yaml 117 >``` 118 >store: 119 > capacity: 1000000 120 > 121 >``` 122 123 # Configuring Hash Ring 124 125 Both orgin and tracker clusters are self-healing hash rings and both can be represented by either a dns name or a static list of hosts. 126 127 We use rendezvous hashing for constructing ring membership. 128 129 Take an origin cluster for example: 130 >origin-static-hosts.yaml 131 >``` 132 >hashring: 133 > max_replica: 2 134 >cluster: 135 > hosts: 136 > static: 137 > - origin1:15002 138 > - origin2:15002 139 > - origin3:15002 140 >``` 141 >origin-dns.yaml 142 >``` 143 >hashring: 144 > max_replica: 2 145 >cluster: 146 > hosts: 147 > dns: origin.example.com:15002 148 >``` 149 150 ## Health Check For Hash Rings 151 152 When a node in the hash ring is considered as unhealthy, the ring client will route requests to the next healthy node with the highest score. There are two ways to do health check: 153 154 ### Active Health Check 155 156 Origins do health check for each other in the ring as the cluster is usually smaller. 157 >origin.yaml 158 >``` 159 >cluster: 160 > healthcheck: 161 > filter: 162 > fails: 3 163 > passes: 2 164 > monitor: 165 > interval: 30s 166 Above configures health check ping from one origin to others every 30 seconds. If 3 or more consecutive health checkes fail for an origin, it is marked as unhealthy. Later, if 2 or more consecutive health checks succeed for the same origin, it is marked as healthy again. Initially, all hosts are healthy. 167 168 ### Passive Health Check 169 170 Agents health checks tracker, piggybacking on the announce requests. 171 >agent.yaml 172 >``` 173 >tracker: 174 > cluster: 175 > healthcheck: 176 > fails: 3 177 > fail_timeout: 5m 178 >``` 179 As shown in this example, if 3 announce requests to one tracker fail with network error within 5 minutes, the host is marked as unhealthy for 5 minutes. The agent will not send requests to this host until after timeout. 180 181 # Configuring Storage Backend For Origin And Build-Index 182 183 Storage backends are used by Origin and Build-Index for data persistence. Kraken has support for S3, GCS, ECR, HDFS, http (readonly), and Docker Registry (readonly) as [backends](https://github.com/uber/kraken/tree/master/lib/backend). 184 185 Multiple backends can be used at the name time, configured based on namespaces of requested blob and tag (for docker images, that means the part of image name before ":"). 186 187 Example origin config that uses multiple backends: 188 189 >origin.yaml 190 >``` 191 >backends: 192 > - namespace: library/.* 193 > backend: 194 > registry_blob: 195 > address: index.docker.io 196 > timeout: 60s 197 > security: 198 > basic: 199 > username: "" 200 > password: "" 201 > - namespace: test-domain/.* 202 > backend: 203 > http: 204 > download_url: http://test-domain:9000/download?sha256=%s 205 > download_backoff: 206 > enabled: true 207 > - namespace: ecr-images/.* 208 > backend: 209 > registry_tag: 210 > address: 123456789012.dkr.ecr.<region>.amazonaws.com 211 > security: 212 > credsStore: 'ecr-login' 213 > - namespace: s3-images/.* 214 > backend: 215 > s3: 216 > region: us-west-1 217 > bucket: test-bucket 218 > root_directory: /test-bucket/kraken/default/ 219 > name_path: sharded_docker_blob 220 > username: kraken-user 221 > - namespace: minio-images/.* 222 > backend: 223 > s3: 224 > region: us-east-1 225 > bucket: self-hosted-bucket 226 > root_directory: /kraken/default/ 227 > name_path: sharded_docker_blob 228 > username: minio-user 229 > endpoint: http://172.17.0.1:9000 230 > disable_ssl: true 231 > force_path_style: true 232 > bandwidth: 233 > enable: true 234 > - namespace: gcs-images/.* 235 > backend: 236 > gcs: 237 > username: kraken-user 238 > bucket: test-bucket 239 > root_directory: /test-bucket/kraken/default/ 240 > name_path: sharded_docker_blob 241 > bandwidth: 242 > enable: true 243 > 244 >auth: 245 > s3: 246 > kraken-user: 247 > s3: 248 > aws: kraken-user 249 > aws_access_key_id: <keyid> 250 > aws_secret_access_key: <key> 251 > minio-user: 252 > s3: 253 > aws: minio-user 254 > aws_access_key_id: <keyid> 255 > aws_secret_access_key: <key> 256 > gcs: 257 > kraken-user: 258 > gcs: 259 > access_blob: <service_account_key> 260 261 ## Read-Only Registry Backend 262 263 For simple local testing with an insecure registry (assuming it listens on `host.docker.internal:5000`), you can configure the backend for origin and build-index accordingly: 264 265 >origin.yaml 266 >``` 267 >backends: 268 > - namespace: .* 269 > backend: 270 > registry_blob: 271 > address: host.docker.internal:5000 272 > security: 273 > tls: 274 > client: 275 > disabled: true 276 >``` 277 278 >build-index.yaml 279 >``` 280 >backends: 281 > - namespace: .* 282 > backend: 283 > registry_tag: 284 > address: host.docker.internal:5000 285 > security: 286 > tls: 287 > client: 288 > disabled: true 289 >``` 290 291 ## Bandwidth on Origin 292 293 When transferring data from and to its storage backend, origins can be configured with download and upload bandwidths. This is useful when using cloud storage providers to prevent origins from saturating the network link. 294 >origin.yaml 295 >``` 296 >backends: 297 > - namespace: .* 298 > backend: 299 > s3: <omitted> 300 > bandwidth: 301 > enabled: true 302 > egress_bits_per_sec: 8589934592 # 8 Gbit 303 > ingress_bits_per_sec: 85899345920 # 10*8 Gbit 304 >```