github.com/uber/kraken@v0.1.4/README.md (about) 1 <p align="center"><img src="assets/kraken-logo-color.svg" width="175" title="Kraken Logo"></p> 2 3 <p align="center"> 4 </a> 5 <a href="https://travis-ci.com/uber/kraken"><img src="https://travis-ci.com/uber/kraken.svg?branch=master"></a> 6 <a href="https://github.com/uber/kraken/releases"><img src="https://img.shields.io/github/release/uber/kraken.svg" /></a> 7 <a href="https://godoc.org/github.com/uber/kraken"><img src="https://godoc.org/github.com/uber/kraken?status.svg"></a> 8 <a href="https://goreportcard.com/badge/github.com/uber/kraken"><img src="https://goreportcard.com/badge/github.com/uber/kraken"></a> 9 <a href="https://codecov.io/gh/uber/kraken"><img src="https://codecov.io/gh/uber/kraken/branch/master/graph/badge.svg"></a> 10 </p> 11 12 Kraken is a P2P-powered Docker registry that focuses on scalability and availability. It is 13 designed for Docker image management, replication, and distribution in a hybrid cloud environment. 14 With pluggable backend support, Kraken can easily integrate into existing Docker registry setups 15 as the distribution layer. 16 17 Kraken has been in production at Uber since early 2018. In our busiest cluster, Kraken distributes 18 more than 1 million blobs per day, including 100k 1G+ blobs. At its peak production load, Kraken 19 distributes 20K 100MB-1G blobs in under 30 sec. 20 21 Below is the visualization of a small Kraken cluster at work: 22 23 <p align="center"> 24 <img src="assets/visualization.gif" title="Visualization"> 25 </p> 26 27 # Table of Contents 28 29 - [Features](#features) 30 - [Design](#design) 31 - [Architecture](#architecture) 32 - [Benchmark](#benchmark) 33 - [Usage](#usage) 34 - [Comparison With Other Projects](#comparison-with-other-projects) 35 - [Limitations](#limitations) 36 - [Contributing](#contributing) 37 - [Contact](#contact) 38 39 # Features 40 41 Following are some highlights of Kraken: 42 - **Highly scalable**. Kraken is capable of distributing Docker images at > 50% of max download 43 the speed limit on every host. Cluster size and image size do not have a significant impact on 44 download speed. 45 - Supports at least 15k hosts per cluster. 46 - Supports arbitrarily large blobs/layers. We normally limit max size to 20G for the best performance. 47 - **Highly available**. No component is a single point of failure. 48 - **Secure**. Support uploader authentication and data integrity protection through TLS. 49 - **Pluggable storage options**. Instead of managing data, Kraken plugs into reliable blob storage 50 options, like S3, GCS, ECR, HDFS or another registry. The storage interface is simple and new 51 options are easy to add. 52 - **Lossless cross-cluster replication**. Kraken supports rule-based async replication between 53 clusters. 54 - **Minimal dependencies**. Other than pluggable storage, Kraken only has an optional dependency on 55 DNS. 56 57 # Design 58 59 The high-level idea of Kraken is to have a small number of dedicated hosts seeding content to a 60 the network of agents running on each host in the cluster. 61 62 A central component, the tracker, will orchestrate all participants in the network to form a 63 pseudo-random regular graph. 64 65 Such a graph has high connectivity and a small diameter. As a result, even with only one seeder and 66 having thousands of peers joining in the same second, all participants can reach a minimum of 80% 67 max upload/download speed in theory (60% with current implementation), and performance doesn't 68 degrade much as the blob size and cluster size increase. For more details, see the team's [tech 69 talk](https://www.youtube.com/watch?v=waVtYYSXkXU) at KubeCon + CloudNativeCon. 70 71 # Architecture 72 73  74 75 - Agent 76 - Deployed on every host 77 - Implements Docker registry interface 78 - Announces available content to tracker 79 - Connects to peers returned by the tracker to download content 80 - Origin 81 - Dedicated seeders 82 - Stores blobs as files on disk backed by pluggable storage (e.g. S3, GCS, ECR) 83 - Forms a self-healing hash ring to distribute the load 84 - Tracker 85 - Tracks which peers have what content (both in-progress and completed) 86 - Provides ordered lists of peers to connect to for any given blob 87 - Proxy 88 - Implements Docker registry interface 89 - Uploads each image layer to the responsible origin (remember, origins form a hash ring) 90 - Uploads tags to build-index 91 - Build-Index 92 - Mapping of the human-readable tag to blob digest 93 - No consistency guarantees: the client should use unique tags 94 - Powers image replication between clusters (simple duplicated queues with retry) 95 - Stores tags as files on disk backed by pluggable storage (e.g. S3, GCS, ECR) 96 97 # Benchmark 98 99 The following data is from a test where a 3G Docker image with 2 layers is downloaded by 2600 hosts 100 concurrently (5200 blob downloads), with 300MB/s speed limit on all agents (using 5 trackers and 101 5 origins): 102 103  104 105 - p50 = 10s (at speed limit) 106 - p99 = 18s 107 - p99.9 = 22s 108 109 # Usage 110 111 All Kraken components can be deployed as Docker containers. To build the Docker images: 112 ``` 113 $ make images 114 ``` 115 For information about how to configure and use Kraken, please refer to the [documentation](docs/CONFIGURATION.md). 116 117 ## Kraken on Kubernetes 118 119 You can use our example Helm chart to deploy Kraken (with an example HTTP fileserver backend) on 120 your k8s cluster: 121 ``` 122 $ helm install --name=kraken-demo ./helm 123 ``` 124 Once deployed, every node will have a docker registry API exposed on `localhost:30081`. 125 For example pod spec that pulls images from Kraken agent, see [example](examples/k8s/demo.json). 126 127 For more information on k8s setup, see [README](examples/k8s/README.md). 128 129 ## Devcluster 130 131 To start a herd container (which contains origin, tracker, build-index and proxy) and two agent 132 containers with development configuration: 133 ``` 134 $ make devcluster 135 ``` 136 137 Docker-for-Mac is required for making dev-cluster work on your laptop. 138 For more information on devcluster, please check out devcluster [README](examples/devcluster/README.md). 139 140 # Comparison With Other Projects 141 142 ## Dragonfly from Alibaba 143 144 Dragonfly cluster has one or a few "supernodes" that coordinates the transfer of every 4MB chunk of data 145 in the cluster. 146 147 While the supernode would be able to make optimal decisions, the throughput of the whole cluster is 148 limited by the processing power of one or a few hosts, and the performance would degrade linearly as 149 either blob size or cluster size increases. 150 151 Kraken's tracker only helps orchestrate the connection graph and leaves the negotiation of actual data 152 transfer to individual peers, so Kraken scales better with large blobs. 153 On top of that, Kraken is HA and supports cross-cluster replication, both are required for a 154 reliable hybrid cloud setup. 155 156 ## BitTorrent 157 158 Kraken was initially built with a BitTorrent driver, however, we ended up implementing our P2P 159 driver based on BitTorrent protocol to allow for tighter integration with storage solutions and more 160 control over performance optimizations. 161 162 Kraken's problem space is slightly different than what BitTorrent was designed for. Kraken's goal is 163 to reduce global max download time and communication overhead in a stable environment, while 164 BitTorrent was designed for an unpredictable and adversarial environment, so it needs to preserve more 165 copies of scarce data and defend against malicious or bad behaving peers. 166 167 Despite the differences, we re-examine Kraken's protocol from time to time, and if it's feasible, we 168 hope to make it compatible with BitTorrent again. 169 170 # Limitations 171 172 - If Docker registry throughput is not the bottleneck in your deployment workflow, switching to 173 Kraken will not magically speed up your `docker pull`. To speed up `docker pull`, consider 174 switching to [Makisu](https://github.com/uber/makisu) to improve layer reusability at build time, or 175 tweak compression ratios, as `docker pull` spends most of the time on data decompression. 176 - Mutating tags (e.g. updating a `latest` tag) is allowed, however, a few things will not work: tag 177 lookups immediately afterwards will still return the old value due to Nginx caching, and replication 178 probably won't trigger. We are working on supporting this functionality better. If you need tag 179 mutation support right now, please reduce the cache interval of the build-index component. If you also need 180 replication in a multi-cluster setup, please consider setting up another Docker registry as Kraken's 181 backend. 182 - Theoretically, Kraken should distribute blobs of any size without significant performance 183 degradation, but at Uber, we enforce a 20G limit and cannot endorse the production use of 184 ultra-large blobs (i.e. 100G+). Peers enforce connection limits on a per blob basis, and new peers 185 might be starved for connections if no peers become seeders relatively soon. If you have ultra-large 186 blobs you'd like to distribute, we recommend breaking them into <10G chunks first. 187 188 # Contributing 189 190 Please check out our [guide](docs/CONTRIBUTING.md). 191 192 # Contact 193 194 To contact us, please join our [Slack channel](https://join.slack.com/t/uber-container-tools/shared_invite/enQtNTIxODAwMDEzNjM1LWIwYzIxNmUwOGY3MmVmM2MxYTczOTQ4ZDU0YjAxMTA0NDgyNzdlZTA4ZWVkZGNlMDUzZDA1ZTJiZTQ4ZDY0YTM).