github.com/geph-official/geph2@v0.22.6-0.20210211030601-f527cb59b0df/libs/kcp-go/README.md (about)

     1  # Geph fork of KCP
     2  
     3  This repo is a fork of KCP-go that tries to maintain compatibility, while making this into a better generic UDP streaming tool. More specifically, it seeks to avoid KCP's focus on "unfair" network flooding, decrease complexity, and improve performance. GephKCP is used to let Geph2 roam seamlessly in intermittent network conditions, NOT to abuse the network.
     4  
     5  TODO:
     6  - [ ] Remove all encryption/obfuscation features (handled at different layer)
     7  - [ ] Saner "unfair" mode (ex: token-bucket limiting to 105% of rolling goodput)
     8  - [ ] Smarter congestion control than the textbook Reno currently in use. More specifically cwnd goes up way too slow.
     9  
    10  # ORIGINAL README BELOW:
    11  
    12  <img src="kcp-go.png" alt="kcp-go" height="50px" />
    13  
    14  
    15  [![GoDoc][1]][2] [![Powered][9]][10] [![MIT licensed][11]][12] [![Build Status][3]][4] [![Go Report Card][5]][6] [![Coverage Statusd][7]][8]
    16  
    17  [1]: https://godoc.org/github.com/xtaci/kcp-go?status.svg
    18  [2]: https://godoc.org/github.com/xtaci/kcp-go
    19  [3]: https://travis-ci.org/xtaci/kcp-go.svg?branch=master
    20  [4]: https://travis-ci.org/xtaci/kcp-go
    21  [5]: https://goreportcard.com/badge/github.com/xtaci/kcp-go
    22  [6]: https://goreportcard.com/report/github.com/xtaci/kcp-go
    23  [7]: https://codecov.io/gh/xtaci/kcp-go/branch/master/graph/badge.svg
    24  [8]: https://codecov.io/gh/xtaci/kcp-go
    25  [9]: https://img.shields.io/badge/KCP-Powered-blue.svg
    26  [10]: https://github.com/skywind3000/kcp
    27  [11]: https://img.shields.io/badge/license-MIT-blue.svg
    28  [12]: LICENSE
    29  
    30  ## Introduction
    31  
    32  **kcp-go** is a **Production-Grade Reliable-UDP** library for [golang](https://golang.org/). 
    33  
    34  This library intents to provide a **smooth, resilient, ordered, error-checked and anonymous** delivery of streams over **UDP** packets, it has been battle-tested with opensource project [kcptun](https://github.com/xtaci/kcptun). Millions of devices(from low-end MIPS routers to high-end servers) have deployed **kcp-go** powered program in a variety of forms like **online games, live broadcasting, file synchronization and network acceleration**.
    35  
    36  [Lastest Release](https://github.com/xtaci/kcp-go/releases)
    37  
    38  ## Features
    39  
    40  1. Designed for **Latency-sensitive** scenarios.
    41  1. **Cache friendly** and **Memory optimized** design, offers extremely **High Performance** core.
    42  1. Handles **>5K concurrent connections** on a single commodity server.
    43  1. Compatible with [net.Conn](https://golang.org/pkg/net/#Conn) and [net.Listener](https://golang.org/pkg/net/#Listener), a drop-in replacement for [net.TCPConn](https://golang.org/pkg/net/#TCPConn).
    44  1. [FEC(Forward Error Correction)](https://en.wikipedia.org/wiki/Forward_error_correction) Support with [Reed-Solomon Codes](https://en.wikipedia.org/wiki/Reed%E2%80%93Solomon_error_correction)
    45  1. Packet level encryption support with [AES](https://en.wikipedia.org/wiki/Advanced_Encryption_Standard), [TEA](https://en.wikipedia.org/wiki/Tiny_Encryption_Algorithm), [3DES](https://en.wikipedia.org/wiki/Triple_DES), [Blowfish](https://en.wikipedia.org/wiki/Blowfish_(cipher)), [Cast5](https://en.wikipedia.org/wiki/CAST-128), [Salsa20]( https://en.wikipedia.org/wiki/Salsa20), etc. in [CFB](https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#Cipher_Feedback_.28CFB.29) mode, which generates completely anonymous packet.
    46  1. Only **A fixed number of goroutines** will be created for the entire server application, costs in **context switch** between goroutines have been taken into consideration.
    47  1. Compatible with [skywind3000's](https://github.com/skywind3000) C version with various improvements.
    48  1. Platform-dependent optimizations: [sendmmsg](http://man7.org/linux/man-pages/man2/sendmmsg.2.html) and [recvmmsg](http://man7.org/linux/man-pages/man2/recvmmsg.2.html) were expoloited for linux.
    49  
    50  ## Documentation
    51  
    52  For complete documentation, see the associated [Godoc](https://godoc.org/github.com/xtaci/kcp-go).
    53  
    54  ## Specification
    55  
    56  <img src="frame.png" alt="Frame Format" height="109px" />
    57  
    58  ```
    59  NONCE:
    60    16bytes cryptographically secure random number, nonce changes for every packet.
    61    
    62  CRC32:
    63    CRC-32 checksum of data using the IEEE polynomial
    64   
    65  FEC TYPE:
    66    typeData = 0xF1
    67    typeParity = 0xF2
    68    
    69  FEC SEQID:
    70    monotonically increasing in range: [0, (0xffffffff/shardSize) * shardSize - 1]
    71    
    72  SIZE:
    73    The size of KCP frame plus 2
    74  ```
    75  
    76  ```
    77  +-----------------+
    78  | SESSION         |
    79  +-----------------+
    80  | KCP(ARQ)        |
    81  +-----------------+
    82  | FEC(OPTIONAL)   |
    83  +-----------------+
    84  | CRYPTO(OPTIONAL)|
    85  +-----------------+
    86  | UDP(PACKET)     |
    87  +-----------------+
    88  | IP              |
    89  +-----------------+
    90  | LINK            |
    91  +-----------------+
    92  | PHY             |
    93  +-----------------+
    94  (LAYER MODEL OF KCP-GO)
    95  ```
    96  
    97  
    98  ## Examples
    99  
   100  1. [simple examples](https://github.com/xtaci/kcp-go/tree/master/examples)
   101  2. [kcptun client](https://github.com/xtaci/kcptun/blob/master/client/main.go)
   102  3. [kcptun server](https://github.com/xtaci/kcptun/blob/master/server/main.go)
   103  
   104  ## Benchmark
   105  ```
   106    Model Name:	MacBook Pro
   107    Model Identifier:	MacBookPro14,1
   108    Processor Name:	Intel Core i5
   109    Processor Speed:	3.1 GHz
   110    Number of Processors:	1
   111    Total Number of Cores:	2
   112    L2 Cache (per Core):	256 KB
   113    L3 Cache:	4 MB
   114    Memory:	8 GB
   115  ```
   116  ```
   117  $ go test -v -run=^$ -bench .
   118  beginning tests, encryption:salsa20, fec:10/3
   119  goos: darwin
   120  goarch: amd64
   121  pkg: github.com/xtaci/kcp-go
   122  BenchmarkSM4-4                 	   50000	     32180 ns/op	  93.23 MB/s	       0 B/op	       0 allocs/op
   123  BenchmarkAES128-4              	  500000	      3285 ns/op	 913.21 MB/s	       0 B/op	       0 allocs/op
   124  BenchmarkAES192-4              	  300000	      3623 ns/op	 827.85 MB/s	       0 B/op	       0 allocs/op
   125  BenchmarkAES256-4              	  300000	      3874 ns/op	 774.20 MB/s	       0 B/op	       0 allocs/op
   126  BenchmarkTEA-4                 	  100000	     15384 ns/op	 195.00 MB/s	       0 B/op	       0 allocs/op
   127  BenchmarkXOR-4                 	20000000	        89.9 ns/op	33372.00 MB/s	       0 B/op	       0 allocs/op
   128  BenchmarkBlowfish-4            	   50000	     26927 ns/op	 111.41 MB/s	       0 B/op	       0 allocs/op
   129  BenchmarkNone-4                	30000000	        45.7 ns/op	65597.94 MB/s	       0 B/op	       0 allocs/op
   130  BenchmarkCast5-4               	   50000	     34258 ns/op	  87.57 MB/s	       0 B/op	       0 allocs/op
   131  Benchmark3DES-4                	   10000	    117149 ns/op	  25.61 MB/s	       0 B/op	       0 allocs/op
   132  BenchmarkTwofish-4             	   50000	     33538 ns/op	  89.45 MB/s	       0 B/op	       0 allocs/op
   133  BenchmarkXTEA-4                	   30000	     45666 ns/op	  65.69 MB/s	       0 B/op	       0 allocs/op
   134  BenchmarkSalsa20-4             	  500000	      3308 ns/op	 906.76 MB/s	       0 B/op	       0 allocs/op
   135  BenchmarkCRC32-4               	20000000	        65.2 ns/op	15712.43 MB/s
   136  BenchmarkCsprngSystem-4        	 1000000	      1150 ns/op	  13.91 MB/s
   137  BenchmarkCsprngMD5-4           	10000000	       145 ns/op	 110.26 MB/s
   138  BenchmarkCsprngSHA1-4          	10000000	       158 ns/op	 126.54 MB/s
   139  BenchmarkCsprngNonceMD5-4      	10000000	       153 ns/op	 104.22 MB/s
   140  BenchmarkCsprngNonceAES128-4   	100000000	        19.1 ns/op	 837.81 MB/s
   141  BenchmarkFECDecode-4           	 1000000	      1119 ns/op	1339.61 MB/s	    1606 B/op	       2 allocs/op
   142  BenchmarkFECEncode-4           	 2000000	       832 ns/op	1801.83 MB/s	      17 B/op	       0 allocs/op
   143  BenchmarkFlush-4               	 5000000	       272 ns/op	       0 B/op	       0 allocs/op
   144  BenchmarkEchoSpeed4K-4         	    5000	    259617 ns/op	  15.78 MB/s	    5451 B/op	     149 allocs/op
   145  BenchmarkEchoSpeed64K-4        	    1000	   1706084 ns/op	  38.41 MB/s	   56002 B/op	    1604 allocs/op
   146  BenchmarkEchoSpeed512K-4       	     100	  14345505 ns/op	  36.55 MB/s	  482597 B/op	   13045 allocs/op
   147  BenchmarkEchoSpeed1M-4         	      30	  34859104 ns/op	  30.08 MB/s	 1143773 B/op	   27186 allocs/op
   148  BenchmarkSinkSpeed4K-4         	   50000	     31369 ns/op	 130.57 MB/s	    1566 B/op	      30 allocs/op
   149  BenchmarkSinkSpeed64K-4        	    5000	    329065 ns/op	 199.16 MB/s	   21529 B/op	     453 allocs/op
   150  BenchmarkSinkSpeed256K-4       	     500	   2373354 ns/op	 220.91 MB/s	  166332 B/op	    3554 allocs/op
   151  BenchmarkSinkSpeed1M-4         	     300	   5117927 ns/op	 204.88 MB/s	  310378 B/op	    6988 allocs/op
   152  PASS
   153  ok  	github.com/xtaci/kcp-go	50.349s
   154  ```
   155  
   156  
   157  ## Typical Flame Graph
   158  ![Flame Graph in kcptun](flame.png)
   159  
   160  ## Key Design Considerations
   161  
   162  1. slice vs. container/list
   163  
   164  `kcp.flush()` loops through the send queue for retransmission checking for every 20ms(interval).
   165  
   166  I've wrote a benchmark for comparing sequential loop through *slice* and *container/list* here:
   167  
   168  https://github.com/xtaci/notes/blob/master/golang/benchmark2/cachemiss_test.go
   169  
   170  ```
   171  BenchmarkLoopSlice-4   	2000000000	         0.39 ns/op
   172  BenchmarkLoopList-4    	100000000	        54.6 ns/op
   173  ```
   174  
   175  List structure introduces **heavy cache misses** compared to slice which owns better **locality**, 5000 connections with 32 window size and 20ms interval will cost 6us/0.03%(cpu) using slice, and 8.7ms/43.5%(cpu) for list for each `kcp.flush()`.
   176  
   177  2. Timing accuracy vs. syscall clock_gettime
   178  
   179  Timing is **critical** to **RTT estimator**, inaccurate timing leads to false retransmissions in KCP, but calling `time.Now()` costs 42 cycles(10.5ns on 4GHz CPU, 15.6ns on my MacBook Pro 2.7GHz). 
   180  
   181  The benchmark for time.Now() lies here:
   182  
   183  https://github.com/xtaci/notes/blob/master/golang/benchmark2/syscall_test.go
   184  
   185  ```
   186  BenchmarkNow-4         	100000000	        15.6 ns/op
   187  ```
   188  
   189  In kcp-go, after each `kcp.output()` function call, current clock time will be updated upon return, and for a single `kcp.flush()` operation, current time will be queried from system once. For most of the time, 5000 connections costs 5000 * 15.6ns = 78us(a fixed cost while no packet needs to be sent), as for 10MB/s data transfering with 1400 MTU, `kcp.output()` will be called around 7500 times and costs 117us for `time.Now()` in **every second**.
   190  
   191  3. Memory management
   192  
   193  Primary memory allocation are done from a global buffer pool xmit.Buf, in kcp-go, when we need to allocate some bytes, we can get from that pool, and a fixed-capacity 1500 bytes(mtuLimit) will be returned, the rx queue, tx queue and fec queue all receive bytes from there, and they will return the bytes to the pool after using to prevent unnecessary zer0ing of bytes. The pool mechanism maintained a high watermark for slice objects, these in-flight objects from the pool will survive from the perodical garbage collection, meanwhile the pool kept the ability to return the memory to runtime if in idle.
   194  
   195  4. Information security
   196  
   197  kcp-go is shipped with builtin packet encryption powered by various block encryption algorithms and works in [Cipher Feedback Mode](https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#Cipher_Feedback_(CFB)), for each packet to be sent, the encryption process will start from encrypting a [nonce](https://en.wikipedia.org/wiki/Cryptographic_nonce) from the [system entropy](https://en.wikipedia.org/wiki//dev/random), so encryption to same plaintexts never leads to a same ciphertexts thereafter.
   198  
   199  The contents of the packets are completely anonymous with encryption, including the headers(FEC,KCP), checksums and contents. Note that, no matter which encryption method you choose on you upper layer, if you disable encryption, the transmit will be insecure somehow, since the header is ***PLAINTEXT*** to everyone it would be susceptible to header tampering, such as jamming the *sliding window size*, *round-trip time*, *FEC property* and *checksums*. ```AES-128``` is suggested for minimal encryption since modern CPUs are shipped with [AES-NI](https://en.wikipedia.org/wiki/AES_instruction_set) instructions and performs even better than `salsa20`(check the table above).
   200  
   201  Other possible attacks to kcp-go includes: a) [traffic analysis](https://en.wikipedia.org/wiki/Traffic_analysis), dataflow on specific websites may have pattern while interchanging data, but this type of eavesdropping has been mitigated by adapting [smux](https://github.com/xtaci/smux) to mix data streams so as to introduce noises, perfect solution to this has not appeared yet, theroretically by shuffling/mixing messages on larger scale network may mitigate this problem.  b) [replay attack](https://en.wikipedia.org/wiki/Replay_attack), since the asymmetrical encryption has not been introduced into kcp-go for some reason, capturing the packets and replay them on a different machine is possible, (notice: hijacking the session and decrypting the contents is still *impossible*), so upper layers should contain a asymmetrical encryption system to guarantee the authenticity of each message(to process message exactly once), such as HTTPS/OpenSSL/LibreSSL, only by signing the requests with private keys can eliminate this type of attack. 
   202  
   203  ## Connection Termination
   204  
   205  Control messages like **SYN/FIN/RST** in TCP **are not defined** in KCP, you need some **keepalive/heartbeat mechanism** in the application-level. A real world example is to use some **multiplexing** protocol over session, such as [smux](https://github.com/xtaci/smux)(with embedded keepalive mechanism), see [kcptun](https://github.com/xtaci/kcptun) for example.
   206  
   207  ## FAQ
   208  
   209  Q: I'm handling >5K connections on my server, the CPU utilization is so high.
   210  
   211  A: A standalone `agent` or `gate` server for running kcp-go is suggested, not only for CPU utilization, but also important to the **precision** of RTT measurements(timing) which indirectly affects retransmission. By increasing update `interval` with `SetNoDelay` like `conn.SetNoDelay(1, 40, 1, 1)` will dramatically reduce system load, but lower the performance.
   212  
   213  Q: When should I enable FEC?
   214  
   215  A: Forward error correction is critical to long-distance transmission, because a packet loss will lead to a huge penalty in time. And for the complicated packet routing network in modern world, round-trip time based loss check will not always be efficient, the big deviation of RTT samples in the long way usually leads to a larger RTO value in typical rtt estimator, which in other words, slows down the transmission.
   216    
   217  Q: Should I enable encryption?
   218  
   219  A: Yes, for the safety of protocol, even if the upper layer has encrypted.
   220  
   221  ## Who is using this?
   222  
   223  1. https://github.com/xtaci/kcptun -- A Secure Tunnel Based On KCP over UDP.
   224  2. https://github.com/getlantern/lantern -- Lantern delivers fast access to the open Internet. 
   225  3. https://github.com/smallnest/rpcx -- A RPC service framework based on net/rpc like alibaba Dubbo and weibo Motan.
   226  4. https://github.com/gonet2/agent -- A gateway for games with stream multiplexing.
   227  5. https://github.com/syncthing/syncthing -- Open Source Continuous File Synchronization.
   228  
   229  ## Links
   230  
   231  1. https://github.com/xtaci/libkcp -- FEC enhanced KCP session library for iOS/Android in C++
   232  2. https://github.com/skywind3000/kcp -- A Fast and Reliable ARQ Protocol
   233  3. https://github.com/klauspost/reedsolomon -- Reed-Solomon Erasure Coding in Go