github.com/geniusesgroup/libgo@v0.0.0-20220713101832-828057a9d3d4/tcp/README.md (about) 1 # TCP on UserSpace 2 In services as API era, almost all services call transfer under two TCP packets, So many logics that consume both memory (e.g. Go just use [2KB for each goroutine](https://go.dev/doc/go1.4#runtime)) and computing resources (e.g. many [context switch](https://en.wikipedia.org/wiki/Context_switch) between kernel and userspace) to handle these few packets are not acceptable anymore. 3 4 Some suggestion such as [change net package](https://github.com/golang/go/issues/15735) or using [epoll](https://en.wikipedia.org/wiki/Epoll) or [kqueue](https://en.wikipedia.org/wiki/Kqueue) and discard net package at all like [this](https://github.com/xtaci/gaio), [this](https://github.com/lesismal/nbio), [this](https://github.com/eranyanay/1m-go-websockets) or [this](https://github.com/panjf2000/gnet) has many other problems: 5 - Go internally use this mechanism in runtime package, So it is easier to just worker mechanism to limit max number of goroutine like [FastHTTP](https://github.com/valyala/fasthttp). We know in this case first Read() on any connection cause to two or more unneeded context-switch on some OS like linux to check if any data ready to read before schedule for future until get read ready state. 6 - OS depend on implementation can very tricky tasks 7 - OS depend optimization need to change such as the number of the active and open file in UNIX based OSs known as ulimit 8 - Balance between the number of events and timeout(milliseconds) in high and low app load isn't easy. 9 - Need some runtime adaptors, because other packages not being ready for this architecture even the Go net library. 10 11 ## Why (OS Kernel level disadvantages) 12 - The Linux or other OSs networking stack has a limit on how many packets per second they can handle. When the limit is reached all CPUs become busy just receiving and routing packets. 13 - 14 15 ## Goals 16 - Improve performance by reducing resource usage. e.g. 17 - No context switch need (L3 as IP, ... need context switch but not as much as kernel-based logics) 18 - No need for separate files for each TCP stream (L3 as IP, ... need some mechanism to take packets from OS) 19 - Just have one buffer and no huge memory copy need more. 20 - Just have one lock mechanism for each stream 21 - Just have one timeout mechanism for a stream of any connections, not many in kernel and user-space to handle the same requirements. 22 - Mix congestion control with rate limiting 23 - Keep-alive a stream for almost free. Just store some bytes in RAM for a stream without impacting other parts of the application 24 - Track connections and streams metrics for any purpose like security, ... 25 - Easily add or changed logic whereas upgrading the host kernel is quite challenging. e.g. add machine learning algorithms, ... 26 - Have protocol implementation in user space to build applications as unikernel image without need huge os kernel. 27 28 ## Non-Goals (Non Considering that can be treated as disadvantages) 29 - Don't want to know how TCP packets come from. So we don't consider or think about how other layers work. 30 31 ## Still considering 32 - Support some protocols like [PLPMTUD - Packetization Layer Path MTU Discovery](https://www.ietf.org/rfc/rfc4821.txt) for bad networks that don't serve L3 IP/ICMP services? 33 - Why [tcp checksum computation](https://en.wikipedia.org/wiki/Transmission_Control_Protocol#Checksum_computation) must change depending on the below layer!!?? 34 35 ## RFCs 36 - https://www.iana.org/assignments/tcp-parameters/tcp-parameters.xhtml 37 - https://datatracker.ietf.org/doc/html/rfc7805 38 - https://datatracker.ietf.org/doc/html/rfc7414 39 - https://datatracker.ietf.org/doc/html/rfc675 40 - https://datatracker.ietf.org/doc/html/rfc791 41 - https://datatracker.ietf.org/doc/html/rfc793 42 - https://datatracker.ietf.org/doc/html/rfc1122 43 - https://datatracker.ietf.org/doc/html/rfc6298 44 - https://datatracker.ietf.org/doc/html/rfc1948 45 - https://datatracker.ietf.org/doc/html/rfc4413 46 47 ## Similar Projects 48 - https://github.com/search?l=Go&q=tcp+userspace&type=Repositories 49 - https://github.com/Xilinx-CNS/onload 50 - https://github.com/mtcp-stack/mtcp 51 - https://github.com/tass-belgium/picotcp/blob/master/modules/pico_tcp.c 52 - https://github.com/saminiir/level-ip 53 - https://github.com/google/gopacket/blob/master/layers/tcp.go 54 - https://github.com/Samangan/go-tcp 55 - https://github.com/mit-pdos/biscuit/blob/master/biscuit/src/inet/ 56 57 ## Resources 58 - https://en.wikipedia.org/wiki/OSI_model 59 - https://en.wikipedia.org/wiki/Transmission_Control_Protocol 60 - https://man7.org/linux/man-pages/man7/tcp.7.html 61 - https://github.com/torvalds/linux/blob/master/net/ipv4/tcp.c 62 - https://github.com/torvalds/linux/blob/master/net/ipv6/tcp_ipv6.c 63 64 ## Attacks 65 - https://www.akamai.com/blog/security/tcp-middlebox-reflection 66 67 ## Articles 68 - https://ieeexplore.ieee.org/document/8672289 69 - https://engineering.salesforce.com/performance-analysis-of-linux-kernel-library-user-space-tcp-stack-be75fb198730 70 - https://tempesta-tech.com/blog/user-space-tcp 71 - https://blog.cloudflare.com/kernel-bypass/ 72 - https://blog.cloudflare.com/why-we-use-the-linux-kernels-tcp-stack/ 73 - https://blog.cloudflare.com/path-mtu-discovery-in-practice/ 74 - https://www.fastly.com/blog/measuring-quic-vs-tcp-computational-efficiency 75 - https://stackoverflow.com/questions/8509152/max-number-of-goroutines 76 - https://developpaper.com/deep-analysis-of-source-code-for-building-native-network-model-with-go-netpol-i-o-multiplexing/ 77 78 # Abbreviations 79 - L3 >> layer 3 OSI 80 - IP >> Internet Protocol