github.com/geniusesgroup/libgo@v0.0.0-20220713101832-828057a9d3d4/tcp/README.md (about)

     1  # TCP on UserSpace
     2  In services as API era, almost all services call transfer under two TCP packets, So many logics that consume both memory (e.g. Go just use [2KB for each goroutine](https://go.dev/doc/go1.4#runtime)) and computing resources (e.g. many [context switch](https://en.wikipedia.org/wiki/Context_switch) between kernel and userspace) to handle these few packets are not acceptable anymore.
     3  
     4  Some suggestion such as [change net package](https://github.com/golang/go/issues/15735) or using [epoll](https://en.wikipedia.org/wiki/Epoll) or [kqueue](https://en.wikipedia.org/wiki/Kqueue) and discard net package at all like [this](https://github.com/xtaci/gaio), [this](https://github.com/lesismal/nbio), [this](https://github.com/eranyanay/1m-go-websockets) or [this](https://github.com/panjf2000/gnet) has many other problems:
     5  - Go internally use this mechanism in runtime package, So it is easier to just worker mechanism to limit max number of goroutine like [FastHTTP](https://github.com/valyala/fasthttp). We know in this case first Read() on any connection cause to two or more unneeded context-switch on some OS like linux to check if any data ready to read before schedule for future until get read ready state.
     6  - OS depend on implementation can very tricky tasks
     7  - OS depend optimization need to change such as the number of the active and open file in UNIX based OSs known as ulimit
     8  - Balance between the number of events and timeout(milliseconds) in high and low app load isn't easy.
     9  - Need some runtime adaptors, because other packages not being ready for this architecture even the Go net library.
    10  
    11  ## Why (OS Kernel level disadvantages)
    12  - The Linux or other OSs networking stack has a limit on how many packets per second they can handle. When the limit is reached all CPUs become busy just receiving and routing packets.
    13  - 
    14  
    15  ## Goals
    16  - Improve performance by reducing resource usage. e.g.
    17      - No context switch need (L3 as IP, ... need context switch but not as much as kernel-based logics)
    18      - No need for separate files for each TCP stream (L3 as IP, ... need some mechanism to take packets from OS)
    19      - Just have one buffer and no huge memory copy need more.
    20      - Just have one lock mechanism for each stream
    21      - Just have one timeout mechanism for a stream of any connections, not many in kernel and user-space to handle the same requirements.
    22      - Mix congestion control with rate limiting
    23      - Keep-alive a stream for almost free. Just store some bytes in RAM for a stream without impacting other parts of the application
    24  - Track connections and streams metrics for any purpose like security, ...
    25  - Easily add or changed logic whereas upgrading the host kernel is quite challenging. e.g. add machine learning algorithms, ...
    26  - Have protocol implementation in user space to build applications as unikernel image without need huge os kernel.
    27  
    28  ## Non-Goals (Non Considering that can be treated as disadvantages)
    29  - Don't want to know how TCP packets come from. So we don't consider or think about how other layers work.
    30  
    31  ## Still considering
    32  - Support some protocols like [PLPMTUD - Packetization Layer Path MTU Discovery](https://www.ietf.org/rfc/rfc4821.txt) for bad networks that don't serve L3 IP/ICMP services?
    33  - Why [tcp checksum computation](https://en.wikipedia.org/wiki/Transmission_Control_Protocol#Checksum_computation) must change depending on the below layer!!??
    34  
    35  ## RFCs
    36  - https://www.iana.org/assignments/tcp-parameters/tcp-parameters.xhtml
    37  - https://datatracker.ietf.org/doc/html/rfc7805
    38  - https://datatracker.ietf.org/doc/html/rfc7414
    39  - https://datatracker.ietf.org/doc/html/rfc675
    40  - https://datatracker.ietf.org/doc/html/rfc791
    41  - https://datatracker.ietf.org/doc/html/rfc793
    42  - https://datatracker.ietf.org/doc/html/rfc1122
    43  - https://datatracker.ietf.org/doc/html/rfc6298
    44  - https://datatracker.ietf.org/doc/html/rfc1948
    45  - https://datatracker.ietf.org/doc/html/rfc4413
    46  
    47  ## Similar Projects
    48  - https://github.com/search?l=Go&q=tcp+userspace&type=Repositories
    49  - https://github.com/Xilinx-CNS/onload
    50  - https://github.com/mtcp-stack/mtcp
    51  - https://github.com/tass-belgium/picotcp/blob/master/modules/pico_tcp.c
    52  - https://github.com/saminiir/level-ip
    53  - https://github.com/google/gopacket/blob/master/layers/tcp.go
    54  - https://github.com/Samangan/go-tcp
    55  - https://github.com/mit-pdos/biscuit/blob/master/biscuit/src/inet/
    56  
    57  ## Resources
    58  - https://en.wikipedia.org/wiki/OSI_model
    59  - https://en.wikipedia.org/wiki/Transmission_Control_Protocol
    60  - https://man7.org/linux/man-pages/man7/tcp.7.html
    61  - https://github.com/torvalds/linux/blob/master/net/ipv4/tcp.c
    62  - https://github.com/torvalds/linux/blob/master/net/ipv6/tcp_ipv6.c
    63  
    64  ## Attacks
    65  - https://www.akamai.com/blog/security/tcp-middlebox-reflection
    66  
    67  ## Articles
    68  - https://ieeexplore.ieee.org/document/8672289
    69  - https://engineering.salesforce.com/performance-analysis-of-linux-kernel-library-user-space-tcp-stack-be75fb198730
    70  - https://tempesta-tech.com/blog/user-space-tcp
    71  - https://blog.cloudflare.com/kernel-bypass/
    72  - https://blog.cloudflare.com/why-we-use-the-linux-kernels-tcp-stack/
    73  - https://blog.cloudflare.com/path-mtu-discovery-in-practice/
    74  - https://www.fastly.com/blog/measuring-quic-vs-tcp-computational-efficiency
    75  - https://stackoverflow.com/questions/8509152/max-number-of-goroutines
    76  - https://developpaper.com/deep-analysis-of-source-code-for-building-native-network-model-with-go-netpol-i-o-multiplexing/
    77  
    78  # Abbreviations
    79  - L3    >> layer 3 OSI
    80  - IP    >> Internet Protocol