gvisor.dev/gvisor@v0.0.0-20240520182842-f9d4d51c7e0f/website/blog/2021-08-31-gvisor-rack.md (about) 1 # gVisor RACK 2 3 gVisor has implemented the [RACK](https://datatracker.ietf.org/doc/html/rfc8985) 4 (Recent ACKnowledgement) TCP loss-detection algorithm in our network stack, 5 which improves throughput in the presence of packet loss and reordering. 6 7 TCP is a connection-oriented protocol that detects and recovers from loss by 8 retransmitting packets. [RACK](https://datatracker.ietf.org/doc/html/rfc8985) is 9 one of the recent loss-detection methods implemented in Linux and BSD, which 10 helps in identifying packet loss quickly and accurately in the presence of 11 packet reordering and tail losses. 12 13 ## Background 14 15 The TCP congestion window indicates the number of unacknowledged packets that 16 can be sent at any time. When packet loss is identified, the congestion window 17 is reduced depending on the type of loss. The sender will recover from the loss 18 after all the packets sent before reducing the congestion window are 19 acknowledged. If the loss is identified falsely by the connection, then the 20 connection enters loss recovery unnecessarily, resulting in sending fewer 21 packets. 22 23 Packet loss is identified mainly in two ways: 24 25 1. Three duplicate acknowledgments, which will result in either 26 [Fast](https://datatracker.ietf.org/doc/html/rfc2001#section-4) or 27 [SACK](https://datatracker.ietf.org/doc/html/rfc6675) recovery. The 28 congestion window is reduced depending on the type of congestion control 29 algorithm. For example, in the 30 [Reno](https://en.wikipedia.org/wiki/TCP_congestion_control#TCP_Tahoe_and_Reno) 31 algorithm it is reduced to half. 32 2. RTO (Retransmission Timeout) which will result in Timeout recovery. The 33 congestion window is reduced to one 34 [MSS](https://en.wikipedia.org/wiki/Maximum_segment_size). 35 36 Both of these cases result in reducing the congestion window, with RTO being 37 more expensive. Most of the existing algorithms do not detect packet reordering, 38 which get incorrectly identified as packet loss, resulting in an RTO. 39 Furthermore, the loss of an ACK at the end of a sequence (known as "tail loss") 40 will also trigger RTO and slow down future transmissions unnecessarily. RACK 41 helps us to identify loss accurately in all these scenarios, and will avoid 42 entering RTO. 43 44 ## Implementation of RACK 45 46 Implementation of RACK requires support for: 47 48 1. Per-packet transmission timestamps: RACK detects loss depending on the 49 transmission times of the packet and the timestamp at which ACK was 50 received. 51 2. SACK and ability to detect DSACK: Selective Acknowledgement and Duplicate 52 SACK are used to adjust the timer window after which a packet can be marked 53 as lost. 54 55 ### Packet Reordering 56 57 Packet reordering commonly occurs when different packets take different paths 58 through a network. The diagram below shows the transmission of four packets 59 which get reordered in transmission, and the resulting TCP behavior with and 60 without RACK. 61 62 ![Figure 1](/assets/images/2021-08-31-rack-figure1.png "Packet reordering.") 63 64 In the above example, the sender sees three duplicate acknowledgments. Without 65 RACK, this is identified falsely as packet loss, and the congestion window will 66 be reduced after entering Fast/SACK recovery. 67 68 To detect packet reordering, RACK uses a reorder window, bounded between 69 [[RTT](https://en.wikipedia.org/wiki/Round-trip_delay)/4, RTT]. The reorder 70 timer is set to expire after _RTT+reorder\_window_. A packet is marked as lost 71 when the packets following it were acknowledged using SACK and the reorder timer 72 expires. The reorder window is increased when a DSACK is received (which 73 indicates that there is a higher degree of reordering). 74 75 ### Tail Loss 76 77 Tail loss occurs when the packets are lost at the end of data transmission. The 78 diagram below shows an example of tail loss when the last three packets are 79 lost, and how it is handled with and without RACK. 80 81 ![Figure 2](/assets/images/2021-08-31-rack-figure2.png "Tail loss figure 2.") 82 83 For tail losses, RACK uses a Tail Loss Probe (TLP), which relies on a timer for 84 the last packet sent. The TLP timer is set to _2 \* RTT,_ after which a probe is 85 sent. The probe packet will allow the connection one more chance to detect a 86 loss by triggering ACK feedback to avoid entering RTO. In the above example, the 87 loss is recovered without entering the RTO. 88 89 TLP will also help in cases where the ACK was lost but all the packets were 90 received by the receiver. The below diagram shows that the ACK received for the 91 probe packet avoided the RTO. 92 93 ![Figure 3](/assets/images/2021-08-31-rack-figure3.png "Tail loss figure 3.") 94 95 If there was some loss, then the ACK for the probe packet will have the SACK 96 blocks, which will be used to detect and retransmit the lost packets. 97 98 In gVisor, we have support for 99 [NewReno](https://datatracker.ietf.org/doc/html/rfc6582) and SACK loss recovery 100 methods. We 101 [added support for RACK](https://github.com/google/gvisor/issues/5243) recently, 102 and it is the default when SACK is enabled. After enabling RACK, our internal 103 benchmarks in the presence of reordering and tail losses and the data we took 104 from internal users inside Google have shown ~50% reduction in the number of 105 RTOs. 106 107 While RACK has improved one aspect of TCP performance by reducing the timeouts 108 in the presence of reordering and tail losses, in gVisor we plan to implement 109 the undoing of congestion windows and 110 [BBRv2](https://datatracker.ietf.org/doc/html/draft-cardwell-iccrg-bbr-congestion-control) 111 (once there is an RFC available) to further improve TCP performance in less 112 ideal network conditions. 113 114 If you haven’t already, try gVisor. The instructions to get started are in our 115 [Quick Start](https://gvisor.dev/docs/user_guide/quick_start/docker/). You can 116 also get involved with the gVisor community via our 117 [Gitter channel](https://gitter.im/gvisor/community), 118 [email list](https://groups.google.com/forum/#!forum/gvisor-users), 119 [issue tracker](https://gvisor.dev/issue/new), and 120 [Github repository](https://github.com/google/gvisor).