trpc.group/trpc-go/trpc-go@v1.0.3/pool/connpool/README.md (about) 1 English | [中文](README.zh_CN.md) 2 3 ## Background 4 5 When client requests server, if the communication is conducted using TCP protocol, the cost of establishing a connection with the three-way handshake needs to be considered. In general, TCP communication mode will pre-establish a connection, or establish a connection when the request is initiated. After using the connection, it will not be directly closed, but will be reused later. 6 7 Connection pool is a certain degree of encapsulation to achieve this function. 8 9 ## Principle 10 11 The pool maintains a `sync.Map` as a connection pool, the key is encoded by <network, address, protocol>, and the value is the ConnectionPool formed by the connection established with the target address, and a linked list is used to maintain idle connections inside. In short connection mode, the transport layer closes the connection after the RPC call, while in the connection pool mode, the used connection is returned to the connection pool to be taken out when needed next time. 12 13 To achieve the above purposes, the connection pool needs to have the following functions: 14 15 - Provide available connections, including creating new connections and reusing idle connections. 16 - Recycling the connections used by the upper layer as idle connection management. 17 - Manage the idle connections in the connection pool, including the selection policy for reusing connections, and health checks on idle connections. 18 - Adjust the running parameters of the connection pool according to user configuration. 19 20 ## Design Implementation 21 22 Its overall code structure is shown in the figure below 23 24 ![design implementation](/.resources/pool/connpool/design_implementation.png) 25 26 ### Initialize the Connection Pool 27 28 `NewConnectionPool` creates a connection pool and supports passing in `option` to modify parameters. If no options are passed, the default values will be used for initialization. `Dial` is the default method for creating connections, and each `ConnectionPool` generates `DialOptions` based on its `GetOptions` to establish a connection to the corresponding target. 29 30 ```go 31 func NewConnectionPool(opt ...Option) Pool { 32 opts := &Options{ 33 MaxIdle: defaultMaxIdle, 34 IdleTimeout: defaultIdleTimeout, 35 DialTimeout: defaultDialTimeout, 36 Dial: Dial, 37 } 38 for _, o := range opt { 39 o(opts) 40 } 41 return &pool{ 42 opts: opts, 43 connectionPools: new(sync.Map), 44 } 45 } 46 ``` 47 48 ### Get a Connection 49 50 A connection can be obtained through `pool.Get`, refer to the implementation of `client_transport_tcp.go`: 51 52 ```go 53 // Get 54 getOpts := connpool.NewGetOptions() 55 getOpts.WithContext(ctx) 56 getOpts.WithFramerBuilder(opts.FramerBuilder) 57 getOpts.WithDialTLS(opts.TLSCertFile, opts.TLSKeyFile, opts.CACertFile, opts.TLSServerName) 58 getOpts.WithLocalAddr(opts.LocalAddr) 59 getOpts.WithDialTimeout(opts.DialTimeout) 60 getOpts.WithProtocol(opts.Protocol) 61 conn, err = opts.Pool.Get(opts.Network, opts.Address, getOpts) 62 ``` 63 64 `ConnPool` only exposes the `Get` interfaces to the outside world, ensuring that the connection pool status is not damaged by user error. 65 66 `Get` gets the `ConnectionPool` according to `<network, address, protocol>`. If the acquisition fails, you need to create it first. Concurrency control is done here to prevent the `ConnectionPool` from being created repeatedly. The core code is as follows: 67 68 ```go 69 func (p *pool) Get(network string, address string, opts GetOptions) (net.Conn, error) { 70 // ... 71 key := getNodeKey(network, address, opts.Protocol) 72 if v, ok := p.connectionPools.Load(key); ok { 73 return v.(*ConnectionPool).Get(ctx) 74 } 75 // create newPool... 76 v, ok := p.connectionPools.LoadOrStore(key, newPool) 77 if !ok { 78 // init newPool... 79 return newPool.Get(ctx) 80 } 81 return v.(*ConnectionPool).Get(ctx) 82 } 83 ``` 84 85 After obtaining the `ConnectionPool`, an attempt is made to obtain a connection. First, a `token` needs to be obtained. The `token` is a `ch` used for concurrency control, and its buffer length is based on `MaxActive`. It represents the number of connections that the user can use at the same time. When an active connection is returned to the connection pool or closed, the `token` is returned. If `Wait=True` is set, it will wait until a timeout is returned when the `token` cannot be obtained. If `Wait=False` is set, `ErrPoolLimit` will be returned directly when the `token` cannot be obtained. 86 87 ```go 88 func (p *ConnectionPool) getToken(ctx context.Context) error { 89 if p.MaxActive <= 0 { 90 return nil 91 } 92 93 if p.Wait { 94 select { 95 case p.token <- struct{}{}: 96 return nil 97 case <-ctx.Done(): 98 return ctx.Err() 99 } 100 } else { 101 select { 102 case p.token <- struct{}{}: 103 return nil 104 default: 105 return ErrPoolLimit 106 } 107 } 108 } 109 110 func (p *ConnectionPool) freeToken() { 111 if p.MaxActive <= 0 { 112 return 113 } 114 <-p.token 115 } 116 ``` 117 118 After the `token` is successfully obtained, the idle connection is first obtained from the `idle list`, and if it fails, the newly created connection returns. 119 120 ### Initialize the ConnectionPool 121 122 Initialization of the `ConnectionPool` should be performed when using `Get`, which is mainly divided into starting the check coroutine and preheating idle connections based on `MinIdle`. 123 124 #### KeepMinIdles 125 126 A sudden surge in business traffic may result in the creation of a large number of new connections. Creating a connection is a time-consuming operation and may result in request timeouts. Pre-creating some idle connections can have a preheating effect. When the connection pool is created, MinIdle connections are created for backup. 127 128 #### Check Goroutine 129 130 The ConnectionPool periodically performs the following checks: 131 132 - Idle connection health check 133 134 The default health check strategy is shown in the figure below. The health check scans the idle list. If the connection fails the security check, it will be closed directly. First, the connection is checked for normality, and then checked for reaching `IdleTimeout` and `MaxConnLifetime`. `With WithHealthChecker`, the health check policy can be customized. 135 136 In addition to periodically checking idle connections, a check is performed each time an idle connection is retrieved from the `idle list`. At this time, isFast is set to true and only a connection survival confirmation is performed. 137 138 ```go 139 func (p *ConnectionPool) defaultChecker(pc *PoolConn, isFast bool) bool { 140 if pc.isRemoteError(isFast) { 141 return false 142 } 143 if isFast { 144 return true 145 } 146 if p.IdleTimeout > 0 && pc.t.Add(p.IdleTimeout).Before(time.Now()) { 147 return false 148 } 149 if p.MaxConnLifetime > 0 && pc.created.Add(p.MaxConnLifetime).Before(time.Now()) { 150 return false 151 } 152 return true 153 } 154 ``` 155 156 The detection time for idle connections in the connection pool should generally be made configurable to coordinate with the server side (especially considering different framework scenarios), as improper coordination can lead to issues. For example, if the detection time for idle connections in the pool is 1 minute, and the server also has a 1-minute detection time, the following scenario may occur: when the server closes idle connections intensively, the client side has not yet detected it. As a result, a large number of failures may occur when sending data, which can only be resolved by retrials on the upper layer. A better solution is to set the idle connection detection duration for the server to be longer than that for the connection pool, as much as possible to allow the client side to actively close the connection, thereby avoiding the situation where the retrieved connection is closed by the server without the client's knowledge. 157 158 > There is actually an optimization idea here, which is to perform a non-blocking read through a system call each time a connection is retrieved, which can actually determine whether the connection has been closed by the peer. This is available on Unix/Linux platforms, but encountered some problems on Windows platforms, so this optimization point was removed in tRPC-Go. 159 160 - Idle connection count check: 161 162 Same as `KeepMinIdles`, periodically replenishes the idle connections to `MinIdle`. 163 164 - ConnectionPool idle check: 165 166 The `transport` will not actively close the `ConnectionPool`, which will cause the background check coroutine to run idly. By setting `poolIdleTimeout`, periodic checks are performed within this time period to ensure that `ConnectionPool` that have not been used for a long time are automatically closed when the number of connections used by the users is 0. 167 168 ## Connection life cycle 169 170 `MinIdle` is the minimum number of idle connections maintained by the `ConnectionPool`, which is replenished during initialization and periodic checks. When a user requests a connection, it first tries to retrieve one from the idle connections; if there are no idle connections available, a new one is created. After the user finishes the request, the connection is returned to the `ConnectionPool`. There are three possible scenarios: 171 172 - If the number of idle connections exceeds `MaxIdle`, one idle connection will be closed based on the elimination strategy. 173 - If `forceClose` for the connection pool is set to `true`, the connection will be closed directly instead of being returned to the `ConnectionPool`. 174 - The connection will be added to the idle connection list. 175 176 If there is a read/write error during connection usage by the user, the connection will be closed directly. If the check for connection availability fails, the connection will also be closed directly. 177 178 ![connection life cycle](/.resources/pool/connpool/life_cycle.png) 179 180 ## Idle Connection Management Policy 181 182 The connection pool has two strategies for selecting and eliminating idle connections: FIFO and LIFO. This is controlled by PushIdleConnToTail, and the appropriate management strategy should be chosen based on the actual characteristics of the business. 183 184 - FIFO ensures that each connection is evenly used, but when the caller's request frequency is not high, but happens to request before the connection idle condition is met each time, it will cause all connections to not be released, and maintaining so many connections is unnecessary. 185 - LIFO prioritizes the top connection of the stack, and idle connections at the bottom that are not frequently used will be eliminated first. 186 187 ```go 188 func (p *ConnectionPool) addIdleConn(ctx context.Context) error { 189 c, _ := p.dial(ctx) 190 pc := p.newPoolConn(c) 191 if !p.PushIdleConnToTail { 192 p.idle.pushHead(pc) 193 } else { 194 p.idle.pushTail(pc) 195 } 196 } 197 198 func (p *ConnectionPool) getIdleConn() *PoolConn { 199 for p.idle.head != nil { 200 pc := p.idle.head 201 p.idle.popHead() 202 // ... 203 } 204 } 205 206 func (p *ConnectionPool) put(pc *PoolConn, forceClose bool) error { 207 if !p.closed && !forceClose { 208 if !p.PushIdleConnToTail { 209 p.idle.pushHead(pc) 210 } else { 211 p.idle.pushTail(pc) 212 } 213 if p.idleSize >= p.MaxIdle { 214 pc = p.idle.tail 215 p.idle.popTail() 216 } 217 } 218 } 219 ``` 220