google.golang.org/grpc@v1.62.1/Documentation/anti-patterns.md (about) 1 ## Anti-Patterns 2 3 ### Dialing in gRPC 4 [`grpc.Dial`](https://pkg.go.dev/google.golang.org/grpc#Dial) is a function in 5 the gRPC library that creates a virtual connection from the gRPC client to the 6 gRPC server. It takes a target URI (which can represent the name of a logical 7 backend service and could resolve to multiple actual addresses) and a list of 8 options, and returns a 9 [`ClientConn`](https://pkg.go.dev/google.golang.org/grpc#ClientConn) object that 10 represents the connection to the server. The `ClientConn` contains one or more 11 actual connections to real server backends and attempts to keep these 12 connections healthy by automatically reconnecting to them when they break. 13 14 The `Dial` function can also be configured with various options to customize the 15 behavior of the client connection. For example, developers could use options 16 such a 17 [`WithTransportCredentials`](https://pkg.go.dev/google.golang.org/grpc#WithTransportCredentials) 18 to configure the transport credentials to use. 19 20 While `Dial` is commonly referred to as a "dialing" function, it doesn't 21 actually perform the low-level network dialing operation like 22 [`net.Dial`](https://pkg.go.dev/net#Dial) would. Instead, it creates a virtual 23 connection from the gRPC client to the gRPC server. 24 25 `Dial` does initiate the process of connecting to the server, but it uses the 26 ClientConn object to manage and maintain that connection over time. This is why 27 errors encountered during the initial connection are no different from those 28 that occur later on, and why it's important to handle errors from RPCs rather 29 than relying on options like 30 [`FailOnNonTempDialError`](https://pkg.go.dev/google.golang.org/grpc#FailOnNonTempDialError), 31 [`WithBlock`](https://pkg.go.dev/google.golang.org/grpc#WithBlock), and 32 [`WithReturnConnectionError`](https://pkg.go.dev/google.golang.org/grpc#WithReturnConnectionError). 33 In fact, `Dial` does not always establish a connection to servers by default. 34 The connection behavior is determined by the load balancing policy being used. 35 For instance, an "active" load balancing policy such as Round Robin attempts to 36 maintain a constant connection, while the default "pick first" policy delays 37 connection until an RPC is executed. Instead of using the WithBlock option, which 38 may not be recommended in some cases, you can call the 39 [`ClientConn.Connect`](https://pkg.go.dev/google.golang.org/grpc#ClientConn.Connect) 40 method to explicitly initiate a connection. 41 42 ### Using `FailOnNonTempDialError`, `WithBlock`, and `WithReturnConnectionError` 43 44 The gRPC API provides several options that can be used to configure the behavior 45 of dialing and connecting to a gRPC server. Some of these options, such as 46 `FailOnNonTempDialError`, `WithBlock`, and `WithReturnConnectionError`, rely on 47 failures at dial time. However, we strongly discourage developers from using 48 these options, as they can introduce race conditions and result in unreliable 49 and difficult-to-debug code. 50 51 One of the most important reasons for avoiding these options, which is often 52 overlooked, is that connections can fail at any point in time. This means that 53 you need to handle RPC failures caused by connection issues, regardless of 54 whether a connection was never established in the first place, or if it was 55 created and then immediately lost. Implementing proper error handling for RPCs 56 is crucial for maintaining the reliability and stability of your gRPC 57 communication. 58 59 ### Why we discourage using `FailOnNonTempDialError`, `WithBlock`, and `WithReturnConnectionError` 60 61 When a client attempts to connect to a gRPC server, it can encounter a variety 62 of errors, including network connectivity issues, server-side errors, and 63 incorrect usage of the gRPC API. The options `FailOnNonTempDialError`, 64 `WithBlock`, and `WithReturnConnectionError` are designed to handle some of 65 these errors, but they do so by relying on failures at dial time. This means 66 that they may not provide reliable or accurate information about the status of 67 the connection. 68 69 For example, if a client uses `WithBlock` to wait for a connection to be 70 established, it may end up waiting indefinitely if the server is not responding. 71 Similarly, if a client uses `WithReturnConnectionError` to return a connection 72 error if dialing fails, it may miss opportunities to recover from transient 73 network issues that are resolved shortly after the initial dial attempt. 74 75 ## Best practices for error handling in gRPC 76 77 Instead of relying on failures at dial time, we strongly encourage developers to 78 rely on errors from RPCs. When a client makes an RPC, it can receive an error 79 response from the server. These errors can provide valuable information about 80 what went wrong, including information about network issues, server-side errors, 81 and incorrect usage of the gRPC API. 82 83 By handling errors from RPCs correctly, developers can write more reliable and 84 robust gRPC applications. Here are some best practices for error handling in 85 gRPC: 86 87 - Always check for error responses from RPCs and handle them appropriately. 88 - Use the `status` field of the error response to determine the type of error that 89 occurred. 90 - When retrying failed RPCs, consider using the built-in retry mechanism 91 provided by gRPC-Go, if available, instead of manually implementing retries. 92 Refer to the [gRPC-Go retry example 93 documentation](https://github.com/grpc/grpc-go/blob/master/examples/features/retry/README.md) 94 for more information. 95 - Avoid using `FailOnNonTempDialError`, `WithBlock`, and 96 `WithReturnConnectionError`, as these options can introduce race conditions and 97 result in unreliable and difficult-to-debug code. 98 - If making the outgoing RPC in order to handle an incoming RPC, be sure to 99 translate the status code before returning the error from your method handler. 100 For example, if the error is an `INVALID_ARGUMENT` error, that probably means 101 your service has a bug (otherwise it shouldn't have triggered this error), in 102 which case `INTERNAL` is more appropriate to return back to your users. 103 104 ### Example: Handling errors from an RPC 105 106 The following code snippet demonstrates how to handle errors from an RPC in 107 gRPC: 108 109 ```go 110 ctx, cancel := context.WithTimeout(context.Background(), time.Second) 111 defer cancel() 112 113 res, err := client.MyRPC(ctx, &MyRequest{}) 114 if err != nil { 115 // Handle the error appropriately, 116 // log it & return an error to the caller, etc. 117 log.Printf("Error calling MyRPC: %v", err) 118 return nil, err 119 } 120 121 // Use the response as appropriate 122 log.Printf("MyRPC response: %v", res) 123 ``` 124 125 To determine the type of error that occurred, you can use the status field of 126 the error response: 127 128 129 ```go 130 resp, err := client.MakeRPC(context.Background(), request) 131 if err != nil { 132 status, ok := status.FromError(err) 133 if ok { 134 // Handle the error based on its status code 135 if status.Code() == codes.NotFound { 136 log.Println("Requested resource not found") 137 } else { 138 log.Printf("RPC error: %v", status.Message()) 139 } 140 } else { 141 //Handle non-RPC errors 142 log.Printf("Non-RPC error: %v", err) 143 } 144 return 145 } 146 147 // Use the response as needed 148 log.Printf("Response received: %v", resp) 149 ``` 150 151 ### Example: Using a backoff strategy 152 153 154 When retrying failed RPCs, use a backoff strategy to avoid overwhelming the 155 server or exacerbating network issues: 156 157 158 ```go 159 var res *MyResponse 160 var err error 161 162 // If the user doesn't have a context with a deadline, create one 163 ctx, cancel := context.WithTimeout(context.Background(), time.Second) 164 defer cancel() 165 166 // Retry the RPC call a maximum number of times 167 for i := 0; i < maxRetries; i++ { 168 169 // Make the RPC call 170 res, err = client.MyRPC(ctx, &MyRequest{}) 171 172 // Check if the RPC call was successful 173 if err == nil { 174 // The RPC was successful, so break out of the loop 175 break 176 } 177 178 // The RPC failed, so wait for a backoff period before retrying 179 backoff := time.Duration(i) * time.Second 180 log.Printf("Error calling MyRPC: %v; retrying in %v", err, backoff) 181 time.Sleep(backoff) 182 } 183 184 // Check if the RPC call was successful after all retries 185 if err != nil { 186 // All retries failed, so handle the error appropriately 187 log.Printf("Error calling MyRPC: %v", err) 188 return nil, err 189 } 190 191 // Use the response as appropriate 192 log.Printf("MyRPC response: %v", res) 193 ``` 194 195 196 ## Conclusion 197 198 The 199 [`FailOnNonTempDialError`](https://pkg.go.dev/google.golang.org/grpc#FailOnNonTempDialError), 200 [`WithBlock`](https://pkg.go.dev/google.golang.org/grpc#WithBlock), and 201 [`WithReturnConnectionError`](https://pkg.go.dev/google.golang.org/grpc#WithReturnConnectionError) 202 options are designed to handle errors at dial time, but they can introduce race 203 conditions and result in unreliable and difficult-to-debug code. Instead of 204 relying on these options, we strongly encourage developers to rely on errors 205 from RPCs for error handling. By following best practices for error handling in 206 gRPC, developers can write more reliable and robust gRPC applications.