google.golang.org/grpc@v1.62.1/Documentation/anti-patterns.md (about)

     1  ## Anti-Patterns
     2  
     3  ### Dialing in gRPC
     4  [`grpc.Dial`](https://pkg.go.dev/google.golang.org/grpc#Dial) is a function in
     5  the gRPC library that creates a virtual connection from the gRPC client to the
     6  gRPC server.  It takes a target URI (which can represent the name of a logical
     7  backend service and could resolve to multiple actual addresses) and a list of
     8  options, and returns a
     9  [`ClientConn`](https://pkg.go.dev/google.golang.org/grpc#ClientConn) object that
    10  represents the connection to the server. The `ClientConn` contains one or more
    11  actual connections to real server backends and attempts to keep these
    12  connections healthy by automatically reconnecting to them when they break.
    13  
    14  The `Dial` function can also be configured with various options to customize the
    15  behavior of the client connection. For example, developers could use options
    16  such a
    17  [`WithTransportCredentials`](https://pkg.go.dev/google.golang.org/grpc#WithTransportCredentials)
    18  to configure the transport credentials to use.
    19  
    20  While `Dial` is commonly referred to as a "dialing" function, it doesn't
    21  actually perform the low-level network dialing operation like
    22  [`net.Dial`](https://pkg.go.dev/net#Dial) would.  Instead, it creates a virtual
    23  connection from the gRPC client to the gRPC server.
    24  
    25  `Dial` does initiate the process of connecting to the server, but it uses the
    26  ClientConn object to manage and maintain that connection over time. This is why
    27  errors encountered during the initial connection are no different from those
    28  that occur later on, and why it's important to handle errors from RPCs rather
    29  than relying on options like
    30  [`FailOnNonTempDialError`](https://pkg.go.dev/google.golang.org/grpc#FailOnNonTempDialError),
    31  [`WithBlock`](https://pkg.go.dev/google.golang.org/grpc#WithBlock), and
    32  [`WithReturnConnectionError`](https://pkg.go.dev/google.golang.org/grpc#WithReturnConnectionError).
    33  In fact, `Dial` does not always establish a connection to servers by default.
    34  The connection behavior is determined by the load balancing policy being used.
    35  For instance, an "active" load balancing policy such as Round Robin attempts to
    36  maintain a constant connection, while the default "pick first" policy delays
    37  connection until an RPC is executed. Instead of using the WithBlock option, which
    38  may not be recommended in some cases, you can call the
    39  [`ClientConn.Connect`](https://pkg.go.dev/google.golang.org/grpc#ClientConn.Connect)
    40  method to explicitly initiate a connection.
    41  
    42  ### Using `FailOnNonTempDialError`, `WithBlock`, and `WithReturnConnectionError`
    43  
    44  The gRPC API provides several options that can be used to configure the behavior
    45  of dialing and connecting to a gRPC server. Some of these options, such as
    46  `FailOnNonTempDialError`, `WithBlock`, and `WithReturnConnectionError`, rely on
    47  failures at dial time. However, we strongly discourage developers from using
    48  these options, as they can introduce race conditions and result in unreliable
    49  and difficult-to-debug code.
    50  
    51  One of the most important reasons for avoiding these options, which is often
    52  overlooked, is that connections can fail at any point in time. This means that
    53  you need to handle RPC failures caused by connection issues, regardless of
    54  whether a connection was never established in the first place, or if it was
    55  created and then immediately lost.  Implementing proper error handling for RPCs
    56  is crucial for maintaining the reliability and stability of your gRPC
    57  communication.
    58  
    59  ###  Why we discourage using `FailOnNonTempDialError`, `WithBlock`, and `WithReturnConnectionError`
    60  
    61  When a client attempts to connect to a gRPC server, it can encounter a variety
    62  of errors, including network connectivity issues, server-side errors, and
    63  incorrect usage of the gRPC API. The options `FailOnNonTempDialError`,
    64  `WithBlock`, and `WithReturnConnectionError` are designed to handle some of
    65  these errors, but they do so by relying on failures at dial time. This means
    66  that they may not provide reliable or accurate information about the status of
    67  the connection.
    68  
    69  For example, if a client uses `WithBlock` to wait for a connection to be
    70  established, it may end up waiting indefinitely if the server is not responding.
    71  Similarly, if a client uses `WithReturnConnectionError` to return a connection
    72  error if dialing fails, it may miss opportunities to recover from transient
    73  network issues that are resolved shortly after the initial dial attempt.
    74  
    75  ## Best practices for error handling in gRPC
    76  
    77  Instead of relying on failures at dial time, we strongly encourage developers to
    78  rely on errors from RPCs. When a client makes an RPC, it can receive an error
    79  response from the server. These errors can provide valuable information about
    80  what went wrong, including information about network issues, server-side errors,
    81  and incorrect usage of the gRPC API.
    82  
    83  By handling errors from RPCs correctly, developers can write more reliable and
    84  robust gRPC applications. Here are some best practices for error handling in
    85  gRPC:
    86  
    87  - Always check for error responses from RPCs and handle them appropriately.  
    88  - Use the `status` field of the error response to determine the type of error that
    89    occurred.
    90  - When retrying failed RPCs, consider using the built-in retry mechanism
    91    provided by gRPC-Go, if available, instead of manually implementing retries.
    92    Refer to the [gRPC-Go retry example
    93    documentation](https://github.com/grpc/grpc-go/blob/master/examples/features/retry/README.md)
    94    for more information.
    95  - Avoid using `FailOnNonTempDialError`, `WithBlock`, and
    96    `WithReturnConnectionError`, as these options can introduce race conditions and
    97    result in unreliable and difficult-to-debug code.
    98  - If making the outgoing RPC in order to handle an incoming RPC, be sure to
    99    translate the status code before returning the error from your method handler.
   100    For example, if the error is an `INVALID_ARGUMENT` error, that probably means
   101    your service has a bug (otherwise it shouldn't have triggered this error), in
   102    which case `INTERNAL` is more appropriate to return back to your users.
   103  
   104  ### Example: Handling errors from an RPC
   105  
   106  The following code snippet demonstrates how to handle errors from an RPC in
   107  gRPC:
   108  
   109  ```go 
   110  ctx, cancel := context.WithTimeout(context.Background(), time.Second)
   111  defer cancel()
   112  
   113  res, err := client.MyRPC(ctx, &MyRequest{})
   114  if err != nil {
   115      // Handle the error appropriately,
   116      // log it & return an error to the caller, etc.
   117      log.Printf("Error calling MyRPC: %v", err)
   118      return nil, err
   119  }
   120  
   121  // Use the response as appropriate 
   122  log.Printf("MyRPC response: %v", res)
   123  ```
   124  
   125  To determine the type of error that occurred, you can use the status field of
   126  the error response:
   127  
   128  
   129  ```go
   130  resp, err := client.MakeRPC(context.Background(), request) 
   131  if err != nil {
   132    status, ok := status.FromError(err) 
   133    if ok {
   134      // Handle the error based on its status code 
   135      if status.Code() == codes.NotFound {
   136        log.Println("Requested resource not found")
   137      } else {
   138        log.Printf("RPC error: %v", status.Message())
   139      }
   140    } else {
   141      //Handle non-RPC errors 
   142      log.Printf("Non-RPC error: %v", err)
   143    }
   144    return
   145  }        
   146  
   147  // Use the response as needed 
   148  log.Printf("Response received: %v", resp) 
   149  ```
   150  
   151  ### Example: Using a backoff strategy
   152  
   153  
   154  When retrying failed RPCs, use a backoff strategy to avoid overwhelming the
   155  server or exacerbating network issues:
   156  
   157  
   158  ```go 
   159  var res *MyResponse
   160  var err error
   161  
   162  // If the user doesn't have a context with a deadline, create one
   163  ctx, cancel := context.WithTimeout(context.Background(), time.Second)
   164  defer cancel()
   165  
   166  // Retry the RPC call a maximum number of times
   167  for i := 0; i < maxRetries; i++ {
   168      
   169      // Make the RPC call
   170      res, err = client.MyRPC(ctx, &MyRequest{})
   171      
   172      // Check if the RPC call was successful
   173      if err == nil {
   174          // The RPC was successful, so break out of the loop
   175          break
   176      }
   177      
   178      // The RPC failed, so wait for a backoff period before retrying
   179      backoff := time.Duration(i) * time.Second
   180      log.Printf("Error calling MyRPC: %v; retrying in %v", err, backoff)
   181      time.Sleep(backoff)
   182  }
   183  
   184  // Check if the RPC call was successful after all retries
   185  if err != nil {
   186      // All retries failed, so handle the error appropriately
   187      log.Printf("Error calling MyRPC: %v", err)
   188      return nil, err
   189  }
   190  
   191  // Use the response as appropriate
   192  log.Printf("MyRPC response: %v", res)
   193  ```
   194  
   195  
   196  ## Conclusion
   197  
   198  The
   199  [`FailOnNonTempDialError`](https://pkg.go.dev/google.golang.org/grpc#FailOnNonTempDialError),
   200  [`WithBlock`](https://pkg.go.dev/google.golang.org/grpc#WithBlock), and
   201  [`WithReturnConnectionError`](https://pkg.go.dev/google.golang.org/grpc#WithReturnConnectionError)
   202  options are designed to handle errors at dial time, but they can introduce race
   203  conditions and result in unreliable and difficult-to-debug code. Instead of
   204  relying on these options, we strongly encourage developers to rely on errors
   205  from RPCs for error handling. By following best practices for error handling in
   206  gRPC, developers can write more reliable and robust gRPC applications.