Getting Error Handling right in gRPC

Pankaj Kumar
6 min readJul 4, 2021
Photo by Markus Spiske on Unsplash

Handling errors right can be tricky and it can be even trickier in gRPC. The current version of the gRPC only has limited built-in error handling based on simple status codes and metadata. In this article, we will see the limitations of gRPC error handling and how to overcome and build a robust error handling framework. In the next article, we will examine how to handle errors in RestFul APIs using Spring Boot.

Code Example

The working code example of this article is listed on GitHub . To run the example, clone repository, and import grpc-spring-boot as aproject in your favorite IDE.

The code example consists of two micro services -

  • Product Gateway — acts as an API Gateway (client of Product Service) and exposes REST APIs (Gradle module product-api-gateway)
  • Product Service — exposes gRPC APIs (Gradle module product-service)

There is a 3rd Gradle module, called commons, which contains common exceptions consumed by both Product Gateway Service and Product Service.

You can start these services from IDE by calling the main method of ProductGatewayApplication and ProductApplication respectively.

You can test application by calling Product Gateway Service API as :

curl --location --request GET 'http://localhost:8080/products/32c29935-da42-4801-825a-ac410584c281' \ --data-raw ''

Error handling in gRPC

By default, gRPC relies heavily on status code for error handling. But this approach has certain drawbacks. Let’s try to understand by example.

In our sample application, the server-side Product Service exposes a gRPC Service getProduct. This API fetches Product from ProductRepository and returns the response back to the client as:

ProductRepository fetches data from productStorage and returns Product and throws an error if Product is not found as:

You may argue that why do we need to throw a custom exception, why can’t we throw gRPC specific StatusRunTimeException as

product.orElseThrow(() -> Status.NOT_FOUND.withDescription("Product ID not found").asRuntimeException());

The biggest benefit is the separation of concern. You don’t want to pollute business logic with gRPC specific code, which belongs to the transport(API) layer.

The responsibility of the client application ( Product Gateway Service) is to call the server application and convert the received response to the domain object. In case of error, it simply wraps the error in domain-specific exception, as ServiceException(error.getCause()), and throws to be handled upstream.

Seems pretty straightforward, but there is one problem. In case of error, on the client-side, you’ll see -

io.grpc.StatusRuntimeException: UNKNOWN

Why do we see StatusRuntimeException with status as unknown?

gRPC wraps our custom exception ResourceNotFoundException in StatusRuntimeException and swallows the error message and assigns a default status code UNKNOWN.

We can improve error handling by catching ResourceNotFoundException in the server's service and call responseObserver.onError(..) as:

On client side, you will see:

Error while calling product service, cause NOT_FOUND: Product ID not found

You’ll notice that on the client-side you don’t get the original exception ResourceNotFoundException thrown by the server, so error.getCause() on the client is effectively returning null.

throw new ServiceException(error.getCause()); 
//error.getCause() is null

From official documentation of Status withCause(Throwable cause), cause is not transmitted from server to client.

Create a derived instance of Status with the given cause. However, the cause is not transmitted from server to client.

grpc-java documentation

Passing error metadata using gRPC Metadata

But what if you need to pass some error metadata information back to the client? For example, in our sample application, we may want to pass the id of the Product and standard error message when an error occurs. This can be done by using gRPC Metadata.

Fortunately, ResourceNotFoundException class has an overloaded constructor that takes additional errorMetadata as, ResourceNotFoundException(String message, Map<String, String> errorMetaData).

We can change Product Service API call by catching ResourceNotFoundException and calling responseObserver.onError(statusRuntimeException) with additional metadata as:

Let’s understand what’s being done here.

  1. Get error metadata from our custom ResourceNotFoundException as error.getErrorMetaData().
  2. For each key-value pair of error-metadata, create a key as Metadata.Key.of(entry.getKey(), Metadata.ASCII_STRING_MARSHALLER).
  3. Store key-value pairs in metadata by calling metadata.put(Key,Value).
  4. Create StatusRuntimeException by passing metadata to Status.
  5. Call responseObserver to set error condition.

On client side, you can catch StatusRuntimeException and get Metadata from error as:

In case of error, the above statement prints:

Received key Key{name='resource_id'}, with value 32c29935-da42-4801-825a-ac410584c281 
Received key Key{name='content-type'}, with value application/grpc Received key Key{name='message'}, with value Product ID not found

As you can see, it’s not clear which metadata is an error related as metadata can contain other information such as content-type (or trace information). For sure, you can define your own convention (for example appending all error metadata keys with err_).

There is another cleaner way to handle error metadata propagation.

Google Richer Error Model

The Google’s google.rpc.Status provides much richer error handling capabilities. This approach is used by Google APIs, but it's not part of the official gRPC error model, yet. Internally, this still uses metadata but in a cleaner way. The google.rpc.Status is defined as:

You must be aware of the gotcha associated with this approach, mainly it’s not supported by all language libraries and implementation may not be consistent across language.

The richness of error handling comes from ‘ repeated google.protobuf.Any '. From documentation -

`Any` contains an arbitrary serialized protocol buffer message along with a URL that describes the type of the serialized message.

You can use Any to pack your arbitrary custom error models or use any of the predefined error_details.proto. Let's see both of the approaches.

Using custom error model

Define you own custom error model as:

On the server-side Product Service, build the ErrorInfo model and add to com.google.rpc.Status by calling .addDetails(Any.pack(errorStatus)) as:

And on client side Product Gateway Service, change catch block as:

Using pre-defined error model

Rather than defining your own error model, you can use predefined error models from error_details.proto. For example, you can use ErrorInfo defined as:

On Server side Product Service, you can use com.google.rpc.ErrorInfo as:

Only change in client side is to user compiled ErrorInfo class as:

Global Interceptor for error handling

The approach of catching and throwing exceptions in the server-side Product Service can quickly get very complex and clumsy. In the case of complex business logic, you may end up with code like catch (ResourceNotFoundException | ServiceException | OtherException error).

We can simplify this by using a gRPC interceptor. The interceptor catches such exceptions and processes them accordingly as:

Let’s understand what’s being done here -

  • First, create ExcepltionHandler, which overrides onHalfClose(), by extending from ForwardingServerCallListener.SimpleForwardingServerCallListener<T>.
  • The handleException(..) method first builds google.rpc.ErrorInfo and then adds to com.google.rpc.Status, which internally builds the new metadata containing .
  • As serverCall.close(status, newHeaders), takes io.grpc.Status we need to convert com.google.rpc.Status by calling Status.fromThrowable(statusRuntimeException)
  • Then all we need to do is call serverCall.close(status, newHeaders) with io.grpc.Status and new .

The only change needed on the server-side service implementation of Product Service API, is to remove catch block and exception processing logic as:

On the client-side, there is no change i.e. we can get an instance of ErrorInfo class as errorInfo = any.unpack(ErrorInfo.class).

Using Spring Interceptor

If you can use grpc-spring-boot-starter then this greatly simplifies everything. All you need to do is to create a class and annotate that class with @GrpcAdvice and provide methods to handle the individual exception as:

This approach is similar to Spring error handling. You just need to define a method with annotation @GrpcExceptionHandler, for example @GrpcExceptionHandler(ResourceNotFoundException.class), for the specific error condition. That's it, no other change is needed on the server-side.

Summary

Getting error handling right can be very tricky in gRPC. Officially, gRPC heavily relies on status codes and metadata to handle errors. We can use gRPC metadata to pass additional error metadata from server application to client application. The Google’s google.rpc.Status provides much richer error handling capabilities but it's not fully supported in all the languages. It's possible to define a global gRPC interceptor to handle all error conditions centrally. The spring boot wrapper library yidongnan/grpc-spring-boot-starter provides a much cleaner approach to handle error.

Originally published at https://techdozo.dev.

--

--

Pankaj Kumar

Software Architect @ Schlumberger ``` Cloud | Microservices | Programming | Kubernetes | Architecture | Machine Learning | Java | Python ```