Read the official Jepsen report for Bufstream

Connect RPC vs. Google gRPC: Conformance Deep Dive

Josh Humphries and Derek Perez on May 30, 2024/8 min read

We’ve open sourced Connect RPC’s protocol conformance suite. Connect is a multi-protocol RPC project that includes support for the gRPC and gRPC-Web protocols. Anyone can now use it to validate the correctness of a gRPC implementation. This article explores how the test suite operates and details our findings for a selection of Connect RPC and Google’s gRPC runtimes.

Key takeaways

  • The Connect Conformance Suite validates interoperability, compatibility, and conformance across the Connect, gRPC, and gRPC-Web protocols. Connect RPC's multi-protocol nature enables us to validate any gRPC-compatible project. In addition to thoroughly testing Connect RPC, we’ve tested several of Google’s gRPC implementations.

  • Our tests identified a handful of conformance issues in Connect RPC (we’ve already fixed them) and at least 22 conformance issues across Google’s generally available/v1.0 gRPC implementations. We’ve filed numerous issues with the gRPC team about our findings and hope to collaborate with the team to resolve them for the ecosystem’s benefit. We’ve also opened issues for everything we couldn’t immediately resolve in the Connect RPC implementations. However, no outstanding conformance violations exist in any of our 1.0 releases.

  • The Connect RPC project is committed to ensuring all of its projects pass all conformance tests before reaching v1.0 status. We take specifications seriously. These are the building blocks that many organizations rely on to develop their products. Connect RPC implementations must pass all conformance tests before being marked stable and generally available.

Introducing the Connect Conformance Suite

The Connect Conformance Suite is an automated series of tests run using a client and server to validate interoperability, compatibility, and conformance across the Connect, gRPC, and gRPC-Web protocols. It’s meant to exercise various client-server interactions to ensure the results follow the protocol’s specification, even when mixing clients and servers written with different programming languages or runtimes.

Tests are divided into two types: client tests and server tests. Those that verify clients are run against a reference server implementation of the Conformance Service written with connect-go. Likewise, servers under test will be verified by a reference client implementing the Conformance Service written with connect-go.

To verify compatibility with other protocol implementations, the conformance tests also use reference client and server implementations that use the gRPC-Go module and a reference server implementation that uses the gRPC-Web Go server.

We’ve adopted this new conformance suite for all our implementations as part of this effort. We previously relied on a “cross test” system modeled after gRPC’s interop testing. However, it had several drawbacks: it was not data-driven, lacked an authoritative client or server to identify issues, could lead to unacceptably slow CI times, and involved complexities with Docker. The new conformance suite addresses these issues by defining test cases in YAML, using reference clients and servers, reducing the need to test against multiple implementations, and providing a test runner that doesn't use Docker.

Running the tests

Most of the work to get the conformance suite up and running is to write a client or server to test. This program uses the RPC implementation that is being tested and implements a protocol for communicating with the test runner by reading messages from STDIN and writing messages to STDOUT.

Once you’ve written a conformance client or server and a configuration file describing the relevant features, it’s time to run the tests. What parameters to pass to the connectconformance test runner depends on whether you are testing a client or a server. The following commands will get you started:

Testing a client

connectconformance --mode client --config <path/to/config/file> \
   -- <path/to/your/executable/client>

Testing a server

connectconformance --mode server --config <path/to/config/file> \
   -- <path/to/your/executable/server>

Anatomy of a test run

The conformance suite is provided as a self-contained test runner named connectconformance. This program includes the data for all of the test cases and the reference client and reference server implementations.

A server under test reads a single message from STDIN that provides configuration details used to start the server. Once the server is listening to the network and ready to accept requests, it writes a single message to STDOUT that contains details on how RPC clients can access the server. Similarly, a client under test reads multiple messages from STDIN, each one describing an RPC to issue. After invoking the RPC and recording the results, it writes a description of each result to STDOUT.

The test runner starts a single client process that will send RPCs and one server process for each distinct configuration it needs to test. It decides if an implementation is conformant by examining the results of the RPCs. When testing a server, the client it starts is a reference client; when testing a client, that gets paired with a reference server.

serverclienttest runnerserverclienttest runnerstart processstart processsend config via stdinsend result via stdoutsend RPC details via stdininvoke RPC, send request(s)process RPCsend response(s)send RPC results via stdoutassess RPC resultsterminateterminate

The reference client and server behave just like any other client or server under test, except they perform additional checks for each request or response, making sure they are well-formed and compliant. Any identified issues uncovered by the checks are reported to the test runner.

In addition to writing a client or server process to test, you must also provide a YAML configuration file that describes the features your implementation supports, which is used to determine which test cases to validate the implementations against.

For more info on how to do this, check out our onboarding guides.

Connect RPC results

We run the conformance tests across all of our existing Connect RPC projects, for every commit as part of our CI process. While adding the tests to these projects, it became clear that the expected behavior for many edge cases was not adequately specified, and our implementations behaved inconsistently. We’ve opened issues against the Connect specification (GH#169), to add those details and bring greater clarity to how clients and servers should act in these conditions. To get consistent behavior in our implementations, even for these unspecified situations, we’ve codified interim expectations into the conformance tests. These expectations are our proposed behavior for these missing parts of the spec, until further consensus is established.

For Connect RPC, most of the issues the conformance tests caught were subtle misuse of status codes for certain error conditions. Interestingly, every implementation tested exhibited issues in the following categories:

  • Cardinality violations: These violations occur when a stream has an unexpected number of messages. For example, a unary RPC should have exactly one request message and either one response message or an error status, but the actual wire-level protocol allows for zero or more messages to be sent. The documentation for the gRPC status codes states that these kinds of issues should result in an “unimplemented” error code. Some implementations weren’t checking for or handling these conditions and some were using different error codes.

  • Classifying “trailers-only” responses in the gRPC and gRPC-Web protocols: A “trailers-only” response is a special kind of compact response that a server can use when there is no response data, only a status and metadata. Our implementations were using an incorrect heuristic to classify a response as a “trailers-only”, which could lead to incorrect computation of the RPC result in rare corner cases.

  • Malformed “end stream” or “error” messages in the Connect protocol: In the face of certain kinds of malformed response messages, the implementations would return an unexpected error code.

Google's gRPC results

We’ve also created conformance clients and servers to test Google’s gRPC implementations using our suite. We’ve identified similar shortcomings in gRPC’s specification, as well as a number of issues in gRPC’s generally-available/v1 implementations. The table below enumerates the issues we identified and reported to the gRPC team (some are links to pre-existing issue reports, where other users had already encountered and reported the issue).

TopicIssues Found
SpecificationGH#24007, GH#36765, GH#36766, GH#36767
GoGH#6987, GH#7286
JavaGH#11245, GH#11246, GH#11247, GH#11248
C++GH#36769, GH#36770
NodeGH#882, GH#2764, GH#2765, GH#2766, GH#2767, GH#2768
Web JSGH#1399, GH#1427, GH#1428, GH#1429

The most common issues uncovered were related to cardinality violations and compression. In fact, every implementation tested had at least one issue related to both of these topics.

To reiterate: cardinality violations are when a stream has an unexpected number of messages. Just like we found in our own Connect implementations, some of Google’s implementations aren’t checking for or handling these conditions and some are using different error codes.

The compression issues uncovered had to do with: (1) what happens when a message is received that uses an unsupported compression encoding and (2) what happens when a message is received that is marked as compressed (via a protocol-level flag) but no compression encoding is actually in use. The docs for compression in gRPC state the expected behavior for these scenarios, but the implementations don’t all conform.

Conclusion

We believe specifications aren’t suggestions, they’re contracts. For Connect RPC, it’s essential to us that every implementation we release strictly provides protocol compatibility with the existing gRPC ecosystem. We are pleased with the findings that our test suite has produced, and we’re certain that it will be a critical resource for the project as we onboard new Connect RPC implementations in the coming months. This conformance suite is now available to everyone, and we hope you’ll join us in using it to grow the confidence and quality of work we’re producing as a community.

Ready for a trial?