There is no doubt that the first requirement for a composer is to be dead. — Arthur Honegger
TL;DR Don’t use required
, no matter how tempting. You won’t be able to get rid of it later when you realize it was a bad idea.
If you’re using proto2
, you’re probably familiar with the two field modifiers repeated
and optional
. These indicate that the field can hold “zero or more” of a value, or “zero or one”. But did you know that proto2
has a third modifier? It’s called required
, and it implies what you expect: the field must be present.
Unfortunately, its semantics make it a massive footgun, so you should never use it!
A little-known feature of most Protobuf runtimes is a setting like Go’s proto.Unmarshal.AllowPartial
. This instructs the decoder to allow “partial messages.” If you’ve used Protobuf for a long time, this must seem like a category error: checking invariants on a message is application-specific! However, there is one check that Protobuf knows about: required
field checks, also called “is initialized” checks. Protobuf considers any message that has unset required
fields to be “uninitialized,” and operations involving it produce errors.
Unset required
fields? As much as that seems like an oxymoron, required
fields don’t generate the same API as, say, a proto3
scalar field without a modifier, which has no has()
function. It actually has the exact same API as an optional
field. This means that when you construct a fresh Protobuf message with required
fields, those fields are all unset! So what’s the point of this feature, if it doesn’t seem to enforce this invariant?
As mentioned before, many operations that involve missing required
fields produce errors. Parsing a message with required
fields treats them the same as parsing optional
fields, but once parsing completes, the runtime walks the message to verify that every required
field is set. If any are missing, the parser returns an error and throws away the parsed message. This is the first big footgun associated with required
fields: any part of the message causes the whole message to get thrown away, even if the error is in some deeply nested repeated
message field. This is why AllowPartial
exists, so that you can parse and inspect a message that may be missing fields.
Curiously, the serialization function will also have an AllowPartial
setting: if you try to serialize a message with unset repeated
fields, you also get a runtime error. This is the only situation under which most serialization functions return errors.
All of this means that handling data errors that should happen at the application layer instead happens opaquely at the transport layer.
required
is foreverAll of required
's semantics conspire to make it impossible to safely:
required
field to an optional
field.required
field.Because initialized-check failures are so pervasive, trying to remove a required
field requires first downgrading it to optional
everywhere (but still making sure to set it everywhere), updating all users of the schema, and then finally deleting the field and replacing it with a reserved
declaration. This amount of coordination is anathema to Protobuf’s core principle of making gradual rollouts safe.
required
, unsurprisingly, was part of the root cause of many outages in Google’s early use of Protobuf.
proto3
got rid of required
and more or less replaced it with implicit presence (a non-message field with no modifier). Implicit presence fields are much closer to what most people want out of required
: the field is not optional
, so whether the field is actually set or not on the wire is irrelevant. Google’s internal version of protoc
doesn’t allow for new required
fields to be created; it contains a very large list of all of the existing required
symbol names, and produces an error if any symbol is added that is not in the list.
You can still create new required
fields in proto2
, but you will quickly discover that a lot of tooling either behaves unpredictably or flat-out ignores a required
modifier, and it creates a potential trap for users of your message. And once you discover the problem, it’s too late: removing a required
field that has escaped the scope of your team is impossible. This is hardly an exaggeration: if Google can’t do it with their mass refactoring tools, you’re not going to be able to, either.
Don’t use required
. It’s really not worth it.