Throughout the documentation, there are many references to sources, images, and inputs. The various I/O options for Buf can seem daunting and overly complex, so we'll break down how this all fits together.
The default built and lint operations of Buf use your current directory. If all you are doing is linting Protobuf files, the below can be largely ignored, as Buf does what you would expect by default.
First, some basic terminology to help our discussion:
A Source is a set of
.protofiles that can be compiled.
An Image is a compiled set of
.protofiles. This is itself a Protobuf message. The exact mechanics of Images are described in the Image documentation, which we encourage you to read.
Images are created from Sources using
buf image buildor
An Input is either a Source or an Image.
All Inputs have a Format, which describes the type of the Input. This Format is usually automatically derived, however it can be explicitly set.
At first glance, this may seem extremely complex for a Protobuf tool.
For most current use cases, that is accurate. Generally, your only goal is to work with
files on disk. This is how Buf works by default. However, there are cases where one wants to work
with more than just local files, some of which apply to Buf's current feature set, and some of which
are for the future.
Breaking change detection
The biggest current use case is for breaking change detection. When you are comparing your current Protobuf schema to an old version of your schema, you have to decide - where is your old version stored? Buf provides multiple options for this, including the ability to directly compile and compare against a git branch or git tag, however it is generally preferable to store a representation of your old version in a file. Buf does this via Images, allowing you to store your golden state, and then compare your current Protobuf schema against this golden state. This includes support for partial comparisons, as well as storing this golden state in a remote location.
Using protoc instead of the internal compiler
Existing lint and breaking change detection tools produce an internal representation of your Protobuf schema in one of two ways:
- By using a third-party Protobuf parser, which is usually error-prone and almost never covers every edge case of the Protobuf grammar.
- By shelling out to
protocitself and parsing the result, which not only requires specific management of
protocin relation to the lint/breaking change detection tool, but can be cumbersome and error-prone itself, especially if the tool parses error output from
Buf tackles this issue by using FileDescriptorSets (which we extend into Images) internally for all operations, and allowing these FileDescriptorSets to be produced in one of two ways:
- By using a newly-developed Golang Protobuf compiler that is continuously tested against thousands of known Protobuf definitions, including all known edge cases of the Protobuf grammar.
- By allowing users to provide
bufinput, thereby bypassing any compiling or parsing on the part of
bufentirely, and instead using
protoc, the gold standard of Protobuf compilation.
In short, we don't expect you to natively trust the internal compiler is actually equivalent to
protoc - we would want to verify this claim ourselves. There are also cases (such as Bazel setups)
where you may already have infrastructure around calling
protoc, and may want to just use
protoc as input to
Buf Schema Registry
Buf's primary functionality right now is linting and breaking change detection.
These features, and the corresponding CLI tool
buf, will always remain free and open source.
However, our goal is to develop this into a new way of working with Protocol Buffers. One of the first products we'll be releasing is the Buf Schema Registry, which will handle stub generation and consumption. See the future for more details.
The core primitive for Buf is the Image, which will be used for the Buf Schema Registry.
Specifying an Input
Inputs are specified using associated flags on the command line.
buf image build, the Source is specified with
--source. The location to write the output Image is specified with
--output, which is a required flag.
buf check lint, the Input (i.e. the Source or Image) to lint is specified with
buf check breaking, the Input is specified with
--input, and the Input to compare against is specified with
buf ls-files, the Input to list is specified with
Inputs are specified as a string, which has the following structure:
The path specifies the path to the Input. The options specify options to interpret the Input at the path.
format can be used on any Input string to override the derived Format.
Examples (the mechanics of which are described below):
path/to/file.data#format=binexplicitly sets the Format to
bin, as by default this path would be interpreted as Format
https://github.com/googleapis/googleapis#format=git,branch=masterexplicitly sets the Format to
git. In this case however, note that
https://github.com/googleapis/googleapis.git#branch=masterhas the same effect as the latter is also a valid path (see below for derived Formats).
-#format=jsonexplicitly sets the Format to
json, i.e. read from stdin as JSON, or in the case of
buf image build --output, write to stdout as JSON.
As of now, there are four other options, all of which are Format-specific:
branchoption specifies the branch to clone for
tagoption specifies the tag to clone for
recurse_submodulesoption says to clone submodules recursively for
strip_componentsoption specifies the number of directories to strip for
git Inputs, one of
tag is required.
All Sources contain a set of
.proto files that can be compiled.
A local directory. The path can be either relative or absolute.
This is the default Format. By default, Buf uses the current directory as the Input for all commands.
path/to/dirsays to compile the files in this relative directory path.
/absolute/path/to/dirsays to compile the files in this absolute directory path.
A tarball. The path to this tarball can be either a local file, a remote http/https location, or
- for stdin.
strip_components option is optional.
foo.tarsays to read the tarball at this relative path.
foo.tar#strip_components=2says to read the tarball at this relative path and strip the first two directories.
-#format=tarsays to read a tarball from stdin.
A gzipped tarball. The semantics are the same as the
foo.tar.gzsays to read the gzipped tarball at this relative path.
foo.tgzsays to read the gzipped tarball at this relative path.
https://github.com/googleapis/googleapis/archive/master.tar.gz#strip_components=1says to read the gzipped tarball at this http location, and strip one directory.
-#format=targzsays to read a gzipped tarball from stdin.
A git repository. The path to the git repository can be either a local
.git directory, or a remote
git http, https, or ssh location.
Exactly one of the
tag options is required. If
branch is specified, Buf will clone the
single commit at the head of this branch and use it as the Source. If
tag is specified, Buf will
clone this tag.
recurse_submodules option allows for cloning of submodules recursively.
.git#branch=mastersays to clone the master branch of the git repository at the relative path
.git. This is particularly useful for local breaking change detection.
.git#tag=v1.0.0says to clone the v1.0.0 tag of the git repository at the relative path
.git#branch=master,recurse_submodules=truesays to clone the master branch along with all recursive submodules.
https://github.com/googleapis/googleapis.git#branch=mastersays to clone the master branch of the git repository at the remote location.
https://github.com/googleapis/googleapis.git#tag=v1.0.0says to clone the v1.0.0 tag of the git repository at the remote location.
git://github.com/googleapis/googleapis.git#branch=masteris also valid.
ssh://email@example.com/org/private-repo.git#branch=masteris also valid.
https://github.com/googleapis/googleapis#format=git,branch=masteris also valid.
All Images are files. Files can be read from a local path, a remote http/https location,
- for stdin.
Images are created using
buf image build. Examples:
buf image build -o image.bin
buf image build -o image.bin.gz
buf image build -o image.json
buf image build -o image.json.gz
buf image build -o -
buf image build -o -#format=json
-o is an alias for
Images can also be created in the
bin Format using
protoc. See the compiler documentation
for more details.
For example, the following is a valid way to compile all Protobuf files in your current directory, produce a FileDescriptorSet (which is also an Image, as described in the Image documentation) to stdout, and read this Image as binary from stdin:
protoc -I . $(find. -name '*.proto') -o /dev/stdout | buf check lint --input -
A binary Image.
image.binsays to read the file at this relative path.
-says to read a binary Image from stdin.
A gzipped binary Image.
image.bin.gzsays to read the file at this relative path.
-#format=bingzsay to read the file from stdin.
A JSON Image. This creates Images that take much more space, and are slower to parse, but will result in diffs that show the actual differences between two Images in a readable format.
When combined with jq, this also allows for introspection. For example, to see a list of all packages:
$ buf image build -o -#format=json | jq '.file | .package' | sort | uniq | head "google.actions.type" "google.ads.admob.v1" "google.ads.googleads.v1.common" "google.ads.googleads.v1.enums" "google.ads.googleads.v1.errors" "google.ads.googleads.v1.resources" "google.ads.googleads.v1.services" "google.ads.googleads.v2.common" "google.ads.googleads.v2.enums" "google.ads.googleads.v2.errors"
A gzipped JSON Image.
image.json.gzsays to read the file at this relative path.
-#format=jsongzsay to read the file from stdin.
Automatically derived Formats
By default, Buf will derive the Format of an Input from the path via the file extension.
There are also two special cases:
If the path is
-, this is interpreted to mean stdin. By default, this is interpreted as the
Of note, the special value
-can also be used as a value to the
buf image build, which is interpreted to mean stdout, and also interpreted by default as the
If the path is
/dev/nullon Linux or Mac, or
nulin the future with Windows, this is interpreted as the
If no format can be automatically derived, the
dir format is assumed, ie Buf assumes the path
is a path to a local directory.
The format of an Input can be explicitly set as described above.
Tarballs, git repositories, and image files can be read from remote locations. For those remote locations that need authentication, a couple mechanisms exist.
Basic authentication can be specified for remote tarballs, git repositories, and image files over https with the following environment variables:
BUF_INPUT_HTTPS_USERNAME- The username. For GitHub, this is your GitHub user.
BUF_INPUT_HTTPS_PASSWORD- The password. For GitHub, this is a personal access token for your GitHub User.
Assuming these environment variables are set up, you can call Buf as you normally would:
$ buf check lint --input https://github.com/org/private-repo.git#branch=master $ buf check lint --input https://github.com/org/private-repo.git#tag=v1.0.0 $ buf check lint --input https://github.com/org/private-repo/archive/master.tar.gz#strip_components=1 $ buf check breaking --against-input https://github.com/org/private-repo.git#branch=master $ buf check breaking --against-input https://github.com/org/private-repo.git#tag=v1.0.0
Public key authentication can be used for remote git repositories over ssh. By default, Buf will look for:
- A private key file with no passphrase at
- A known hosts file at either
The following environment variables can be used to customize these locations:
BUF_INPUT_SSH_KEY_FILE- The path to the private key file.
BUF_INPUT_SSH_KEY_PASSPHRASE- The passphrase for the private key.
BUF_INPUT_SSH_KNOWN_HOSTS_FILES- A colon-separated list of known hosts file paths.
Assuming these default files exist or these environment variables are set up, you can call Buf as you normally would:
$ buf check lint --input ssh://firstname.lastname@example.org/org/private-repo.git#branch=master $ buf check lint --input ssh://email@example.com/org/private-repo.git#tag=v1.0.0 $ buf check breaking --against-input ssh://firstname.lastname@example.org/org/private-repo.git#branch=master $ buf check breaking --against-input ssh://email@example.com/org/private-repo.git#tag=v1.0.0
Note that CI services such as CircleCI have a private key and known hosts file pre-installed, so this should work out of the box.
By default, buf will look for a configuration file for an Input in the following manner:
dir, bin, bingz, json, jsongzInputs, Buf will look at your current directory for a
tar, targzInputs, Buf will look at the root of the tarball for a
gitInputs, Buf will look at the root of the cloned repository at the head of the cloned branch.
The configuration can be overridden by command line flags. See the configuration documentation for more details.