Deploy with Docker#

Docker is the fastest way to deploy Bufstream, whether you need an in-memory broker for testing and development or a persistent environment using an existing storage bucket and Postgres database.

For production Kubernetes deployments with Helm, we provide deployment guides for AWS, Google Cloud, and Azure. For a full-stack local environment, we also provide a Docker Compose example.

In-memory deployment#

You can run a Bufstream broker suitable for local testing and development with one line:

$ docker run --network host bufbuild/bufstream serve --inmemory

This creates an ephemeral instance listening for Kafka connections on 9092 and admin API requests on port 9089. Once it's stopped, all data is lost.

If you're running Docker Desktop, you'll need enable host mode networking.

Deploying with existing resources#

For a long-lived broker, Bufstream requires the following:

Object storage such as AWS S3, Google Cloud Storage, or Azure Blob Storage.
A metadata storage service such as PostgreSQL.

Follow the instructions below to run Bufstream with an existing storage bucket and Postgres database.

AWSGoogle CloudAzureLocalhost (MinIO)

Create a bufstream.yaml file providing bucket configuration and Postgres connection information:

bufstream.yaml

storage:
  provider: S3
  region: <region>
  bucket: <bucket-name>
  access_key_id:
    string: <S3 access key id>
  secret_access_key:
    string: <S3 secret access key>
postgres:
  dsn:
    string: postgresql://<postgres-user>:<postgres-password>@<postgres-host>:<postgres-port>/<database-name>

Start by creating a service account that has the appropriate bucket permissions. Once it's created, add a key. Creating the key results in a downloaded JSON file.

Next, create a bufstream.yaml file providing bucket configuration and Postgres connection information:

bufstream.yaml

storage:
  provider: GCS
  bucket: <bucket-name>
postgres:
  dsn:
    string: postgresql://<postgres-user>:<postgres-password>@<postgres-host>:<postgres-port>/<database-name>

Create a bufstream.yaml file providing bucket configuration and Postgres connection information:

bufstream.yaml

storage:
  provider: AZURE
  bucket: <blob-container-name>
  endpoint: https://<storage-account-name>.blob.core.windows.net
  access_key_id:
    string: <storage-account-name>
  secret_access_key:
    string: <access-key>
postgres:
  dsn:
    string: postgresql://<postgres-user>:<postgres-password>@<postgres-host>:<postgres-port>/<database-name>

Create a bufstream.yaml file providing bucket configuration and Postgres connection information:

bufstream.yaml

storage:
  provider: S3
  region: <region>
  bucket: <bucket-name>
  endpoint: http://localhost:<minio-port>
  force_path_style: true
  access_key_id:
    string: <minio-username>
  secret_access_key:
    string: <minio-password>
postgres:
  dsn:
    string: postgresql://<postgres-user>:<postgres-password>@<postgres-host>:<postgres-port>/<database-name>

It's never a good idea to commit credentials, so be sure to follow your organization's policies before adding configuration files like bufstream.yaml to version control.

Now that you have a configuration file, use Docker to start Bufstream. Note that this command uses -v to mount the bufstream.yaml file and the --config flag to specify the file for bufstream serve.

AWSGoogle CloudAzureLocalhost (MinIO)

$ docker run \
    -v $PWD/bufstream.yaml:/bufstream.yaml \
    --network host \
    bufbuild/bufstream serve \
    --config /bufstream.yaml

Replace <service-account-key-file> with the path to your service account key file.

$ docker run \
    -v $PWD/bufstream.yaml:/bufstream.yaml \
    -v $PWD/<service-account-key-file>:/service-account-key.json \
    -e GOOGLE_APPLICATION_CREDENTIALS=/service-account-key.json \
    --network host \
    bufbuild/bufstream serve \
    --config /bufstream.yaml

$ docker run \
    -v $PWD/bufstream.yaml:/bufstream.yaml \
    --network host \
    bufbuild/bufstream serve \
    --config /bufstream.yaml

$ docker run \
    -v $PWD/bufstream.yaml:/bufstream.yaml \
    --network host \
    bufbuild/bufstream serve \
    --config /bufstream.yaml

This creates a broker listening for Kafka connections on 9092 and admin API requests on port 9089. It's safe to stop this instance—all of its topic data is stored in the bucket you configured, and its metadata state is stored in Postgres.

Bucket permissions#

Bufstream needs the following permissions to work with objects in its storage bucket.

AWSGoogle CloudAzureLocalhost (MinIO)

Bufstream uses an S3 bucket for object storage, and needs to perform the following operations:

s3:GetObject: Read existing objects
s3:PutObject: Create new objects
s3:DeleteObject: Remove old objects according to retention and compaction rules
s3:ListBucket: List objects in the bucket
s3:AbortMultipartUpload: Allow failing of multi-part uploads that won't succeed

For more information about S3 bucket permissions and actions, consult the AWS S3 documentation.

Bufstream uses a Google Cloud Storage bucket for object storage, and needs to perform the following operations:

storage.objects.create: Create new storage objects
storage.objects.get: Retrieve existing storage objects
storage.objects.delete: Remove old storage objects according to retention and compaction rules
storage.objects.list: View all storage objects to enforce retention and compaction rules
storage.multipartUploads.*: Allow multi-part uploads

Bufstream uses Azure Blob Storage for object storage, and at minimum, requires the Storage Blob Data Contributor RBAC role on the Storage account container.

If you're using a local MinIO container and connecting as an administrator, you don't need to configure these. Otherwise, make sure you've granted the following:

s3:GetObject: Read existing objects
s3:PutObject: Create new objects
s3:DeleteObject: Remove old objects according to retention and compaction rules
s3:ListBucket: List objects in the bucket
s3:AbortMultipartUpload: Allow failing of multi-part uploads that won't succeed

Postgres role#

Bufstream needs full access to the database in Postgres so that it can manage its metadata schema.

Network ports#

If you're not running Bufstream locally, the following ports need to be open to allow Kafka clients and admin API requests to connect:

Kafka traffic: Defaults to 9092. Change this by setting kafka.address.port in bufstream.yaml.
Admin API traffic: Defaults to 9089. Change this by setting kafka.admin_address.port in bufstream.yaml.

Other considerations#

For additional configuration topics like instance types and sizes, metadata storage configuration, and autoscaling, see Cluster recommendations.

When running in Kubernetes, Bufstream supports workload identity federation within AWS, GCP, or Azure. It also supports GCP Cloud SQL IAM users. Refer to cloud provider deployment guides for more information.

Deploying with Docker Compose#

We also provide a full-stack Docker Compose example that sets up MinIO, PostgreSQL and Bufstream for you.