Skip to content

Deploy with Docker#

Docker is the fastest way to deploy Bufstream, whether you need an in-memory broker for testing and development or a persistent environment using an existing storage bucket and Postgres database.

For production Kubernetes deployments with Helm, we provide deployment guides for AWS, Google Cloud, and Azure. For a full-stack local environment, we also provide a Docker Compose example.

In-memory deployment#

You can run a Bufstream broker suitable for local testing and development with one line:

$ docker run --network host bufbuild/bufstream serve --inmemory

This creates an ephemeral instance listening for Kafka connections on 9092 and admin API requests on port 9089. Once it's stopped, all data is lost.

Deploying with existing resources#

For a long-lived broker, Bufstream requires the following:

  1. Object storage such as AWS S3, Google Cloud Storage, or Azure Blob Storage.
  2. A metadata storage service such as PostgreSQL.

Follow the instructions below to run Bufstream with an existing storage bucket and Postgres database.

Create a bufstream.yaml file providing bucket configuration and Postgres connection information:

bufstream.yaml
storage:
  provider: S3
  region: <region>
  bucket: <bucket-name>
  access_key_id:
    string: <S3 access key id>
  secret_access_key:
    string: <S3 secret access key>
postgres:
  dsn:
    string: postgresql://<postgres-user>:<postgres-password>@<postgres-host>:<postgres-port>/<database-name>

Start by creating a service account that has the appropriate bucket permissions. Once it's created, add a key. Creating the key results in a downloaded JSON file.

Next, create a bufstream.yaml file providing bucket configuration and Postgres connection information:

bufstream.yaml
storage:
  provider: GCS
  bucket: <bucket-name>
postgres:
  dsn:
    string: postgresql://<postgres-user>:<postgres-password>@<postgres-host>:<postgres-port>/<database-name>

Create a bufstream.yaml file providing bucket configuration and Postgres connection information:

bufstream.yaml
storage:
  provider: AZURE
  bucket: <blob-container-name>
  endpoint: https://<storage-account-name>.blob.core.windows.net
  access_key_id:
    string: <storage-account-name>
  secret_access_key:
    string: <access-key>
postgres:
  dsn:
    string: postgresql://<postgres-user>:<postgres-password>@<postgres-host>:<postgres-port>/<database-name>

Create a bufstream.yaml file providing bucket configuration and Postgres connection information:

bufstream.yaml
storage:
  provider: S3
  region: <region>
  bucket: <bucket-name>
  endpoint: http://localhost:<minio-port>
  force_path_style: true
  access_key_id:
    string: <minio-username>
  secret_access_key:
    string: <minio-password>
postgres:
  dsn:
    string: postgresql://<postgres-user>:<postgres-password>@<postgres-host>:<postgres-port>/<database-name>

It's never a good idea to commit credentials, so be sure to follow your organization's policies before adding configuration files like bufstream.yaml to version control.

Now that you have a configuration file, use Docker to start Bufstream. Note that this command uses -v to mount the bufstream.yaml file and the --config flag to specify the file for bufstream serve.

$ docker run \
    -v $PWD/bufstream.yaml:/bufstream.yaml \
    --network host \
    bufbuild/bufstream serve \
    --config /bufstream.yaml

Replace <service-account-key-file> with the path to your service account key file.

$ docker run \
    -v $PWD/bufstream.yaml:/bufstream.yaml \
    -v $PWD/<service-account-key-file>:/service-account-key.json \
    -e GOOGLE_APPLICATION_CREDENTIALS=/service-account-key.json \
    --network host \
    bufbuild/bufstream serve \
    --config /bufstream.yaml
$ docker run \
    -v $PWD/bufstream.yaml:/bufstream.yaml \
    --network host \
    bufbuild/bufstream serve \
    --config /bufstream.yaml
$ docker run \
    -v $PWD/bufstream.yaml:/bufstream.yaml \
    --network host \
    bufbuild/bufstream serve \
    --config /bufstream.yaml

This creates a broker listening for Kafka connections on 9092 and admin API requests on port 9089. It's safe to stop this instance—all of its topic data is stored in the bucket you configured, and its metadata state is stored in Postgres.

Bucket permissions#

Bufstream needs the following permissions to work with objects in its storage bucket.

Bufstream uses an S3 bucket for object storage, and needs to perform the following operations:

  • s3:GetObject: Read existing objects
  • s3:PutObject: Create new objects
  • s3:DeleteObject: Remove old objects according to retention and compaction rules
  • s3:ListBucket: List objects in the bucket
  • s3:AbortMultipartUpload: Allow failing of multi-part uploads that won't succeed

For more information about S3 bucket permissions and actions, consult the AWS S3 documentation.

Bufstream uses a Google Cloud Storage bucket for object storage, and needs to perform the following operations:

  • storage.objects.create: Create new storage objects
  • storage.objects.get: Retrieve existing storage objects
  • storage.objects.delete: Remove old storage objects according to retention and compaction rules
  • storage.objects.list: View all storage objects to enforce retention and compaction rules
  • storage.multipartUploads.*: Allow multi-part uploads

Bufstream uses Azure Blob Storage for object storage, and at minimum, requires the Storage Blob Data Contributor RBAC role on the Storage account container.

If you're using a local MinIO container and connecting as an administrator, you don't need to configure these. Otherwise, make sure you've granted the following:

  • s3:GetObject: Read existing objects
  • s3:PutObject: Create new objects
  • s3:DeleteObject: Remove old objects according to retention and compaction rules
  • s3:ListBucket: List objects in the bucket
  • s3:AbortMultipartUpload: Allow failing of multi-part uploads that won't succeed

Postgres role#

Bufstream needs full access to the database in Postgres so that it can manage its metadata schema.

Network ports#

If you're not running Bufstream locally, the following ports need to be open to allow Kafka clients and admin API requests to connect:

  • Kafka traffic: Defaults to 9092. Change this by setting kafka.address.port in bufstream.yaml.
  • Admin API traffic: Defaults to 9089. Change this by setting kafka.admin_address.port in bufstream.yaml.

Other considerations#

For additional configuration topics like instance types and sizes, metadata storage configuration, and autoscaling, see Cluster recommendations.

When running in Kubernetes, Bufstream supports workload identity federation within AWS, GCP, or Azure. It also supports GCP Cloud SQL IAM users. Refer to cloud provider deployment guides for more information.

Deploying with Docker Compose#

We also provide a full-stack Docker Compose example that sets up MinIO, PostgreSQL and Bufstream for you.