Deploy with Docker#
Docker is the fastest way to deploy Bufstream, whether you need an in-memory broker for testing and development or a persistent environment using an existing storage bucket and Postgres database.
For production Kubernetes deployments with Helm, we provide deployment guides for AWS, Google Cloud, and Azure. For a full-stack local environment, we also provide a Docker Compose example.
In-memory deployment#
You can run a Bufstream broker suitable for local testing and development with one line:
This creates an ephemeral instance listening for Kafka connections on 9092 and admin API requests on port 9089. Once it's stopped, all data is lost.
Deploying with existing resources#
For a long-lived broker, Bufstream requires the following:
- Object storage such as AWS S3, Google Cloud Storage, or Azure Blob Storage.
- A metadata storage service such as PostgreSQL.
Follow the instructions below to run Bufstream with an existing storage bucket and Postgres database.
Create a bufstream.yaml
file providing bucket configuration and Postgres connection information:
Start by creating a service account that has the appropriate bucket permissions. Once it's created, add a key. Creating the key results in a downloaded JSON file.
Next, create a bufstream.yaml
file providing bucket configuration and Postgres connection information:
Create a bufstream.yaml
file providing bucket configuration and Postgres connection information:
storage:
provider: AZURE
bucket: <blob-container-name>
endpoint: https://<storage-account-name>.blob.core.windows.net
access_key_id:
string: <storage-account-name>
secret_access_key:
string: <access-key>
postgres:
dsn:
string: postgresql://<postgres-user>:<postgres-password>@<postgres-host>:<postgres-port>/<database-name>
Create a bufstream.yaml
file providing bucket configuration and Postgres connection information:
storage:
provider: S3
region: <region>
bucket: <bucket-name>
endpoint: http://localhost:<minio-port>
force_path_style: true
access_key_id:
string: <minio-username>
secret_access_key:
string: <minio-password>
postgres:
dsn:
string: postgresql://<postgres-user>:<postgres-password>@<postgres-host>:<postgres-port>/<database-name>
It's never a good idea to commit credentials, so be sure to follow your organization's policies before adding configuration files like bufstream.yaml
to version control.
Now that you have a configuration file, use Docker to start Bufstream. Note that this command uses -v
to mount the bufstream.yaml
file and
the --config
flag to specify the file for bufstream serve
.
Replace <service-account-key-file>
with the path to your service account key file.
This creates a broker listening for Kafka connections on 9092 and admin API requests on port 9089. It's safe to stop this instance—all of its topic data is stored in the bucket you configured, and its metadata state is stored in Postgres.
Bucket permissions#
Bufstream needs the following permissions to work with objects in its storage bucket.
Bufstream uses an S3 bucket for object storage, and needs to perform the following operations:
s3:GetObject
: Read existing objectss3:PutObject
: Create new objectss3:DeleteObject
: Remove old objects according to retention and compaction ruless3:ListBucket
: List objects in the buckets3:AbortMultipartUpload
: Allow failing of multi-part uploads that won't succeed
For more information about S3 bucket permissions and actions, consult the AWS S3 documentation.
Bufstream uses a Google Cloud Storage bucket for object storage, and needs to perform the following operations:
storage.objects.create
: Create new storage objectsstorage.objects.get
: Retrieve existing storage objectsstorage.objects.delete
: Remove old storage objects according to retention and compaction rulesstorage.objects.list
: View all storage objects to enforce retention and compaction rulesstorage.multipartUploads.*
: Allow multi-part uploads
Bufstream uses Azure Blob Storage for object storage, and at minimum, requires the Storage Blob Data Contributor
RBAC role on the Storage account container.
If you're using a local MinIO container and connecting as an administrator, you don't need to configure these. Otherwise, make sure you've granted the following:
s3:GetObject
: Read existing objectss3:PutObject
: Create new objectss3:DeleteObject
: Remove old objects according to retention and compaction ruless3:ListBucket
: List objects in the buckets3:AbortMultipartUpload
: Allow failing of multi-part uploads that won't succeed
Postgres role#
Bufstream needs full access to the database in Postgres so that it can manage its metadata schema.
Network ports#
If you're not running Bufstream locally, the following ports need to be open to allow Kafka clients and admin API requests to connect:
- Kafka traffic: Defaults to 9092. Change this by setting
kafka.address.port
inbufstream.yaml
. - Admin API traffic: Defaults to 9089. Change this by setting
kafka.admin_address.port
inbufstream.yaml
.
Other considerations#
For additional configuration topics like instance types and sizes, metadata storage configuration, and autoscaling, see Cluster recommendations.
When running in Kubernetes, Bufstream supports workload identity federation within AWS, GCP, or Azure. It also supports GCP Cloud SQL IAM users. Refer to cloud provider deployment guides for more information.
Deploying with Docker Compose#
We also provide a full-stack Docker Compose example that sets up MinIO, PostgreSQL and Bufstream for you.