Skip to content

Deploy Bufstream to Google Cloud

This page walks you through installing Bufstream into your Google Cloud Platform (GCP) deployment by setting your Helm values and installing the provided Helm chart. See the GCP configuration page for defaults and recommendations about resources, replicas, storage, and scaling.

Data from your Bufstream cluster will never leave your network or report back to Buf.

Prerequisites

To deploy Bufstream on GCP, you need the following capabilities before you start:

  • A Kubernetes cluster (v1.27 or newer)
  • A Google Cloud Storage bucket
  • A Bufstream service account, with read/write permission to the GCS bucket above.
  • Helm (v3.12.0 or newer)

If you don't yet have your GCP environment, you'll need at least the following IAM permissions:

  • Kubernetes Engine Admin role (roles/container.admin)
  • Storage Admin role (roles/storage.admin)
  • Service Account Admin role (roles/iam.serviceAccountAdmin)
  • Optionally, you may also have either of these roles, but neither is required:
    • Role Administrator role (roles/iam.roleAdmin)
    • Service Account Key Admin role (roles/iam.serviceAccountKeyAdmin) (not recommended)

Create a GKE cluster

Create a GKE standard cluster if you don't already have one. A GKE cluster involves many settings that vary depending on your use case. See the official documentation for details, but you'll need at least these settings:

  • [Optional, but recommended] Workload identity federation:
    • Toggle Enable Workload Identity in the console under the Security tab when creating the cluster; or
    • Include --workload-pool=<gcp-project-name.svc.id.goog> on the gcloud command.
    • See the official documentation
  • Bufstream brokers by default use 2 CPUs and 8 GiB of memory, so you'll need a node pool with machine types at least as big as n2d-standard-4. Learn more about configuring resources in Resources and replicas.

Create a GCS bucket

If you don't already have one, you need the Storage Admin role (roles/storage.admin).

$ gcloud storage buckets create gs://<bucket-name> --project <gcp-project-name>

Create a Bufstream Service Account

Bufstream needs a dedicated service account. If you don't have one yet, make sure you have the Service Account Admin role (roles/iam.serviceAccountAdmin) and create a service account:

$ gcloud iam service-accounts create bufstream-service-account --project <project>
$ gcloud iam service-accounts add-iam-policy-binding bufstream-service-account@<gcp-project-name>.iam.gserviceaccount.com \
    --role roles/iam.workloadIdentityUser \
    --member "serviceAccount:<gcp-project-name>.svc.id.goog[bufstream/bufstream-service-account]"

If you have the Storage Admin role, you can use add permissions directly on the bucket:

$ gcloud storage buckets add-iam-policy-binding gs://<bucket-name> --member=serviceAccount:bufstream-service-account@<gcp-project-name>.iam.gserviceaccount.com --role=roles/storage.objectAdmin

If you have the the Project IAM Admin role (roles/resourcemanager.projectIamAdmin), you can also set the permission on the entire project:

$ gcloud projects add-iam-policy-binding <gcp-project-name> --member=serviceAccount:bufstream-service-account --role=roles/storage.objectAdmin
Using Custom Object Storage permissions If you have the `Role Administrator` role (`roles/iam.roleAdmin`), you can also create a role with the minimal set of permissions required:
$ gcloud iam roles create 'bufstream.gcs' \
  --project <gcp-project-name> \
  --permissions \
  storage.objects.create,\
  storage.objects.get,\
  storage.objects.delete,\
  storage.objects.list,\
  storage.multipartUploads.abort,\
  storage.multipartUploads.create,\
  storage.multipartUploads.list,\
  storage.multipartUploads.listParts
Then replace `--role=roles/storage.objectAdmin` with `--role=projects//roles/bufstream.gcs` in the preceding commands.

Create a namespace

Create a Kubernetes namespace in the k8s cluster for the bufstream deployment to use:

$ kubectl create namespace bufstream

Deploy etcd

Bufstream requires an etcd cluster. To set up an example deployment of etcd on Kubernetes, use the Bitnami etcd Helm chart with the following values:

$ helm install \
  --namespace bufstream \
  bufstream-etcd \
  oci://registry-1.docker.io/bitnamicharts/etcd \
  -f - <<EOF
replicaCount: 3
persistence:
  enabled: true
  size: 10Gi
  storageClass: "premium-rwo"
autoCompactionMode: periodic
autoCompactionRetention: 30s
removeMemberOnContainerTermination: false
resourcesPreset: none
auth:
  rbac:
    create: false
    enabled: false
  token:
    enabled: false
metrics:
  useSeparateEndpoint: true
customLivenessProbe:
  httpGet:
    port: 9090
    path: /livez
    scheme: "HTTP"
  initialDelaySeconds: 10
  periodSeconds: 30
  timeoutSeconds: 15
  failureThreshold: 10
customReadinessProbe:
  httpGet:
    port: 9090
    path: /readyz
    scheme: "HTTP"
  initialDelaySeconds: 20
  timeoutSeconds: 10
extraEnvVars:
  - name: ETCD_LISTEN_CLIENT_HTTP_URLS
    value: "http://0.0.0.0:8080"
EOF

Warning

etcd is sensitive to disk performance, so we recommend using SSD-backed disks, such as the premium-rwo in the example above.

Deploy Bufstream

1. Authenticate helm

To get started, authenticate helm with the Bufstream OCI registry using the keyfile that was sent alongside this documentation. The keyfile should contain a base64 encoded string.

$ cat keyfile | helm registry login -u _json_key_base64 --password-stdin \
  https://us-docker.pkg.dev/buf-images-1/bufstream

2. Configure Bufstream's Helm values

Bufstream is configured using Helm values that are passed to the bufstream Helm chart. To configure the values:

  1. Create a Helm values file named bufstream-values.yaml, which is required by the helm install command in step 5. This file can be in any location, but we recommend creating it in the same directory where you run the helm commands.

  2. Put the values from the steps below in the bufstream-values.yaml file. Skip to Install the Helm chart for a full example chart.

Configure object storage

Bufstream requires GCS object storage. See bucket permissions for a minimal set of permissions required.

Bufstream attempts to acquire credentials from the environment using GKE Workload Identity Federation. To configure storage, set the following Helm values, filling in your GCS variables and service account annotations for the service account binding:

bufstream-values.yaml
storage:
  use: gcs
  gcs:
    bucket: <bucket-name>
bufstream:
  serviceAccount:
    annotations:
      iam.gke.io/gcp-service-account: bufstream-service-account@<gcp-project-name>.iam.gserviceaccount.com

The k8s service account to be bound to the GCP service account is named bufstream-service-account.

Alternatively, you can use service account credentials. You'll need the Service Account Key Admin role (roles/iam.serviceAccountKeyAdmin) for this.

  1. Create a key credential for the service account:
$ gcloud iam service-accounts keys create credentials.json --iam-account=bufstream-service-account@<gcp-project-name>.iam.gserviceaccount.com --key-file-type=json
  1. Create a k8s secret containing the service account credentials:
$ kubectl create secret --namespace bufstream generic bufstream-service-account-credentials \
  --from-file=credentials.json=credentials.json
  1. Set the secretName in the configuration:
bufstream-values.yaml
storage:
  use: gcs
  gcs:
    bucket: <bucket-name>
    secretName: "bufstream-service-account-credentials"

Configure etcd

Then, configure Bufstream to connect to the etcd cluster that you created before:

bufstream-values.yaml
metadata:
  use: etcd
  etcd:
    # etcd addresses to connect to
    addresses:
    - host: "bufstream-etcd.bufstream.svc.cluster.local"
      port: 2379

Configure observability

The observability block is used to configure the collection and exporting of metrics and traces from your application, using Prometheus or OTLP:

bufstream-values.yaml
observability:
  # Optional, set the log level
  # logLevel: INFO
  # otlpEndpoint: "" # Optional, OTLP endpoint to send traces and metrics to..
  metrics:
    # Optional, can be either "NONE", "STDOUT", "HTTP", "HTTPS" or "PROMETHEUS"
    # When set to HTTP or HTTPS, will send OTLP metrics
    # When set to PROMETHEUS, will expose prometheus metrics for scraping on port 9090 under /metrics
    exporter: "NONE"
  tracing:
    # Optional, can be either "NONE", STDOUT", "HTTP", or "HTTPS"
    # When set to HTTP or HTTPS, will send OTLP metrics
    exporter: "NONE"
    # Optional, trace sampling ratio, defaults to 0.1
    # traceRatio: 0.1

3. Install the Helm chart

If you want to deploy Bufstream with zone-aware routing, go to the zonal deployment steps. If not, follow the instructions below to deploy the basic Helm chart.

After following the steps above, the set of Helm values should be similar to the example below:

bufstream-values.yaml
storage:
  use: gcs
  gcs:
    bucket: <bucket-name>
metadata:
  use: etcd
  etcd:
    # etcd addresses to connect to
    addresses:
    - host: "bufstream-etcd.bufstream.svc.cluster.local"
      port: 2379
observability:
  metrics:
    exporter: "PROMETHEUS"

Using the bufstream-values.yaml Helm values file, install the Helm chart for the cluster and set the target Bufstream version:

$ helm install bufstream oci://us-docker.pkg.dev/buf-images-1/bufstream/charts/bufstream \
  --version "<version>" \
  --namespace=bufstream \
  --values bufstream-values.yaml

If you change any configuration in the bufstream-values.yaml file, re-run the Helm install command to apply the changes.

Deploy Bufstream with zone-aware routing

1. Specify a list of target zones

First, specify a list of target zones in a ZONES variable, which will be used for future commands.

$ ZONES=(<zone1> <zone2> <zone3>)

2. Create GCP service account association for all zones

Create a bufstream account association for the gcp service account in each zone:

$ for ZONE in $ZONES; do
gcloud iam service-accounts add-iam-policy-binding bufstream-service-account@<gcp-project-name>.iam.gserviceaccount.com \
    --role roles/iam.workloadIdentityUser \
    --member "serviceAccount:<gcp-project-name>.svc.id.goog[bufstream/bufstream-service-account-${ZONE}]"
done

3. Create Helm values files for each zone

Then, use this script to iterate through the availability zones saved in the ZONES variable and create a Helm values file for each zone:

$ for ZONE in $ZONES; do
  cat <<EOF > "bufstream-${ZONE}-values.yaml"
nameOverride: bufstream-${ZONE}
name: bufstream-${ZONE}
bufstream:
  serviceAccount:
    name: bufstream-service-account-${ZONE}
  deployment:
    replicaCount: 2
    affinity:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: topology.kubernetes.io/zone
              operator: In
              values:
              - ${ZONE}
kafka:
  publicAddress:
    host: bufstream-${ZONE}.bufstream.svc.cluster.local
    port: 9092
EOF
done

Using the example ZONES variable above creates three values files: bufstream-<zone1>-values.yaml, bufstream-<zone2>-values.yaml and bufstream-<zone3>-values.yaml. However, Bufstream is available in all GCP regions, so you can specify AZs in any region such as us-central1 or europe-central2 in the variable.

4. Install the Helm chart for each zone

After following the steps above and creating the zone-specific values files, the collection of Helm values should be structurally similar to the example below:

bufstream-values.yaml
storage:
  use: gcs
  gcs:
    bucket: <bucket-name>
metadata:
  use: etcd
  etcd:
    # etcd addresses to connect to
    addresses:
    - host: "bufstream-etcd.bufstream.svc.cluster.local"
      port: 2379
observability:
  metrics:
    exporter: "PROMETHEUS"
bufstream-_zone1_-values.yaml
nameOverride: bufstream-<zone1>
name: bufstream-<zone1>
bufstream:
  serviceAccount:
    name: bufstream-service-account-<zone1>
  deployment:
    replicaCount: 2
    affinity:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: topology.kubernetes.io/zone
              operator: In
              values:
              - <zone1>
kafka:
  publicAddress:
    host: bufstream-<zone1>.bufstream.svc.cluster.local
    port: 9092

To deploy a zone-aware Bufstream using the bufstream-values.yaml Helm values file, install the Helm chart for the cluster, set the target Bufstream version, and supply the ZONES variable:

$ for ZONE in $ZONES; do
  helm install "bufstream-${ZONE}" oci://us-docker.pkg.dev/buf-images-1/bufstream/charts/bufstream \
    --version "<version>" \
    --namespace=bufstream \
    --values bufstream-values.yaml \
    --values "bufstream-${ZONE}-values.yaml"
done
If you change any configurations in the bufstream-values.yaml file, re-run the Helm install command to apply the changes.

5. Create a regional service for the cluster

Create a regional service which will create a bootstrap address for bufstream across all the zones.

$ cat <<EOF | kubectl apply -f -
---
apiVersion: v1
kind: Service
metadata:
  labels:
    bufstream.buf.build/cluster: bufstream
  name: bufstream
  namespace: bufstream
spec:
  type: ClusterIP
  ports:
  - name: connect
    port: 8080
    protocol: TCP
    targetPort: 8080
  - name: admin
    port: 9089
    protocol: TCP
    targetPort: 9089
  - name: kafka
    port: 9092
    protocol: TCP
    targetPort: 9092
  selector:
    bufstream.buf.build/cluster: bufstream
EOF