Skip to content

Deploy Bufstream to Azure with Azure Database for PostgreSQL

This page walks you through installing Bufstream into your Azure deployment, using PostgreSQL for metadata storage. See the Azure configuration page for defaults and recommendations about resources, replicas, storage, and scaling.

Data from your Bufstream cluster never leaves your network or reports back to Buf.

Prerequisites

To deploy Bufstream on Azure, you need the following before you start:

  • A Kubernetes cluster (v1.27 or newer)
  • An Azure Storage account and blob storage container
  • An Azure Database for PostgreSQL flexible server (version 14 or higher)
  • A Bufstream managed identity, with read/write permission to the storage container above.
  • Helm (v3.12.0 or newer)

Create an Azure Kubernetes Service (AKS) cluster

Create an AKS cluster if you don't already have one. An AKS cluster involves many settings that vary depending on your use case. See the official documentation for details.

Set up Workload Identity Federation (WIF) for Bufstream

You can authenticate to Azure Blob Storage with storage account shared access keys, or you can use Kubernetes WIF via Microsoft Entra Workload ID with Azure Kubernetes Service. See the official documentation:

Create a storage account

If you don't already have one, create a new resource group:

$ az group create \
  --name <group-name> \
  --location <region>

Then, create a new storage account within the group:

$ az storage account create \
  --name <account-name> \
  --resource-group <group-name> \
  --location <region> \
  --sku Standard_RAGRS \
  --kind StorageV2 \
  --min-tls-version TLS1_2 \
  --allow-blob-public-access false

Create a storage container

Create a storage container inside the storage account created above:

$ az storage container create \
    --name <container-name> \
    --account-name <account-name> \
    --auth-mode login

Create an Azure Database for PostgreSQL flexible server

Create a new Azure Database for PostgreSQL flexible server

$ az postgres flexible-server create \
  --name <server-name> \
  --resource-group <group-name> \
  --location <region> \
  --version 16 \
  --password-auth enabled \
  --admin-user postgres \
  --admin-password <postgres-user-password> \
  --tier generalpurpose \
  --sku-name Standard_D4ds_v5 \
  --storage-type premium_lrs \
  --storage-size 32 \
  --performance-tier p20 \
  --storage-auto-grow enabled \
  --high-availability zoneredundant \
  --public-access all \
  --create-default-database enabled \
  --database-name bufstream

For more details about instance creation, see the official docs. For a more secure setup, using private access instead of a public IP is recommended.

Create a managed identity and assign role to Microsoft Entra Workload ID for WIF

The managed identity must be given the Storage Blob Data Contributor role with access to the target container.

$ az identity create \
  --name <identity name> \
  --resource-group <group name> \
  --location <region>

$ export MANAGED_IDENTITY_CLIENT_ID="$(az identity show --resource-group <group name> --name <identity name> --query 'clientId' -otsv)"

$ az role assignment create \
    --role "Storage Blob Data Contributor" \
    --assignee $MANAGED_IDENTITY_CLIENT_ID \
    --scope "/subscriptions/<azure-subscription-id>/resourceGroups/<group-name>/providers/Microsoft.Storage/storageAccounts/<account-name>/blobServices/default/containers/<container-name>"

$ export AKS_OIDC_ISSUER="$(az aks show --name <aks cluster name> --resource-group <aks cluster resource group name> --query "oidcIssuerProfile.issuerUrl" -otsv)"

$ az identity federated-credential create \
  --name bufstream \
  --identity-name <identity name> \
  --resource-group <group name> \
  --issuer "${AKS_OIDC_ISSUER}" \
  --subject system:serviceaccount:bufstream:bufstream-service-account" \
  --audience api://AzureADTokenExchange

$ echo $MANAGED_IDENTITY_CLIENT_ID # Save and use for helm values below

Create a namespace

Create a Kubernetes namespace in the k8s cluster for the bufstream deployment to use:

$ kubectl create namespace bufstream

Deploy Bufstream

1. Authenticate helm

To get started, authenticate helm with the Bufstream OCI registry using the keyfile that was sent alongside this documentation. The keyfile should contain a base64 encoded string.

$ cat keyfile | helm registry login -u _json_key_base64 --password-stdin \
  https://us-docker.pkg.dev/buf-images-1/bufstream

2. Configure Bufstream's Helm values

Bufstream is configured using Helm values that are passed to the bufstream Helm chart. To configure the values:

  1. Create a Helm values file named bufstream-values.yaml, which is required by the helm install command in step 4. This file can be in any location, but we recommend creating it in the same directory where you run the helm commands.

  2. Add the values from the steps below to the bufstream-values.yaml file. Skip to Install the Helm chart for a full example chart.

Configure object storage

Bufstream attempts to acquire credentials from the environment using WIF.

To configure storage, set the following Helm values, filling in your Blob Storage variables:

bufstream-values.yaml
storage:
  use: azure
  azure:
    # Azure storage account container name.
    bucket: <container name>
    # Azure storage account endpoint to use—for example, https://<account-name>.blob.core.windows.net
    endpoint: <endpoint>
bufstream:
  deployment:
    podLabels:
      azure.workload.identity/use: "true"
  serviceAccount:
    annotations:
      azure.workload.identity/client-id: <managed identity client id>

The k8s service account to create the Federated identity credential association for is named bufstream-service-account.

Alternatively, you can use a shared key pair.

  1. Fetch the shared key for the storage account. It is recommended to only use the first key returned. The second key should only be used when you are rotating keys.

    $ az storage account keys list \
      --resource-group <group-name> \
      --account-name <account-name>
    
  2. Create a k8s secret containing the storage account's shared key:

    $ kubectl create secret --namespace bufstream generic bufstream-storage \
      --from-literal=secret_access_key=<Azure storage account key>
    
  3. Add the accessKeyId to the configuration:

    storage:
      use: azure
      azure:
        # Azure storage account container name.
        bucket: <container name>
        # Azure storage account endpoint to use—for example, https://<account-name>.blob.core.windows.net
        endpoint: <endpoint>
        # Azure storage account name to use for auth instead of the metadata server.
        accessKeyId: <account-name>
        # Kubernetes secret containing a `secret_access_key` (as the Azure storage account key) to use instead of the metadata server.
        secretName: bufstream-storage
    

Configure PostgreSQL

Get the endpoint address of the PostgreSQL instance:

$ az postgres flexible-server show \
  --name <server-name> \
  --resource-group <resource-group> \
  --query "{endpoint:fullyQualifiedDomainName}" \
  --output table

Create a secret with the DSN to connect to the PostgreSQL instance:

kubectl create secret --namespace bufstream generic bufstream-postgres \
      --from-literal=dsn='postgresql://postgres:<postgres-user-password>@<endpoint-address>:5432/bufstream?sslmode=require'

Then, configure Bufstream to connect to PostgreSQL:

bufstream-values.yaml
metadata:
  use: postgres
  postgres:
    secretName: bufstream-postgres

3. Install the Helm chart

Proceed to the zonal deployment steps if you want to deploy Bufstream with zone-aware routing. If not, follow the instructions below to deploy the basic Helm chart.

Add the following to the bufstream-values.yaml Helm values file to make bufstream brokers automatically detect their zone:

discoverZoneFromNode: true

After following the steps above, the set of Helm values should be similar to the example below:

bufstream-values.yaml
storage:
  use: azure
  azure:
    # Azure storage account container name.
    bucket: my-container
    endpoint: https://mystorageaccount.blob.core.windows.net
bufstream:
  deployment:
    podLabels:
      azure.workload.identity/use: "true"
  serviceAccount:
    annotations:
      azure.workload.identity/client-id: <managed identity client id>
metadata:
  use: postgres
  postgres:
    secretName: bufstream-postgres
discoverZoneFromNode: true
bufstream-values.yaml
storage:
  use: azure
  azure:
    bucket: my-container
    endpoint: https://mystorageaccount.blob.core.windows.net
    accessKeyId: mystorageaccount
    secretName: bufstream-storage
metadata:
  use: postgres
  postgres:
    secretName: bufstream-postgres
discoverZoneFromNode: true

Using the bufstream-values.yaml Helm values file, install the Helm chart for the cluster and set the correct Bufstream version:

$ helm install bufstream oci://us-docker.pkg.dev/buf-images-1/bufstream/charts/bufstream \
  --version "<version>" \
  --namespace=bufstream \
  --values bufstream-values.yaml

If you change any configuration in the bufstream-values.yaml file, re-run the Helm install command to apply the changes.

Deploy Bufstream with zone-aware routing

1. Specify a list of target zones

First, specify a list of target zones in a ZONES variable, which are used for future commands.

$ ZONES=(<zone1> <zone2> <zone3>)

2. Create WIF Association for all zones

If you're using WIF, you'll need to create a federated identity credential for each service account in each zone.

$ export AKS_OIDC_ISSUER="$(az aks show --name <aks cluster name> --resource-group <group name> --query "oidcIssuerProfile.issuerUrl" -otsv)"

$ for ZONE in $ZONES; do
  az identity federated-credential create \
    --name bufstream-${ZONE} \
    --identity-name <identity name> \
    --resource-group <group name> \
    --issuer "${AKS_OIDC_ISSUER}" \
    --subject system:serviceaccount:bufstream:bufstream-service-account-${ZONE} \
    --audience api://AzureADTokenExchange
done

3. Create Helm values files for each zone

Then, use this script to iterate through the availability zones saved in the ZONES variable and create a Helm values file for each zone:

$ for ZONE in $ZONES; do
  cat <<EOF > "bufstream-${ZONE}-values.yaml"
nameOverride: bufstream-${ZONE}
name: bufstream-${ZONE}
zone: ${ZONE}
bufstream:
  serviceAccount:
    name: bufstream-service-account-${ZONE}
  deployment:
    replicaCount: 2
    affinity:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: topology.kubernetes.io/zone
              operator: In
              values:
              - ${ZONE}
kafka:
  publicAddress:
    host: bufstream-${ZONE}.bufstream.svc.cluster.local
    port: 9092
EOF
done

Using the example ZONES variable above creates three values files: bufstream-<zone1>-values.yaml, bufstream-<zone2>-values.yaml and bufstream-<zone3>-values.yaml.

4. Install the Helm chart for each zone

After following the steps above and creating the zone-specific values files, the collection of Helm values should be structurally similar to the example below:

bufstream-values.yaml
storage:
  use: azure
  azure:
    # Azure storage account container name.
    bucket: my-container
    endpoint: https://mystorageaccount.blob.core.windows.net
bufstream:
  deployment:
    podLabels:
      azure.workload.identity/use: "true"
  serviceAccount:
    annotations:
      azure.workload.identity/client-id: <managed identity client id>
metadata:
  use: postgres
  postgres:
    secretName: bufstream-postgres
bufstream-_zone1_-values.yaml
nameOverride: bufstream-<zone1>
name: bufstream-<zone1>
bufstream:
  serviceAccount:
    name: bufstream-service-account-<zone1>
  deployment:
    replicaCount: 2
    affinity:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
          - matchExpressions:
            - key: topology.kubernetes.io/zone
              operator: In
              values:
              - <zone1>
kafka:
  publicAddress:
    host: bufstream-<zone1>.bufstream.svc.cluster.local
    port: 9092

To deploy a zone-aware Bufstream using the bufstream-values.yaml Helm values file, install the Helm chart for the cluster, set the target Bufstream version, and supply the ZONES variable:

$ for ZONE in $ZONES; do
  helm install "bufstream-${ZONE}" oci://us-docker.pkg.dev/buf-images-1/bufstream/charts/bufstream \
    --version "<version>" \
    --namespace=bufstream \
    --values bufstream-values.yaml \
    --values "bufstream-${ZONE}-values.yaml"
done

If you change any configuration in the bufstream-values.yaml file, re-run the Helm install command to apply the changes.

5. Create a regional service for the cluster

Create a regional service which creates a bootstrap address for Bufstream across all the zones.

$ cat <<EOF | kubectl apply -f -
---
apiVersion: v1
kind: Service
metadata:
  labels:
    bufstream.buf.build/cluster: bufstream
  name: bufstream
  namespace: bufstream
spec:
  type: ClusterIP
  ports:
  - name: connect
    port: 8080
    protocol: TCP
    targetPort: 8080
  - name: admin
    port: 9089
    protocol: TCP
    targetPort: 9089
  - name: kafka
    port: 9092
    protocol: TCP
    targetPort: 9092
  selector:
    bufstream.buf.build/cluster: bufstream
EOF

Running CLI commands

Once you've deployed, you can run the Bufstream CLI commands directly using kubectl exec bufstream <command> on the running Bufstream pods. You don't need to install anything else.