Deploy Bufstream to AWS with etcd#
This page walks you through installing Bufstream into your AWS deployment, using etcd for metadata storage.
Data from your Bufstream cluster never leaves your network or reports back to Buf.
Prerequisites#
To deploy Bufstream on AWS, you need the following before you start:
- A Kubernetes cluster (v1.27 or newer)
- An S3 bucket
- A Bufstream role, with read/write permission to the S3 bucket above.
- Helm (v3.12.0 or newer)
If you don't yet have your AWS environment, you'll need the following:
- For the EKS cluster
- AmazonEC2FullAccess
- A custom policy including
eks:*
, since there's no default AWS EKS policy including it.
- For the S3 bucket
- AmazonS3FullAccess
- For the Bufstream role
- IAMFullAccess
Terraform module#
We also provide a Terraform module at https://github.com/bufbuild/terraform-modules-bufstream. It sets up all necessary components from an empty AWS account, or adds all necessary components to any subset of the required ones that are already installed.
If you're setting up from an empty account, the following IAM policy should contain the minimum permissions needed:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "EKSFullAccess",
"Effect": "Allow",
"Action": [
"eks:*"
],
"Resource": "*"
},
{
"Sid": "VPCManagement",
"Effect": "Allow",
"Action": [
"ec2:CreateVpc",
"ec2:DeleteVpc",
"ec2:DescribeVpcs",
"ec2:CreateSubnet",
"ec2:DeleteSubnet",
"ec2:DescribeSubnets",
"ec2:CreateRouteTable",
"ec2:DeleteRouteTable",
"ec2:DescribeRouteTables",
"ec2:CreateRoute",
"ec2:DeleteRoute",
"ec2:DescribeRoutes",
"ec2:CreateInternetGateway",
"ec2:AttachInternetGateway",
"ec2:DetachInternetGateway",
"ec2:DeleteInternetGateway",
"ec2:CreateNatGateway",
"ec2:DeleteNatGateway",
"ec2:DescribeNatGateways",
"ec2:CreateSecurityGroup",
"ec2:DeleteSecurityGroup",
"ec2:DescribeSecurityGroups",
"ec2:AuthorizeSecurityGroupIngress",
"ec2:RevokeSecurityGroupIngress",
"ec2:AuthorizeSecurityGroupEgress",
"ec2:RevokeSecurityGroupEgress",
"ec2:CreateTags",
"ec2:DeleteTags",
"ec2:CreateVpcEndpoint",
"ec2:DeleteVpcEndpoint",
"ec2:DescribeVpcEndpoints",
"ec2:ModifyVpcEndpoint",
"ec2:DescribeVpcEndpointServices"
],
"Resource": "*"
},
{
"Sid": "S3Management",
"Effect": "Allow",
"Action": [
"s3:CreateBucket",
"s3:DeleteBucket",
"s3:ListAllMyBuckets",
"s3:ListBucket",
"s3:GetBucketAcl",
"s3:PutBucketAcl",
"s3:GetBucketPolicy",
"s3:PutBucketPolicy",
"s3:DeleteBucketPolicy",
"s3:GetBucketCors",
"s3:PutBucketCors",
"s3:DeleteBucketCors"
],
"Resource": "*"
},
{
"Sid": "IAMPassRole",
"Effect": "Allow",
"Action": [
"iam:PassRole"
],
"Resource": "*",
"Condition": {
"StringEquals": {
"iam:PassedToService": [
"eks.amazonaws.com",
"ec2.amazonaws.com"
]
}
}
}
]
}
Create an EKS cluster#
Create an EKS cluster if you don't already have one. An EKS cluster involves many settings that vary depending on your use case. See the official documentation for details.
Set up Workload Identity Federation for Bufstream role#
You can authenticate to S3 with access keys, or you can use Kubernetes Workload Identity Federation with either OIDC or EKS Pod Identity. We recommend using EKS Pod Identity. See the official documentation:
- EKS Pod Identity (recommended)
- OIDC provider
- Access keys
Create an S3 Bucket#
If you don't already have one, you need the AmazonS3FullAccess
role.
To create a new S3 bucket, you need the AmazonS3FullAccess
role.
$ aws s3api create-bucket \
--bucket <bucket-name> \
--region <region> \
--create-bucket-configuration LocationConstraint=<region>
Create a Bufstream role and policy#
You'll need the IAMFullAccess
role.
$ aws iam create-role \
--role-name BufstreamRole \
--assume-role-policy-document file://<(echo '{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "pods.eks.amazonaws.com"
},
"Action": [
"sts:AssumeRole",
"sts:TagSession"
],
"Condition": {
"StringEquals": {
"aws:SourceAccount": "<aws-account-id>"
},
"ArnEquals": {
"aws:SourceArn": "<eks-cluster-arn>"
}
}
}
]
}')
$ aws eks create-pod-identity-association \
--cluster-name <cluster-name> \
--namespace bufstream \
--service-account bufstream-service-account \
--role-arn <role-arn>
Refer to the OIDC provider guide for details
$ aws iam create-role \
--role-name BufstreamRole \
--assume-role-policy-document file://<(echo '{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "<oidc-provider-arn>"
},
"Action": [
"sts:AssumeRoleWithWebIdentity"
],
"Condition": {
"StringEquals": {
"<oidc-url>:aud" : "sts.amazonaws.com",
"<oidc-url>:sub" : "system:serviceaccount:bufstream:bufstream-service-account"
}
}
}
]
}')
Refer to the guide on creating and managing user access keys.
$ aws iam create-policy \
--policy-name BufstreamS3 \
--policy-document file://<(echo '{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:ListBucket",
"s3:AbortMultipartUpload"
],
"Resource": [
"arn:aws:s3:::<bucket-name>",
"arn:aws:s3:::<bucket-name>/*"
]
}
]
}')
$ aws iam attach-role-policy \
--policy-arn arn:aws:iam::<aws-account-id>:policy/BufstreamS3 \
--role-name BufstreamRole
Create a namespace#
Create a Kubernetes namespace in the k8s cluster for the bufstream
deployment to use:
Deploy etcd#
Bufstream requires an etcd cluster. To set up an example deployment of etcd on Kubernetes, use the Bitnami etcd Helm chart with the following values:
$ helm install \
--namespace bufstream \
bufstream-etcd \
oci://registry-1.docker.io/bitnamicharts/etcd \
-f - <<EOF
replicaCount: 3
persistence:
enabled: true
size: 10Gi
storageClass: ""
autoCompactionMode: periodic
autoCompactionRetention: 30s
removeMemberOnContainerTermination: false
resourcesPreset: none
auth:
rbac:
create: false
enabled: false
token:
enabled: false
metrics:
useSeparateEndpoint: true
customLivenessProbe:
httpGet:
port: 9090
path: /livez
scheme: "HTTP"
initialDelaySeconds: 10
periodSeconds: 30
timeoutSeconds: 15
failureThreshold: 10
customReadinessProbe:
httpGet:
port: 9090
path: /readyz
scheme: "HTTP"
initialDelaySeconds: 20
timeoutSeconds: 10
extraEnvVars:
- name: ETCD_LISTEN_CLIENT_HTTP_URLS
value: "http://0.0.0.0:8080"
EOF
Check that etcd is running after installation.
etcd is sensitive to disk performance, so we recommend using the AWS EBS CSI Driver with gp3
or io1/io2
disks, instead of the default gp2
disks EKS uses.
The storage class in the example above can be changed by setting
the persistence.storageClass
value to a custom storage class using those disks.
Deploy Bufstream#
1. Authenticate helm
#
To get started, authenticate helm
with the Bufstream OCI registry using the keyfile that was sent alongside this
documentation.
The keyfile should contain a base64 encoded string.
$ cat keyfile | helm registry login -u _json_key_base64 --password-stdin \
https://us-docker.pkg.dev/buf-images-1/bufstream
2. Configure Bufstream's Helm values#
Bufstream is configured using Helm values that are passed to the bufstream
Helm chart.
To configure the values:
-
Create a Helm values file named
bufstream-values.yaml
, which is required by thehelm install
command in step 5. This file can be in any location, but we recommend creating it in the same directory where you run thehelm
commands. -
Add the values from the steps below to the
bufstream-values.yaml
file. Skip to Install the Helm chart for a full example chart.
Configure object storage#
Bufstream requires S3-compatible object storage.
Bufstream attempts to acquire credentials from the environment using EKS Pod Identity.
To configure storage, set the following Helm values, filling in your S3 variables:
storage:
use: s3
s3:
bucket: <bucket-name>
region: <region>
# forcePathStyle: false # Optional, use path-style bucket URLs (http://s3.amazonaws.com/BUCKET/KEY)
# endpoint: "https://example.com" # Optional
The k8s service account to create the pod identity association for is named bufstream-service-account
.
Bufstream attempts to acquire credentials from the environment using the OIDC provider.
To configure storage, set the following Helm values, filling in your S3 variables:
bufstream:
serviceAccount:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::<aws-account-id>:role/BufstreamRole
storage:
use: s3
s3:
bucket: <bucket-name>
region: <region>
# forcePathStyle: false # Optional, use path-style bucket URLs (http://s3.amazonaws.com/BUCKET/KEY)
# endpoint: "https://example.com" # Optional
The k8s service account to create the pod identity association for is named bufstream-service-account
.
Alternatively, you can use an access key pair.
-
Create a k8s secret containing the S3 access secret key:
-
Add the
accessKeyId
to the configuration:
Configure etcd#
Then, configure Bufstream to connect to the etcd cluster:
metadata:
use: etcd
etcd:
# etcd addresses to connect to
addresses:
- host: "bufstream-etcd.bufstream.svc.cluster.local"
port: 2379
3. Install the Helm chart#
Proceed to the zonal deployment steps if you want to deploy Bufstream with zone-aware routing. If not, follow the instructions below to deploy the basic Helm chart.
After following the steps above, the set of Helm values should be similar to the example below:
bufstream:
serviceAccount:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::<aws-account-id>:role/BufstreamRole
storage:
use: s3
s3:
bucket: <bucket-name>
region: <region>
metadata:
use: etcd
etcd:
# etcd addresses to connect to
addresses:
- host: "bufstream-etcd.bufstream.svc.cluster.local"
port: 2379
observability:
metrics:
exporter: "PROMETHEUS"
The k8s service account to create the pod identity association for is named bufstream-service-account
.
storage:
use: s3
s3:
bucket: <bucket-name>
region: <region>
accessKeyId: "AKIAIOSFODNN7EXAMPLE"
secretName: bufstream-storage
metadata:
use: etcd
etcd:
# etcd addresses to connect to
addresses:
- host: "bufstream-etcd.bufstream.svc.cluster.local"
port: 2379
observability:
metrics:
exporter: "PROMETHEUS"
Using the bufstream-values.yaml
Helm values file, install the Helm chart for the cluster and set the correct
Bufstream version:
$ helm install bufstream oci://us-docker.pkg.dev/buf-images-1/bufstream/charts/bufstream \
--version "<version>" \
--namespace=bufstream \
--values bufstream-values.yaml
If you change any configuration in the bufstream-values.yaml
file, re-run the Helm install command to apply the changes.
Network load balancer#
To access the Bufstream cluster from outside the Kubernetes cluster, create an AWS Network Load Balancer (NLB). The easiest way
to create an NLB is to use the AWS Load Balancer Controller. Once the controller is successfully installed in the EKS cluster, add the following
configuration to bufstream-values.yaml
file:
bufstream:
service:
type: LoadBalancer
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "external"
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: "ip"
service.beta.kubernetes.io/aws-load-balancer-scheme: "internal"
Run the helm upgrade
command for Bufstream:
$ helm upgrade bufstream oci://us-docker.pkg.dev/buf-images-1/bufstream/charts/bufstream \
--version "<version>" \
--namespace=bufstream \
--values bufstream-values.yaml
Check the progress of the NLB creation using the following command:
Once the NLB is created, use the following commands to get its DNS name and more details:
# save the dns name into a variable
$ NLBDNSNAME=$(kubectl -n bufstream get service bufstream -o "jsonpath={.status.loadBalancer.ingress[*].hostname}")
# print the dns name
$ echo "$NLBDNSNAME"
# show more details like arn of the created NLB:
$ aws elbv2 describe-load-balancers --query "LoadBalancers[?DNSName=='${NLBDNSNAME}']"
The Bufstream cluster needs to advertise its public address to the connecting clients. Make sure the address advertised is the same as the one that the clients
are connecting to by adding the following configuration to bufstream-values.yaml
:
Run the helm upgrade
command for Bufstream to update the public address:
$ helm upgrade bufstream oci://us-docker.pkg.dev/buf-images-1/bufstream/charts/bufstream \
--version "<version>" \
--namespace=bufstream \
--values bufstream-values.yaml
Deploy Bufstream with zone-aware routing#
1. Specify a list of target zones#
First, specify a list of target zones in a ZONES
variable, which are used for future commands.
2. Create EKS Pod Association for all zones#
If you're using EKS Pod association, you'll need to create a pod association for each service account in each zone.
$ for ZONE in $ZONES; do
aws eks create-pod-identity-association \
--cluster-name <cluster-name> \
--namespace bufstream \
--service-account bufstream-service-account-${ZONE} \
--role-arn <role-arn>
done
3. Create Helm values files for each zone#
Then, use this script to iterate through the availability zones saved in the ZONES
variable and create a Helm values file for each zone:
$ for ZONE in $ZONES; do
cat <<EOF > "bufstream-${ZONE}-values.yaml"
nameOverride: bufstream-${ZONE}
name: bufstream-${ZONE}
zone: ${ZONE}
bufstream:
serviceAccount:
name: bufstream-service-account-${ZONE}
deployment:
replicaCount: 2
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- ${ZONE}
kafka:
publicAddress:
host: bufstream-${ZONE}.bufstream.svc.cluster.local
port: 9092
EOF
done
Using the example ZONES
variable above creates three values files: bufstream-<zone1>-values.yaml
, bufstream-<zone2>-values.yaml
and bufstream-<zone3>-values.yaml
. However, Bufstream is available in all AWS regions, so you can specify AZs in any region such as us-east-1
, us-west-2
or eu-central-2
in the variable.
4. Install the Helm chart for each zone#
After following the steps above and creating the zone-specific values files, the collection of Helm values should be structurally similar to the example below:
storage:
use: s3
s3:
bucket: "<bucket-name>"
region: "<region>"
metadata:
use: etcd
etcd:
# etcd addresses to connect to
addresses:
- host: "bufstream-etcd.bufstream.svc.cluster.local"
port: 2379
observability:
metrics:
exporter: "PROMETHEUS"
nameOverride: bufstream-<zone1>
name: bufstream-<zone1>
bufstream:
serviceAccount:
name: bufstream-service-account-<zone1>
deployment:
replicaCount: 2
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- <zone1>
kafka:
publicAddress:
host: bufstream-<zone1>.bufstream.svc.cluster.local
port: 9092
To deploy a zone-aware Bufstream using the bufstream-values.yaml
Helm values file, install the Helm chart for the cluster, set the target
Bufstream version, and supply the ZONES
variable:
$ for ZONE in $ZONES; do
helm install "bufstream-${ZONE}" oci://us-docker.pkg.dev/buf-images-1/bufstream/charts/bufstream \
--version "<version>" \
--namespace=bufstream \
--values bufstream-values.yaml \
--values "bufstream-${ZONE}-values.yaml"
done
If you change any configuration in the bufstream-values.yaml
file, re-run the Helm install command to apply the changes.
5. Create a regional service for the cluster#
Create a regional service which creates a bootstrap address for Bufstream across all the zones.
$ cat <<EOF | kubectl apply -f -
---
apiVersion: v1
kind: Service
metadata:
labels:
bufstream.buf.build/cluster: bufstream
name: bufstream
namespace: bufstream
spec:
type: ClusterIP
ports:
- name: connect
port: 8080
protocol: TCP
targetPort: 8080
- name: admin
port: 9089
protocol: TCP
targetPort: 9089
- name: kafka
port: 9092
protocol: TCP
targetPort: 9092
selector:
bufstream.buf.build/cluster: bufstream
EOF
Running CLI commands#
Once you've deployed, you can run the Bufstream CLI commands directly using kubectl exec bufstream <command>
on the running Bufstream pods.
You don't need to install anything else.