Release notes
v0.3.1
Release Date: 2024-12-03 | Status: latest
Bug Fixes
- Fix etcd garbage collection bug which could lead to large database sizes
Features and Improvements
- Update metrics units to match OTEL recommendations
- Allow configuring OTLP temporality preference
- Expose bufstream health status via
bufsteam.status
metric
v0.3.0
Release Date: 2024-12-03
Bug Fixes
- Fix routing bug that caused inconsistent load distribution among agents
- Fix regression in the DescribeCluster API that returned non-unique agent hostnames, affecting certain clients
- Fix DescribeProducers API to be strongly consistent, making some clients more reliable
- Make consumer group offset processing more robust when groups are abruptly deleted
- Redact sensitive record data in debug logs
Features and Improvements
- Improve compatibility with Kafka 3.9.0 clients
- Improve consumer group performance
- Improve Fetch API performance
- Add idempotent producer memory to match Apache Kafka behavior
- Bind Kafka listeners to all resolved addresses of a hostname, instead of just the first IPv4 address
- Improve metadata storage write scalability by ~100x by grouping partition sequencing together
- Add
admin clean-topics
,admin get
, andadmin set
CLI commands to flush the Bufstream intake and sequencing system to safely migrate towards grouping partition sequencing (see migration guide below) - Add
admin usage
CLI command for computing write statistics cluster-wide and by-topic. - Add
timeout
andquiet
flags foradmin
commands - Retry startup checks to prevent spurious crashes when cluster initializes
- Improve reliability of inter-cluster RPCs
- Improve observability metrics, disabling internal debugging metrics by default
- Add probes for etcd and OTLP endpoints
- Use Kubernetes
StatefulSet
for deployments by default for more stable scaling behavior - Other miscellaneous performance improvements
Upgrading to v0.3.x
New clusters will automatically opt-in to the new partition sequencing groups, however existing clusters will need to manually perform this migration after upgrading the cluster to v0.3.x:
- Read these instructions completely before beginning the migration process
- Upgrade Bufstream cluster to 0.3.x
- Identify your admin URL (default: http://localhost:9089)
- Run
bufstream admin clean-topics --url=<admin URL>
and check results - Optionally, disable Kafka traffic to the cluster to reduce noise
- Save an etcd snapshot as backup
- Run
bufstream admin set sequence-shard-count 64 --url=<admin URL>
and check results - Re-enable Kafka traffic to the cluster if disabled
v0.2.0
Release Date: 2024-11-08
Bug Fixes
- Fix a data race when flushing the intake cache
- Log the correct Kafka address on startup
- Reduce compaction errors by improving etcd locking
- Fix crash when cluster is shut down unexpectedly
- Wait for DNS resolution before registering new nodes
- Fix a bug in epoch calculation that erroneously invalidated in-progress offset updates
- Improve archiving behavior during cluster auto-scaling
Features and Improvements
- Support TLS for all cluster communications, including agent-to-agent and among etcd nodes
- Implement KIP-394
- Improve etcd performance when reading from archives
- Improve memory utilization and cluster performance by increasing default cache sizes
- Add a virtual broker configuration to client IDs, so that Bufstream can present itself as a single broker when necessary
- Improve error output for
bufstream serve
failures - Support deploying Bufstream as a Kubernetes Stateful Set
- Expose configurable liveness and readiness probe timeouts
- Output human readable cluster UUIDs for debugging
- Update configuration defaults for improved read performance
- Shutdown gracefully if Kafka or HTTP listeners fail to avoid cluster panics
- Reduce startup and auto-scaling log verbosity
- Various improvements to CLI reference documentation
- Other miscellaneous performance optimizations
v0.1.3
Release Date: 2024-10-30
Bug Fixes
- Fix a bug in v0.1.2's transaction numbering that silently discarded some commits and aborts
- Return errors when clients attempt to change the outcome of a committed or aborted transaction
- Fix stuck producers and consumers by polling etcd, rather than relying exclusively on leases
- Fix a race that led to serving stale high watermarks and last stable offsets on startup
- Fix compatibility with Java clients by always setting offset to -1 when returning produce errors
- Improve AKHQ reliability by limiting the size of archive chunks
- Various fixes to data management for low-throughput partitions
- Fix archiving of internal usage-tracking topic
- Miscellaneous fixes to log and metrics output
Features and Improvements
- Order transaction-related RPCs with logical clocks, preventing re-ordering within Bufstream
- Support more concurrent producers by decreasing etcd heartbeat frequency
- Support L4 load balancers by defaulting to advertising only public hosts
- Add support for zone-local load balancers
- Improve graceful shutdown logic
- Improve produce reliability by retrying more transient errors
- Improve cluster throughput by increasing default cache size
- Increase cluster throughput and reduce object storage costs by optimizing hedging
- Guard against overlapping storage between clusters with a fingerprint check on startup
- Reduce metrics cardinality by decreasing number of histogram buckets
- Assorted improvements to logging and internal tracing
v0.1.2
Release Date: 2024-08-19 | Status: archived
This release has been archived due to a regression in the transaction processing system. All production workloads should continue to use version 0.1.1.
Bug Fixes
- Fixes error-handling bug in topic auto-creation
- Reduces error probability when Bufstream attempts to calculate the last stable or next unstable offset
- Resolves error when Bufstream attempts to read the
kafka.public_address
value in the Helm chart - Prevents Bufstream from sending empty values for CPU memory limits by setting a reasonable default in the Helm chart
- Assigns all transactions a monotonic number so that concurrent complete operations no longer result in transactions completing multiple times for a topic partition -- Bufstream now applies only the first completion for a given transaction number
- Addresses checkpoint error when Bufstream attempts to archive internal topics
Features and Improvements
- Expands Bufstream's Kafka conformance testing suite
- Exposes Kafka configs in the Helm chart so that they can be set directly
- Adds documentation for the Helm chart and recommended defaults
- Improves debug log output for transaction state changes
- Uncaches transactional producers to expose state transition errors
- Allows topic replication factor to be set to
-1
-- in cases where the topic replication factor is not set to -1 or 1, Bufstream will return an error - Improves compatibility with RedPanda console when displaying topics and offsets
v0.1.1
Release Date: 2024-08-14
Bug Fixes
- Fixes off-by-one error in archive requests
Features and Improvements
- Adds config option
kafka.exact_log_offsets
that when set to true will always return the exact offset for fetch requests - Updates and documents recommended default values in the helm chart
- Improves error handling for produce requests and transactions
v0.1.0
Release Date: 2024-08-09
Bug Fixes
- Fixes panic when coercing a message payload to Confluent Schema Registry format
- Respects
acks=0
setting on produce and does not wait for or guarantee the success of the produce request
Features and Improvements
- Adds helm value
exact_log_sizes
that determines whether exact log sizes should be fetched for all topics and partitions - Documents dynamic configuration options
- Adds configuration options for consumer group session timeout:
group.consumer.session.timeout.ms
,group.consumer.min.session.timeout.ms
,group.consumer.max.session.timeout.ms
v0.0.4
Release Date: 2024-08-06
Bug Fixes
- Fixed memory leak when uploading objects to S3 storage
- Removed redundant zone lookups when resolving metadata requests
- Fixed
Fetch
response to work withlibrdkafka
-based clients (including theconfluent-kafka
Python client) - Amended various API responses to match expectations of the
segmentio/kafka-go
andIBM/sarama
Go clients
Features and Improvements
- Allow command-line flags to override YAML configuration
- Support deleting topics by name with
DeleteTopics
- Expose additional Bufstream-specific broker and topic configuration options
- Reduce debug log volume
- Improve cache throughput
v0.0.3
Release Date: 2024-07-25
Bug Fixes
- Change
dataEnforcement
key in helm chart to an empty object so that it does not emit a warning when coalescing values
Features and Improvements
- Enable retries with backoff by default
- Return an error if Bufstream cannot resolve the producer ID
- Set etcd connection wait time to 2 minutes -- Bufstream will return an error and shut down if it cannot establish a connection to etcd within the 2 minute interval
- Provide traces for all etcd storage errors
- Improve topic metadata management
- Change shut down behavior such that Bufstream will wait for the archiver to finish before shutting down
v0.0.2
Release Date: 2024-07-10
Features and Improvements
- Emit build version in helm chart logs
v0.0.1
Release Date: 2024-07-09
- Initial release