Topic-level configuration parameters#
This page describes Bufstream-specific topic-level configuration parameters.
Bufstream topic configurations work similarly to Kafka, with each parameter having a cluster-wide default that can be overridden at the topic level. When you create or modify a topic without specifying a particular configuration value, Bufstream applies the cluster default.
Bufstream supports reading and updating topic configuration values from any Kafka API-compatible tool, including browser-based interfaces like AKHQ and Redpanda Console.
Archival#
Archival parameters control how Bufstream archives recent topic data from intake files to long-term storage formats.
bufstream.archive.fetch.sync
#
bool
Whether fetching intake data for archival should be synchronized, improving cache hit rates. Defaults to true
.
bufstream.archive.kind
#
string
The kind of archive to use when creating new archives. Defaults to FLAT
.
Must be one of the following values:
FLAT
: DefaultPARQUET
: Parquet data filesICEBERG
: Parquet data and Iceberg table files
Note:
- Compacted topics may only use
FLAT
. - Using
ICEBERG
requires settingbufstream.archive.iceberg.catalog
andbufstream.archive.iceberg.table
.
bufstream.archive.min.bytes
#
int64
The threshold for starting a new archive. -1
to disable archival, 0
for active archiving. Defaults to 0
.
Archive limits#
An archive file is completed (closed) on any of these conditions:
- It reaches the size determined by
bufstream.archive.max.bytes
. - It has been written to for the duration specified in
bufstream.archive.complete.delay.max.ms
. - It has been idle (waiting for newly produced records) for the duration specified in
bufstream.archive.idle.max.ms
.
When any of these conditions is met, Bufstream closes the current archive file. It then checks every 30 seconds for newly produced data and starts a new archive file if data is available.
For more information about archival, see Archive in Bufstream's Kafka data flow documentation.
bufstream.archive.max.bytes
#
int64
The maximum size of each archive. Defaults to 4294975493
(4 GiB).
bufstream.archive.complete.delay.max.ms
#
int64
The maximum delay (in milliseconds) between starting an archive and completing it. Defaults to 3600000
(one hour).
To prevent multiple, coincidentally synchronized jobs from continually firing at the same time, the actual value used randomly varies by ±12.5%.
bufstream.archive.idle.max.ms
#
int64
The maximum delay (in milliseconds) to wait for new data to archive before completing an archive. Defaults to 300000
(five minutes).
Iceberg and Parquet#
See Apache Iceberg™ configuration for complete instructions on configuring Bufstream to store data as Iceberg tables.
bufstream.archive.iceberg.catalog
#
string
The name of the Iceberg catalog to use when bufstream.archive.kind
is set to ICEBERG
.
This refers to a named catalog configuration in Bufstream's configuration.
This must be set before or when changing the archive kind—changing the kind fails if the topic doesn't already have valid catalog and table names.
bufstream.archive.iceberg.table
#
string
The name of the Iceberg table that reflects the data published to this topic.
This is only used when bufstream.archive.kind
is set to ICEBERG
.
This name must contain at least two dot-separated components, where all components except the last indicate the table's namespace (which can't be empty).
This must be set before or when changing the archive kind—changing the kind fails if the topic doesn't already have valid catalog and table names.
bufstream.archive.parquet.granularity
#
string
The granularity of partitions for Parquet archive files. Used when bufstream.archive.kind
is ICEBERG
or PARQUET
.
Parquet files will be organized into a directory hierarchy to enable efficient pruning for time-based queries (compatible with Hive-style partitioning).
The partitions are based on the ingestion timestamp.
If unspecified, the default granularity is defined by Bufstream's archive configuration.
Must be one of the following values:
MONTHLY
: Files will be partitioned into yearly and monthly directories. For example,year=2025/month=1
.DAILY
: Files will be partitioned into yearly, monthly, and daily directories. For example,year=2025/month=1/day=15
.HOURLY
: Files will be partitioned into yearly, monthly, daily, and hourly directories. For example,year=2025/month=1/day=15/hour=23
.