Skip to content

Topic-level configuration parameters#

This page describes Bufstream-specific topic-level configuration parameters.

Bufstream topic configurations work similarly to Kafka, with each parameter having a cluster-wide default that can be overridden at the topic level. When you create or modify a topic without specifying a particular configuration value, Bufstream applies the cluster default.

Bufstream supports reading and updating topic configuration values from any Kafka API-compatible tool, including browser-based interfaces like AKHQ and Redpanda Console.

Archival#

Archival parameters control how Bufstream archives recent topic data from intake files to long-term storage formats.

bufstream.archive.fetch.sync#

bool

Whether fetching intake data for archival should be synchronized, improving cache hit rates. Defaults to true.

bufstream.archive.kind#

string

The kind of archive to use when creating new archives. Defaults to FLAT.

Must be one of the following values:

  • FLAT: Default
  • PARQUET: Parquet data files
  • ICEBERG: Parquet data and Iceberg table files

Note:

  • Compacted topics may only use FLAT.
  • Using ICEBERG requires setting bufstream.archive.iceberg.catalog and bufstream.archive.iceberg.table.

bufstream.archive.min.bytes#

int64

The threshold for starting a new archive. -1 to disable archival, 0 for active archiving. Defaults to 0.

Archive limits#

An archive file is completed (closed) on any of these conditions:

  1. It reaches the size determined by bufstream.archive.max.bytes.
  2. It has been written to for the duration specified in bufstream.archive.complete.delay.max.ms.
  3. It has been idle (waiting for newly produced records) for the duration specified in bufstream.archive.idle.max.ms.

When any of these conditions is met, Bufstream closes the current archive file. It then checks every 30 seconds for newly produced data and starts a new archive file if data is available.

For more information about archival, see Archive in Bufstream's Kafka data flow documentation.

bufstream.archive.max.bytes#

int64

The maximum size of each archive. Defaults to 4294975493 (4 GiB).

bufstream.archive.complete.delay.max.ms#

int64

The maximum delay (in milliseconds) between starting an archive and completing it. Defaults to 3600000 (one hour).

To prevent multiple, coincidentally synchronized jobs from continually firing at the same time, the actual value used randomly varies by ±12.5%.

bufstream.archive.idle.max.ms#

int64

The maximum delay (in milliseconds) to wait for new data to archive before completing an archive. Defaults to 300000 (five minutes).

Iceberg and Parquet#

See Apache Iceberg™ configuration for complete instructions on configuring Bufstream to store data as Iceberg tables.

bufstream.archive.iceberg.catalog#

string

The name of the Iceberg catalog to use when bufstream.archive.kind is set to ICEBERG. This refers to a named catalog configuration in Bufstream's configuration. This must be set before or when changing the archive kind—changing the kind fails if the topic doesn't already have valid catalog and table names.

bufstream.archive.iceberg.table#

string

The name of the Iceberg table that reflects the data published to this topic. This is only used when bufstream.archive.kind is set to ICEBERG. This name must contain at least two dot-separated components, where all components except the last indicate the table's namespace (which can't be empty). This must be set before or when changing the archive kind—changing the kind fails if the topic doesn't already have valid catalog and table names.

bufstream.archive.parquet.granularity#

string

The granularity of partitions for Parquet archive files. Used when bufstream.archive.kind is ICEBERG or PARQUET. Parquet files will be organized into a directory hierarchy to enable efficient pruning for time-based queries (compatible with Hive-style partitioning). The partitions are based on the ingestion timestamp. If unspecified, the default granularity is defined by Bufstream's archive configuration.

Must be one of the following values:

  • MONTHLY: Files will be partitioned into yearly and monthly directories. For example, year=2025/month=1.
  • DAILY: Files will be partitioned into yearly, monthly, and daily directories. For example, year=2025/month=1/day=15.
  • HOURLY: Files will be partitioned into yearly, monthly, daily, and hourly directories. For example, year=2025/month=1/day=15/hour=23.