Session Details: Current London 2025

Breakout Session (45 minutes)

Name

Towards an Open Apache Kafka Performance Model

Date

Wednesday, May 21, 2025

Time

3:00 PM - 3:45 PM

Location Name

Breakout Room 7

Description

Imagine having a powerful, customizable model that brings the end-to-end flow of records through your Kafka applications and clusters to life. Picture a tool that allows you to swiftly and affordably understand and predict the performance, scalability, and resource demands of your entire system. With this model, you can explore “what if” scenarios, such as changes to workloads, application and cluster hardware, Kafka configurations, and even dependencies on external systems.

This vision is closer than you think. In this talk, we’ll introduce a simple Kafka performance model and demonstrate its application to Kafka tiered storage sizing. Whether you’re using SSD or EBS local storage or S3 remote storage, this model can predict IO, network requirements, the size and number of brokers, and storage space needs.

But this is just the beginning. We’ll unveil the potential of a fully-featured open Kafka performance model. Discover how it could work, what it could do, and the approaches we’re investigating to build and parameterize it. These include benchmarking workloads separately, applying multivariate regression over metrics from our largest managed Kafka clusters, leveraging Kafka client metrics (KIP-714), and utilizing OpenTelemetry traces. For visualization, we’re exploring Sankey Diagrams and integrating OpenTelemetry data into an open-source GUI.

Our goal is to democratize access to an open Kafka performance model, empowering anyone using, developing, or running Apache Kafka clusters and applications. This model will help predict end-to-end application performance, client and cluster resources, and performance SLAs. It will also aid in capacity planning, cluster sizing/re-sizing, and understanding dynamic changes for variable workloads, elastic cluster resizing, cluster failures, maintenance operations, and more. The scope could even expand to include Kafka stream processing, multiple clusters, and heterogeneous integration scenarios with Kafka Connect.

Speakers

Paul Brebner, NetApp

Intermediate

Audience

Architect, Data Engineer/Scientist, Developer, Executive (Technical), Operator/Administrator

Industry

IT, Technology

Tags

Apache Kafka, Architecture, Integration, Operations, Storage, Systems