Session Type
Breakout Session
Name
Kafka Tiered Storage in Production?
Date
Wednesday, May 21, 2025
Time
1:00 PM - 1:45 PM
Location Name
Breakout Room 1
Description

1/3rd of the cost of a typical Kafka cluster is storage. Beyond costing money, fluctuating usage means storage space needs to be monitored and has been a source of on-call pain for us. Tiered storage for Kafka is a newly released feature that promises to dramatically reduce storage costs by offloading most data to cheap storage (eg S3) rather than expensive local or network attached disks (eg EBS). It's marked as production-ready, but it's not widely adopted yet. Stripe is currently in the process of migrating to tiered storage across our fleet of more than 50 Kafka clusters. We've encountered some problems already like JVM crashes and metadata calls that occasionally time out only for tiered storage topics, and we're still early in the migration process (though we'll be done one way or the other by the time this conference takes place!). In this talk you'll learn about the problems we encountered that either made us abandon the use of tiered storage or that we had to solve to run it successfully in production.

Donny Nadolny
Level
Advanced
Target Audience
Developer, Operator/Administrator
Industry
Banking/Finance, Retail/E-Commerce, Technology
Tags
Apache Flink, Cloud, Operations, Storage, Systems, Tales from the trenches