Session Type
Breakout Session
Name
Cost-Effective Logging at Scale: ShareChat’s Journey to WarpStream
Date
Wednesday, March 19, 2025
Time
2:00 PM - 2:45 PM
Location Name
Scarlet 1
Description

In August 2023, WarpStream introduced itself as a Kafka-compatible, S3-native streaming solution offering powerful features such as a BYOC-native approach, decoupling of storage and compute as well as data and metadata, offset-preserving replication, and direct-to-S3 writes. It shines in a specific niche—logging, observability, and data lake feeding—where a slight increase in latency is a fair trade-off for substantial cloud cost savings and simplified operations. In this session, we'll take a look into ShareChat's journey of migrating our logging systems from managed Kafka-compatible solutions to WarpStream. At ShareChat, logging suffered from 2 issues: highly unpredictable workloads and high inter-zone fees for data replication across brokers. Logging volume could spike up to 5 times the normal rate for brief periods before returning to baseline. We had to over-provision our Kafka clusters to prevent costly rebalancing and scaling issues, resulting in unnecessary expenses. WarpStream offers a solution with its stateless, autoscaling agents—eliminating the need to manage local disks or rebalance brokers. Moreover, by leveraging S3 for replication, WarpStream allows us to eliminate inter-zone fees. In this session, we’ll discuss things like setting up WarpStream in your cloud, best practices for agents (brokers) and clients, fine-tuning your cluster's latency, and offer advice for local testing. You'll see a detailed cost comparison between WarpStream and both multi-zone and single-zone Kafka-compatible solutions. Additionally, we'll demonstrate how to set up comprehensive monitoring for your WarpStream cluster at various levels of granularity—including agent, topic, and zone. Finally, we'll cover essential alerts you should configure for your agents and our experience in consuming from WarpStream from inside Spark jobs and share the best Spark configs that worked for us.

Vivek Chandela Shubham Dhal
Level
Intermediate
Target Audience
Architect, Data Engineer/Scientist, Developer, Operator/Administrator
Industry
IT, Technology