Session Type
Lightning Talk
Name
How We Replaced Node.js with Apache Flink for Real-Time Deduplication and Cut Costs by 7x
Date
Wednesday, May 21, 2025
Time
12:30 PM - 12:45 PM
Location Name
Breakout Room 7
Description
ShareChat is one of the largest social media platforms in India, with over 180 million monthly active users. We had a high-throughput real-time stream (>200K RPS) processing using a Node.js + Redis-based deduplication with a 24-hour window. In this talk, I'll walk you through how we transitioned to an Apache Flink-based solution, the challenges we faced, and the strategies that led to a 7x cost reduction. Topics Covered: 1. State Management at Scale: - Our early attempts to structure Flink state efficiently to handle massive-scale deduplication. - Lessons learned in making the job manageable and performant despite the huge state size. 2. Autoscaling Challenges: - How we leveraged the Flink Kubernetes Operator to enable autoscaling. - Why autoscaling initially increased duplication—and how we solved it. 3. When Async API Matters in Apache Flink: - Understanding the role of Async I/O in Flink. - How it impacts performance and resource efficiency in real-time streaming. 4. How We Achieved 7x Cost Savings
Andrei Manakov
Level
Intermediate
Target Audience
Architect, Developer
Industry
Advertising/Media, Technology, IT
Tags
Apache Flink, Apache Kafka, Stream Processing