Session Type
Lightning Talk
Name
How We Replaced Node.js with Apache Flink for Real-Time Deduplication and Cut Costs by 7x
Date
Wednesday, May 21, 2025
Time
12:30 PM - 12:45 PM
Location Name
Breakout Room 7
Description
ShareChat is one of the largest social media platforms in India, with over 180 million monthly active users.
We had a high-throughput real-time stream (>200K RPS) processing using a Node.js + Redis-based deduplication with a 24-hour window.
In this talk, I'll walk you through how we transitioned to an Apache Flink-based solution, the challenges we faced, and the strategies that led to a 7x cost reduction.
Topics Covered:
1. State Management at Scale:
- Our early attempts to structure Flink state efficiently to handle massive-scale deduplication.
- Lessons learned in making the job manageable and performant despite the huge state size.
2. Autoscaling Challenges:
- How we leveraged the Flink Kubernetes Operator to enable autoscaling.
- Why autoscaling initially increased duplication—and how we solved it.
3. When Async API Matters in Apache Flink:
- Understanding the role of Async I/O in Flink.
- How it impacts performance and resource efficiency in real-time streaming.
4. How We Achieved 7x Cost Savings
Speakers

Level
Intermediate
Target Audience
Architect, Developer
Industry
Advertising/Media, Technology, IT
Tags
Apache Flink, Apache Kafka, Stream Processing