Session Type
Lightning Talk
Name
From Events to Insights: Kafka’s Role in Myntra’s Real-Time Data Revolution
Date
Wednesday, March 19, 2025
Time
12:30 PM - 12:45 PM
Location Name
Scarlet 2
Description

In today’s fast-paced world, where actionable business insights drive competitive advantage,
tapping into dynamic real-time streams marks the evolution of data-driven decision-making and
revolutionizing business intelligence.
Traditional batch-based data pipelines slowed down decision-making, causing delays in business
insights, and limiting our ability to respond in real time.
Join this session to learn, how at Myntra, we revamped our data infrastructure by transforming
batch-based pipelines into a robust, real-time streaming architecture, reducing latency from hours
to mere minutes.
This session will also delve into how we leveraged Kafka, Spark Structured Streaming, and
Delta Lake to create a scalable, low-latency ingestion pipeline. By implementing exactly-once
semantics and optimizing data flows, we achieved the reliability and scalability needed to power
mission-critical use cases.We’ll also explore how this transformation addressed the inherent
limitations of traditional batch systems, enabling data freshness, operational agility, and the
delivery of actionable near real-time business insights. These advancements have redefined how
Myntra supports its dynamic ecosystem, driving unprecedented agility.
The audience will gain actionable strategies for building real-time streaming pipelines,
overcoming data freshness challenges, and unlocking the potential of near real-time
insights to fuel innovation and growth at scale.
Key highlights:
1. Kafka-Centric Streaming Architecture: Delve into the architectural design where Kafka
powers seamless integration between streaming and batch workflows,efficiently handling
millions of events/minute.
2. Data Freshness & Completeness Challenges: Understand how Myntra ensures data freshness
and completeness using write ahead logs, micro-batch freshness propagation.
3. Operational Innovations with Delta and Spark: Explore how Apache Spark enabled efficient
real-time ingestion, exactly-once semantics and fault tolerance in high-throughput.

Shrvan Warke