In this session, Dream11 engineering team will share the secret sauce and the innovation around Apache Kafka consumers, processing tens of millions of events using a re-engineered Kafka consumer library. Dream11 is one of the largest fantasy sports platforms in the world, handling peak user concurrency of over 15 million during IPL 2024, with edge RPM surpassing 300 million. The business operates under highly time-sensitive conditions, experiencing hockey-stick traffic surges just before the start of matches. To ensure real-time updates for game users, the Dream11 platform heavily relies on Apache Kafka in the critical pipelines of end user services. As the scale grew, the legacy Kafka consumer (simple, high-level) began facing challenges such as delays and data loss, severely impacting user trust. To address these issues, the Dream11 engineering team innovated and developed a low-level Kafka consumer. In this consumer, polling is decoupled from processing and executing both in parallel which fixed our frequent rebalancing problem. For processing the messages we created dedicated worker pool which improved our speed significantly. We disabled the auto-commit and commits were done in batches making sure at-least-once processing, ensuring no data loss. With the growth of the microservices ecosystem, Kafka pipelines became integral to many services. Building on the success of the low-level consumer, Dream11 engineering team created a platform from Kafka consumer library that abstracted the complexities of Kafka integration. This library provides simple interfaces for developers to implement business logic seamlessly. Over time, it matured with features like backpressure, enabling developers to process messages locally during incidents or to scale across a consumer pool with varied core counts. Join this session to learn strategies to optimize Kafka consumers for low latency and high reliability at massive scale.
