Session Type
Lightning Talk
Name
The Silent Migration: How Kafka Streams Became Our Safety Net
Date
Tuesday, May 20, 2025
Time
12:30 PM - 12:45 PM
Location Name
Breakout Room 6
Description
Migrating from a monolithic Postgres system to a distributed architecture is a high-stakes balancing act. Over five years, we transformed our legacy infrastructure, with Kafka Streams emerging as the backbone bridging old and modern systems, ensuring uninterrupted compliance, real-time reporting, and ML-driven insights. This talk details how we collaborated across legacy teams, new service developers, external partners, and ML engineers to build a resilient streaming platform. Our layered Kafka Streams topologies served as a universal abstraction layer, addressing key challenges: - Orchestrating Cross-Team Workflows: Legacy monoliths (using CDC with Debezium), Kafka-based new services, and external systems often produced conflicting schemas. We unified these data streams, enabling downstream innovation without tight coupling to source systems. - Simplifying Operations: To manage tens of complex topologies, we developed internal tools for automated topology validation, state store monitoring, simplified replays, and efficient debugging, significantly reducing new engineer onboarding time. - Compliance at Streaming Speed: Processing every transaction through Kafka Streams allowed us to implement real-time compliance checks with sub-100ms latency. This stream-first approach cut regulatory implementation time from weeks to days without altering legacy systems. - Reporting & Machine Learning: Integrating with Databricks, we converted real-time streams into batch-compatible datasets using Spark Structured Streaming and Delta tables for sub-minute processing. Our pipeline also enabled real-time feature engineering, enhancing ML model performance for recommendations and risk scoring. The target audience is data engineers, architects, and team leads tackling legacy modernization, cross-team collaboration, and real-time analytics. Attendees will learn strategies to align priorities, accelerate compliance, and unify real-time and batch pipelines for reporting and ML.
Nemanja Milicevic
Level
Intermediate
Target Audience
Architect, Data Engineer/Scientist
Industry
Entertainment, Technology, Gaming
Tags
Kafka Streams, Kafka Connect, Stream Processing, CDC, ML/AI Application, Integration