Session Details: Current London 2025

Breakout Session (45 minutes)

Name

Unlocking Next-Gen Stateful Streaming: Harnessing transformWithState in Apache Spark with Kafka

Date

Wednesday, May 21, 2025

Time

3:00 PM - 3:45 PM

Location Name

Breakout Room 1

Description

As event-driven architectures powered by Apache Kafka™ continue to redefine real-time data processing, the demand for flexible, scalable, and efficient stateful streaming solutions has never been higher.

Enter transformWithState, Apache Spark™’s groundbreaking new operator for Structured Streaming, designed to tackle the complexities of stateful processing head-on. In this session, we’ll dive into how transformWithState empowers developers to build sophisticated, low-latency streaming applications with Kafka as the backbone. From flexible state management and timer-driven logic to seamless schema evolution and integration with Kafka, we’ll explore real-world use cases—like real-time fraud detection and session-based analytics—that showcase its power.

Attendees will leave with a clear understanding of how to leverage transformWithState to supercharge their Kafka-powered Spark pipelines, complete with practical examples, performance insights, and best practices for production deployment. Whether you’re optimizing stateful aggregations or chaining complex event-driven workflows, this talk will equip you to push the boundaries of what’s possible with Kafka and Spark.

Speakers

Holly Smith, Databricks
Craig Lukasik, Databricks

Intermediate

Audience

Architect, Data Engineer/Scientist, Developer, Executive (Technical), Operator/Administrator

Tags

Analytics, Architecture, Cloud, Integration, ML PLatform, ML/AI Application, Stream Processing