Ever wondered how OpenAI keeps Kafka running smoothly while scaling, upgrading, or replacing clusters? Join us for an inside look at the strategies and tools we use for seamless Kafka migrations at massive scale — without ever missing a message. We'll also explore best practices for Kafka consumers, patterns for high availability and disaster recovery, and lessons learned from real-world incidents and edge cases. Attendees will learn a new set of tools and tactics for making infrastructure changes safely and transparently. We'll cover applications to specific technologies including Apache Kafka, Apache Flink for stateful stream processing, Apache Spark (Structured Streaming) for streaming ELT, and Uber uForwarder as a platform for managed Kafka consumers.

