To offer its customers state-of-the-art digital services, Daimler Truck manages anonymized data from more than 12,000 connected buses operating in Europe using the CTP, an installed piece of technology that streams telemetry data (such as vehicle speed, GPS position, acceleration values, and braking force). The throughput going through the system is around 500k messages per second, on an average latency of around 5 seconds between the vehicle and when the data is available for consumption. Follow our three-year journey of developing self-managed, stateful Apache Flink applications on top of a treasure trove of near-real-time data, with the ultimate goal of delivering business-critical products like Driver Performance Analysis, Geofencing, EV Battery Health and Signal Visualization. Starting with a team completely new to Flink, we learned through trial, error, and iteration—eventually building a modern, resilient data processing setup. In this session, we'll share our victories, setbacks, and key lessons learned, focusing on practical tips for managing self-hosted Flink clusters. Topics will include working with Flink operators, understanding load distributions, scaling pipelines, and achieving operational reliability. We'll also delve into the mindset shifts required to succeed in building robust, real-time data systems. Whether you're new to Flink, transitioning from batch to streaming, or scaling existing pipelines, this talk offers actionable insights to help you architect, deploy, and optimize your self-managed Flink environment with confidence.
