At Wix, our Feature Store processes billions of events every day to power data-driven experiences - from real-time personalizations to machine learning model inferences. Our initial, Apache Storm–based design struggled under massive event volumes, resulting in significant data loss and complex maintenance challenges that limited our ability to scale. In this session, we'll share how we re-architected our online feature store with Apache Flink. You'll learn about the limitations of our previous design, the challenges we faced, and the principles that guided our shift to a high-performance online feature store. We'll illustrate how we combined Apache Spark, Apache Kafka, Aerospike and Apache Flink to achieve high-throughput, low latency feature computations and seamless real-time updates to over 2,500 features, without data loss. Expect a direct, architecture focused session where we’ll compare our old and new designs, sharing the lessons learned along the way, without the philosophical debates.

