The day start with one problem: How to have content from a CMS system to reflect into multiple systems, especially in our global search (knauf.com) “right away”! That’s easy if you can wait (welcome back to 1920s). Today, milliseconds can mean the difference between a happy customer (in our case, editor) and one lost to frustration. Why is that impressive? ~200 editors working and ~3.3 millions of connections a day!!!
Here is where the stream helps us with (near) real-time data processing. Our system efficiently integrates Contentful CMS, Confluent Kafka, and Apache Flink to create a real-time data pipeline that captures, processes, and analyzes content updates with lightning-fast speed and precision.
Our goal was clear: create a reactive data processing system that ensures any content changes in the Contentful CMS are immediately reflected in other applications, especially our global search functionality via Algolia indexes. We wanted users to have access to the most up-to-date information, no delays allowed.
How did we achieve this? Leveraging Contentful webhooks to detect content modifications and trigger a Spring Boot application, which acts as a Kafka producer, transforming and channeling the data stream into our Kafka topics. From there, we deploy Confluent Flink for the heavy lifting, using its powerful capabilities to perform real-time data transformation, aggregation, and analysis. This allows us to update the necessary information for Algolia index updates with minimal changes, ensuring our global website stays in sync across all systems.
In short, we’ve turned waiting into a thing of the past, transforming what was once a cumbersome process into a seamless, real-time experience. Our system not only keeps pace with the demands of modern technology but also sets a new standard for efficiency and speed.
