Session Type
Breakout Session
Name
Melting Icebergs: Enabling Analytical Access to Kafka Data through Iceberg Projections
Date
Wednesday, May 21, 2025
Time
3:00 PM - 3:45 PM
Location Name
Breakout Room 4
Virtual Session Link
Description

An organisation's data has traditionally been split between the operational estate, for daily business operations, and the analytical estate for after-the-fact analysis and reporting. The journey from one side to the other is today a long and torturous one. But does it have to be?

In the modern data stack Apache Kafka is your defacto standard operational platform and Apache Iceberg has emerged as the champion of table formats to power analytical applications. Can we leverage the best of Iceberg and Kafka to create a powerful solution greater than the sum of its parts?

Yes you can and we did!

This isn't a typical story of connectors, ELT, and separate data stores. We've developed an advanced projection of Kafka data in an Iceberg-compatible format, allowing direct access from warehouses and analytical tools.

In this talk, we'll cover:

* How we presented Kafka data for Iceberg processors without moving or transforming data upfront—no hidden ETL!
* Integrating Kafka's ecosystem into Iceberg, leveraging Schema Registry, consumer groups, and more.
* Meeting Iceberg's performance and cost reduction expectations while sourcing data directly from Kafka.

Expect a technical deep dive into the protocols, formats, and services we used, all while staying true to our core principles:

* Kafka as the single source of truth—no separate stores.
* Analytical processors shouldn't need Kafka-specific adjustments.
* Operational performance must remain uncompromised.
* Kafka's mature ecosystem features, like ACLs and quotas, should be reused, not reinvented.

Join us for a thrilling account of the highs and lows of merging two data giants and stay tuned for the surprise twist at the end!

Tom Scott Roman Kolesnev
Level
Intermediate
Target Audience
Architect, Data Engineer/Scientist, Developer, Executive (Technical)
Industry
Advertising/Media, Banking/Finance, Education, Energy/Utilities, Entertainment, Gaming, Government, Healthcare, Hospitality, Insurance, IT, Manufacturing, Retail/E-Commerce, Technology, Telecommunications, Transportation
Tags
Analytics, Apache Flink, Apache Iceberg, Architecture, Data Catalog, Integration, Storage