How Yelp modernized its data infrastructure with a streaming lakehouse on AWS
Original URL: https://aws.amazon.com/blogs/big-data/how-yelp-modernized-its-data-infrastructure-with-a-streaming-lakehouse-on-aws/
Article Written: November 13, 2025
Added: November 13, 2025
Type: project
Summary
The article discusses Yelp's transformation of its data infrastructure through the adoption of a streaming lakehouse architecture on AWS. This modernization aimed to address challenges related to data processing latency, operational complexity, and compliance with regulations like GDPR. By migrating from self-managed Apache Kafka to Amazon MSK and implementing Apache Paimon for storage, Yelp achieved significant improvements, reducing analytics data latencies from 18 hours to minutes and cutting storage costs by over 80%. The article outlines the architectural shifts and technologies involved in this transformation.
💠Your Thoughts
Agreed: SQL as the primary interface, much simpler and DA,DS people friendly.