Apache Iceberg

Apache Iceberg is an open table format for huge analytic datasets. Iceberg adds tables to compute engines including Spark, Trino, PrestoDB, Flink and Hive using a high-performance table format that works just like a SQL table.

Web site

Github repository

Tech tags:

Related shared contents:

Iceberg REST Catalog Now Supported in BigLake Metastore for Open Data Interoperability

product

2025-11-20

Google Cloud has announced the general availability of Iceberg REST Catalog support in BigLake metastore, enhancing open data interoperability across various data engines. This fully-managed, serverless metastore allows users to query data using their preferred engines, including Apache Spark and BigQuery, without the need for data duplication. The integration with Dataplex Universal Catalog provides comprehensive governance and lineage capabilities. Organizations like Spotify are already leveraging this technology to build modern lakehouse platforms.
Expressive Time Travel and Data Validation for Financial Workloads

product

2024-12-10

The validation and remediation are interesting.
Introducing Impressions at Netflix (part 1)

project

2025-02-14
Jumia builds a next-generation data platform with metadata-driven specification frameworks

vision

2024-12-20
Apache Iceberg: The Hadoop of the Modern Data Stack?

vision

2024-12-12

Good summarise the current problem for using Iceberg system, but the new S3 Table looks addressing all these pain points.
Turbocharging Efficiency & Slashing Costs: Mastering Spark & Iceberg Joins with Storage-Partitioned

spike

2024-12-03

Leverage of Iceberg table, Data is partitioned and stored in a way that aligns with the join keys, enabling highly efficient joins with minimal data movement for Spark job.
A First Look at S3 (Iceberg) Tables

tech1

2024-12-04

S3 Table bucket handle the Iceberg compaction and catalog maintenance tasks for you.
How Amazon Ads uses Iceberg optimizations to accelerate their Spark workload on Amazon S3

project

2024-11-22

Improving the data processing efficiency by implementing Apache Iceberg's base-2 file layout for S3.
Build Write-Audit-Publish pattern with Apache Iceberg branching and AWS Glue Data Quality

product

2024-12-09

Without Iceberg, there are lot of overhead works to implement WAP pattern.
Implement historical record lookup and Slowly Changing Dimensions Type-2 using Apache Iceberg

spike

2024-12-09

Apache Iceberg

Tech tags:

Related shared contents:

In productions with: