Filter articles by tags or search for specific topics:
Filter articles by tags or search for specific topics:
Added Date: January 2, 2025
Memo: Automated granted the S3 data access to the Data consumers.
Original URL: https://blog.det.life/apache-iceberg-the-hadoop-of-the-modern-data-stack-c83f63a4ebb9
Added Date: December 31, 2024
Memo: Good summarise the current problem for using Iceberg system, but the new S3 Table looks addressing all these pain points.
Added Date: December 30, 2024
Memo:
Added Date: December 29, 2024
Memo:
Original URL: https://www.linkedin.com/blog/engineering/ai/practical-text-to-sql-for-data-analytics
Added Date: December 28, 2024
Memo: Very good sharing blog, a lot of tips to build a modern LLM RAG app.
Original URL: https://dropbox.tech/machine-learning/selecting-model-semantic-search-dropbox-ai
Added Date: December 27, 2024
Memo:
Added Date: December 26, 2024
Memo: Leverage of Iceberg table, Data is partitioned and stored in a way that aligns with the join keys, enabling highly efficient joins with minimal data movement for Spark job.
Original URL: https://meltware.com/2024/12/04/s3-tables
Added Date: December 25, 2024
Memo: S3 Table bucket handle the Iceberg compaction and catalog maintenance tasks for you.
Original URL: https://blog.twitch.tv/en/2024/12/05/views-pwn-tables-as-data-interfaces/
Added Date: December 24, 2024
Memo: Twitch has leveraged Views in their Data Lake to enhance data agility, minimize downtime, and streamline development workflows. By utilizing Views as interfaces to underlying data tables, they've enabled seamless schema modifications, such as column renames and VARCHAR resizing, without necessitating data reprocessing. This approach has facilitated rapid responses to data quality issues and supported efficient ETL processes, contributing to a scalable and adaptable data infrastructure.
Added Date: December 23, 2024
Memo: