Efficient Data Aggregation Techniques for Big Data
Description
Explore advanced techniques for performing data aggregation in big data environments to improve speed and efficiency while managing large datasets.
Level: tech2
Articles Addressing This Problem (2):
650GB of Data (Delta Lake on S3). Polars vs DuckDB vs Daft vs Spark.
The article discusses the challenges of processing large datasets using single-node frameworks like Polars, DuckDB, and Daft compared to traditional...
tech1
Added: Nov 24, 2025
View →
Simple Queries in Spark Catalyst Optimisation (2) Join and Aggregation
This article explores the join and aggregation operations in Spark's Catalyst optimization engine. It discusses how Spark generates execution plans...
tech2
Added: Nov 23, 2025
View →