DuckDB

DuckDB

DuckDB is an in-process SQL OLAP database management system

Web site

Github repository

Tech tags:

Related shared contents:

  • tech1
    2025-11-12

    The article discusses the challenges of processing large datasets using single-node frameworks like Polars, DuckDB, and Daft compared to traditional Spark clusters. It highlights the concept of 'cluster fatigue' and the emotional and financial costs associated with running distributed systems. The author conducts a performance comparison of these frameworks on a 650GB dataset stored in Delta Lake on S3, demonstrating that single-node frameworks can effectively handle large datasets without the need for extensive resources. The findings suggest that modern Lake House architectures can benefit from these lightweight alternatives.

  • poc
    2025-02-04

    Sounds very first, but if we work with big dataset, how to handle the data transformation in the memory? If we work with small data, we can rewrite into Parquet format and the performance is not an issue.

In productions with: