Apache Airflow

Apache Airflow

Apache Airflow (or simply Airflow) is a platform to programmatically author, schedule, and monitor workflows.

Web site

Github repository

Tech tags:

Related shared contents:

  • project
    2026-03-06

    This article discusses Pinterest's evolution from basic Text-to-SQL systems to a sophisticated Analytics Agent that leverages unified context-intent embeddings for improved SQL generation and table discovery. The system addresses the challenges of understanding analytical intent and provides a structured approach to data governance and documentation. By encoding historical query patterns and utilizing AI-generated documentation, the agent enhances the efficiency and reliability of data analytics at Pinterest. The article outlines the architecture and operational principles behind the agent's design, emphasizing the importance of context and governance in AI-driven analytics.

  • project
    2025-12-29

    The article discusses Vinted's journey in standardizing large-scale decentralized data pipelines as they migrated their data infrastructure to the cloud. Initially, teams operated independently, but as dependencies grew, coordination became challenging. To address this, they developed a DAG generator that abstracts pipeline creation and standardizes dependency interactions, allowing teams to focus on data models rather than orchestration details. This approach improved visibility and reduced operational complexity across decentralized teams.

  • product
    2025-06-24

    i don't know "OpenLineage standard" before, I guess Datahub should enable to support it as well.

  • project
    2025-01-10

    Very classic MDS

  • vision
    2024-12-20
  • project
    2024-10-07
  • product
    2024-09-05

    Good "dataset" feature since 2.4.0, released on September 19, 2022.

  • project
    2024-10-17

    QuintoAndar's DAG Builder allows scalable management of 10,000+ Apache Airflow DAGs by using YAML configurations to generate DAGs, minimizing code duplication and standardizing data pipeline creation. By separating DAG structures from workflow-specific parameters, QuintoAndar enables data engineers to create new pipelines through declarative YAML files, streamlining the process and ensuring quality across pipelines. This system improves team productivity, simplifies code maintenance, and reduces the learning curve for new team members.

  • vision
    2022-02-15

    This article is for talk about the idea behind fal dbt, extend the dbt capability on airflow platform. It also talk about a lot of other popular tools on Airflow.

In productions with:

Airbnb Astrafy QuintoAndar Funding Circle