How I Structure My Data Pipelines

Original URL: https://loglevelinfo.substack.com/p/how-i-structure-my-data-pipelines

Article Written: December 30, 2025

Added: March 22, 2026

Type: tech1

Summary

The article discusses the author's approach to structuring data pipelines by integrating the medallion architecture, Kimball dimensional modeling, and semantic layers. It emphasizes the importance of defining clear roles and outputs for each layer—Bronze, Silver, and Gold—to cater to different user needs. The author argues for making the semantic layer a first-class priority in data architecture, highlighting its role in providing governed metrics for self-service analytics. The article concludes with a concrete example of how marketing attribution data flows through this architecture.

💭 Your Thoughts

Very clear blog explaining the boundaries of each layer. One important aspect, however, that is often be less looked at the beginning is ownership. In a data mesh approach, should each department own their domain data across all three layers? Also, with the rise of AI tools, end users in the “gold” layer can now access data more directly—often from the “silver” layer using AI agent. This might reduce the need for heavily pre-calculated metrics over time.

Data Problems Addressed

Establishing Clear Definitions for Data Layers

Technologies Referenced

Databricks Unity Catalog AWS Database Migration Service dbt