The Missing Piece in Many Data Pipelines

แชร์
ฝัง
  • เผยแพร่เมื่อ 9 ก.ค. 2024
  • ►► The Starter Guide for Modern Data → bit.ly/starter-mds
    Simplify “modern” architectures + better understand common tools & component
    All data teams (large & small) have at least one thing in common.
    Source data.
    But not everyone handles it the same way in their pipelines.
    For some, they'll reference raw source tables directly in many queries.
    For others, they'll create ad-hoc custom tables to address subtle formatting changes.
    But without any real over arching strategy or consistent naming behind it.
    While a more popular topic is data modeling (ex. kimball, one big table, etc.)
    I believe an equally more important area to consider is what you do BEFORE you start creating those core data models.
    For many, this "before" layer doesn't exist at all.
    In previous videos I've talked about a 3-Layered Data Model.
    And today I want to focus solely on Layer 1, which addresses this concept.
    It's called a "Staging" layer.
    When done right, it can help you establish reliable pipelines from the very start.
    Timestamps:
    00:00 - Intro
    00:52 - What is a Staging Layer?
    03:23 - Reason # 1: Modularity
    05:03 - Reason # 2: Consistency
    07:21 - Reason #3: Clarity
    Title & Tags:
    The Missing Piece in Many Data Pipelines
    #kahandatasolutions #dataengineering #datamodeling

ความคิดเห็น • 10