Data Warehouse vs Data Lake | Explained (non-technical)

แชร์
ฝัง
  • เผยแพร่เมื่อ 20 ม.ค. 2025

ความคิดเห็น • 21

  • @KahanDataSolutions
    @KahanDataSolutions  2 ปีที่แล้ว +3

    Want to build a reliable, modern data architecture without the mess?
    Here’s a free checklist to help you → bit.ly/kds-checklist

  • @Day925
    @Day925 ปีที่แล้ว +3

    Your channel is now my new bible when it comes to Data Engineering

  • @beepboop1237
    @beepboop1237 2 ปีที่แล้ว +1

    Great video! Perfect for anyone looking to understand some of the key first steps in setting up a solid data architecture.

  • @davidk4682
    @davidk4682 2 ปีที่แล้ว +4

    You rock bro. Real clean and concise.

  • @BlakeC341
    @BlakeC341 ปีที่แล้ว

    Great. Straightforward and simple.

  • @Reinales
    @Reinales 2 ปีที่แล้ว +1

    I loved those animation parts, nice video! 😎😎

  • @nlopedebarrios
    @nlopedebarrios ปีที่แล้ว

    How do you know if you need a data lake? suppose all your data sources are dbs in the cloud, except maybe 2 or 3 files uploaded to an S3 bucket periodically. I don't see how data is sent to the operational dbs to cloud storage, instead of doing a traditional ETL to the data warehouse

  • @kkindahouse7153
    @kkindahouse7153 2 ปีที่แล้ว +2

    Very easy to understand!
    Could you explain more about fact/dimensions and slowly changed pls?

    • @KahanDataSolutions
      @KahanDataSolutions  2 ปีที่แล้ว +2

      Thanks! This is a complex topic in itself, but here are the short answers to your question....
      Facts/Dimensions - These terms come from what's called "Dimensional Modeling" which is a strategy for creating tables in a data warehouse. They are still just database tables, but are described with these terms to help indicate their function in an overall strategy.
      Fact Tables - Typically represent an activity (ex. sales or comments) and include quantitative data (ex. price, quantity, etc.) along w/ foreign keys to dimensions.
      Dimensions - Provide qualitative context around fact tables, these are descriptions. (ex. color, type, name, etc.). The goal is to join facts to dimensions and create various types of views of the underlying business activity (fact) for reporting. When designed properly, this type of relationship makes slicing up data really straightforward.
      Slowly Changing Dimensions - Dimensions that have attributes that may change over time (ex. the location of an employee). This is called a slowly changing dimension. There are various strategies for handling the change (ex. overwrite it vs add a new row and attach time frames to them).
      Again, this topic could be an entire video in itself but ultimately it revolves around a strategy for organizing a data warehouse. I suggest looking into Kimball Data Modeling to learn more as well. Hope that helps!
      Here's a wiki link - en.wikipedia.org/wiki/Dimensional_modeling

    • @kkindahouse7153
      @kkindahouse7153 2 ปีที่แล้ว

      @@KahanDataSolutions love how you explain in your way, pretty clear for me as always.
      Hope you could share more data architecturing topics in a simple way :)

  • @pooshpoosh9232
    @pooshpoosh9232 ปีที่แล้ว

    So if I use a data lake and a data warehouse this means that I necessarily am using an ELT? Since I'm getting the data, loading it into the lake, then structuring it better on the warehouse

  • @nomenetasaili8598
    @nomenetasaili8598 ปีที่แล้ว +1

    When or what situation you would need a data lake? Wouldnt tranforming the various data directly into the data wharehouse be more efficient?

  • @ashishlimaye2408
    @ashishlimaye2408 ปีที่แล้ว

    Great videos!!

  • @josephojo1313
    @josephojo1313 2 ปีที่แล้ว +3

    Thank you so much. love your content!

  • @AlexKashie
    @AlexKashie ปีที่แล้ว

    Will be appreciated to do a more technical difference vidéo please.

  • @InnovativeBeautifulWorld
    @InnovativeBeautifulWorld 4 หลายเดือนก่อน

    Thanks a lot.