Data Modeling in the Modern Data Stack

แชร์
ฝัง
  • เผยแพร่เมื่อ 19 พ.ย. 2024

ความคิดเห็น • 49

  • @KahanDataSolutions
    @KahanDataSolutions  2 ปีที่แล้ว +11

    Deliver more impact w/ modern data tools, without getting overwhelmed
    See how in The Starter Guide for Modern Data → www.kahandatasolutions.com/guide

  • @firefoxmetzger9063
    @firefoxmetzger9063 ปีที่แล้ว +4

    Just because storage is cheap doesn't mean it's unimportant. I/O is the often slowest part of OLAP and ETL queries and directly dictates compute cost. The two current OLAP pricing models (pay by minute or pay by bytes "scanned") both depend on how much data you load per query. The former because you pay for an idle cluster that is waiting for data, the latter because it literally bills you for how much data you load. Data modeling helps you get the compression and partitioning you need to control and minimize that cost.

  • @johncosgrove7727
    @johncosgrove7727 ปีที่แล้ว +9

    Hey Mike, this video was straight 🔥🔥🔥 - This is such an important topic for all businesses taking the journey right now and it really blows my mind how much understanding of modelling has been lost. Your overview of the pros and cons of each was a well balanced and to-the-point summary - I especially love that you correctly call out the source data and scale risks of OBT. I think too many peeps in the MDS are working for SaaS companies where they don't realise that 1) source systems are way less static in enterprise and 2) most enterprises outside of SaaS aren't just okay with massive pipeline code spaghettis. Great video!

  • @shreshti82
    @shreshti82 ปีที่แล้ว +1

    Addresses the challenges and thoughts to be taken when going into the cloud. But not enough details on modelling itself from the Data.

  • @nlopedebarrios
    @nlopedebarrios ปีที่แล้ว +3

    Great summary, thank you! Personally, I'm more comfortable with the Kimball approach, but the "modern data stack" part in the title drove catched my attention, and I think it's important to know what trends align better withe the modern tech. Also, I couldn't agree more than design is a key part of the process, and if done correctly is going to prevent a lot of headaches. It would be nice to get deeper into the hybrid model you presented.

  • @nlishivha1005
    @nlishivha1005 6 หลายเดือนก่อน

    just started out my career watch the videos 4 months back gave me a high level overview ------------------- came back after a session with my seniors I see the granula details thanks

  • @nicky_rads
    @nicky_rads 2 ปีที่แล้ว +12

    Solid overview! I’ve mainly used dimensional modeling with fact/dim tables.
    A good data model goes a long way within the analytics pipeline, important stuff. Thanks

    • @KahanDataSolutions
      @KahanDataSolutions  2 ปีที่แล้ว +1

      Definitely. Appreciate you sharing your experience

    • @datasleek7950
      @datasleek7950 2 ปีที่แล้ว +2

      I second that. As a original DBA, proper DB modeling (either in OLTP or OLAP) will provide peace of mind in the future.

    • @summer_xo
      @summer_xo 2 ปีที่แล้ว +1

      Get the data model right and the rest falls into place IMO 👍

  • @Joeymbryan
    @Joeymbryan ปีที่แล้ว

    Wow this is spot on. I've been using the modern data stack for the last ~8 years or so. Starting out, I was definitely focusing on a star schema/denormalized approach with the MPP databases but as I started to learn they do best with fewer joins and can handle wide tables I strive for the OBT approach. In practice, the hybrid is what typically happens, there are so many dimensional tables which very in need from team to team so especially in a larger enterprise, the hybrid is almost a guaranteed.
    Great video! Love the work you did here.

    • @KahanDataSolutions
      @KahanDataSolutions  ปีที่แล้ว

      Appreciate the feedback! What you describe is really similar to my journey as well.

  • @davemeech
    @davemeech ปีที่แล้ว +1

    Oh man, it's so nice to get something substantial on TH-cam for data modeling! Awesome awesome stuff.
    What do you think of unistore? I have yet to dive in, and I'm also very fresh into the industry, but the prospect of combining oltp and olap capabilities is certainly compelling!

  • @theconfusedchannel6365
    @theconfusedchannel6365 ปีที่แล้ว +1

    Good one. I think taking an actual data and flowing through model would be great.

  • @Rex_793
    @Rex_793 2 ปีที่แล้ว +2

    great stuff - when are you going on Joe Reis and Matt Housley's podcast?

  • @mrcool4uall
    @mrcool4uall ปีที่แล้ว

    Good one Michael. Well summarized and to the point. Even I think Hybrid approach is the best for the MDS.

  • @ArmstrongNigere
    @ArmstrongNigere 2 ปีที่แล้ว +2

    Awesome video , really enjoyed very clear and straight to the point

  • @wingnut29
    @wingnut29 ปีที่แล้ว +1

    Great overview! Thanks for educating us on the newer approaches. We currently are using Denormalized Modeling. This is due to the fact that all our current needs are around our ERP. The Marketing Dept wants to start collecting more website and estore analytics, which I believe will lead us to a Hybrid model.

    • @KahanDataSolutions
      @KahanDataSolutions  ปีที่แล้ว

      Glad it was helpful! I still think denormalized is a great strategy.

  • @datasleek7950
    @datasleek7950 2 ปีที่แล้ว +2

    Great Job. Great Presentation.

  • @amitsaha7756
    @amitsaha7756 ปีที่แล้ว +1

    In some cases, we observe another pattern whether industry standard data model is used after raw layer.

  • @bradleymiller437
    @bradleymiller437 2 ปีที่แล้ว +1

    I literally hit this video so fast sincerely thinking I was going to learn "How to date a model." My brain went faster than my eyes and lost the game.

  • @zulkhaireesulaiman8575
    @zulkhaireesulaiman8575 2 ปีที่แล้ว +2

    thank you, incredibly helpful.

  • @johnytheripper
    @johnytheripper 2 ปีที่แล้ว +2

    Thanks for addressing this often overlooked but important topic!
    I'm looking for some good sources on dimensional data modeling. Of course I have the Kimball books, but something more practical (books, courses...) and hands-on, perhaps with exercises on various business scenarios / sources would be great. Any ideas?

    • @deltagamma1442
      @deltagamma1442 ปีที่แล้ว

      Any luck?

    • @johnytheripper
      @johnytheripper ปีที่แล้ว

      @@deltagamma1442 not really :/. Took some inspiration from Gitlab data handbook as I'm mostly looking for SaaS use cases

  • @sunil-de
    @sunil-de ปีที่แล้ว

    hey, videos was top tier, can you suggest a any good course to get the in depth understandin of the DM

  • @Billbillbillhahagdvdve
    @Billbillbillhahagdvdve 3 หลายเดือนก่อน

    Excellent Video !

  • @Lima3578user
    @Lima3578user ปีที่แล้ว

    great video. thank you. can you upload a vlog on Speech to Text transcripts using AI

  • @drewwolin3162
    @drewwolin3162 2 หลายเดือนก่อน

    This was great

  • @SnowFlake-h4y
    @SnowFlake-h4y ปีที่แล้ว

    Could you please explain the differences between different data models(Inmon,Kimball,3NF,Dimension Modelling,Data Vault).

  • @renvils
    @renvils 8 หลายเดือนก่อน

    hey ur explanation really calm and clear ! did you have any udemy course?

  • @rpelegrini
    @rpelegrini ปีที่แล้ว +2

    hello, really nice video about data modeling. I was looking for this.
    It's very difficult to define a right approach for data modeling, each case is a case, in my experience I did a lot of star schema during my carrer, but in nows day I see a trend to one big table in modern data warehouse like bigquery, redshift or synapse.
    Do you have the same impression?

    • @KahanDataSolutions
      @KahanDataSolutions  ปีที่แล้ว +1

      Yep I'm seeing the same thing. Truly case by case. I think the concept of star schema/data models are still very applicable today mainly b/c of the organization and structure it brings rather than for any performance gains.

  • @nlopedebarrios
    @nlopedebarrios ปีที่แล้ว

    Hi Michael, interesting approach the hybrid model. What could be used to transition from the star schema to the OBT data marts? for example, views, materialized views? or are they separate schemas? Also, in what scenario this would make sense?

  • @okj4521
    @okj4521 ปีที่แล้ว +1

    Next: How to date a model!

  • @venkatvaddula6343
    @venkatvaddula6343 หลายเดือนก่อน

    can someone please tell me what the website/document that was shown at 13 sec point?

  • @theukulelegod
    @theukulelegod 8 หลายเดือนก่อน

    What are the downsides of doing all three in one? Pull all source systems raw data (inmon) then modeling fact and dim tables (kimball) then making data marts? If storage is getting cheaper wouldn’t this be the best way?

  • @S_B_S1
    @S_B_S1 7 หลายเดือนก่อน

    How do Data Products in their various guises fit with these data modelling concepts.

  • @jimgillespie3540
    @jimgillespie3540 ปีที่แล้ว +1

    *slow clap* thank you, fantastic.

  • @medhatatef7737
    @medhatatef7737 2 ปีที่แล้ว

    merci beaucoup a toi :))

  • @SujitA-h9d
    @SujitA-h9d ปีที่แล้ว

    How to create a LDm in Magic draw

  • @largpack
    @largpack 9 หลายเดือนก่อน +4

    just theoretical bla bla in my opinion.. great for COO's to talk about stuff they have no idea about

    • @moonfire5069
      @moonfire5069 7 หลายเดือนก่อน

      Until you are being grilled about these in an interview with companies like airbnb, Netflix and Facebook and you look like a complete clueless idiot and you get shown the door