Data Modeling Challenges - The Issues Data Engineers & Architects Face When Implementing Data Models

แชร์
ฝัง
  • เผยแพร่เมื่อ 2 ต.ค. 2024

ความคิดเห็น • 34

  • @SeattleDataGuy
    @SeattleDataGuy  ปีที่แล้ว +2

    If you guys want to learn more about data engineering, then sign up for my newsletter here seattledataguy.substack.com/ or join the discord here discord.gg/2yRJq7Eg3k

  • @William-B
    @William-B ปีที่แล้ว +16

    Could you do a demonstration video? Here's what we want to model -> Here's one way to model it

    • @chrisformoso
      @chrisformoso ปีที่แล้ว +2

      this and the logic of microdecisions that goes into building the model would be gold!

  • @EcZachly_
    @EcZachly_ ปีที่แล้ว +6

    That data modeling live will be so good!

  • @RodrigoBocanegraCruz
    @RodrigoBocanegraCruz ปีที่แล้ว +15

    Different data models were defined to respond to different scenarios:
    - 3NF for transactional applications
    - Dimensional (star or snowflake schemas) or flat for reporting
    - Datavault, Anchor, Hook for integration
    Sometimes you could even decide not to model the data at all, as many practitioners are doing, which could work for specific scenarios. Now we also have hybrid tables, so models are also being challenged for those previous use cases. Perhaps Anchor with its 6NF could be an interesting approach when having hybrid tables? It sounds appealing to me!
    But regardless the technology challenges and innovations, when we consider the overall enterprise and overarching data needs, I would say that the most important models for any scenario are the enterprise and conceptual models (not technical models) where you define the key business elements (data domains) and relationships relevant for the business. After that, I would even suggest you can use whatever you want on every layer, as long as the physical model is aligned to the business model.
    I think that's why datavault with automation in place, is very popular in many industries, because it enforces you to understand the conceptual model and focus on the business rather than the technical implementation. Certainly datavault without automation is just a pain in the integration, and it certainly doesn't consider lakehouse architectures where you could physically design your lake folders around data domains and define guidelines for relationships (the devil is in the details).
    But sadly, as you suggest, people usually think conceptual models are easy to create or unnecessary, and that's a wrong step towards proper scalability. Even if you follow a federated approach (data mesh or equivalent) you'll have eventually to integrate data from multiple domains and only enterprise and conceptual modeling can soften the burden.
    This is where business and data architecture play a crucial role in defining proper data domains and linking that with data development, delivery, management, and governance.
    Thanks for the video!

  • @JoeG2324
    @JoeG2324 ปีที่แล้ว +11

    My team certainly doesn't follow data modeling standards and it works perfectly fine for us. One of the main reasons we don't is we simply don't have enough time or people to properly put together a data model. We handle data ingestion, transformations and reporting for many many department and it would be impractical to build data models. What we do is we have processes to that build tables for different subject areas. So, its technically one table that fits a subject area and these tables get joined with other datasets etc.. i'm simplifying this, in general this is how we work. If my team covered only one subject area and had specific questions that needed to be reported on then yeah a data model might work in our case.

    • @SeattleDataGuy
      @SeattleDataGuy  ปีที่แล้ว +1

      Yeah, I think a lot of teams take it on in different ways. One of which is to have looser practices, there are always tradeoffs of course

    • @moona5454
      @moona5454 9 หลายเดือนก่อน

      Basically gold layer about a certain domain and then dbt to create OBTs joining multiple domains

  • @BJTangerine
    @BJTangerine ปีที่แล้ว +4

    Congrats on the new home

  • @BudaVlogss
    @BudaVlogss ปีที่แล้ว

    @SeattleDataGuy Hello, I had a question. I just finished my BS degree in Software Development I was wondering is that enough as far as the Degree aspect to get into Data engineering? When I search around for jobs I see a lot of Data Engineer positions require a Master degree. Which I do not have.

  • @hwy9nightkid
    @hwy9nightkid 5 หลายเดือนก่อน

    what is the best open source data modelling option

  • @JuanBojaca-e5m
    @JuanBojaca-e5m 9 หลายเดือนก่อน

    What do you think about a graph data model?

  • @sarvesht7299
    @sarvesht7299 ปีที่แล้ว +1

    In brief, when the requirements are clear, what may go wrong while designing a data model? Like the challenges we might encounter.. could you please share those?

    • @jppbkm
      @jppbkm ปีที่แล้ว +1

      The requirements will turn out to be wrong 😂

    • @cnaeuspompeus3188
      @cnaeuspompeus3188 ปีที่แล้ว +4

      Here is one: perfect model with 3 Fact tables and 45 Dimensions(!), model was made almost perfect ...cool yeah !!
      However 25%, 33% and 45% of FKs for the 3 Fact tables were zeros, or completely missing (without enforced constrains of course) , due to “successful” migration years ago , of old DB to the OLTP DB where this star schema was reading ... modeling work was done , PMs were happy; and time for me to come , to do “reporting”... biggest joke in my career

    • @sarvesht7299
      @sarvesht7299 ปีที่แล้ว

      @@cnaeuspompeus3188 ha.. if it was a successful migration, then how were those tables empty. And how did you overcome this issue ? Reloaded the tables ?

    • @ivani3237
      @ivani3237 10 หลายเดือนก่อน

      @@sarvesht7299 it's often a typical situation in many DB's - when at some point you add a new property (ex. "') to you fact table, and the old data wasn't updated (has NULL in it), and new data came with populated

    • @ivani3237
      @ivani3237 10 หลายเดือนก่อน +1

      The requirements are constantly changing over the time, plus new data is coming.

  • @Kondaranjith3
    @Kondaranjith3 ปีที่แล้ว +3

    THANK YOU VERY USEFUL

  • @santoshbhamidipati8455
    @santoshbhamidipati8455 11 หลายเดือนก่อน +1

    Can you give a demo instead of theoritically explaining as many of us really need some help on real world example of data modelling for cracking interviews

  • @datamandy8975
    @datamandy8975 ปีที่แล้ว +2

    why the camera angle always wiggles from zoom in to zoom out..making my head burst...content is always good from you 🤗can you please fix this alone

    • @SeattleDataGuy
      @SeattleDataGuy  ปีที่แล้ว +1

      Ok, I can see if we can do that less!!

  • @thedailyepochs338
    @thedailyepochs338 9 หลายเดือนก่อน +1

    Loved this video. Thank you

  • @calypso5629
    @calypso5629 ปีที่แล้ว +1

    There are some background noise.

  • @emonymph6911
    @emonymph6911 11 หลายเดือนก่อน +1

    congrats on the new house

  • @mirlamontano6640
    @mirlamontano6640 10 หลายเดือนก่อน +1

    awesome video!