Learn Database Denormalization

แชร์
ฝัง
  • เผยแพร่เมื่อ 31 พ.ค. 2024
  • What is RDBMS denormalization all about? This video will help you to recognize situations in which it is appropriate to denormalize a relational database table - or avoid normalizing it in the first place. Featuring lots of examples and a focus on the design process.

ความคิดเห็น • 66

  • @decomplexify
    @decomplexify  ปีที่แล้ว +14

    CORRECTIONS: Expand this comment to see corrections.
    * CORRECTIONS *
    At 12:00, a couple of the columns on the Subtask table are oddly named. "Task_Assignment_Datetime" and "Task_Type_Code" should really have been called "Subtask_Assignment_Datetime" and "Subtask_Type_Code", respectively. In addition, one of the assignment datetimes (November 12th at 204pm) is unrealistic, as we'd expect it to be later than the creation datetime of the parent Task.

  • @misc_things
    @misc_things ปีที่แล้ว +28

    The return of the king (2022)

  • @newmonengineering
    @newmonengineering ปีที่แล้ว +20

    As a database administrator, I approve of your explanations. I work with tons of data, and I wish every database was normalized and had solid loads for denormalized reporting databases. But in the real world it is rare. When we get feeds from outside and then need to report on them it always makes the process ridden with issues. Unfortunately having 10 different data sources outside of your control and then trying to join them for reporting is why database administration is still a solid job.

    • @snakhokonkebuthelezi2959
      @snakhokonkebuthelezi2959 ปีที่แล้ว

      Hi, I also aspire to be a database administrator, currently I work in admin for a bank but I'm currently doing my second year in BSc IT. I decided to take on this journey cause I realized how important IT is in the banking sector and how most processes are being automated. My main goal with pursing this Bsc IT degree is to become a database administrator and my question to you sir, if you don't mind answering is, how did you get there? What qualifications did you have to acquire for example, and what experience or path did you take to get there? I'm 29 years old currently and I just need clarity on how to get to that position as quickly as possible. My aim is to finish my Bsc IT hopefully in record time (3 years) and then do a COMPTia in Data to specialize in data and after that hope for the best in terms of getting the relevant experience.

    • @newmonengineering
      @newmonengineering ปีที่แล้ว +4

      @snakhokonkebuthelezi2959 to be honest I got to the Database Admin job by working in a support position that required help from the DBA team regularly. I made friends with the team and their Boss and asked to be a part of it. He asked me some questions you can't find the answers to on Google, and told me I have some number of hours to think about them and answer. I answered them close enough so he said I could join the team. Even to this day that boss was the smartest DBA I had ever met, fortunately I learned from him and quickly also became an extremely knowledgeable DBA, even more so than most of the team. Often in life, it's getting into a position where you can talk to a supervisor of the position you want. If you are a reliable smart person, the best thing to do is get to be friends with the boss you want. I have done exactly this at least 5 times in my life. Who you know often pulls just as much weight as your resume. I have been a DBA for 15 years. I have worked with DB2, Oracle, SQL server and many other smaller systems. Also learned Unix/Linux along the way. If you can find a support job that works with DBA team you can get to know the team and eventually probably be transfered. You can get a job directly but you will probably have a bit harder time because so many need experience now. If you can backup, restore, migrate data, copy and alter tables, reindex and change the file storage of the database as well as write SQL you can probably get most jobs. There are more in depth things like mirroring, replication, and so many others but not every company uses such things.

    • @fabio.1
      @fabio.1 5 หลายเดือนก่อน

      👍

  • @ngoctandang9307
    @ngoctandang9307 5 หลายเดือนก่อน +2

    As a junior developer who didn’t have enough knowledge about databases, I stumbled upon this video because I was curious about ‘What is Denormalization?’ Your video introduced me to many new concepts that I now need to contemplate and learn. Thank you very much.

  • @mikefochtman7164
    @mikefochtman7164 ปีที่แล้ว +3

    One project, we deliberately did a 'denormalizing' thing. We had data from instruments that took readings every 5 minutes and recorded this raw data to a table. Then, we had to get an accumulated calculation that basically summed these readings for an entire month, quarter, and year (well, it was more complex than that, but that's the general idea). These 'sums' had to be accessed often for various reasons. So we made a table of 'cached sums' where we saved the results of the 'expensive, time-consuming' calculation. If there was no table entry for a particular instrument, we ran the 'slow' calculation that queried every instrument reading for the past year, saved the result in the 'cached sums' table and returned the result. Next time we needed that particular sum, it would be quickly returned from the 'cached sums' table.
    So the 'cached sums' data was, technically speaking, redundant data (the original readings being the 'source data'). We did consider that editing the raw readings causes the 'cached sums' to be invalidated, but we compromised.

  • @user-ps3gr3ed1z
    @user-ps3gr3ed1z ปีที่แล้ว +9

    Love the content! Helped me get a job. Great explanations and practical examples! Please upload more - I think this channel could be huge over the long run!

  • @dmytroshchotkin2939
    @dmytroshchotkin2939 ปีที่แล้ว +1

    That was a very nice example with the 'UnitPrice'. Ultimately it's about knowing your domain and distinguishing , what Uncle Bob calls, 'critical business rules' from something rather temporary and not so 'critical'.

  • @nirshadnijam2291
    @nirshadnijam2291 ปีที่แล้ว +1

    This is one of a simplest explanation for a complicated topic. Good work, Decomplexify! Living up to the name.

  • @martinedelius
    @martinedelius ปีที่แล้ว

    Always happy to see new content from you!

  • @MeinDeutschkurs
    @MeinDeutschkurs ปีที่แล้ว +1

    I misunderstood the topic, I thought it‘s about „reverse-engeneering“ a given (older) database. But it was great to watch. Vocabulary is everything! Thank you so much for your efforts!!! ❤️❤️❤️

  • @AlexLuthore
    @AlexLuthore ปีที่แล้ว +2

    Your channel is solid and informative! Please keep making content!

  • @SumTingWong886
    @SumTingWong886 ปีที่แล้ว +2

    Please keep making videos! They are so good!

  • @eladiomendez8226
    @eladiomendez8226 ปีที่แล้ว +3

    Thank you for your amazing videos !

  • @sang1s160
    @sang1s160 ปีที่แล้ว +1

    Superb examples and explanation in practical terms!

  • @denioduarte6782
    @denioduarte6782 8 หลายเดือนก่อน

    Amazing explanation covering all the denormalization facets. Congrats

  • @shrek769
    @shrek769 ปีที่แล้ว +2

    You have THE BEST database videos. Like... Ever.

  • @rthdv
    @rthdv 9 หลายเดือนก่อน

    Thank you SO MUCH for such neat, organized and really SIMPLE to understand explanations! Not just this but I learned a lot from most of your videos. Keep up the great work and thanks a lot (Y)

  • @tenthlegionstudios1343
    @tenthlegionstudios1343 ปีที่แล้ว

    Thanks for making this. I think I requested this on your normalization video - appreciate it!

  • @ShadowHakaiSuminoe
    @ShadowHakaiSuminoe ปีที่แล้ว

    New decomplexify video!! Great stuff

  • @someshmusunuri4233
    @someshmusunuri4233 7 หลายเดือนก่อน

    Thankyou so much, your content is nothing less than gold to me for my learning curiosity.

  • @mohammadalaaelghamry8010
    @mohammadalaaelghamry8010 8 หลายเดือนก่อน

    Of course it's helpful. Great content thank you.

  • @itdataandprocessanalysis3202
    @itdataandprocessanalysis3202 ปีที่แล้ว

    Very happy about that video!

  • @kirle5455
    @kirle5455 ปีที่แล้ว +1

    Great video - thanks! It would be great to see a video on how to load denormalized data into normalized db

  • @allone258
    @allone258 ปีที่แล้ว

    wow thanks my friend

  • @haisonphan5700
    @haisonphan5700 ปีที่แล้ว

    Banger video, i love it!

  • @iliyangermanov5003
    @iliyangermanov5003 ปีที่แล้ว +1

    Excellent explanation! I'd like to see a video on "Views" in relational databases.

  • @smoothbeak
    @smoothbeak ปีที่แล้ว

    Very helpful video

  • @snakhokonkebuthelezi2959
    @snakhokonkebuthelezi2959 ปีที่แล้ว

    I should be paying my tuition to you instead, but all I can thank you with in this instance is a thank you and a like button. Thank you.

  • @Blomma761
    @Blomma761 ปีที่แล้ว

    Just in time for my exam, thank you

  • @donnyrides
    @donnyrides ปีที่แล้ว

    great video. Suggestion for a new video: indexes. What they are, how they work, and their importance to efficient queries.

  • @MichalionWhite
    @MichalionWhite ปีที่แล้ว +8

    Hi @Decomplexify, would you cover how relational databases are redesigned into non-relational ones? This is typically done during system/product migration to cloud.

    • @ThomasBaxterSoutar
      @ThomasBaxterSoutar ปีที่แล้ว

      Do you mean moving access to dataverse and SharePoint?
      And PowerPages etc?

    • @KaiHenningsen
      @KaiHenningsen ปีที่แล้ว +1

      @@ThomasBaxterSoutar I'm presuming stuff like NoSQL databases such as Couchbase, MongoDB, and Redis.

  • @ThomasBaxterSoutar
    @ThomasBaxterSoutar ปีที่แล้ว

    This Dude is a Fantastic Teacher.
    All of his Videos I've loved.
    Learnt so much in 5 videos...wow!
    Off to implement it in Microsoft Acces in Office 365
    You gotta download Acces though....it doest run on the cloud.
    PowerApps you create will though...
    Use your learning ;)
    Can't believe this series of Tutorials has such a low uptake?

    • @ThomasBaxterSoutar
      @ThomasBaxterSoutar ปีที่แล้ว

      I've been trying to untangle a real world problem described here (huge user base) recently...
      You provided my solution
      I can't implement
      But understand
      I'm the victim of multiple legacy RDBMS
      where no one has the SOURCE CODE...
      only the FRONT END...
      Frustrating but solvable
      Reverse Engineering?

  • @rthdv
    @rthdv 9 หลายเดือนก่อน

    Thanks!

  • @adib4361
    @adib4361 ปีที่แล้ว

    Hi, could you please create a video on Data Modelling. Your videos are so awesome!!

  • @deez_dev
    @deez_dev 9 หลายเดือนก่อน

    keep going bro

  • @pamelacastro32
    @pamelacastro32 10 หลายเดือนก่อน

    I’d like to see some videos on various data model types and when to use them on structure or non structured data (e.g. relational, dimensional, object, etc)

  • @ErickBuildsStuff
    @ErickBuildsStuff ปีที่แล้ว

    @Decomplexify How does this example be exported to a NOSQL design ideas? I personally would use a graph DB and use a high level languages to manage the data. I always use 2x copies of same graph and original DB graph data doesn’t get manipulated. Since I‘m a beginner and small projects, it is easy to manage but not sure how to manage it for really large graphs

    • @albertosimeoni7215
      @albertosimeoni7215 7 หลายเดือนก่อน

      Fun fact...
      All modern rdbms (oracle, sql server, sap hana) have graph execution Engines...
      No sql / non relational databases have a meaning only for data Lakes...where you have unstructured data...
      Otherwise only rdbms with in Memory columnstore table is the modern option

  • @gunahawk6893
    @gunahawk6893 ปีที่แล้ว

    He iis back

  • @user-oe3gv9df5x
    @user-oe3gv9df5x 2 หลายเดือนก่อน

    do you have a video on subqueries?

  • @justgivemethetruth
    @justgivemethetruth ปีที่แล้ว

    Whoa ... I no sooner watch your normatization video and get suitably impressed than I am confronted with de-normalization. Does that put me back to the point I was before I watched the normalization video? ;-)

  • @iShaymus
    @iShaymus ปีที่แล้ว

    Aargh I hate it when Data Warehousing is called Data Mart! Nevertheless a fantastic video. I was thinking the whole time that dimensional modelling was a perfect example of demoralisation.
    Great video!

    • @decomplexify
      @decomplexify  ปีที่แล้ว

      Thanks!

    • @Milhouse77BS
      @Milhouse77BS 4 หลายเดือนก่อน

      The fact tables in dimensional models are in 3NF. Only dimensions are 2NF, 3NF if snowflakes.

  • @vicsteiner
    @vicsteiner ปีที่แล้ว

    I found the discussion on denormalization for performance very interesting and sad at the same time. It is kind of a social marketing issue to me more than a technical issue. I mean we try so optimize queries (whixh contradicts completely the very reason for which thos were envisioned by Codd) instead of demanding from vendors better optimization to happens in the phisical (or what you called process) layer. Being a relational query language a description of the expected form/structure of the data and not the steps to get there it should not matter if the language lets you to express it in multiple ways. If thise are equivalent we should expect same performance. But I do know that in practice we as consumers tend to have no choice other than adapt and bound to vendors while finding our hacks for improving SQL performance from the logical layer leading us to unwanted things such as denirmalization for performance.

    • @decomplexify
      @decomplexify  ปีที่แล้ว

      Yes, I entirely agree with that!

  • @digvijaysingh325
    @digvijaysingh325 ปีที่แล้ว

    can you make one video on indexing as well

  • @vicsteiner
    @vicsteiner ปีที่แล้ว +1

    In any way I have the opinion that if you are modelling data using the relational data model and databases that are SQL based (therefore that do not support the relational data model properly) event if for whatever reason one decides to denormalize a relation one should be completely aware of the decision and its implications. Possibly, the deviations from the ideal model and the flawed one should be documented and given reasons so at least it becomes easier to interpret the data and build data products on top of it.

    • @austin4855
      @austin4855 2 หลายเดือนก่อน

      100% this is the way. My approach is always strive for 5th normal form first, and then work backwards and denormalize from there if I see a benefit. It's common in high-read/low-write tables. I document my reasoning and test any DML procedures - it then becomes important to enforce that the proven, transactional procedures are the only interface that developers have for DML in order to avoid corruption.

  • @thorstenl.4928
    @thorstenl.4928 ปีที่แล้ว

    Thanks for the good content.
    I have my issues with the first example.
    To me this is a table that documents things that happened in the real world as they happened.
    That often changes the contexts.
    The price is no longer a property of the item, it actually refers to the transaction itself. Sure there is still a price table with an item to price relationship, but the actually paid price is dependent to the transaction, even if it is derived in some way from the item price. There can be a typing mistake, a calculation error, a discount because of goodwill… all not displayable in other tables with joins.
    Do you get my point?

    • @decomplexify
      @decomplexify  ปีที่แล้ว

      You're welcome! Regarding the first example, I think I covered this at 5:54, where it says we can imagine changes being made to the conventions around pricing, e.g. the conventions change in such a way as to support volume discounts, tiered pricing based on membership level, etc. The denormalized design anticipates the possibility of these kinds of changes, and continues to work if such changes get introduced.

    • @thorstenl.4928
      @thorstenl.4928 ปีที่แล้ว +1

      @@decomplexify Ok I get your point. But I was thinking about, if the relationship is to the transaction (the PK) and not to the item. Isn‘t it in 3NF?
      I‘m not 100% sure, because the item price is surely the source, but in my opinion no longer related from the moment on where it is about real world documentation.
      And it‘s not abou a future decision how pricing works. To me it‘s from the moment on, when its about documentation what happened.
      When there is a calculation mistake, I actually paid 10$ instead of 2•5,01$=10,02$
      Then I have to log the actual booked transaction values not the theoretical correct values by joining…
      Do you get what I‘m about?

    • @decomplexify
      @decomplexify  ปีที่แล้ว

      @@thorstenl.4928 Sure, I understand exactly what you're saying, and when it comes to many transaction tables that I've encountered in real life, I would agree with you. In many cases, a transaction table constitutes your only source of truth about "what really happened". But that's not true in all cases.
      For the purposes of my example from the video, I'm going to stipulate that "what really happens" is that when a player decides to purchase a certain quantity of a certain type of item, the system looks up the item type in the Item_Type_Daily_Prices table where Date = today, so as to obtain the Unit Price. The system then checks to see whether the player has enough game currency to go through with the purchase. If so, the system debits the appropriate amount of game currency from the player's balance, and adds the purchased items to the player's inventory. All of this constitutes "what really happened".
      Only once all this has been done does the system write out an Item_Purchase_Transaction, for informational purposes.
      If the Item_Purchase_Transaction is written out wrongly, for example if it's written out with a Unit Price that differs from the price that was actually used as the basis for debiting the player's balance, then, with regard to the price that really was charged, the Item_Purchase_Transaction will fail to reflect "what really happened".
      This can be mitigated by normalizing Item_Purchase_Transaction i.e. removing the Unit_Price column. If you don't explicity record the Unit_Price on the transaction then the transaction can never be in conflict with the unit price that really was charged.

    • @thorstenl.4928
      @thorstenl.4928 ปีที่แล้ว

      @@decomplexify I get you.
      Thanks for your efford. ;-)
      Great Channel by the way. Keep going…

  • @pauliusnarkevicius9959
    @pauliusnarkevicius9959 ปีที่แล้ว

    I clicked "Like" not for content, only for the headline.

  • @YasinNabi
    @YasinNabi ปีที่แล้ว

    “Successful people make money. It’s not that people who make money become successful, but that successful people attract money. They bring success to what they do.” - Wayne Dyer..

  • @vicsteiner
    @vicsteiner ปีที่แล้ว +1

    Does the so called advantage of a denormalized relation (which I understand in this case is put a being normalized by possible future states of the data model) wiuld persist if the original data model was actually in C. J. Date 6N form. I mean the price is initially dependent on the date, but then in the future it coul be dependent on timestamp and later on timestamp plus a state of the user given one or multiple business rules. So denormalization sounds like you are actually trying to normalize by changes in the data model in in the future and in a way I believe that's the reason to normalize up to 6N (I know the discussion is not so simple if you also read the work of David McGoveran). ButI also u derstand that if you are actually collecting the data you want it to be normalized even if in the future you have to change your model. On the other hand maybe if you are maintaining a data warehouse since you are not writing ñ, updating or deleting having denirmalized relations could be a bit less dangerous although I persobally think it makes the interpretation if the data (I mean its meaning or semantics) harder, specially considering that proper documentation on the data model and its relation to the data collection mechanisms and the reality it tries to represent are often inexistent. Anyway, nice discussion, and thanks for the video!

    • @decomplexify
      @decomplexify  ปีที่แล้ว +1

      Hi Victor, glad you liked the video, and thanks for your question. In the normalized design shown in the video, the Item_Type_Daily_Prices table actually is in 6NF. So 6NF isn't really helping us here with the issue I've raised, i.e. that we'd have to change our model if the conventions around pricing changed. In many respects, 6NF ensures flexibility. But not in this particular respect!

    • @vicsteiner
      @vicsteiner ปีที่แล้ว

      @@decomplexify I see. I've just finished reading C. J. Date book "Database Design and Relational Theory: Normal Forms and All That Jazz" so now I have it much more clear that normalisation and orthogonality can only reduce redundancy but not entirely eliminate it. The issue is much more complex than I thought initially. Even the definition of redundancy given might not be ideal and need further research it seems. For instance the use of quantifiers in predicates as an element for the existence of redundancy in a design is still a bit confusing in my mind. I really wish there was a study group on this but it is very odd as people are in general reactive to these discussions and just dismiss it as not being practical, similar to the reasons people go the easy "non-structured" ways and say that the relational model does not perform well.