Agreed! Conversational tone, appropriate visuals, and well-paced. I'm in a $4k EDW graduate school class that teaches from bulleted PowerPoint slides. Doesn't compare 🙆😟
The most understandable, detailed & overall introduction to Dimensional Modeling, with clear key words explanation and logically sequential arrangement on the slide content. Thanks Bryan!
This is the most informative session I have ever seen! 54 minutes of pure knowledge. If only everything I had an interest in learning was taught by Bryan!
I come from an OLTP software data architect side of the house and only had to hand off streaming/replication to an ODS. I just took a role that will require dimensional modeling. So glad I found your video. This clicks.
My current favorite channel on TH-cam being a senior data engineer. I would like to have a video created by you telling more about what cubes are, OLAP, ROLAP, etc. This kind of nomenclature is getting more and more rare and people joining data engineering in this modern data stack area should understand them well so that they can understand how we got here and also talk to people familiar with this data engineering nomenclature.
ROLAP and MOLAP are only used in the legacy SSAS Multidimensional Models. They were cool but the Tabular model does not use them b/c everything is in memory. Thanks for the suggestion. Are you using SSAS Multidimensional Models?
Hi Bryan, many thanks to provide this tutorial. The most solid foundation tutorial on data warehouse & dimensional modelling I have even seen. For those who want to build their career path as a data analyst, data architect etc. This is the best start point. I really regret that I did not see this video 2 years ago when I start off my career.
These are the best 53 minutes I've spent recently. Crisp and Clear. I feel much more confident and I request more videos on Dimensional Modeling. Thank you for your effort.
Thanks. I did do a video on Slowly Changing Dimensions which is a big area for interview questions. What topics are you most interested in regarding dimensional modeling?
I'm grateful for the time and efforts you put in explaining one of the key concepts anyone deals with enterrpise data needs. Clear and comprehensive. Thanks a lot
I dont usually spend time watching a 53 min learning video most of the time unless the course and the orator is really worth it. Trust me sir, you kept me glued every minute. God bless you Sir !
A very comprehensive and fully understandable video during my very first viewing itself. I felt like having read a high quality book on the topic within an hour. Bryan never wastes time on unnecessary talks and very effectively concentrates on the core through and through. Thank you!!
@@BryanCafferky My statement was true. Not just a passing compliment. I got a full perspective on what is dimension modelling and how it is different from Data warehousing. Leave alone warehousing, it actually helped me in designing various reports using data from Database tables. I was able to see the common threads that run through them all.
I have been working in data warehousing for the last 5 years and this video gave me the answer I have looking for so long- why is there so much normalisation in data warehouses we see today? Nobody ever gave me a satisfactory answer but you did sir. Big thanks !! This video deserves to be on the billboard.
I wish I had watched it earlier when i was struggling to understand Datawarehousing and Dimension Modeling. Very informative video. Every second was worth spending.
Good sir, I typically never like or comment TH-cam videos. This was a must. You are masterful, and could quite possibly mint some people into data engineers off of this video. God bless you. I hope to become a teacher like you one day
A very useful introduction to data warehousing and the common terminologies, presented in an interesting and easy to understand manner. IT was a very quick 53 minutes - I only meant to get an idea of the content and then watch it later - but got completely absorbed.
This is an excellent primer. The best one I've seen. I'll recommend it if anyone asks for advice on a good starting point. One point however: USER STORY. In 99% of Agile Software Development, the example from that book is NOT a User Story. A user story actually describes something that needs to be done and why. e.g. As a business manager I need to be able to drill down into the types of sales act my book stores to see the types of books being purchased, how they are being purchased, how they are being paid for, down to the level of the book title, so that I can understand what is being sold and better stock my stores. What the book you mention is actually doing is closer to a Use Case. There is semantic impedance going on. But it's just data. :) Anyway, I really do like this video and think the approach of using use cases is very good. I think though that there will be some need to translate between different groups kind of like what is a location to different systems. :)
Thank you for this excellent presentation on Dimensional Modeling. I'm a student and it was so easy to follow because you made it interesting and you provided some excellent examples to support your slides.
This is a must watch video for anyone having a hard time understanding Dimensional Modeling. Wish you could do a full series on Database systems and Warehousing.
Hi Bryan, This video is one of the best videos have watched and has every information required, to the point and well described...Hats off...Thank You. Highly recommend must watch for everyone who is working in DWH domain.
Thank you so much!! Struggling through my data modeling/structuring course and your video was incredibly helpful in understanding dimensional modeling.
Excellent Video and loved it. Being an OLTP modeler, this gave me a very nice idea about dimension modeling. The only one thing which I wanted to bring it to your attention that, When you talked about 7W's of dimensional modeling, it only had 6 Ws and I was searching for the 7th one :)
Hi Sathya, Great observation! Actually, the 'How Many?' is one of the 7 but yeah, it is an H not a W. The book 'Agile Data Warehouse Design' presents it that way so I did to. It should be the 6 W's and 1 H of Data Warehouse Design but that is harder to say. :-) Actually, it helps me remember the How because it is different. Thanks!
I listed out some topics (after my failed interview )to gain clearcut understanding and this video answered all my questions in detail,Sir big thumbs to you ,if possible please do a video on interview questions and how to answer them (Dw,DBMS concepts).
Hi Sai, That's a great idea for a video. Do you have any specific questions in mind? BTW: Interviews always have questions to stump you. But I'd like to help with a video that helps. Thanks! Bryan.
Thanks, This is very helpful. When you discussed the scenarios regarding a person getting married, it triggered a bunch of other questions I can ask for my project. I enjoyed the descriptions of what is fact vs what is dimension.
This is a phenomenal presentation on dimensional modeling but i don’t understand the implementation of surrogate keys. I feel like I’m missing an obvious and low compute way of maintaining all the surrogate keys on your facts. No videos I’ve seen discuss this. But it seems every time a new fact record is generated you have to join every related dim on the foreign natural key and update the fact with the dim’s related surrogate key. So that you can later perform joins using the surrogate key. Am I thinking through this correctly?
Yeah. It does add complexity but you have the gist of it. Surrogate keys are particuarly important when you want SCD history since natural keys would result in duplicate keys on the dim tables. Also, they isolate changes from the backend systems to the dw. But they do add some extra work.
Excellent video. I do have a question at 15:57 . I was unable to understand how the design of the SalesFact table differs from what the OLTP table for Sales would have been. My OLTP Sales table would have been almost identical to SalesFact shown in this presentation, with the exception of a SalesLineItem.
Yeah. I had a hard time seeing the difference in the beginning too. Good question. In the OLTP table at about 23:09, notice the sample table has descriptive columns like CustomerFirstName, CustomerLastName, Product, SalesDate, Order Number, and OrderLineNumber. These are Dimension attributes. It also has Quantity and Price which are Facts or Measures. In a Star Schema, these cannot be in the same table. The Dimension attributes are stored in a separate table that has its own Primary Key. The Facts are stored in a Fact table with a foreign key to the Dimension table. The Dimension table Primary Key is called a Surrogate key and is usually an auto-generated identity column. There is no effort to reduce data redundancy for Dimensions, i.e. you could have Product Category values in the same table with Product Model values. It is not efficient to maintain the data that way which is why OLTP design would not do this. But it is fine for a Star Schema, i.e. Dimensional Modeled design. Make sense?
@BryanCafferky Thank you for taking the time to put this together. This is a great foundational video for anyone getting started and presents the subject in a very relatable way.
I learned so much from this one video. Thank you! Also, 23:51 "snowflake is something you may get questioned in an interview, so wake up" I feel personally attacked lol. I wasn't sleeping (your video was long but not boring at all) but I DID get asked about this in a recent interview and I totally flopped. At least now I can answer that question :)
Hi Bryan, great tutorial! I've been dimensional modeling for quite some time but it's always good to review the basics again. I was wondering if you might have some thoughts on a fact at the atomic grain for a order header/detail fact where your order lines have different dimension attributes. For example, imagine a fact for vehicle maintenance where 2 or 3 line numbers might be for labor with 3 different employees, and 5 line numbers for different parts. This is all easy to rollup, but getting it to the atomic grain leaves a lot of zero keys for the dimensions that aren't applicable. Would be curious to see your thoughts on this. Thanks!
My thought is to consolidate the data (employye, parts, etc.) into a single dimension table and add a column, charge type with values like 'Emp', 'Part'. Assign a surrogate key and point the fact table to this. The line detail points to the related charges which is now in one table.
@@BryanCafferky Hi Bryan, thanks for the reply. Interesting. So if you had 1 line item with 1 employee and 3 parts, then another line item with the same employee and 2 parts, I'm guessing that you'd have to calculate ratios of hours/costs for the labor to each row that's got a part...otherwise, you'd have repeating values (i.e., double-counting) of the labor. In my original thought, it avoids that by having a row for each record type, but I get why that's a problem. Another approach is to just create a fact for each type of line item detail, but then a work order (similar to a sales order) would be broken up among multiple line item detail facts and make drill down impossible.
@@craigdubin6325 If feasible, creating the DW at the lowest grain you may ever need gives you a lot of future flexibility. Your second reply indicate each line item has both an employee(s) and part(s) so you could just aggregate for reporting. Maybe aggregate to the employee/part level for drill down or whatever the business needs.
@@BryanCafferky Yup, I always go down to the atomic grain if possible. And I think I probably made it more confusing. A work order is comprised of basically 2 different types of records, 1) labor, 2) parts. So imagine a quote or invoice from a mechanic. You might have the summary costs at the top with total labor and total parts. But there will be n number of lines with only labor charges, and n number of lines with only parts. They're never on the same line. So quantity on a labor line might be hours for the employee on that labor line, unit cost would be dollars per hour, and total cost would be hours x unit cost. On the parts lines, the quantity would be the number of the specific part number, the unit cost would be the cost of the particular part, and the total cost would be quantity x unit cost. My concern with this method was that the dim_employee_key will always be 0 (or possibly could make it -1 for N/A) on the parts line...while the dim_part_key will always be 0 (or -1 as above) for the part number on the labor lines. Hopefully that's making more sense!
I love this video, the facts, and color commentary you present with it. But what is the relevance of star schema (vs wide flat) for consumption by analysis tools such as Tableau which implicitly and automatically creates a high performing dimensional model from a flat view without a human needing to do any dimensional data development? I would love to hear your thoughts on this.
Thanks. Sure. First, a Star Schema is arranged for efficient and easy data analysis for Tableau, Power BI, or any other tool. There is more to the world than Tableau. The use of surrogate keys supports inclusion of dimension history, i.e. slowly changing dimensions. Thinking things out like conformed dimensions builds in extensibility. The dimensional model creates the foundation of your data which will sustain your organization long term through many changes in the reporting tools that consume the data. The second reason to use a Star Schema is performance when used by reporting tools. You need to load the data and that process can be slow if a lot of joins are needed to get the data together. The Star Schema connects the fact to dimension tabless in one join. No need to join dimension to dimension tables. See medium.com/data-ops/why-do-i-need-a-star-schema-338c1b029430 Many organizations don't want to spend the effort to build a star schema but then run into problems and build hacks to solve them. It's pay now or pay later.
I've seen hours of videos on data warehousing. These are the most valuable 56 minutes.
Well said
53 minutes*
100% agreed
Could not agree more, this is the best video on data warehousing I have watched so far!!
Agreed! Conversational tone, appropriate visuals, and well-paced. I'm in a $4k EDW graduate school class that teaches from bulleted PowerPoint slides. Doesn't compare 🙆😟
This is by far the best guide to dimensional modeling I've found on the internet!
The most understandable, detailed & overall introduction to Dimensional Modeling, with clear key words explanation and logically sequential arrangement on the slide content. Thanks Bryan!
This is the most informative session I have ever seen! 54 minutes of pure knowledge. If only everything I had an interest in learning was taught by Bryan!
Thanks. What are your tech learning interests?
I come from an OLTP software data architect side of the house and only had to hand off streaming/replication to an ODS. I just took a role that will require dimensional modeling. So glad I found your video. This clicks.
Glad it is helpful.
My current favorite channel on TH-cam being a senior data engineer. I would like to have a video created by you telling more about what cubes are, OLAP, ROLAP, etc. This kind of nomenclature is getting more and more rare and people joining data engineering in this modern data stack area should understand them well so that they can understand how we got here and also talk to people familiar with this data engineering nomenclature.
ROLAP and MOLAP are only used in the legacy SSAS Multidimensional Models. They were cool but the Tabular model does not use them b/c everything is in memory. Thanks for the suggestion. Are you using SSAS Multidimensional Models?
Using for an interview prep. Great refresher. Perfect length and depth of content...
Hi Bryan, many thanks to provide this tutorial. The most solid foundation tutorial on data warehouse & dimensional modelling I have even seen. For those who want to build their career path as a data analyst, data architect etc. This is the best start point. I really regret that I did not see this video 2 years ago when I start off my career.
You're very welcome!
By far the MOST lucid and practical no-nonsense explanation of key terms. Loved it!
Glad it is helpful! Thank you for the kind words.
These are the best 53 minutes I've spent recently. Crisp and Clear. I feel much more confident and I request more videos on Dimensional Modeling. Thank you for your effort.
Thanks. I did do a video on Slowly Changing Dimensions which is a big area for interview questions. What topics are you most interested in regarding dimensional modeling?
Perfect combination of experience and theoretical handle on the subject. Thank you for the time it took to record it.
You're welcome. Glad it was helpful.
I'm grateful for the time and efforts you put in explaining one of the key concepts anyone deals with enterrpise data needs. Clear and comprehensive. Thanks a lot
I'm about 4 or 5 years late to this party. You've still done a great job in this video compared to many other sources. Thanks, and well done!
YW and thanks for watching.
I dont usually spend time watching a 53 min learning video most of the time unless the course and the orator is really worth it. Trust me sir, you kept me glued every minute. God bless you Sir
!
Thank you!
A very comprehensive and fully understandable video during my very first viewing itself. I felt like having read a high quality book on the topic within an hour. Bryan never wastes time on unnecessary talks and very effectively concentrates on the core through and through. Thank you!!
Thanks
@@BryanCafferky My statement was true. Not just a passing compliment. I got a full perspective on what is dimension modelling and how it is different from Data warehousing.
Leave alone warehousing, it actually helped me in designing various reports using data from Database tables. I was able to see the common threads that run through them all.
I have been working in data warehousing for the last 5 years and this video gave me the answer I have looking for so long- why is there so much normalisation in data warehouses we see today? Nobody ever gave me a satisfactory answer but you did sir. Big thanks !! This video deserves to be on the billboard.
Great! Glad it helped!
Amazing video. Well explained. Uploaded 3 years ago, still valid for educational and professional training for 2020. Thanks Bryan!
I wish I had watched it earlier when i was struggling to understand Datawarehousing and Dimension Modeling. Very informative video. Every second was worth spending.
Thanks. Yeah. I struggled understanding this too for some time. Glad it helped.
Good sir, I typically never like or comment TH-cam videos. This was a must. You are masterful, and could quite possibly mint some people into data engineers off of this video. God bless you. I hope to become a teacher like you one day
Thank You
A very useful introduction to data warehousing and the common terminologies, presented in an interesting and easy to understand manner. IT was a very quick 53 minutes - I only meant to get an idea of the content and then watch it later - but got completely absorbed.
Thanks. Glad you liked it.
Lovely way of teaching.
Looking up for more of your material about data warehouse on the web.
Many Thanks !! This is the best video I have even seen on this subject. Simple explanation of all complicated areas.
This is premium content on the topic. Simple yet effective explanation shows your understanding of the subject. Thank you Bryan!
Glad it was helpful! Thanks
The concept has been explained thoroughly with real time examples. Thanks Bryan.
YW. Thanks for watching.
An excellent video on data warehousing, easily the best I've seen.
Quality. Seeing your videos, I realise that this is the subsection of my role that I enjoy the most and need to learn more about. Great video.
Great presentation!👍 Thank you Bryan ! 👏
YW. Glad it is helpful.
this is great Dimensional Modeling tutorial ever.. !! Thank you so much, sir..
This is what I subscribe internet for, it's beautiful piece of 53 mins straight
Thanks. Hope you check out my other videos like the ones on Databricks and Python too.
The ideal video for a dba trying to reach the dw world (using mssql server and also Azure) . Thanks a lot for the video !
Glad it helped!
This is so helpful. I'm modeling a data warehouse for my Org and only have experience with OLTP. This has saved so much headache
Great! Yeah. Dimensional Modeling is a very different mindset. Good luck!
This is an excellent primer. The best one I've seen. I'll recommend it if anyone asks for advice on a good starting point. One point however: USER STORY. In 99% of Agile Software Development, the example from that book is NOT a User Story. A user story actually describes something that needs to be done and why. e.g. As a business manager I need to be able to drill down into the types of sales act my book stores to see the types of books being purchased, how they are being purchased, how they are being paid for, down to the level of the book title, so that I can understand what is being sold and better stock my stores. What the book you mention is actually doing is closer to a Use Case. There is semantic impedance going on. But it's just data. :) Anyway, I really do like this video and think the approach of using use cases is very good. I think though that there will be some need to translate between different groups kind of like what is a location to different systems. :)
Thanks!
Thank you for this excellent presentation on Dimensional Modeling. I'm a student and it was so easy to follow because you made it interesting and you provided some excellent examples to support your slides.
Thanks. Glad it helped. Hope you find my other videos helpful too.
Great explanation of dimensional modelling. Highly insightful.
Thank you for taking the time to put this together. I found it very educational and helpful!
You're welcome and thanks for the kind words.
Amazing simple and focused explanations.
Thanks Bryan
You're Welcome! Glad it was helpful.
among several videos i watched on dimensional modeling, this is the one with more insight and experience sharing!
This was an excellent primer. I was alert and it really grounded me on a number of key points. Thank you so much for this contribution.
This is a must watch video for anyone having a hard time understanding Dimensional Modeling. Wish you could do a full series on Database systems and Warehousing.
Thanks. What specific topics are you thinking of?
AMAZING lecture, Bryan - thanks so much! Exactly what I was looking for and an extremely well articulated 56 minutes.
Great Job Bryan, great content .Thanks for sharing
Thanks!
Very well presented! Clear and concise with real-world use cases!
The best video I have ever watched till date. Very well explained and neatly presented. This video helped a lot!!!
Glad it was helpful!
Thank you Bryan for this video. You did an excellent job of explaining the concepts data warehouse design.
Thanks!
A nice and concise presentation of dimensional modeling for data warehouse.
I am listening you for the first time and I found out that you are a great teacher.
Thanks!
@@BryanCafferky After an year I am listening again to refresh concepts.
Great tutorial sir , Thank you so much for such relevent information .
You are most welcome
You are amazing!! Thank you so much for this. Best summary you can get and which can make you talk like a pro..
Thanks.
For someone just getting started, this was amazing thank you so much!
YW!
Hi Bryan, This video is one of the best videos have watched and has every information required, to the point and well described...Hats off...Thank You.
Highly recommend must watch for everyone who is working in DWH domain.
Thanks! Appreciate the kind words.
Excellent video! Thanks for sharing
YW
Very well done and explained. Thank you for sharing your knowledge.
Thanks for watching.
Excellent, specially focus on process model. Thank you
YW!
This is the first time I loved blue screen on my computer. :) Very good advice.
Great content Bryan. Great level of detail and insights (from actual experience). Please keep it up !
Excellent explanation Bryan!!
Thanks Bryan, great video.
Thank you so much!! Struggling through my data modeling/structuring course and your video was incredibly helpful in understanding dimensional modeling.
Wow! Really glad to hear that. Thanks for letting me know.
Excellent Video and loved it. Being an OLTP modeler, this gave me a very nice idea about dimension modeling. The only one thing which I wanted to bring it to your attention that, When you talked about 7W's of dimensional modeling, it only had 6 Ws and I was searching for the 7th one :)
Hi Sathya, Great observation! Actually, the 'How Many?' is one of the 7 but yeah, it is an H not a W. The book 'Agile Data Warehouse Design' presents it that way so I did to. It should be the 6 W's and 1 H of Data Warehouse Design but that is harder to say. :-) Actually, it helps me remember the How because it is different. Thanks!
@@BryanCafferky
hoW?
What?
When?
Where?
Who?
hoW many?
Why?
makes 7.
Thanks for the video. Very informative!!!👍
I listed out some topics (after my failed interview )to gain clearcut understanding and this video answered all my questions in detail,Sir big thumbs to you ,if possible please do a video on interview questions and how to answer them (Dw,DBMS concepts).
Hi Sai, That's a great idea for a video. Do you have any specific questions in mind? BTW: Interviews always have questions to stump you. But I'd like to help with a video that helps.
Thanks! Bryan.
It's a great content and presentation. Thank you very much for this wonderful work!
YW!
Amazing video, very detail level. Thanks so much.
Amazing explanation
Thanks!
Thanks, This is very helpful. When you discussed the scenarios regarding a person getting married, it triggered a bunch of other questions I can ask for my project. I enjoyed the descriptions of what is fact vs what is dimension.
Great video Bryan.
This is a phenomenal presentation on dimensional modeling but i don’t understand the implementation of surrogate keys. I feel like I’m missing an obvious and low compute way of maintaining all the surrogate keys on your facts. No videos I’ve seen discuss this. But it seems every time a new fact record is generated you have to join every related dim on the foreign natural key and update the fact with the dim’s related surrogate key. So that you can later perform joins using the surrogate key. Am I thinking through this correctly?
Yeah. It does add complexity but you have the gist of it. Surrogate keys are particuarly important when you want SCD history since natural keys would result in duplicate keys on the dim tables. Also, they isolate changes from the backend systems to the dw. But they do add some extra work.
Thank you for confirming my understanding and great presentation!
@Bryan Cafferky - thank you for creating this great video. Its really a marvel , simple, realistic approach to understand.
Great presenation. Very clear and to the point!
One of the best videos I’ve seen on DM!!
Thank you, loved the content and how well it was structured and presented. Looking forward to your other tutorials!
Absolutely amazing video, thanks Bryan!
YW. Glad it was helpful.
Great video, great explanation!
Excellent job! Thank you for this wonderful video!
Great lecture, definitely worth the watch!
best video on data warehousing on youtube..
Thank you so much for sharing your knowledge and your skills to teach them in a so clean, so comprehensible.
Thanks. So glad it is helpful!
This video was perfect to answer my questions! Thank you!
Glad to hear that!
Excellent video. I do have a question at 15:57 . I was unable to understand how the design of the SalesFact table differs from what the OLTP table for Sales would have been. My OLTP Sales table would have been almost identical to SalesFact shown in this presentation, with the exception of a SalesLineItem.
Yeah. I had a hard time seeing the difference in the beginning too. Good question. In the OLTP table at about 23:09, notice the sample table has descriptive columns like CustomerFirstName, CustomerLastName, Product, SalesDate, Order Number, and OrderLineNumber. These are Dimension attributes. It also has Quantity and Price which are Facts or Measures. In a Star Schema, these cannot be in the same table. The Dimension attributes are stored in a separate table that has its own Primary Key. The Facts are stored in a Fact table with a foreign key to the Dimension table. The Dimension table Primary Key is called a Surrogate key and is usually an auto-generated identity column. There is no effort to reduce data redundancy for Dimensions, i.e. you could have Product Category values in the same table with Product Model values. It is not efficient to maintain the data that way which is why OLTP design would not do this. But it is fine for a Star Schema, i.e. Dimensional Modeled design. Make sense?
Absolutely amazing explanations.
@BryanCafferky Thank you for taking the time to put this together. This is a great foundational video for anyone getting started and presents the subject in a very relatable way.
Thanks.
I'm getting into data engineering and I really enjoyed this content.
Amazing lecture. Thanks Sir. 🙏🙏🙏
Thank you so much for this! Very organized lecture, and I love how you included the time stamps.
Hey Bryan! This is amazing, thank you for the great video. Could we get a download link for the slide deck?
I learned so much from this one video. Thank you! Also, 23:51 "snowflake is something you may get questioned in an interview, so wake up" I feel personally attacked lol. I wasn't sleeping (your video was long but not boring at all) but I DID get asked about this in a recent interview and I totally flopped. At least now I can answer that question :)
Yes. I find I always remember answers to interview questions I missed.
Thank you sir, you saved my life.
Wow. well detail and explainable. Thanks Bryan
YW!
Hi Bryan, great tutorial! I've been dimensional modeling for quite some time but it's always good to review the basics again. I was wondering if you might have some thoughts on a fact at the atomic grain for a order header/detail fact where your order lines have different dimension attributes. For example, imagine a fact for vehicle maintenance where 2 or 3 line numbers might be for labor with 3 different employees, and 5 line numbers for different parts. This is all easy to rollup, but getting it to the atomic grain leaves a lot of zero keys for the dimensions that aren't applicable. Would be curious to see your thoughts on this. Thanks!
My thought is to consolidate the data (employye, parts, etc.) into a single dimension table and add a column, charge type with values like 'Emp', 'Part'. Assign a surrogate key and point the fact table to this. The line detail points to the related charges which is now in one table.
@@BryanCafferky Hi Bryan, thanks for the reply. Interesting. So if you had 1 line item with 1 employee and 3 parts, then another line item with the same employee and 2 parts, I'm guessing that you'd have to calculate ratios of hours/costs for the labor to each row that's got a part...otherwise, you'd have repeating values (i.e., double-counting) of the labor. In my original thought, it avoids that by having a row for each record type, but I get why that's a problem. Another approach is to just create a fact for each type of line item detail, but then a work order (similar to a sales order) would be broken up among multiple line item detail facts and make drill down impossible.
@@craigdubin6325 If feasible, creating the DW at the lowest grain you may ever need gives you a lot of future flexibility. Your second reply indicate each line item has both an employee(s) and part(s) so you could just aggregate for reporting. Maybe aggregate to the employee/part level for drill down or whatever the business needs.
@@BryanCafferky Yup, I always go down to the atomic grain if possible. And I think I probably made it more confusing. A work order is comprised of basically 2 different types of records, 1) labor, 2) parts. So imagine a quote or invoice from a mechanic. You might have the summary costs at the top with total labor and total parts. But there will be n number of lines with only labor charges, and n number of lines with only parts. They're never on the same line. So quantity on a labor line might be hours for the employee on that labor line, unit cost would be dollars per hour, and total cost would be hours x unit cost. On the parts lines, the quantity would be the number of the specific part number, the unit cost would be the cost of the particular part, and the total cost would be quantity x unit cost. My concern with this method was that the dim_employee_key will always be 0 (or possibly could make it -1 for N/A) on the parts line...while the dim_part_key will always be 0 (or -1 as above) for the part number on the labor lines. Hopefully that's making more sense!
Priceless resource. Much appreciated!
Excellent and very informative video. Thank you very much.
Thank you very much for this GREAT presentation!!!!
Excellent explanation! Thank you for this.
This was very informative and also has few drawbacks, I want to know if you have done video on Data Vault 2.0 with Dimensional Modeling ???
Thanks. I have not done any videos on Data Vault yet.
Best video on DW design ever!
I love this video, the facts, and color commentary you present with it. But what is the relevance of star schema (vs wide flat) for consumption by analysis tools such as Tableau which implicitly and automatically creates a high performing dimensional model from a flat view without a human needing to do any dimensional data development? I would love to hear your thoughts on this.
Thanks. Sure.
First, a Star Schema is arranged for efficient and easy data analysis for Tableau, Power BI, or any other tool. There is more to the world than Tableau. The use of surrogate keys supports inclusion of dimension history, i.e. slowly changing dimensions. Thinking things out like conformed dimensions builds in extensibility. The dimensional model creates the foundation of your data which will sustain your organization long term through many changes in the reporting tools that consume the data.
The second reason to use a Star Schema is performance when used by reporting tools. You need to load the data and that process can be slow if a lot of joins are needed to get the data together. The Star Schema connects the fact to dimension tabless in one join. No need to join dimension to dimension tables. See medium.com/data-ops/why-do-i-need-a-star-schema-338c1b029430
Many organizations don't want to spend the effort to build a star schema but then run into problems and build hacks to solve them. It's pay now or pay later.
Thank you for this worderful session.Very clear and informative.Really enjoyed it.
Thanks!
You're a savior, thank you Bryan.
YW. Glad it helps.
Great video. Well explained !
Great video, and very helpful. I've been struggling with the dryness of the Kimball book.
Thanks. Yeah. The book can be a bit dry.
outstanding overview, well done