As a lay person I always found the idea of a restaurant the best way to understand applications. Waiter : Web Server Chef : Application Store Manager : DBMS Storage Racks : SSD Library
In a nutshell, data lakes stores all kind of data coming into the organization in cost effective manner as it utilises cloud object storage which is infinitely scalable.. It is equivalent to data swamps as data stroed inside also can be inaccurate, duplicate or inaccurate data which can not be used for querying or for Business Intelligence. In order to use this data, Data is cleaned first and then loaded into Data Warehouse through ETL process. It is easily queryable and can be used for BI and report generation. But it has two disadvantages :- 1. The cost of data warehouse is too high 2. Apps wants to consume fresh data may not get it from Data warehouse as it ETL process takes time to load data into warehoulse. Hence to solve the shortcomings of both Data Lake and Data Warehouse, concept of data lakehouse is introduced
ok used bard to help: Data Lakehouse: Unifies the advantages of both data lakes and data warehouses, creating a single platform for all data needs. Stores all data, structured and unstructured, in low-cost object storage like a data lake. Applies metadata and schema to the data, like a data warehouse, enabling efficient querying and analysis. Offers cost-effective storage, flexibility for exploration, and structured data for analysis.
The video is very clear in explaining the concepts. But one question that comes to my mind is in which situation would a data warehouse still be viable as a final destination for some of the tables built. Could a use case be optimized query performance that the lakehouse may lake?
Hey , very cleary and simple explanation. One big question from me , then i used both termin's synonymous,. And I think is not realy correct. Is the statging area equal to the Data lake, even not, what is the main difference between thus both ? Thanks
- 5:43 the data doesn't lose its value per se (on the same way at least as food does when it expires). E.g. if it's not "found" (not labelled so nobody knows what it is) and when it is recognized that it's a duplicate of something else are not the same things. In the first case you don't know what the value is, and in the 2nd case the actual/original data has the same value as before and the copy of it has no value. - well, when it comes to have a lakehouse, the restaurant could force the supplier to dock at a special place to load ONLY vegetables or ONLY meat, so reducing the amount of "labeling" (obviously it has some additional costs to build different docks and certain restaurants (small ones) may not be able to afford that) so on the same way a data lake could apply some data warehouse "principles" to increase the structured-ness and the possibility of "governance". - It reminds me the sci-fi writer Stanislav Lem's novel where he describes how the wireless communication was "invented": "the engineers made the diameter of the wire by which the communication was done smaller... and then even smaller... and then a bit more... and at one point... there was no wire..." 🙂
Summary: We encounter various types of data-unstructured, semi-structured, and structured-in our data lakes, sourced from different databases and various channels. Our need extends to powerful dashboards, business intelligence, and reports. Subsequently, we establish an ETL path to transform this data into our enterprise warehouses, which contain domain-specific data tailored for particular use cases. However, two critical issues arise concerning data governance and data quality, creating what can be likened to data swarms. To address these challenges, developers contemplate a solution that combines both aspects, known as a lake house. This approach provides a cost-effective, flexible, and high-performance structure, bundling everything into one cohesive system. This integrated system can be utilized for both business intelligence and machine learning processes.
As a lay person I always found the idea of a restaurant the best way to understand applications.
Waiter : Web Server
Chef : Application
Store Manager : DBMS
Storage Racks : SSD Library
Loading dock example was a great way to illustrate the concept, thanks!
In future eposiode , can you cover comparison between Data Lake & Data Mesh ?
In a nutshell, data lakes stores all kind of data coming into the organization in cost effective manner as it utilises cloud object storage which is infinitely scalable.. It is equivalent to data swamps as data stroed inside also can be inaccurate, duplicate or inaccurate data which can not be used for querying or for Business Intelligence.
In order to use this data, Data is cleaned first and then loaded into Data Warehouse through ETL process. It is easily queryable and can be used for BI and report generation. But it has two disadvantages :-
1. The cost of data warehouse is too high
2. Apps wants to consume fresh data may not get it from Data warehouse as it ETL process takes time to load data into warehoulse.
Hence to solve the shortcomings of both Data Lake and Data Warehouse, concept of data lakehouse is introduced
nice explanation, not too technical but really clear
ok used bard to help: Data Lakehouse:
Unifies the advantages of both data lakes and data warehouses, creating a single platform for all data needs.
Stores all data, structured and unstructured, in low-cost object storage like a data lake.
Applies metadata and schema to the data, like a data warehouse, enabling efficient querying and analysis.
Offers cost-effective storage, flexibility for exploration, and structured data for analysis.
Amazing video explaining the Data structure using simple method
Great video Luv. I like the analogy of food service prep that you used also.
The video is very clear in explaining the concepts. But one question that comes to my mind is in which situation would a data warehouse still be viable as a final destination for some of the tables built. Could a use case be optimized query performance that the lakehouse may lake?
brilliant video. best explained data lakehouse in almost 8 minutes. Thank you :)
Brilliant analogy! Invaluable info. Thank you.
Excellent presentation about DataLakeHouse
Great vid - would love to know how a data lakehouse works though
Data lakehouse architecture explainer coming soon!
any future videos showing real life examples?
Hey ,
very cleary and simple explanation. One big question from me , then i used both termin's synonymous,. And I think is not realy correct. Is the statging area equal to the Data lake, even not, what is the main difference between thus both ?
Thanks
Great video Luv! Amazing explanation!
Awesome and very clear video! By the way, how can you write backwards? 😅
- 5:43 the data doesn't lose its value per se (on the same way at least as food does when it expires). E.g. if it's not "found" (not labelled so nobody knows what it is) and when it is recognized that it's a duplicate of something else are not the same things. In the first case you don't know what the value is, and in the 2nd case the actual/original data has the same value as before and the copy of it has no value.
- well, when it comes to have a lakehouse, the restaurant could force the supplier to dock at a special place to load ONLY vegetables or ONLY meat, so reducing the amount of "labeling" (obviously it has some additional costs to build different docks and certain restaurants (small ones) may not be able to afford that) so on the same way a data lake could apply some data warehouse "principles" to increase the structured-ness and the possibility of "governance".
- It reminds me the sci-fi writer Stanislav Lem's novel where he describes how the wireless communication was "invented": "the engineers made the diameter of the wire by which the communication was done smaller... and then even smaller... and then a bit more... and at one point... there was no wire..." 🙂
Excellent video.. thanks
Great analogy, thanks Luv!!
Can you please explain about data mesh??
Excellent video. Thanks!
It was a wonderful explaination !! Thanks !
Gran forma de explicar con simpleza el uso que le podemos dar a los datos
Good analogy thx for explain it !
Am I the only one mesmerized by how he can write backwards, while talking about complex concepts?
Haha, he doesn't. The video is flipped in post processing.
He works at IBM and learned that feat in an effective human communication crash course
Absolutely loved this!
You are the Best Luv!
Good video. keep em' coming!
Summary:
We encounter various types of data-unstructured, semi-structured, and structured-in our data lakes, sourced from different databases and various channels.
Our need extends to powerful dashboards, business intelligence, and reports. Subsequently, we establish an ETL path to transform this data into our enterprise warehouses, which contain domain-specific data tailored for particular use cases.
However, two critical issues arise concerning data governance and data quality, creating what can be likened to data swarms.
To address these challenges, developers contemplate a solution that combines both aspects, known as a lake house. This approach provides a cost-effective, flexible, and high-performance structure, bundling everything into one cohesive system. This integrated system can be utilized for both business intelligence and machine learning processes.
great video *^^* thank you!
Great metaphore ! Well done !
To the man in the mirror speaking to the outside world... Writing all flipped letters for us to understand -- thank you
Thank u very much❤❤
REAL-GOOD VIDEO❗😃
If the data Is coming from the api and I want to store it in the database and I wanted to ask how will give and access to data load like an validation
I would like to focus on my meal really :D
JK. Amazing video. Keep up the good work.
Great vid!
Thanks
Great!
❤
Is he writing backwards? How is this filmed??
See ibm.biz/write-backwards
The god bless you
All the time i am thinking how is writing like this , its better to watch it at 1.5 speed
It would be more efficient to use graphics instead of the painfully slow drawings
No -- it reflects human thought process.
I wish these videos went straight to the topic.... meals? Trucks?.... im out
data lake is such a useless term. what does this mean in tech terms ?