28:32 How does Delta Lake work? 28:50 Delta On Disk 29:59 Table = result of a set of actions 31:31 Implementing Atomicity 32:48 Ensuring Serializability 33:33 Solving Conflicts Optimistically 35:08 Handling Massive Metadata 36:32 Roadmap 38:20 QnA
I see this whole "Hierarchical Data Pipeline" strategy being talked about quite a bit these days. We did establish this as part of a ready solution we built for Manufacturing use case using Confluent Kafka + KSQL. But the Data Lake is something i believe will remain/continue to exist as a depot for long term retention of data where AI/DA platforms leverage data from these data lakes for batch processing. I see this story from DataBricks to be a Data-warehouse convergence towards Data Lakes !
It's probably the same, but not sure how you could do that on a datalake consistently. As described here, Delta appears to make it easier to do and making it possible to do it as if you were doing it on a relational database.
28:32 How does Delta Lake work?
28:50 Delta On Disk
29:59 Table = result of a set of actions
31:31 Implementing Atomicity
32:48 Ensuring Serializability
33:33 Solving Conflicts Optimistically
35:08 Handling Massive Metadata
36:32 Roadmap
38:20 QnA
hi kim thanks for support , you are from ? , i am from india.
I see this whole "Hierarchical Data Pipeline" strategy being talked about quite a bit these days. We did establish this as part of a ready solution we built for Manufacturing use case using Confluent Kafka + KSQL. But the Data Lake is something i believe will remain/continue to exist as a depot for long term retention of data where AI/DA platforms leverage data from these data lakes for batch processing. I see this story from DataBricks to be a Data-warehouse convergence towards Data Lakes !
The architecture comes with a nice VLDB 2020 paper (which the presenter did not mention).
Please give all empl. a better audio recording microphone.
Or use some AI audio cleaner :D
27.28 on automating data quality. .. isn't it same as we do quality check before we save using custom code..Will there be any additional benefits?
It's probably the same, but not sure how you could do that on a datalake consistently. As described here, Delta appears to make it easier to do and making it possible to do it as if you were doing it on a relational database.
Excellent features!!
Wait, people still use comcast and watch TV?
What is the best way to load data from Sql server to Delta lake every 5 seconds?
debezium
Where can I download the slides? Thanks!
Thank you and can you please share PPT..
www.slideshare.net/databricks/making-apache-spark-better-with-delta-lake
@@張博凱-p7z Many thanks!