We have been using Data Lakehouse mostly with sql serverless for around a year now and it works very well for our customers.. We have to think differently when designing the ETL/ELT but basically we still end up with a snowflake schema and great performance. However, it requires a tool like Power Bi premium for caching data bc of the rather slow and unpredictiable latency from the serverless.
Hi Please help me. I have created an external table in the synapse lake database, Now I like to load the records from the external table into the dedicated SQL pool table. Please advice on the procedure.
Great Video, at my company we are currently trying to identify which architecture is the best to unify the many data silos we have. Looks like I can show a possible solution ;)
I wonder if the increase in remote work impacted naming conventions? …data lakes…data streams…lake house…I am started to feel like someone has a better view while they work.
Yeah Fabric is nothing new to me... Data factory and Synapse existed 3years ago.. Sem-structured data storage alternatives as well and Apache Spark and Cosmos DB
Personally i don't like DLH all that much due to the slow loading of new data into parquet and the fact you read files instead of SQL tables with history. Who would've known merging 10'000parquet files would be slower than an RDB database?
Yes, we are building a data lake house with the structured data transformed pointing to redshift and all of the unstructured data pointing to the S3 buckets. But the data scientists can't query the data in Athena and point the data to Power BI for analytics and data visualizations.
Yeah. I liked Storage Blobs, Data Factory into a Synappse or Azure SQL Database. Lakehouse and parquet is very dependant on a coded scala/python framework and makes every step 10x more confusing and code dependant. We are back to SSIS 2005 and have to code an ETL framework in C+ or instead of working in BI moving into fullstack coding instead..
It's a very nice talk with a lot of energy. That is contagion; it draws a smile from me. Thanks guys
We have been using Data Lakehouse mostly with sql serverless for around a year now and it works very well for our customers.. We have to think differently when designing the ETL/ELT but basically we still end up with a snowflake schema and great performance. However, it requires a tool like Power Bi premium for caching data bc of the rather slow and unpredictiable latency from the serverless.
Yup we lost the fast data retention of an SQL server and instead have to deal with parquet file conversions..
We are looking to implement a DW, DL, and/or DLH where I work so I am looking EXTREMELY FORWARD to those videos. Thank you!!!
Great overview guys. Love it!
Yep, now building out a company wide lakehouse - early days at the moment. But this thing is nuts fantastic. Especially loving delta tables.
I hate Delta parquet.. 😂
Convert files upon stahing, build delta logic and shit.. Then everything sucks when handling upserts and deletion.
Hi Please help me. I have created an external table in the synapse lake database, Now I like to load the records from the external table into the dedicated SQL pool table. Please advice on the procedure.
Great content and engaging, worth every minute.
where does the concept of datamarts (not Power BI datamarts) fit into this combo of data lakehouse?
Great video, but using the Azure machine learning icon for the data lake is confusing for me 😅
This is exactly what we use. It is great for ad hoc analysis as well.
When the Data Warehouse and the Data Lake went out for a few beers... was it in Belgium?
Where else? 😅
Great content as always! Awaiting the slowed down version with some hands!!
Chris Wagner is pointing us toward Synapse Analytics, I got a book and would love to also watch GIAC videos!
Amazing information, Thank You
Great Video, at my company we are currently trying to identify which architecture is the best to unify the many data silos we have. Looks like I can show a possible solution ;)
Waiting for the slow downed version Patrick !!
Great video, thank you!!!
I wonder if the increase in remote work impacted naming conventions? …data lakes…data streams…lake house…I am started to feel like someone has a better view while they work.
Neat. We’ve been taking this approach for a couple years now but didn’t give it a fancy name.
Yeah Fabric is nothing new to me...
Data factory and Synapse existed 3years ago..
Sem-structured data storage alternatives as well and Apache Spark and Cosmos DB
Was that you Patrick on the "Power BI Update - October 2022" ? :)
I would have loved to hear the problem DW has that DL and DLH solves. I only heard “large” data in contrast to DW historical data.
Personally i don't like DLH all that much due to the slow loading of new data into parquet and the fact you read files instead of SQL tables with history.
Who would've known merging 10'000parquet files would be slower than an RDB database?
Q: What happens to all those BI reports with their virtual schemas when the "Silver Layer" data structure changes?
Can your audio be more clear with less base and noise?
interesting! thanks!
Yes, we are building a data lake house with the structured data transformed pointing to redshift and all of the unstructured data pointing to the S3 buckets. But the data scientists can't query the data in Athena and point the data to Power BI for analytics and data visualizations.
What is a Databrick? TIA
Microsoft implementation of Apache Spark. Connects to the data lake etc but for in memory, clustered computing. Geared towards big data analytics.
Patrick really likes data LMAOOO
More like data swamp
Yeah. I liked Storage Blobs, Data Factory into a Synappse or Azure SQL Database.
Lakehouse and parquet is very dependant on a coded scala/python framework and makes every step 10x more confusing and code dependant.
We are back to SSIS 2005 and have to code an ETL framework in C+ or instead of working in BI moving into fullstack coding instead..
But... what about...a...: "Lake Database" :)
funny this is called data constelation, someone added lakehouse just to sell training courses I bet.
Data warehouses are dead, aren't they?