Looking forward to the rest of the series! I had wanted to learn data bricks before, but was a little terrified of setting up a cluster that costs might spiral. If there are any hints on how to run data bricks as cheaply as possible to learn that would be awesome! 😉👍
Hi Tom. Take a look at the community edition. Performance is not amazing and it technically runs on AWS, but it is free and has most of the functionality we cover in this series! community.cloud.databricks.com/login.html# Thanks for watching
An anology is rather than putting something in a drawer you keep in on your table. Its much quicker to pick it up and do something with it. Storing something 'in-memory' means not saving to disk which is always slower. Its the same as RAM on your computer which is super quick for your computer to use, in comparision to disk.
That would be easy understand if we would still have had HDDs, but as we have NVMEs not sure if the processing in memory would make a significant impact. Today we have much large volumes of data so processing in memory unlikely to work.
Really crearly explained. While searching I had a feeling that there are too many videos and articles that go too much into details and there is a lack of sources that provide a big picture about what is Spark, map reduce, databricks etc.
Hello, is Azure Databricks a relational Database? Does Azure Databricks supports incremental refresh in power bi? Does azure Databricks supports query folding? If there are Microsoft documents which answers these queries woukd of great help. Anyone please help.
Hi! So technically no, Databricks is not a relational database. It is a distributed query engine that can present data so that it behaves like a relational engine, despite the data being stored in a lake. It accepts SQL commands and does allow some level of query folding. This means that incremental refresh is also possible, but does require some careful configuration around rangestart/end etc. Column names are case sensitive, so can be a bit awkward! I've not seen any official docs about setting up IR for Databricks specifically, but you can follow the usual setup notes: docs.microsoft.com/en-us/power-bi/connect-data/incremental-refresh-overview Simon
Yep, there is an "Azure Databricks", "AWS Databricks" and "GCP Databricks". They all use the same core Databricks engine, but each has some nuance around integration with the cloud provider, identity management etc!
Who knew you could learn so much in 6 mins ? Great lesson, giving the history and the overall concept. Big up 👍🏾
The best explanation of what Databricks is in youtube. Thank you a lot!
Best explanation of Databicks. Started from Hadoop, then Spark and then cam to DB in 5mins..
Such a straightforward explanation. Thanks!
Excellent intro - thank you
Looking forward to the rest of the series! I had wanted to learn data bricks before, but was a little terrified of setting up a cluster that costs might spiral. If there are any hints on how to run data bricks as cheaply as possible to learn that would be awesome! 😉👍
Hi Tom. Take a look at the community edition. Performance is not amazing and it technically runs on AWS, but it is free and has most of the functionality we cover in this series! community.cloud.databricks.com/login.html# Thanks for watching
Fantastic! That is what i needed.
Hi, thanks for this great intro! Where are the other videos in this series? Couldn't see a playlist for them.
It's the month of azure databricks playlist
awesome. and in 6 minutes. thanks
Great explanation
Could you please make one video where you gonna create ML model in Azure Databricks and Deploy it on Azure kubernet service
Wow!!! so much knowledge encapsulated in less 6 mins video, that too going so slow.
Only thing I don't understand is what is "processing in memory", what is the advantage of it? and what would be an alternative?
An anology is rather than putting something in a drawer you keep in on your table. Its much quicker to pick it up and do something with it. Storing something 'in-memory' means not saving to disk which is always slower. Its the same as RAM on your computer which is super quick for your computer to use, in comparision to disk.
That would be easy understand if we would still have had HDDs, but as we have NVMEs not sure if the processing in memory would make a significant impact. Today we have much large volumes of data so processing in memory unlikely to work.
Really crearly explained. While searching I had a feeling that there are too many videos and articles that go too much into details and there is a lack of sources that provide a big picture about what is Spark, map reduce, databricks etc.
Yeah baby, yeah!
Hello, is Azure Databricks a relational Database? Does Azure Databricks supports incremental refresh in power bi? Does azure Databricks supports query folding?
If there are Microsoft documents which answers these queries woukd of great help.
Anyone please help.
Hi!
So technically no, Databricks is not a relational database. It is a distributed query engine that can present data so that it behaves like a relational engine, despite the data being stored in a lake.
It accepts SQL commands and does allow some level of query folding. This means that incremental refresh is also possible, but does require some careful configuration around rangestart/end etc. Column names are case sensitive, so can be a bit awkward!
I've not seen any official docs about setting up IR for Databricks specifically, but you can follow the usual setup notes: docs.microsoft.com/en-us/power-bi/connect-data/incremental-refresh-overview
Simon
Thank you. I didn't expect so quick response to my query. Very much appreciated. Thank you again.
what is difference between "azure data bricks " and "databricks" ....... is there a "aws databricks"
Yep, there is an "Azure Databricks", "AWS Databricks" and "GCP Databricks". They all use the same core Databricks engine, but each has some nuance around integration with the cloud provider, identity management etc!
I desperately need help to write json file to azure table storage from Databricks
It was introduction to spark...Much expected on databrick not spark