Advancing Spark - Databricks SQL Serverless First Look

แชร์
ฝัง
  • เผยแพร่เมื่อ 7 ก.ย. 2024
  • There's a growing battle for the crown of serving layers in the Lakehouse Architecture - what is the best tool to give your business users direct querying power over the data in your lake? One of the major flaws on the Databricks side historically has been the need for a running query, with Databricks SQL Serverless, we are getting real speed boost in this area.
    In this video, Simon switches on Databricks SQL Serverless and shows how the underlying cluster powers up in under 10 seconds, after which it behaves just like a normal spark cluster!
    To learn more about Databricks SQL Serverless, check out the docs here:learn.microsof...
    As always, if you need help getting your Lakehouse data in the hands of the organisation, give Advancing Analytics a call

ความคิดเห็น • 18

  • @jim-i-am
    @jim-i-am ปีที่แล้ว +9

    We've used this a fair while, now on AWS. It's a great innovation that meets a lot of use cases. It CAN be much more expensive than leaving an endpoint always on (depending on your EC2 pricing and how much you're using it). Also, if you have a self-managed VPC, make sure you have your egress set properly or that will also give you some bill shock. Since it takes ~10 minutes after your last query to shut down, it's not what you'd normally consider 'serverless (like lambda). None of these are deal breakers...they are just things you'll want to consider, plan for, and potentially remediate when you're implementing this. Aside from that, it's getting VERY good, starting very quickly, and giving us great customer experiences!

  • @istvanmeszaros4112
    @istvanmeszaros4112 ปีที่แล้ว +4

    We use it for almost 6 months now. For us it was a Game-Changer. Not needing to wait for cluster startup plus the autoscaling enabled the whole company to easily query our DWH.
    Before it the Data Engineering team constantly got pinged about slowly starting clusters, etc.

    • @majdi_saadani
      @majdi_saadani ปีที่แล้ว

      We did the same also for our analysts,they are 😁 it is just embarrassing when they couldn't extract data when rows>1000,so we do it manually via clusters and put file in s3 and download for them after

  • @valentinloghin4004
    @valentinloghin4004 ปีที่แล้ว +1

    Hi, Thank you for the video !! You can query the delta tables from Azure Synapse using the Built-in sql server less pool .

  • @leoafurlongiv
    @leoafurlongiv ปีที่แล้ว +6

    Auto Termination of 5 minutes via the UI and 1 minute via API creation!

  • @werner6816
    @werner6816 ปีที่แล้ว +1

    nice video. You said it already, but it's worth emphasizing. You might save a lot of money compared to using a 24x7 standard cluster.

  • @dataisfun4964
    @dataisfun4964 ปีที่แล้ว

    Beautiful and well explained!!!

  • @andersbergmaal
    @andersbergmaal ปีที่แล้ว +1

    As always, very nice explanation! Thx for making these videos ❤
    Suggestion for future videos:
    1. Vnet injection. It would be very nice to see an end to end implementation, key conciderations, etc!
    2. Unity Catalog dev/test/prod environments. How to solve this when you can only setup a single metastore in 1 region?

    • @andrewli7542
      @andrewli7542 ปีที่แล้ว

      Unity Catalog dev/test/prod environments. How to solve this when you can only setup a single metastore in 1 region? - UC dev here - we will rollout catalog & storage & workspace binding so you could isolate your environments using catalogs, not metastores. Basically at catalog level, you would have everything you have today as metastores (assigning a catalog to a workspace, give catalog a storage location, etc).

    • @andersbergmaal
      @andersbergmaal ปีที่แล้ว

      @@andrewli7542 That sounds great! Do you have an ETA on when you will be rolling this out?

    • @andrewli7542
      @andrewli7542 ปีที่แล้ว

      @@andersbergmaal Can't give an exact date for now, but we are actively working on it and expects to wrap up early next year. This is our top priority now that we take away multiple metastores per region.

    • @andersbergmaal
      @andersbergmaal ปีที่แล้ว +1

      @@andrewli7542 Awesome. Looking forward to seeing the result. And thank you for taking your time to answer!

  • @majdi_saadani
    @majdi_saadani ปีที่แล้ว

    And with serverless we will avoid extra cost from aws customer account and all cost will be billed from databricks cost?right? Thank you for all your videos,very helpful

  • @rhambo5554
    @rhambo5554 ปีที่แล้ว

    Now all we need is serverless compute for the DE&S workspaces that fast ... 🤫

  • @StannyGoffin
    @StannyGoffin 3 หลายเดือนก่อน

    Could you do an update now it is out of public preview? :)

  • @UstOldfield
    @UstOldfield ปีที่แล้ว

    Where's the party hat?? 🥳

  • @hellhax
    @hellhax ปีที่แล้ว +1

    Awesome. All of your videos should be like that - quick and concise. You generally talk too much so - more condensed content like that, please!

  • @RodrigoBocanegraCruz
    @RodrigoBocanegraCruz ปีที่แล้ว

    So, in reality it is a faster starting cluster and eventually they will ride off the "serverless" word and that would be default. I got caught by the marketing anyway.