Azure Synapse Analytics | Data Distribution Strategy and Best Practices

แชร์
ฝัง
  • เผยแพร่เมื่อ 19 ต.ค. 2024

ความคิดเห็น • 46

  • @hoanglieuit
    @hoanglieuit ปีที่แล้ว

    This is the first time I ever subscribed a channel.

  • @VK-ln9vk
    @VK-ln9vk ปีที่แล้ว

    i wish there are 100000 LIKE buttons. THE BEST VIDEO on the azure synapse distribution. Understood clearly about the distributions with the demo.Thank you so much 🙏

  • @VirtusRex48
    @VirtusRex48 ปีที่แล้ว

    One of the best Synapse videos out there; highly recommend!!!

  • @husnabanu4370
    @husnabanu4370 ปีที่แล้ว

    wow so detailed explaination with all the visuals and query example is making so easy to understand...

  • @jubershikalgar4205
    @jubershikalgar4205 2 ปีที่แล้ว

    Thank you very much for this video.
    It was a very helpful and learnt alot about synapse.

  • @goelnikhils
    @goelnikhils 2 ปีที่แล้ว

    What hard work in creating this video. Very good content

  • @orxanbabashov
    @orxanbabashov 8 หลายเดือนก่อน

    This is the first time I ever subscribed a channel as well. Huge thanks !!!!

  • @Zaf567
    @Zaf567 2 ปีที่แล้ว

    Have watched many videos related to this but yours is awesome.

  • @donanuradha2162
    @donanuradha2162 3 ปีที่แล้ว +1

    Very well explained how data is distributed in Synapse SQL DW

  • @vinayak6685
    @vinayak6685 2 ปีที่แล้ว +1

    Really happy to find this video. Loved the practical demo on how the distributions happened. Subscribed(500th subscriber😁). Waiting for more such awesome content🤩

  • @julianromero3359
    @julianromero3359 ปีที่แล้ว

    Amazing explanation, thanks for concepts are very clear and practical to understand. I hope find more contents from you. 🤗

  • @danielveraec
    @danielveraec 2 ปีที่แล้ว

    Thanks for sharing this knowledge. Really helpfully!!

  • @gvgnaidu6526
    @gvgnaidu6526 2 ปีที่แล้ว

    Amazing explanation and nice representation of all the aspects. Thank you so much Arshad

  • @vaibhavvaidya1442
    @vaibhavvaidya1442 3 ปีที่แล้ว

    Never saw explanation like this on azure synapse, Amazing :)

  • @Ali-q4d4c
    @Ali-q4d4c ปีที่แล้ว +1

    👍🏻👍🏻👍🏻

  • @Farisito
    @Farisito ปีที่แล้ว

    Thank you a lot ALI, very useful in my case

  • @abc_987
    @abc_987 2 หลายเดือนก่อน

    JUST GOLD

  • @Ali-q4d4c
    @Ali-q4d4c ปีที่แล้ว +1

    👍👍👍👍

  • @peaceneeded
    @peaceneeded 2 ปีที่แล้ว

    Simply Amazing Explanation !

  • @MohammedKhan-np7dn
    @MohammedKhan-np7dn 3 ปีที่แล้ว

    Looking forward for the next session

    • @ArshadAliAasTrailblazers
      @ArshadAliAasTrailblazers  3 ปีที่แล้ว

      Thanks Mohammed, I just posted a video on CI/CD and planning to post few more in next couple of weeks.

  • @SQLTalk
    @SQLTalk 2 ปีที่แล้ว

    This is a very well done and helpful video. Thank you for making it.

  • @MohammedKhan-np7dn
    @MohammedKhan-np7dn 3 ปีที่แล้ว

    Very Good session to understand the concepts in Synapse Analytics

  • @vivekvishal2500
    @vivekvishal2500 2 ปีที่แล้ว +1

    Great Sir 👌

  • @amittyagi9171
    @amittyagi9171 2 ปีที่แล้ว

    Thank you so much. You are amazing.

  • @MohammedKhan-np7dn
    @MohammedKhan-np7dn 3 ปีที่แล้ว

    Thank you to explain the concepts in detail.

  • @upendarjakkula2561
    @upendarjakkula2561 2 ปีที่แล้ว

    Extraordinary 👌

  • @kuldeepgawande9550
    @kuldeepgawande9550 3 ปีที่แล้ว

    Excellent explanation. Thank you.

  • @shuaibpantnagar
    @shuaibpantnagar 2 ปีที่แล้ว

    Very nicely explained the Azure Synapse specially SQL pool. I have question here. Both Synapse and Azure Data bricks have spark engine. How would I choose one between them for my my project work?

  • @HGoIchetan09
    @HGoIchetan09 3 ปีที่แล้ว

    Excellent explanation.. Thanks..

  • @samuelrocha9079
    @samuelrocha9079 2 ปีที่แล้ว

    Thank you for the video, one of the bests that I ever watched in terms of learning data.
    Just a quick question, in round-robin table, you said the data will be shuffled when you query the group by ProductKey, and the distribution will be organized by that field, so, what if after that, I decide to execute the same query, but grouping by a different field? The shuffle will happen again? and the distribution will be by this other field that I'm considering to group?

  • @TiffanyMorris123
    @TiffanyMorris123 3 ปีที่แล้ว +1

    Thanks for this video! Question you touched quickly on creating statistics in Synapse prior to running queries based on the query patterns.. For my case I have a large group of users from admins to analysts to developers and I can not predict the types of queries that they will run. Is there a best practices that I can pass on to the users when planning to create the stats before running their queries? Do you plan on future tutorials on this topic? thanks!

    • @ArshadAliAasTrailblazers
      @ArshadAliAasTrailblazers  3 ปีที่แล้ว

      Thanks Tiffany! While creating stats in advance is a proactive way to optimize the performance, engine also learns from first time submitted queries to optimize the performance for future submissions when AUTO_CREATE_STATISTICS setting is ON (which is ON by default). You can find more details about it here: docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-statistics
      To shorten statistics maintenance time, be selective about which columns have statistics, or need the most frequent updating. For example, you might want to update date columns where new values may be added daily. Focus on having statistics for columns involved in joins, columns used in the WHERE clause, and columns found in GROUP BY. docs.microsoft.com/en-us/azure/synapse-analytics/sql/best-practices-dedicated-sql-pool#maintain-statistics

  • @SSingh-lr2ue
    @SSingh-lr2ue 3 ปีที่แล้ว

    Thank you for the clear explaination . however i am not clear about where does 60 buckets or 60 distribution gets stored , Is it in azure storage ? In short not getting the purpose/difference of azure storage and SQL Database instance attached with compute node , Could you please explain more about it ?

    • @ArshadAliAasTrailblazers
      @ArshadAliAasTrailblazers  3 ปีที่แล้ว

      For developers, I think the important thing to consider is how it scales out, for example, if you have 2 nodes, each of these nodes will have 30 distributions attached to it, likewise if you 4 nodes, each of these nodes will have 15 distributions. By this scaling out from 2 to 4 nodes, each of these nodes now will have roughly half of the data (assuming there is no data skewness), and will take roughly half the time to complete processing. docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/memory-concurrency-limits#service-levels

    • @BabatundeAdeleye-mw5ce
      @BabatundeAdeleye-mw5ce 11 หลายเดือนก่อน

      The 60 distributions are stored in the sql database instance in the sql pool. data from azure store are distributed to the distributions in different patterns, depending on the distribution type defined on the sql pool table during table creation. sql engine then gets these data from the distributions as instructed in your query, which may require it to move data around or not before executing the aggregate function on the data and sending the output to the control node, which in turn sends the same to the user for viewing.

  • @sumitrauniyar7347
    @sumitrauniyar7347 2 ปีที่แล้ว

    how does replicate distribution work when we have 1 compute node?

  • @Mohammad.aarif_222
    @Mohammad.aarif_222 6 หลายเดือนก่อน

    From where I need to store files in blob storage

  • @Mohammad.aarif_222
    @Mohammad.aarif_222 6 หลายเดือนก่อน

    How do I make external table