Database Sharding and Partitioning

แชร์
ฝัง
  • เผยแพร่เมื่อ 21 ธ.ค. 2024

ความคิดเห็น • 172

  • @shishirchaurasiya7374
    @shishirchaurasiya7374 ปีที่แล้ว +17

    I was literally consfused in gaining the clarity untill you came to the point where you transposed this theory into understanding through tables and the reference with SQL queries, thanks a lot to your efforts for this loving beautiful explaination Arpit sir

  • @ranjithpals
    @ranjithpals 2 ปีที่แล้ว +2

    Thanks!

  • @AlokMehta24
    @AlokMehta24 ปีที่แล้ว +6

    Excellent video Arpit . Coming from no software and system engineering background , this was the best video to explain data sharding and partioning . I am a Tech PM for AWS Supply Chain and data partitioning and sharding is real deal for us. Thank for making this extremely easy to understand video

  • @codecspy3479
    @codecspy3479 ปีที่แล้ว +6

    2 Important points which i felt could be discussed more are 1) When you said the choice of partitioning depends on the load , usecase and access patterns , can you please give an example of each case ?? 2) When you were talking about the advantages and disadvantages of sharding , have you written these points considering only sharding and no partitioning or have you written considering both sharding and partitioning ??

  • @kahleryasla
    @kahleryasla 4 หลายเดือนก่อน +4

    I am just stunned how you describe every concept that easily and even if the concept is so complicated. I love you and your works man please continue doing it. I promise I will join as a member to this channel as soon as I become an earning developer. :D

  • @nikhilrajput8696
    @nikhilrajput8696 9 หลายเดือนก่อน

    Wow...really nice. Nowadays a lot of people are selling and talking about system design and always try to build some optimistic solution straight forward without going into the internals and in fact they have not even worked on a lot of systems. I strongly feel the way of your explanation is very very nice and I am going to buy your system design plan to improve mine.

    • @AsliEngineering
      @AsliEngineering  9 หลายเดือนก่อน

      Thanks. Looking forward to having you enrolled 🙌

  • @jaskiratwalia
    @jaskiratwalia 10 หลายเดือนก่อน +2

    Wonderfully explained! Cleared all my doubts. Please keep making such videos. These are also well timed, not too short nor too long.

  • @najimali32
    @najimali32 หลายเดือนก่อน

    This is excellent, high-quality content! I always had questions about sharding and partitioning, but now I understand: sharding pertains to the database server level, while partitioning is about organizing the data itself.

  • @visheshjindal1073
    @visheshjindal1073 6 หลายเดือนก่อน

    I have just started learning System Design and not from a backend background. Still able to understand the concepts. Thanks for creating such content.

  • @virendersingh9377
    @virendersingh9377 3 หลายเดือนก่อน

    You have a brilliant way to explain, only one who has gone through the journey would be able to teach it this way.

  • @anandahs6078
    @anandahs6078 9 หลายเดือนก่อน

    Very good explanation with right examples. Hats off to you. Thanks for great content. I always thought shard and partitions are same but you clarified it very well.

  • @ranjithpals
    @ranjithpals 2 ปีที่แล้ว +5

    Thanks a lot ! That was well explained with clear and concise explanation. Looking forward to enrolling in your complete system design course.

  • @AqibJavaid-zl7vc
    @AqibJavaid-zl7vc 8 หลายเดือนก่อน

    Excellent video ❤. Finally, I got a good grasp of the whole concept.

  • @chaitanyawaikar382
    @chaitanyawaikar382 2 ปีที่แล้ว +4

    One of the best videos explaining the nuances between partitioning and sharding. Thank you @ArpitBhayani

  • @DevendraSingh-y5f
    @DevendraSingh-y5f 4 หลายเดือนก่อน

    Partitioning can be done at table as well. So let's consider we have a table named X and it is having huge data so if we want to increase the performance and improve the latency we can partition the table into Y shards.

  • @nuclearniraj
    @nuclearniraj ปีที่แล้ว

    One video and all the clutter on Sharding and Partitioning is clear. Thank you so much Arpit.

  • @aksharcharagi996
    @aksharcharagi996 5 หลายเดือนก่อน

    When i finished watching this video , concepts looked super easy. Thanks

  • @ishwarhingorani9825
    @ishwarhingorani9825 6 หลายเดือนก่อน +1

    Great clarity and in depth explanation. Thank you

  • @yuvrajchauhan9410
    @yuvrajchauhan9410 3 หลายเดือนก่อน

    Amazing explanation! The diagrams were really helpful in helping me solidify my understanding of the difference between sharding and partitioning.

  • @shreyanshsinha37
    @shreyanshsinha37 2 ปีที่แล้ว +1

    When we say Shard1 or Shard2, do we mean the sql server hosted on the EC2 instance combinedly as a shard?

  • @mystiqkc
    @mystiqkc 6 หลายเดือนก่อน

    I looove this kind of explanation. i.e stepping back and discussing the scenarios behind why something came up to be. Thanks a lot for these videos man!

  • @Jamsessions0
    @Jamsessions0 7 หลายเดือนก่อน

    One of the best explanations on the internet, well done sir

  • @sameer1571
    @sameer1571 ปีที่แล้ว

    Bro your diagram example made my day. Such a clear and concise explanation of this topic. Bro dil se love u ❤❤ for making this video.

    • @ClarifyDeCode
      @ClarifyDeCode 3 หลายเดือนก่อน

      All Software Performance Enthusiats 😊,Please do also watch our Playlist on Software Performance concepts !

  • @varshard0
    @varshard0 11 หลายเดือนก่อน

    thank you. I always assumed that they are the same thing. This cleared things up for me.

  • @___vandanagupta___
    @___vandanagupta___ ปีที่แล้ว +1

    The knowledge of amount in this video is tremendous!!! Extremely helpful 👍👍👍 thankyou sir!!

  • @paragtyagi3713
    @paragtyagi3713 2 หลายเดือนก่อน

    Bhai bhot videos dekhe.. this one was lit 🔥

  • @TechSpot56
    @TechSpot56 9 หลายเดือนก่อน

    Nice explaination, arpit.

  • @vipulsharma5140
    @vipulsharma5140 6 หลายเดือนก่อน

    This is such a great and simple explanation of partitioning and sharding of a database. Would love to watch the video on partitioning strategies when it is uploaded.

  • @vamsidharvemuluri3817
    @vamsidharvemuluri3817 9 หลายเดือนก่อน

    Best explanation so far. thanks brother

  • @aditigupta6870
    @aditigupta6870 11 หลายเดือนก่อน

    Hello arpit, at 5:49, why you mentioned that the new resources are being allocated to the EC2 machine? I think that should be allocated to the DB server running on EC2 machine right?

    • @AsliEngineering
      @AsliEngineering  11 หลายเดือนก่อน

      I meant the server running the database. The database is eventually running on some VM.

    • @aditigupta6870
      @aditigupta6870 11 หลายเดือนก่อน

      @@AsliEngineering thanks arpit

  • @sumit13agarwal
    @sumit13agarwal หลายเดือนก่อน

    In case of replica of data on sharded server, we have same data dupicated on the physical disks of the data server. How is the consistency of data maintained across data base servers ?

  • @CSKnowledge007
    @CSKnowledge007 หลายเดือนก่อน

    What do you mean by "one node" at 12:48 ? Is it one EC2 instance?

  • @movies2watchify
    @movies2watchify 3 หลายเดือนก่อน

    liked the way concepts are explained like a story.

  • @kamalbahadur007
    @kamalbahadur007 2 หลายเดือนก่อน

    This is really great video. Subscribed, and exploring other videos uploaded by you.

  • @rohitsk6793
    @rohitsk6793 4 หลายเดือนก่อน

    Thanks for these practical examples and overall explanation

  • @letsexplorewithanika2642
    @letsexplorewithanika2642 2 ปีที่แล้ว +1

    Very clear explaination

  • @InvincibleMan99
    @InvincibleMan99 2 หลายเดือนก่อน

    Looking for practical example for partionitiong

  • @joshir8500
    @joshir8500 6 หลายเดือนก่อน

    What should be the strategy of partitioning a mysql database with multiple tables into multiple shards? There are some tables which does not contains shard key.

    • @AsliEngineering
      @AsliEngineering  6 หลายเดือนก่อน

      1. Either it should contain the shard key
      2. Or if it small enough then replicate the entire table across all the shards (eg. Config table, or meta table)
      3. Or third bear the cost of cross shard fan outs.

  • @iMakeYoutubeConfused
    @iMakeYoutubeConfused 10 หลายเดือนก่อน

    Very clear explanation, thanks!

  • @birajasahoo7
    @birajasahoo7 24 วันที่ผ่านมา

    Amazing Explanation!

  • @sumeetsingh1729
    @sumeetsingh1729 10 หลายเดือนก่อน

    how's it decided which shard is hit by request? Is there any router in front ensuring routing of requests?

  • @dhaanaanjaay
    @dhaanaanjaay ปีที่แล้ว

    One question, at 21.00 the matrix shows what it looks like when we have both sharding and partioning, how that is different from having two databases on two different EC2 instance for two applications?

  • @kritibindra4232
    @kritibindra4232 2 ปีที่แล้ว +1

    Wow this was really really helpful! Thank you posting this.✨

  • @shrad6611
    @shrad6611 ปีที่แล้ว

    finally I understand what sharding is, thanks a ton

  • @DEEPAKKUMAR-wk5pk
    @DEEPAKKUMAR-wk5pk 2 ปีที่แล้ว +1

    Wow great explanation

  • @VenkateshwaranP-u8b
    @VenkateshwaranP-u8b 3 หลายเดือนก่อน

    Thanks for the explanation

  • @ryan-bo2xi
    @ryan-bo2xi ปีที่แล้ว

    bohot badhia bhai .. lajawwab

  • @shintojoseph9166
    @shintojoseph9166 ปีที่แล้ว +1

    Clear explanation

  • @KishoreThatavarthi
    @KishoreThatavarthi 11 หลายเดือนก่อน

    thanks a lot arpit sir really enjoyed and got full clarity

  • @KetanNabera
    @KetanNabera หลายเดือนก่อน

    Great video ! Nicely explained. :)
    But I have one query , towards the end of the video :
    The example you shared for "Sharding NO and partitioning YES" about Airline Check-in System and Ticket Booking System. How is that classified as partitioning if the databases are altogether different ?
    Any thoughts ?

  • @shyama5612
    @shyama5612 2 หลายเดือนก่อน

    Great explainer. - thanks!

  • @amogu_07
    @amogu_07 9 หลายเดือนก่อน

    thank you so much , clearly understood!!

  • @hanzalasiddique6313
    @hanzalasiddique6313 ปีที่แล้ว +1

    Mind Blowing ❤

  • @mahendratonape27
    @mahendratonape27 5 หลายเดือนก่อน

    Thanks a lot u clearing confusion of long period

  • @akshayrahangdale8511
    @akshayrahangdale8511 ปีที่แล้ว

    Very Nice Video, I just loved the explanation.

  • @jasper5016
    @jasper5016 10 หลายเดือนก่อน

    Thanks so much Arpit!!

  • @vijaymunavalli335
    @vijaymunavalli335 2 ปีที่แล้ว +1

    Its very practical explanation...cool one

  • @GaganJain2508
    @GaganJain2508 ปีที่แล้ว +1

    Does it mean Sharding and replication are the same? 22:16

  • @santanuhalder9306
    @santanuhalder9306 หลายเดือนก่อน

    What iPad note application are you using?

  • @amananurag07
    @amananurag07 8 หลายเดือนก่อน

    @arpit Thanks for such dense information in so short and simple video.
    However I have a query on a corner case
    - How can have replicas when one has multiple shards with partitioning?
    - In this case is replication locally on the shard or it can also be replicated on other shards for high availability across avalability zone or DR (like kafka architecture)?

  • @aditiagarwal7081
    @aditiagarwal7081 7 หลายเดือนก่อน

    When running two databases on the same machine, are we not still sharing the same underlying resources such as CPU, memory, and disk I/O?

    • @likith1337
      @likith1337 6 หลายเดือนก่อน

      How can u run two sql daemon on the same machine?

  • @magicpotato1707
    @magicpotato1707 28 วันที่ผ่านมา

    Hi arpit after sharding or partioning how does the request gets routed to the respective shards do you we need to write logic at application level to track shards or is there any other approach?

    • @AsliEngineering
      @AsliEngineering  28 วันที่ผ่านมา +1

      either your backend servers know the topology or add a proxy that knows the topology.

  • @zeyuli53
    @zeyuli53 2 ปีที่แล้ว +1

    well explained, thank you

  • @nimitkanani1691
    @nimitkanani1691 2 ปีที่แล้ว

    Very beautifully and simply explained. The content of the video flowed so smoothly. Thank You @ArpitBhayani

  • @timamet
    @timamet 2 ปีที่แล้ว +1

    amazing explanations, thank you

  • @ShivangGoyal-n2n
    @ShivangGoyal-n2n ปีที่แล้ว

    literally one of the based video i have ever seen on this topic.

  • @vikasbhutra9400
    @vikasbhutra9400 2 ปีที่แล้ว +1

    Thanks a lot Arpit for explaining in so simplistic way. One request can you please make video on Sharding strategies and also on how composite indexes stores in the disk.

    • @AsliEngineering
      @AsliEngineering  2 ปีที่แล้ว

      Soon.

    • @hc90919
      @hc90919 ปีที่แล้ว

      @asli engineering - Bhai, any update on the sharding strategies.
      Also, one more request is examples of scenarios to explain shard key selection.
      How is the data replicated behind the scenes n stuff please ?

  • @ankitmaheshwari2341
    @ankitmaheshwari2341 ปีที่แล้ว

    Do we use sharding when we have better options available like Oracle RAC where database can be scaled horizontally

  • @imperfecto7734
    @imperfecto7734 ปีที่แล้ว +1

    @arpit what's the benefit of partitioning the data but not sharding it. Can you give me a usecase please?

    • @AsliEngineering
      @AsliEngineering  ปีที่แล้ว +4

      Partitioning allows your database to read/access/move the required subset of data easily and efficiently.
      1. Imagine if you partition data by time and create one partition for every hour and someone queries how many events happened in the last 10 hours, you would just need to access last 10 partition to fulfil this query. Others are not even required to be read.
      2. In a distributed setup, instead of moving individual rows/elements we can easily and efficiently move partitions across the cluster for balancing the load.

    • @imperfecto7734
      @imperfecto7734 ปีที่แล้ว

      Understood! Thanks 🙏

  • @TarunKumarSaraswat
    @TarunKumarSaraswat 3 หลายเดือนก่อน

    Thanks, really detailed

  • @anshujaiswal5622
    @anshujaiswal5622 7 หลายเดือนก่อน

    Simple and to the point explanation .. Thanks Arpit, Liked & Subscribed :)

  • @mohitkumartoshniwal
    @mohitkumartoshniwal 2 ปีที่แล้ว +1

    A very clear and detailed explanation. ♥️

  • @KriszSch
    @KriszSch 9 หลายเดือนก่อน

    Great explanation!

  • @AlexandraSkrzypinski
    @AlexandraSkrzypinski หลายเดือนก่อน

    Great content, as always! I need some advice: My OKX wallet holds some USDT, and I have the seed phrase. (alarm fetch churn bridge exercise tape speak race clerk couch crater letter). How can I transfer them to Binance?

  • @neerajdixit7102
    @neerajdixit7102 2 ปีที่แล้ว

    Awesome Arpit, Thanks truly admire your way of teaching

  • @DurgaShiva7574
    @DurgaShiva7574 3 หลายเดือนก่อน

    GOD of explanation !

  • @tawseefbhat977
    @tawseefbhat977 2 ปีที่แล้ว

    how do we know which partition or shard our data is located when we make query? any detailed explantion

  • @pramodpatil-ue8sm
    @pramodpatil-ue8sm ปีที่แล้ว

    Great explanation, as always. Please post a link If you have recorded any video on Partitioning strategies

  • @kaal_bhairav_24
    @kaal_bhairav_24 9 หลายเดือนก่อน

    thanks a lot arpit for an awesome explanation as always

  • @aneksingh4496
    @aneksingh4496 ปีที่แล้ว

    super video Arpit

  • @iHariPatel
    @iHariPatel ปีที่แล้ว +2

    As my view Partition is more complex because you have to work with partition key! With wrong query accidentally query scan all partition’s.

  • @aditijalaj5036
    @aditijalaj5036 ปีที่แล้ว

    this is an amazing video and your explainations are very clear

  • @prashantkamble898
    @prashantkamble898 ปีที่แล้ว

    Greatly explained

  • @Bluesky-rn1mc
    @Bluesky-rn1mc 2 ปีที่แล้ว +1

    how foreign key constraints are managed when two tables are in different shards ?

    • @AsliEngineering
      @AsliEngineering  2 ปีที่แล้ว +6

      Foreign keys are dropped when you adopt sharding. You cannot maintain FK when data is partitioned across multiple shards.

    • @Bluesky-rn1mc
      @Bluesky-rn1mc 2 ปีที่แล้ว

      @@AsliEngineering thanks

  • @hemsagarpatel8992
    @hemsagarpatel8992 ปีที่แล้ว

    If we had horizontal partitioning and 1 partition getting so much traffic in real time how can we load balance the traffic. is it possible

  • @kalinduabeysinghe8917
    @kalinduabeysinghe8917 ปีที่แล้ว

    Such a clean explanation🙌

  • @aditigupta6870
    @aditigupta6870 11 หลายเดือนก่อน

    One shard also must be having replicas right? I mean if a shard is handling the first 2 partitions, then all data from those first 2 partitions will go to this shard, but what if the shard is down?

    • @AsliEngineering
      @AsliEngineering  11 หลายเดือนก่อน

      shared can have replicas to scale the reads. If the shard goes down, then either you auto promote replica to take over, or take the downtime.

  • @ronakshah725
    @ronakshah725 5 หลายเดือนก่อน

    Hey great video Ankit! I was wondering how sorting and specially filtering would work across shards?
    Or is that an anti pattern ( just like cross shard joins )

  • @gigachad400
    @gigachad400 ปีที่แล้ว +1

    One of the biggest disadvantages of sharding over a SQL server is you lose the ACIDity so you have to be careful while you doing it with SQL databases

  • @the_angry_developer
    @the_angry_developer 6 หลายเดือนก่อน

    How can the api server know which database server shall be given this amount of load etc ??

    • @AsliEngineering
      @AsliEngineering  6 หลายเดือนก่อน

      that is your routing strategy - range based or hash based or static routing.
      for example, all request for a user goes to a particular database and the ownership is determined by taking hash of user id i.e. f(user_id)%num_databases

    • @the_angry_developer
      @the_angry_developer 6 หลายเดือนก่อน

      @@AsliEngineering how can this be implemented? Also if you can please can you make an system design with implementation video on microservices with like nodejs .. as i have understood the theoretical part but the implementation part is where am getting stuck and not understanding how to do that

    • @AsliEngineering
      @AsliEngineering  6 หลายเดือนก่อน

      @@the_angry_developer you can get user ID from your auth token. Have an array of database connections in your API code and apply the function mentioned above.

    • @the_angry_developer
      @the_angry_developer 6 หลายเดือนก่อน

      @@AsliEngineering o okay, thank you

  • @lazry1773
    @lazry1773 ปีที่แล้ว

    Dude this was amazing

  • @tanyasingh6435
    @tanyasingh6435 3 หลายเดือนก่อน

    Start from 2:50

  • @RohanBoopathiraj
    @RohanBoopathiraj 7 หลายเดือนก่อน

    How we know in which shard our data resides?

    • @AsliEngineering
      @AsliEngineering  7 หลายเดือนก่อน +1

      That depends on your routing strategy - Range/Hash/Static. In any case, you pick a partitioning key and depending on the approach you deduce which shard to go to.

  • @jithinb7047
    @jithinb7047 ปีที่แล้ว

    Awesome content Arpit ! Thanks a lot and please do continue post more on concepts such as well as analysis of real use cases.

  • @GaneshSrivatsavaGottipati
    @GaneshSrivatsavaGottipati 8 หลายเดือนก่อน

    what if we have read replicas and still have partitioning?

  • @kritibindra4232
    @kritibindra4232 2 ปีที่แล้ว

    Also which software did you use in this video to create pictures and write content?

  • @AndreeSawkar
    @AndreeSawkar 3 หลายเดือนก่อน

    Great video as always! 👍 I’ve got a question: 🤨 I have these words 🤨. (behave today finger ski upon boy assault summer exhaust beauty stereo over). What should I do with this? 🤷‍♂️

  • @coderkashif
    @coderkashif 4 หลายเดือนก่อน

    As usual amazing

  • @KrishnaSagar02
    @KrishnaSagar02 5 หลายเดือนก่อน

    I have a doubt sir. I think increase in overall storage capacity and higher availability, doesn't got hand in hand.
    If we shard and partition data, then the capacity will be available but then the data is not same in both shards i.e., if one shard becomes offline then the other shard cannot provide data if the requested data is in the offline shard. So no higher availability.
    And if we only do sharding then the same data will be in both servers, and as u said if we assume 100tb, then the other will also have 100tb only? We cant get 200tb.
    This is what I thought. Correct me if im wrong.

    • @bharadwajsai179
      @bharadwajsai179 4 หลายเดือนก่อน

      I too got the same doubt, I came to know that in this case, our system will be available but the data might be inconsistent, so still the server accepts requests and provides an empty response.

  • @abhigujjar7439
    @abhigujjar7439 ปีที่แล้ว

    Can you please share the notes