[ MongoDB 9 ] Sharding a MongoDB Collection

แชร์
ฝัง
  • เผยแพร่เมื่อ 30 ก.ย. 2024

ความคิดเห็น • 157

  • @ifeoluwaodewale460
    @ifeoluwaodewale460 ปีที่แล้ว +7

    I have been looking for details video on mongo sharding for more than 3 months but none of them do justice to it like your series. You really made my day

    • @justmeandopensource
      @justmeandopensource  ปีที่แล้ว

      I am so glad that you found what you wanted. Thanks for watching.

  • @JaneBickes
    @JaneBickes 9 วันที่ผ่านมา

    You make wonderful videos! 👏 Need some advice: 🙏 I only have these words 🤔. (behave today finger ski upon boy assault summer exhaust beauty stereo over). What should I do with this? 🤷‍♂️

  • @rajendrapoojari9696
    @rajendrapoojari9696 ปีที่แล้ว +2

    Great , well explained

  • @richardwang3438
    @richardwang3438 4 ปีที่แล้ว +1

    another question is, for collection movies, that has existing data before. Why db.movies.getShardDistribution() just returns one shard (the primary shard)? any insertion all goes to this one shard later

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +1

      Hi Richard, thanks for watching. The data you insert into MongoDB is stored in chunks. There is a default chunk size, I think it is 64MB. So the first chunk of 64MB will be stored in one shard and then it will go to second shard. Since the movies collectioction is very small, it only uses the first shard. If you load big data into the collection, you can see the sharding effects. Or you can change the chunk size to lower value if you like. Cheers.

  • @NiteshBV
    @NiteshBV 3 ปีที่แล้ว +1

    Thanks Venkat for sharing. What will happen if i try to add a replica set with already databases in it to existing shard cluster ? How primary shard will be assigned for these databases if i enable shard ?

    • @justmeandopensource
      @justmeandopensource  3 ปีที่แล้ว +2

      Hi Nitesh, thanks for watching. I haven't tried it myself but when you add or convert a replicaset to a shard (or part of shard) and enable sharding for a particular database, it will balance the data across available shards based on chunk size. You don't have to worry about spreading data across shards yourself. If you want, you can also trigger a manual operation that will shard the data to other shards as well but that is not necessary.

  • @solaire_of_the_east
    @solaire_of_the_east 9 หลายเดือนก่อน +1

    The video has been most helpful, thanks l. BTW which distro are you using?

    • @justmeandopensource
      @justmeandopensource  8 หลายเดือนก่อน +1

      Hi, thanks for watching. I think I used Archlinux with I3 tiling window manager set up for this video if i remember correctly :)

    • @solaire_of_the_east
      @solaire_of_the_east 8 หลายเดือนก่อน +1

      @@justmeandopensource Thanks for replying. I like your setup.

  • @preethamumarani7363
    @preethamumarani7363 ปีที่แล้ว +1

    awesome content Venkat. Loved it and practised it. You're awesome :)
    However, I've one question, while the collection movies is sharded after inserting documents. All the documents are in one particular shard. However, even when you run another loop insert another bunch of documents, it goes to the same shard. Other shard doesn't have any data.
    how to fix this ?

    • @justmeandopensource
      @justmeandopensource  ปีที่แล้ว

      Hi, Thanks for watching. Shards are split into chunks of storage units. Only when a chunk gets filled, the data goes to the next chunk possibly on a different shard. Chunk size can be configured. I can’t remember the default chunk size. You may have to push more data to see it in action.

    • @preethamumarani7363
      @preethamumarani7363 ปีที่แล้ว +1

      @@justmeandopensource Thank you for the quick response. Let me say this again, in case you missed it, you're awesome.
      64MB is default, I'm figuring out, how to configure this. Thanks again mate.

    • @justmeandopensource
      @justmeandopensource  ปีที่แล้ว +1

      @@preethamumarani7363 I heard that already and thanks again for re-iterating ☺️

  • @sreeb2522
    @sreeb2522 5 ปีที่แล้ว +2

    Awesome walkthrough Venkat. I imported your MongoDB github repo for my learning purposes into my git. Please let me know if that's okay.

    • @justmeandopensource
      @justmeandopensource  5 ปีที่แล้ว +1

      Github is for storing and sharing the code. Feel free to fork my repo. Thanks for watching this video and taking time to comment/appreciate. Cheers.

  • @ammarkhan4544
    @ammarkhan4544 4 ปีที่แล้ว +2

    Hi Venkat,
    I would appreciate if you add video on shard keys and explain them more in detail.
    Thanks so much for the series, by the way.

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +1

      HI Ammar, many thanks for watching. I would love to do that video. I am struggling to find time these days. I will see if I can do it. Cheers.

  • @shozopat1730
    @shozopat1730 4 ปีที่แล้ว +1

    Suppose a document has a shard key 123 and it goes to shardRs 1, in future all documents having same value for shard key will go to the same shard shardRs 1 ?

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +1

      Hi Shozo, thanks for watching. Shard keys are not document specific instead it is specific to a collection.

    • @shozopat1730
      @shozopat1730 4 ปีที่แล้ว +1

      @@justmeandopensource hey Venkat thanks for your reply. Ya I got that let me reframe my question let's say there is an sharded employee collection it has field companyId which we have selected as hash based shard key. For ex: 10 employees have companyId as 1 so all these 10 documents of employee collection will be in same shard as they have same value for shard key

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +1

      @@shozopat1730 Its not that simple in logic.
      docs.mongodb.com/manual/sharding/#shard-keys
      You also should remember about chunks. There is a default chunk size. Also you can change it. Data is sharded in chunks. Mongodb will achieve even distribution of data across all the shards as the data grows. You shouldn't need to balance it. But sometimes if you are adding a new shard instance in to the cluster, you may want to manually trigger the balancer to re-distribute the data evenly.

  • @kannedakanneda2128
    @kannedakanneda2128 7 หลายเดือนก่อน

    Can you explain how to reshard any collection with different key?

  • @estebankolmaier1756
    @estebankolmaier1756 2 ปีที่แล้ว

    Bro, I'm trying create a sharded cluster where... a document insert in his zones, but, inside his zone these documents insert in a random shard of that zone... its possible?

  • @dineshdevaraj1844
    @dineshdevaraj1844 3 ปีที่แล้ว

    In your design mongos (port 60000) service seems to be a single point of failure. Please clarify me if I am wrong.

  • @vineetkumar9371
    @vineetkumar9371 2 ปีที่แล้ว

    I have pre existing database in mongod shell. And then i am connecting to mongos shell i ca'nt se my database. how to enable sharding on mongod database.

  • @majidkarimizadeh234
    @majidkarimizadeh234 4 ปีที่แล้ว +1

    hello thanks for awesome tutorial... I have question... can we shard existing collection with data? becasue after running mongos i couldnot fild any data...thanks

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +1

      Hi Majid, thanks for watching. If you are running a sharded cluster, then you can selectively shard collections. You will have to enable sharding at the database level. But under the database you can choose which collections to shard. If you don't shard a collection, it will be stored in primary shard of your cluster. Cheers.

  • @kumarashish2607
    @kumarashish2607 2 ปีที่แล้ว

    you could have explained the reason for creating hashed index.

  • @vineetkumar9371
    @vineetkumar9371 2 ปีที่แล้ว

    My database is not there in mongos. It is in mongod.how to access mongod database in mongos instance any idea

  • @alexanderlinders4999
    @alexanderlinders4999 3 ปีที่แล้ว +2

    Really appreciate this video! It helped me out a lot, have a great weekend.

  • @skmahaboobbasha6059
    @skmahaboobbasha6059 3 ปีที่แล้ว

    Please make a vedio on logrotation on production servers

  • @skmahaboobbasha6059
    @skmahaboobbasha6059 3 ปีที่แล้ว

    Hi could you please make a vedio on sharding using ops manager

  • @fadygamilmahrousmasoud5863
    @fadygamilmahrousmasoud5863 9 หลายเดือนก่อน +1

    I think you got the best explanation for this topic, I have an interview for a backend position 2 days from now and I hope I get asked about this topic, because i really fell confident into it right now.

    • @justmeandopensource
      @justmeandopensource  8 หลายเดือนก่อน

      Great! Glad it was helpful and hope you did well in the interview. Thanks for watching.

  • @vineetkumar9371
    @vineetkumar9371 2 ปีที่แล้ว

    Hi pls tell me how to shard pre existing database.

  • @TheRemarkableImages
    @TheRemarkableImages 4 ปีที่แล้ว +1

    Hi Mate! Thanks for this tutorial. I just have a problem now. do you have any idea on how can i persist my config on docker-machine?

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +1

      Hi Cal, thanks for watching. You can use docker volumes or bind a local directory to your container for persisting data. Cheers.

  • @a.yashwanth
    @a.yashwanth 4 ปีที่แล้ว +1

    How did you install the bottom bar in your computer?(where you have cpu temp, IP etc)

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +1

      Hi Ande, thanks for watching. This is I3 tiling window manager with i3bar as the status bar. I used i3blocks and each item you see in the bar is a script.
      I have done few videos on i3 tiling window manager setup using Ansible. You can check that here th-cam.com/play/PL34sAs7_26wOgqJAHey16337dkqahonNX.html. But you may need to modify the ansible playbook if it doesn't work for you. Or you will just get an idea of how to achieve this customization. Cheers.

  • @danieljohnson8304
    @danieljohnson8304 3 ปีที่แล้ว +2

    Thank you man! You helped me passing the exam

  • @gaspardbaye
    @gaspardbaye 4 ปีที่แล้ว +2

    Really great stuff! It helped me for my lab exercises. Explanation , configs etc all clear and works well.Kudos pal ))

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +2

      Hi Baye, many thanks for watching this video. Cheers.

  • @user-mb7qe6ro9m
    @user-mb7qe6ro9m 4 ปีที่แล้ว +1

    Best tutorials for kubernetes and mongodb. Eagerly waiting for your upcoming videos.

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +2

      Hi R, thanks for watching and following my series. Cheers.

  • @rajeshkunda6704
    @rajeshkunda6704 4 ปีที่แล้ว +1

    Hi Venkat, please make one video on point intime recovery on Mongo database. Thanks

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +2

      Hi Rajesh, thanks for watching. I will add it to my list. Cheers.

  • @ryanyu3487
    @ryanyu3487 3 ปีที่แล้ว +1

    Hi! Thank you so much for your tutorial. But when I shard the collection, the whole data goes to one shard. Or half of the data goes to one shard, this is good. However, the second shard contains the whole data set again. This is strange. Do you have any suggestions for this strange scenario? Thank you so much.

    • @ryanyu3487
      @ryanyu3487 3 ปีที่แล้ว

      By the way, I tried many sharded keys. My situation is there is no single attribute can make the document unique. I tried to use one key with the default _id and just use one attribute, these two methods. But all of them led to the scenarios I mentioned above.

  • @phoneix24886
    @phoneix24886 4 ปีที่แล้ว +2

    NaN%. Cool NaN bro.

  • @mdsiddiqaamiri3019
    @mdsiddiqaamiri3019 3 ปีที่แล้ว +1

    very excellent,explanation thank you so much sir

  • @premballabh5540
    @premballabh5540 4 ปีที่แล้ว +1

    How to use graphql with mongodb?
    Can you make a series on graphql?

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +1

      Thanks for watching. I haven't used Graphql before. That's something I need to learn first. Cheers.

  • @a.yashwanth
    @a.yashwanth 4 ปีที่แล้ว +1

    Whenever I mess something or want to experiment I delete all docker containers and create them again. But having to copy and paste each command is taking time. Is there any way I could do all that by executing a simple script?

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +4

      Hi Ande, thanks for watching. Well you could create a docker compose file. This is exactly what docker compose file is for. Cheers.

  • @anilg7915
    @anilg7915 5 ปีที่แล้ว +1

    Many thanks for sharing the vedio. Thank you so much for your time Venkat.

  • @sameettelang8277
    @sameettelang8277 3 ปีที่แล้ว +1

    You have beautifully explained the sharding concept.

  • @yuomtheara
    @yuomtheara 5 ปีที่แล้ว +1

    How to set which collection field for sharding?
    Example my collection have many fields like this:
    ```
    Invoices = {
    _id
    date
    employee
    product
    price
    discount
    amount
    .....
    }
    ```
    Should be which field to add sharding???
    sh.shardCollection("mydb.invoices", {.................})

    • @justmeandopensource
      @justmeandopensource  5 ปีที่แล้ว +1

      Hi Theara, thanks for watching this video. It is very important you choose your sharding key wisely. Once you enabled sharding on a collection based on a sharding key, it will be difficult to change it. The following articles might help you understand how to choose your keys.
      www.bugsnag.com/blog/mongo-shard-key
      docs.mongodb.com/manual/core/sharding-shard-key
      Thanks.

    • @yuomtheara
      @yuomtheara 5 ปีที่แล้ว +1

      @@justmeandopensource
      Thanks for your time

    • @justmeandopensource
      @justmeandopensource  5 ปีที่แล้ว +1

      You are welcome.

  • @GauravSharma-ui4yd
    @GauravSharma-ui4yd 4 ปีที่แล้ว +1

    Awesome series
    Please carry-on this series and add video on atlas to this series.
    Or start a new series on Cassandra

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +1

      Hi Gaurav, thanks for watching these videos. I will see if I can continue this series. Thanks for your interest.

  • @templuismunoz
    @templuismunoz 5 ปีที่แล้ว +1

    Thanks, great content as always. I hope you made the k8s version for a mongo shard and see what strategy you follow

    • @justmeandopensource
      @justmeandopensource  5 ปีที่แล้ว +1

      Hi Luis, thanks for watching this video. I played with MongoDB in Kubernetes cluster using helm chart as a statefulset. Its all working fine. I haven't tried sharding in k8s cluster. There are certain things you don't have to worry about when running MongoDB inside a k8s cluster. K8s cluster will handle it differently. I will record a video and release it later.
      Thanks.

  • @kennethsarfo
    @kennethsarfo 5 ปีที่แล้ว +2

    Please dont stop these mongodb videos, i'm just getting into it and i find your tutorials very informative and helpful.

    • @justmeandopensource
      @justmeandopensource  5 ปีที่แล้ว +3

      Thanks for your interest in MongoDB series. I am very interested in this as well, but not a lot of people are watching this series. The effort I am putting to prepare for and record these videos are too much when compared to the number of people watching them.
      But anyways will try my best to continue this.

    • @kennethsarfo
      @kennethsarfo 5 ปีที่แล้ว +3

      @@justmeandopensource I think people are just getting to know you and your videos. I bumped into your video yesterday and I find it to be the best out there so far. Dont stop please. Thank you

    • @justmeandopensource
      @justmeandopensource  5 ปีที่แล้ว +3

      @@kennethsarfo That's good to hear. Thanks and I will continue. Cheers.

    • @TheRemarkableImages
      @TheRemarkableImages 4 ปีที่แล้ว

      Yeah agreed to you @kennethsarfo. his the best.

  • @hamidullahmuslih6301
    @hamidullahmuslih6301 4 ปีที่แล้ว +1

    well done bro, I did my Lab with your vids.

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +1

      Hi Hamid, thanks for watching. Glad that it helped you. Cheers.

  • @richardwang3438
    @richardwang3438 4 ปีที่แล้ว +1

    nice series, Venkat, thank you
    1. when we shard a collection, do we always use replicaSet?
    2. you show us in a standalone cluster, but in prod, I believe each replicaset will be on a separete machine, any ideas how to arrange replicasets for different shards among the machines?

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +1

      Hi Richard, each shard in a sharded cluster must be a replicaset. And each member of the replicaset has to be in a separate machine. This is a best practice suggestion. Cheers.

  • @sabapathy27
    @sabapathy27 3 ปีที่แล้ว +1

    Very good stuff! - simple and clean. Much appreciated.

  • @geelemo
    @geelemo 4 ปีที่แล้ว +1

    what language is used in the terminal here

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +1

      Hi Geelemo, thanks for watching. What do you mean by language?

    • @geelemo
      @geelemo 4 ปีที่แล้ว +1

      HI thanks for replying, you wrote a for loop in the terminal , what language is that?

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +1

      @@geelemo Thats just bash syntax. But I use Zsh.

    • @geelemo
      @geelemo 4 ปีที่แล้ว +1

      @@justmeandopensource Thank you!

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +2

      @@geelemo You are welcome.

  • @sumedhasaran7648
    @sumedhasaran7648 3 ปีที่แล้ว +1

    Thankyou so much ....really really helpful

  • @arulprakasan1697
    @arulprakasan1697 4 ปีที่แล้ว +1

    Fabulous don't know why less view count ... Best demo ever !!! thank you very much bro for a fantastic demo and explanation !!!

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +1

      Hi Arul, many thanks for watching this video. Glad you liked it. Cheers.

    • @arulprakasan1697
      @arulprakasan1697 4 ปีที่แล้ว +1

      @@justmeandopensource Thanks bro !! please share the video in some other medium more useful !! thanks again bro !!

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +2

      @@arulprakasan1697 I only do it out of my passion. I think I am already busy with TH-cam amidst my primary job. Hopefully this channel will grow as day passes by.

  • @NehaGupta-lf7sk
    @NehaGupta-lf7sk 5 ปีที่แล้ว +1

    Can we manually insert the data into the shard according to the requirements? Example Collection have 2 shards Shard1 and Shard2. And I want to insert the data into Shard1, is it possible?

    • @justmeandopensource
      @justmeandopensource  5 ปีที่แล้ว +2

      Hi Neha, thanks for watching this video. You shouldn't be doing that anyway. There is a cluster balancer component that makes sure the data is evenly spread across the shards. You should only connect via mongos router so that config-servers gets the right metadata about the shards. If you try to connect and do write operation on a specific shard, config-server won't be aware of it and any other clients coming through mongos router won't be able to find that data.
      From docs.mongodb.com/manual/core/sharded-cluster-shards/
      """Users, clients, or applications should only directly connect to a shard to perform local administrative and maintenance operations.
      Performing queries on a single shard only returns a subset of data. Connect to the mongos to perform cluster level operations, including read or write operations."""
      Thanks.

    • @NehaGupta-lf7sk
      @NehaGupta-lf7sk 5 ปีที่แล้ว +1

      Just me and Opensource thanks for the quick response. :) nice explanation. I hope you will continue to make videos on MongoDB, it will be very helpful.

    • @justmeandopensource
      @justmeandopensource  5 ปีที่แล้ว +2

      I would love to do more videos. Lets see. Currently busy with kubernetes and AWS series. Thanks for your interest. Cheers.

  • @vanmuonha
    @vanmuonha 4 ปีที่แล้ว

    Thanks very much! Your video content useful and clear. I'm having problems between the shard shard collection. I do like you do, but I can not shard database between shard together. My cluster includes 2 shard (shard1 and shard2), when I create a database from mongos and enableSharding, so database available only on 1 shard (shard1 or shard2).
    what did i do wrong? pls! help me. Best regards

  • @demulupusarla9248
    @demulupusarla9248 2 ปีที่แล้ว

    Hi venkat, this is venkatesh, ur explanation is Great, i understanding very well. small request from side, can u make video mangodb sharding cluster in different servers on ubuntu.it will help for me. can u give me your email id pleqase

  • @mdarian
    @mdarian 2 ปีที่แล้ว

    Hi, thank you for this, great video!
    A suggestion for another video would be to add another mongos instance.

  • @yuomtheara
    @yuomtheara 5 ปีที่แล้ว +1

    Should We add Sharding for all collections or only on specific the collections that has large data???

    • @justmeandopensource
      @justmeandopensource  5 ปีที่แล้ว +2

      Hi Theara, You don't have to shard all the collections in your database. But you have to enable sharding at the database level to be able to shard the collections within it. For relatively small collection, sharding doesn't bring much benefit. It will be beneficial to shard large collection as it will improve read performance. Thanks.

    • @yuomtheara
      @yuomtheara 5 ปีที่แล้ว +1

      @@justmeandopensource
      Haha very quick reply.
      The bast answer

    • @justmeandopensource
      @justmeandopensource  5 ปีที่แล้ว +1

      You are welcome.

  • @lukasmaci
    @lukasmaci 4 ปีที่แล้ว +1

    Great work man! Thanks so much!

  • @EzequielRegaldo
    @EzequielRegaldo 4 ปีที่แล้ว +1

    Hi ! thank you so much for your videos, i have a question:
    Is it necessary to always have an arbitter? or i can automatic choose another primary without arb?

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +3

      Hi Ezequiel, thanks for watching. You don't need to have an arbiter node. The primary purpose of an arbiter node is to participate in an election. It can't become a primary. It doesn't have the data. The replicaset needs to have odd number of members. So if you have a primary and a secondary in your replicaset, and don't want to add another secondary, you can add an arbiter to make the total 3.
      docs.mongodb.com/manual/core/replica-set-arbiter/
      Cheers.

    • @EzequielRegaldo
      @EzequielRegaldo 4 ปีที่แล้ว +1

      @@justmeandopensource thank you so much ! i made some testing in my workstation and i had problems with 2 nodes; reading mongo docs i found what they recommend use at least 1 primary, 1 slave and 1 arbitter (for save resources) and create more fault tolerance. Its awesome :D thank you again for your answer

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +1

      @@EzequielRegaldo No worries. You are welcome. Cheers.

  • @zaqraburu6893
    @zaqraburu6893 3 ปีที่แล้ว

    What are you coding we're not seeing anything

  • @atharvapegasus7816
    @atharvapegasus7816 4 ปีที่แล้ว +1

    Thanks for your help

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +1

      Hi Atharva, thanks for watching this video. Cheers.

  • @filat239
    @filat239 3 ปีที่แล้ว +1

    Really a valuable video!

  • @yuomtheara
    @yuomtheara 5 ปีที่แล้ว +1

    How to use $lookup aggregate on `Sharded collections`.
    Get error '"message" : "db.mycollection cannot be sharded"'

    • @justmeandopensource
      @justmeandopensource  5 ปีที่แล้ว +1

      Hi Theara, I will test how to use the lookup command and give you some example later today possibly. Thanks.

    • @yuomtheara
      @yuomtheara 5 ปีที่แล้ว +1

      @@justmeandopensource
      Very thanks

    • @justmeandopensource
      @justmeandopensource  5 ปีที่แล้ว +1

      @@yuomtheara you are welcome.

    • @yuomtheara
      @yuomtheara 5 ปีที่แล้ว

      @@justmeandopensource
      Could you help me about this ($lookup)?
      Now I would like to try migrate `Current DB` TO `Sharding`!

    • @yuomtheara
      @yuomtheara 5 ปีที่แล้ว +1

      @@justmeandopensource
      HI dear, how are you?
      Excuse me, could you help me about `$lookuup` aggregate in collection sharded?
      Thanks for your helping :>

  • @dream11tatyabichoo92
    @dream11tatyabichoo92 4 ปีที่แล้ว

    2 questions - wat was use of config server, when im sharding a collection how it sharded into 2 shards only were we configured daty

  • @ubiquicomubiquicom7545
    @ubiquicomubiquicom7545 3 ปีที่แล้ว

    Thank you, very detailed and clear video!
    I have a question regarding collection sharding: let's say I am sharding my collection over a datetime index, how can I delete old chunk that I do not need anymore (example: SQL Server for switch partition)?

  • @oktaarifcahyawan6823
    @oktaarifcahyawan6823 4 ปีที่แล้ว

    Hello , thanks for the tutorial it very helping me to understand mongodb, so how to restore the db that I have backed up to this mongodb sharding? Is it the same as a normal restore or not?

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว

      Hi Okta, thanks for watching. Once you have setup your Mongodb cluster, then its like any other mongodb cluster. You can dump/restore/export/import by connecting to the mongos router. Cheers.

  • @quirkyquester
    @quirkyquester 4 ปีที่แล้ว +1

    amazing video, thank you!

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +2

      Hi George, thanks for watching.

    • @quirkyquester
      @quirkyquester 4 ปีที่แล้ว +1

      @@justmeandopensource Hi Venkat, I have problem sharding an existing collection. I tried so many different shard key, i tried using hosting servers locally and in docker containers. None of these worked. In the last part of this video, sharding the existing collection movies also did not seem to work. I wonder if you know how this could be done? thank you so much!

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +2

      @@quirkyquester Hmm. What error do you see when you shard a collection? Have you made sure sharding is enabled at the database level? Is your shard setup properly?

    • @quirkyquester
      @quirkyquester 4 ปีที่แล้ว +1

      ​@@justmeandopensource Thank you for your help! here is more details: no error occured at all. the shardset up is also good. I was able to shard an empty collection and then add data in it just like what you did in the tutorial. However, when i shard an existing collection with data in it, i create index and then i shard it. it will tell me its sharded successfully. but there is only primary shardset(servers) that has the data. the other shardrepset is not sharding that collection at all (when I check sharding statistics of that collection). The shard in the tutorial for collection "movies" also ended up them same way as I described. right? Only primary shardrepset holds 100% of that collection. the other shardreplset does not hold any of that data. when we do 'db.movies.getShardDistribution()'

    • @justmeandopensource
      @justmeandopensource  4 ปีที่แล้ว +2

      @@quirkyquester I think thats kind of expected. Data is stored as chunks in the shards. There is a specific size for the chunk. I believe it is 64MB or 128MB. As you write more data in to the collection, the data will get sharded on other nodes as well.

  • @tushargoel5522
    @tushargoel5522 5 ปีที่แล้ว

    Hi thanks for the video.. Its helpful. I have one doubt related to config server. How mongos communicating with config server and how config server storing metadata? I guess its not mentioned in the videos. Could u please explain it?

    • @justmeandopensource
      @justmeandopensource  5 ปีที่แล้ว

      Hi Tushar,
      Thanks for watching this video.
      You can find the cluster information through various commands by logging into one of the mongos router.
      Check the below link for that.
      docs.mongodb.com/manual/tutorial/view-sharded-cluster-configuration/
      During the set up process, we logged into the mongos router and added the shards. So you know how mongos and shards are connected.
      Config servers are used to store the metadata about the shards. So when you query mongos router, it checks the metadata from the config server to find out which shard contains the data requested.
      So now the puzzle is how mongos is connected to the config servers replicaset? Or how to find out what the config servers are for the given mongos, right?
      If you followed my video step by step, you would have noticed that we specified the config-servers replicaset while starting the mongos instance.
      github.com/justmeandopensource/learn-mongodb/blob/master/sharding/mongos/docker-compose.yaml
      The option --configdb passed to mongos to connect to the config-servers replicaset.
      Hope this makes sense.
      Thanks.

    • @tushargoel5522
      @tushargoel5522 5 ปีที่แล้ว

      @@justmeandopensource Thanks. I missed this point.. So in mongos docker-compose file u have mentioned it but i guess 2nd question is still unanswered about metadata information. We have not stored metadata information on config server yet.

    • @justmeandopensource
      @justmeandopensource  5 ปีที่แล้ว +1

      I think the below documentation has answers to your second question.
      docs.mongodb.com/manual/core/sharded-cluster-config-servers/
      Cheers.

    • @tushargoel5522
      @tushargoel5522 5 ปีที่แล้ว +1

      @@justmeandopensource Thanks. I have 1 more doubt. Say i have existing shard architecture and due to heavy load i need to add new shards so that capacity and performance can be improved. But i already have old data running so i can't reshard it bec it requires so much work and may be we need to add more shards in future.. what would be our strategy in such cases?

    • @justmeandopensource
      @justmeandopensource  5 ปีที่แล้ว +2

      @@tushargoel5522 Extending/Adding new shards to existing sharded architecture is a common practice if you haven't done the capacity planning properly in the first place. So you have done some capacity planning and based on that you have set up a sharded architecture. Later you realize that the planning wasn't correct and you need more shards to improve the performance. This is very common practise.
      When you have a shard, data is split between the shards in chunk size. For example, with chunk size of 64MB, with two shards and 128MB of data, you will have two chunks spread between the two shards. Now you can add a 3rd shard to this cluster. Cluster balancer will then make sure that the chunks are equally split between the 3 shards. This involves moving the chunks around so that they are equally split. This requires some resources which you need to plan.
      docs.mongodb.com/manual/core/sharding-balancer-administration/#sharding-internals-balancing
      Thanks.

  • @yuomtheara
    @yuomtheara 5 ปีที่แล้ว +1

    Sharding Collection Problem with Database Collection Restoring from Standalone DB.
    - I created the Mongo sharding: OK
    - Created sharding database: myDB (Enable shard): OK
    - Restore database into myDB: OK
    - Create collection sharding by create index first: OK
    sh.shardCollection("myDB.app_journals", {
    journalType: 1, // Change to `hashed` , still don't work
    });
    - Get sharding colleciton status: Work only one sharding
    ```
    mongos> db.app_journals.getShardDistribution()
    Shard shard3rs at shard3rs/..........
    data : 3.05MiB docs : 11022 chunks : 1
    estimated data per chunk : 3.05MiB
    estimated docs per chunk : 11022
    Totals
    data : 3.05MiB docs : 11022 chunks : 1
    Shard shard3rs contains 100% data, 100% docs in cluster, avg obj size on shard : 290B
    ```

    • @justmeandopensource
      @justmeandopensource  5 ปีที่แล้ว +1

      Hi Theara,
      This is expected behaviour. The default chunk size is 64MB. The data are split into 64MB chunks and are stored in a shard. So the first 64MB chunk will go to your first shard and then next 64MB will go to your second shard. At the moment you have only 3.05MB data in your chunk.
      64MB is a reasonable chunk size in production.
      If you want to test, you can lower your chunk size to 1MB and restore the data.
      Check the below documentation.
      docs.mongodb.com/manual/tutorial/modify-chunk-size-in-sharded-cluster/
      Basically, you need to connect to your mongos router and issue the following two commands.
      > use config
      > db.settings.save( { _id:"chunksize", value: 1 } )
      Thanks.

    • @yuomtheara
      @yuomtheara 5 ปีที่แล้ว +1

      @@justmeandopensource
      I tried create new sharding on empty collection, and inserted any data (1000 docs).
      I get sharding status (Split my data into 3 Shardings)
      `
      // Example
      Sharding1
      ............ (500)
      Sharding2
      ........ (300)
      Sharding3
      ...........(200)
      `
      Why It split my data, bc size is less than 64M???????????
      (Have any problem for performance or NOT on Sharding Existing Data Collection like my options above)

    • @yuomtheara
      @yuomtheara 5 ปีที่แล้ว +1

      @@justmeandopensource
      Now I tried to create all Sharding Collections before Restoring Database (Don't use --drop).
      I work fine (Split to 3 shardings).

    • @justmeandopensource
      @justmeandopensource  5 ปีที่แล้ว +2

      Cool. I am very glad that you are making good progress in learning and understanding Mongodb which is what I want. Cheers.