Dropbox system design | Google drive system design | System design file share and upload

แชร์
ฝัง
  • เผยแพร่เมื่อ 17 ม.ค. 2025

ความคิดเห็น • 306

  • @sumonmal009
    @sumonmal009 3 ปีที่แล้ว +43

    idea scope 1:38
    scale 2:10
    HLD 2:41
    problem to solve 4:55 6:57
    solution 10:41
    metadata file 15:26
    HLD 17:38
    messaging service detail 25:01 device sync feature
    metadata handling 28:40
    metadata schema 31:48
    edge store usage to serve metadata 36:16
    search feature 40:01

  • @roopaschannel9731
    @roopaschannel9731 4 ปีที่แล้ว +5

    Thanks for your channel Naren! Brings back my love for computer science. We need more such teachers that can break things down and explain it as simply as you have done here.

  • @RakeshGajjar
    @RakeshGajjar 5 ปีที่แล้ว +106

    Give this man the credit he deserves 👏🏼👏🏼👏🏼

  • @eugenekim6937
    @eugenekim6937 2 ปีที่แล้ว +4

    Great system design. I really wish he explained why file change sets need to be ordered and consistent, in which led him to use a relational database for the metadata.
    If you look at his design for google docs, it doesn't even use a relational database for massively concurrently updated files.

    • @rabindrapatra7151
      @rabindrapatra7151 ปีที่แล้ว

      Yes. He explained google docs using operational transformation.

  • @stevemew6955
    @stevemew6955 5 ปีที่แล้ว +2

    Great work Narendra. This is the best video I have found so far on TH-cam on the DropBox architecture.

  • @simpleurbanliving
    @simpleurbanliving 2 ปีที่แล้ว +2

    Enjoyed this video more than others because of the cute doggo interruptions. :) Thank you!

  • @deepakzworld
    @deepakzworld 4 ปีที่แล้ว +6

    The best part I like about your videos is you do a lot of research to put the information from various sources about a topic into one place. You are our Edgestore ;)

  • @vaibhavsingh9x
    @vaibhavsingh9x 5 ปีที่แล้ว +1

    Another reason to use async queues: one cannot assume that only a single file will be uploaded. There could be a case in which multiple files could be uploaded and a queue ensures that chunks do not get mixed with each other. I guess one can also talk about failover (what happens when a chunk gets lost during transmission/gets corrupted) but that might not be required.
    Edit: NVM he covers this case as well LOL. Love the depth he goes into when covering different components.

  • @RandomShowerThoughts
    @RandomShowerThoughts 5 ปีที่แล้ว +13

    15:39 LMFAO! great video man, you are my go to for system design prep

  • @akinkanju9653
    @akinkanju9653 6 ปีที่แล้ว +31

    Hello Naren! your channel is a goldmine. I've learned quite a lot. Please consider creating content that dives deep into data models/schemas/datasets. Thanks 🙏

  • @samahome
    @samahome 2 ปีที่แล้ว

    Your explanations and approaches in explaining these System Design Problems is absolutely phenomenal.

  • @chepaiytrath
    @chepaiytrath 4 ปีที่แล้ว +3

    Clients described at 18:26, taking an example of Google Drive, refer to the various "Backup and Sync" desktop clients which you might have active on multiple devices. All these clients keep listening to a messaging queue. In case one device makes changes to a file, the change is propagated to S3 and all clients are notified of this by publishing the change to the messaging queue which they are listening to. The client which is the originator of the change doesn't care but other clients do and when they know of a change they update their local copies (download the whole file if not present).
    Update:
    It's not just one Q2. Each client will have its own queue on which the change is broadcasted. This is to have an asynchronous behaviour wherein the client can be offline for a period and then when it is online it starts listening to the queue for any changes
    This is my understanding. Correct me if I'm wrong

  • @helloworld7313
    @helloworld7313 4 ปีที่แล้ว +29

    honestly as a swe working at dropbox, i don't feel like this is an answer i am looking for. It misses a lot of important stuff like how do you design your database schema for storing the metadata and how would your sync protocols looks like? what if there are write conflicts during sync how do you deal with that? and the search engine part i guess is the least likely bonus question i'll ask in an interview(probably makes more sense in design twitter)
    no offense to Narendra, i think you put in a lot of effort/research into this and even referenced dropbox's blog post on network edge infra.
    but i think this's a problem to almost all of these youtube system design videos, like, yes you will learn a little bit here and there, but it's not the same as a real interview and don't expect to memorize some sys design solution and pass the interview.
    better ways to learn system design:
    read DDIA, web scalability for startup engineers, take a distributed system class
    listen to real mock interviews if you somehow can(or some faang engineer does these mock interviews and post them somewhere i guess)
    design and implement projects at your job if you have the opportunity

    • @Doug-rv3nr
      @Doug-rv3nr 5 หลายเดือนก่อน

      It depends on the system design interview you attend.

    • @princeroshan4105
      @princeroshan4105 8 ชั่วโมงที่ผ่านมา

      This should be pinned

  • @rajkrishna8294
    @rajkrishna8294 2 ปีที่แล้ว

    You don't have studio but you are delivering better content than those who have studio.

  • @saiprajeeth
    @saiprajeeth 5 ปีที่แล้ว +4

    WTF. only 633 likes out of 38,663 views for this gold? Come on viewers, you are beholden for this guy who is putting enormous effort to share knowledge beyond his boundaries.

  • @dhhsncnd6107
    @dhhsncnd6107 5 ปีที่แล้ว +1

    Awesome video that comes down to details for real design not just for interviews 😄

  • @ragingpahadi
    @ragingpahadi 4 ปีที่แล้ว

    Give this man Bharat Anmol Ratna : ]. Thanks for SD series it helps us broaden our thinking and not just defect fixing and small CR.

  • @viktorartemov2361
    @viktorartemov2361 4 ปีที่แล้ว +11

    Needs an explanation of how exactly does one detect which chunk was changed. Because your applications, video editor, for example, doesn't know anything about chunks, it doesn't change a chunk, it changes your file. It's up to your Dropbox client to figure out which chunk the change corresponds to. And that is not immediately obvious especially for huge binary files.

    • @jyotsnamadhavi6203
      @jyotsnamadhavi6203 ปีที่แล้ว +1

      Hash computation can help

    • @Doug-rv3nr
      @Doug-rv3nr 5 หลายเดือนก่อน

      @@jyotsnamadhavi6203 Yes, correlation ID is used in Kafka.

  • @amoghasoda
    @amoghasoda 4 ปีที่แล้ว +5

    Hey Naren. Great job! Few questions for you.
    1. Why can't we expose a single service which takes chunks of data and make metadata entry into database and also stores chunks to S3 instead of client calling both services?
    2. From your design, if sync service pushes notifications to a topic are we maintaining dedicated topics/partitions for different clients? Or are we pushing notifications via Websockets/HTTP Polling?
    Few comments:
    1. If clients go offline they can still come back and establish connections via Websockets?
    2. We can't have 'n' number of topics because creating Kafka topics/JMS queues need infrastructure support and is a costly operation. Also creating partitions in a live system is a costly affair. Pls let me know if I'm missing anything.

    • @jamesneesham70
      @jamesneesham70 ปีที่แล้ว

      Though this video is a good starter, its gets wrong at multiple places

  • @rohittiwarirvt
    @rohittiwarirvt 3 ปีที่แล้ว

    A Great Video on Understanding file storage service design like dropbox, Preparing for an interview and this content is helpfull

  • @druidclash9161
    @druidclash9161 6 ปีที่แล้ว +8

    Shit, it's fucking perfect explanation. Thanks for all these stuff.

  • @yog2915
    @yog2915 3 ปีที่แล้ว

    Amazing no nonsense serious designs which are really good hatsoff bro 👍 keep doing good work

  • @yawar110
    @yawar110 4 ปีที่แล้ว +1

    Salaams and respect from Pakistan for you sir! You are a hard working and a smart individual who is helping the IT community across the world using whatever best resources you have. Keep up the good work - Keep posting them system design videos. God Bless!

  • @Yan-rv8mi
    @Yan-rv8mi 4 ปีที่แล้ว +9

    33:56 Here you threw the problem that we need to rebalance/re-shard as we get more and more data in one shard, but the subsequent mentioned approach "edgestore" does not seem to solve this, does it? It seems like the edge wrapper simply provides a better interface for developers to read/write data. How does the "edgestore" help in regards to the data sharding parts?

    • @ishanchopra7468
      @ishanchopra7468 3 ปีที่แล้ว

      Yeah Naren, would like to know the answer to this - how is the cost of denormalization required due to sharding reduced by edgestore?

  • @sananirajabov3
    @sananirajabov3 6 ปีที่แล้ว +7

    Great system design and clear explanation, thank you !

  • @archfitness2399
    @archfitness2399 2 ปีที่แล้ว

    Excellent the way of explaining the concept.
    and really enjoyed the the dogs pictures while barking in the mid of presentation. 🙂👍

  • @pavankumaruppuluri4097
    @pavankumaruppuluri4097 4 ปีที่แล้ว +4

    While explaining why we need queue instead of http call to sync service you mentioned we need it as client may not always be connected. My question is if client dont connect to internet for example, even that message also cant be transmitted to queue right ?

  • @codetolive27
    @codetolive27 6 ปีที่แล้ว

    Very informative. You have covered each layer like front end, Middle tier and database layer effectively. Thanks

  • @chilamakoorugangadevi9208
    @chilamakoorugangadevi9208 4 ปีที่แล้ว

    Really you done a good & great job annaiah.....Awsome explanation,tq☺️

  • @nribackpacker
    @nribackpacker 4 ปีที่แล้ว

    Sirji excellent video

  • @dimei4170
    @dimei4170 6 ปีที่แล้ว +6

    Very nice video! Please do an Instagram system design for the next one! Thank you!

    • @raywu9685
      @raywu9685 5 ปีที่แล้ว

      Without reference to original paper “Designing a Dropbox-like File Storage Service” by Alejandro Ramirez, Fariborz Khanzadeh, Hassaan Bukhari. this is unfair.

  • @sweetyb3287
    @sweetyb3287 6 ปีที่แล้ว +2

    Awesome! Loved the explanation and learned a lot. Last part of the search design for this service could be expanded into another video.

  • @joeyyu133
    @joeyyu133 4 ปีที่แล้ว +2

    I am not quite clear about the response queue. Is it necessary? If each client maps to a response queue, and what if the client never comes back? Are we still posting messages to its queue? Meanwhile, why not just let each client periodically check the diff between the local metadata vs. the latest metadata? By doing this, we can get rid of the response queues, right?

  • @derpina615
    @derpina615 3 ปีที่แล้ว +1

    Is the logic same for JPG files (images or videos)? Is it different to stitch together a text file vs video file?

    • @derpina615
      @derpina615 3 ปีที่แล้ว +1

      oh, its in bytes is it. chunks are in 1/0 format. ok got it. facepalm :)

    • @RandomShowerThoughts
      @RandomShowerThoughts 2 ปีที่แล้ว

      videos should be encoded and broken into chunks

  • @Maw0822
    @Maw0822 4 ปีที่แล้ว +6

    What happens to the chunks when I add data to the file that would be contained in the first chunk causing it to go over it's limit? Wouldn't that cause a cascading effect where every chunk spills over into the next chunk? Our small change in one chunk would cause changes in every chunk no?

    • @sudhasravan92
      @sudhasravan92 2 ปีที่แล้ว +1

      Exactly!! I have been struggling with the same question for the last few days but could not find an answer anywhere!

    • @mahee96
      @mahee96 2 ปีที่แล้ว

      @@sudhasravan92 Haha at least at this point it was not just me scratching my head. Jokes apart, seriously I once had a discussion with my coworker why git scm was not being used, for that he reminded me how git works.
      which is by storing delta/diff between two files, so that when a file is modified, only the delta info is uploaded or downloaded.
      BUT, he explained me that this is exclusive to TEXT ENCODED files, and not for BINARY files because git can in no way know what is the delta because actual data is binary (such as .exe, .obj, .dat, .class etc).
      He confirmed that in case of binary files, git actually stores the new file completely. so this is equivalent to storing old file + new file which doubles size of storage required.
      HENCE git is not intended to store BINARY Files where delta info can't be determined.
      Considering this theory, you could see that the chunking current file to be uploaded can save you in terms of network errors so that you can re-upload erroneous chunk again, but it is completely not helpful in terms of using as delta information.
      Because when the file is modified, the whole file can't be chunked again as how the previous version was chunked and compared with previous version of chunks in 1:1 manner,
      nor it can be variably chunked such that we can deduce the exact chunk that has changed considering file is binary where data could be machine code(exe) of a processor.
      If someone can point me "THE OBVIOUSNESS" of the chunker design shown here and its purpose/usefulness, I would be much thankful!

  • @xuemingzhang8456
    @xuemingzhang8456 3 ปีที่แล้ว

    The content is always great from this channel, but if you can use a microphone while talking that will bring the video to the next level.

  • @AmdJunaid
    @AmdJunaid 6 ปีที่แล้ว +2

    Truly amazing. Hats off to you. 🙏😍 Request you to upload more of such videos. It would be too awesome if we can have a system design tutorial for beginners and how to improve.

  • @leprofesseurshen
    @leprofesseurshen 4 ปีที่แล้ว

    Man, I wish I discovered your channel sooner. I recently failed on a system design interview, Dropbox system design particularly. Thanks for your work. I will study every single of your video and prepare myself for my next interviews.

  • @chenx3838
    @chenx3838 5 ปีที่แล้ว +2

    So clear and easy to understand, keep going!

  • @manojbgm
    @manojbgm 3 ปีที่แล้ว

    Nice explanation. Insightful

  • @dhruv4u9
    @dhruv4u9 6 ปีที่แล้ว +11

    Nice explaination Naren. Why is messaging service sending information to clients. It should rather be pull model on client's front, where they periodically pulling from Server the chunks. It should be based on last synched chunk_id. So, even if any client is unavailable for an interim duration he can synch based on last synched chunk_id.

    • @armharish
      @armharish 4 ปีที่แล้ว

      I agree

    • @psn999100
      @psn999100 4 ปีที่แล้ว

      How does this work ? Lets say some client had last synched the chunk id = 10 of a particular file (this file has 10 chunks). Now lets say chunk_id = 5 has changed for this particular file. How will the other clients know that they need to get chunk_id = 5 from the object store ? Yoir idea only works if the file increases in size ,thereby increasing the chunk_id. Please correct me if I am wrong

    • @oscarmvl
      @oscarmvl 3 ปีที่แล้ว

      @@psn999100 you store the last time there was a modification in the client #1, and let’s say another client #2 updates a chunk, then you update the latest modification time in the server. When client #1 asks if there has been any changes using the last time of client #1, the server compares it to its last modification time (done by client #2) and lets the client #1 know that there are some changes that client #1 is missing, these changes are the changes done by client #2.

    • @Anoopchaudhary36
      @Anoopchaudhary36 3 ปีที่แล้ว

      We may need to pull only when client connects then for further updates queue can be used that can help us avoid constant polling to server

  • @karthiyogi93
    @karthiyogi93 6 ปีที่แล้ว +6

    Wow. Amazing. U r doing a grt job.

  • @mohammedmohideen1756
    @mohammedmohideen1756 3 ปีที่แล้ว

    Wonderful Explanation...!! Thanks for the work Naren.

  • @abhishekkapoor7955
    @abhishekkapoor7955 2 ปีที่แล้ว +1

    separate queue for each client doesn't sound good additionally we are using queue as persistence storage which should be avoided because a large number of messages can pile up in queue without any proper ordering. instead, the client side can call the sync service to fetch the latest files index for the user

  • @T-Sparks208
    @T-Sparks208 4 ปีที่แล้ว

    Amazing .. I am new in system design and I've learned a lot.. Thankyou so much

  • @ashleyspianoprogress1341
    @ashleyspianoprogress1341 2 ปีที่แล้ว +1

    I have a question about the response queues. At the beginning you mention dropbox has 500 million users. If we assume a user has 3 devices on average, that leaves us maintaining 1.5 billion response queues. Am I misunderstanding something?

    • @yusufsipahi3916
      @yusufsipahi3916 ปีที่แล้ว

      I think this solution is not scalable.

  • @tacowilco7515
    @tacowilco7515 4 ปีที่แล้ว

    thank you for the video
    it gets the very general idea about how it works
    but without important details though
    once again thanks

  • @madhuj3683
    @madhuj3683 2 หลายเดือนก่อน

    Love your videos...Thank you so much for sharing

  • @codinga-cx1nn
    @codinga-cx1nn ปีที่แล้ว

    THE BEST OF THE BEST -> PLEASE, CONTINUE YOUR CHANNEL!

  • @arjun.s5112
    @arjun.s5112 4 ปีที่แล้ว

    Thank you so much. The best system design video on this topic.

  • @srikanth26mar
    @srikanth26mar 5 ปีที่แล้ว +3

    Firstly, thanks for the video. it would have been interesting to know how the Edge Wrapper achieves transaction isolation level without explicit locking/transaction.

  • @TheDibyendusarkar
    @TheDibyendusarkar 4 ปีที่แล้ว +5

    What if we send the diff only, what git does. Storing a tree like structure of changes.

  • @Icix1
    @Icix1 3 ปีที่แล้ว

    just fyi, cassandra consistency model provides a higher chance of reads being consistent, but doesn't provide true linearizability. This is why it's better to use terms like linearizability and not consistency as DB providers can play games with their definition of "consistency". Cassandra and similar nosql variants are basically partitioned key value stores in disguise and cannot ever compete with a true relational database. Also, even within relational databases, configuring isolation levels is pretty important, and it's easy to get tripped up there.

  • @harshakada3374
    @harshakada3374 5 ปีที่แล้ว +1

    Those are great videos that u r doing. Can you please start a course about system design basics n how to build from scratch to advanced level. Please do that course I would love to buy. Thank you 😀

  • @vrushangdesai2813
    @vrushangdesai2813 6 ปีที่แล้ว +4

    excellent video , thanks a ton .
    pls make a video on system design for decentralized applocations on ethereum and ipfs (like decentralized uber)

  • @sadihassan8407
    @sadihassan8407 5 ปีที่แล้ว

    You are the best! Thank you so much for explaining this so nicely!!!

  • @veereshvik3521
    @veereshvik3521 4 ปีที่แล้ว

    Doing great job Naren, keep up the spirit 👍🏻

  • @Amin-wd4du
    @Amin-wd4du 5 ปีที่แล้ว +4

    Very good content. I loved the dog barking.

  • @AnonYmous-yu6hv
    @AnonYmous-yu6hv 3 ปีที่แล้ว +1

    11:25 You can't just break a file like that, what if one of the parts' size increased 100 times? what if it's a binary file and you just synced something that broke it?

    • @sniGGandBaShoR
      @sniGGandBaShoR 3 ปีที่แล้ว +1

      all chunks are going to be the same size. If the file gets bigger you get more chunks.
      now if its not a text file and you can not do good diffs, the client needs to split the whole file with the same algorithm again into chunks. Then the client compares the "new meta data" with the "old meta data" and just sends the changed junks to the server

    • @mahee96
      @mahee96 2 ปีที่แล้ว

      @@sniGGandBaShoR so basically what you are saying is, after the first 2 kb of a 20GB binary file there is 10kb data added, then all the 40 chunks of 500mb are now obsolete as the first chunk is already modified and rest of the chunks are displaced by 10kb so wouldn't match with previous version either.
      Eventually you are simply uploading full 20GB :(
      I am eager to know how you propose to solve the "chunker problem".
      What I can think of are:
      A binary diff tool which can find the diff in bytes from previous version to current version such as a patch.
      if such a tool is available, still you need to consider the possibility that original file cannot be modified in any means.
      So the uploader software which has this "chunker" component needs to basically create chunks of data physically on the disk? ie (use another 20GB?) or even if it does process chunk by chunk.
      Does it going to build an in-memory file chunk marker meta structure(like seek pointer data without actually chunking the file) such that chunk data of each file is processed in the client system?
      How is it going to respond to 10 following 20 GB files waiting to be uploaded while it computes this chunk data itself in like next 20 mins? then upload it in next 2 mins?

    • @mahee96
      @mahee96 2 ปีที่แล้ว

      @@sniGGandBaShoR At least I have seen gdrive and dropbox and I don't think they do this chunk transfer for "obvious reasons" coz when I say upload a 2 GB video then do a small edit to cut out last 1 min and reupload the new 1.9GB video, it still takes full time that was required for initial 2GB.

  • @kristhiantiu4317
    @kristhiantiu4317 3 ปีที่แล้ว

    for the length of video, i learned a ton

  • @deepakmahtohan
    @deepakmahtohan 6 ปีที่แล้ว

    believe me, ur channel will gonna have 50K+ subscribers within 3 months, keep up the good work

  • @ameyapatil1139
    @ameyapatil1139 4 ปีที่แล้ว

    Fabulous videos, excellent information and lots to learn ! Dogs were hilarious.

  • @bephrem
    @bephrem 5 ปีที่แล้ว +3

    15:38 hahahahaha, I do that sometimes too haha. I pause then forget to clip the video or I just can’t clip the video since noise is in a critical segment.

    • @karthikmucheli7930
      @karthikmucheli7930 5 ปีที่แล้ว +1

      Hey back to back SWE. :D

    • @bephrem
      @bephrem 5 ปีที่แล้ว

      @@karthikmucheli7930 hey man, wassup

    • @karthikmucheli7930
      @karthikmucheli7930 5 ปีที่แล้ว +1

      @@bephrem yo, you are great. Like all your videos. :)

  • @vallimcts
    @vallimcts 4 ปีที่แล้ว

    Thanks, you are doing a great job. Also, It would be really helpful if you could run the whole flow once at the end. So that we don't have to watch the full video when revisiting the video for the second time.

  • @amyzeng3816
    @amyzeng3816 5 ปีที่แล้ว +2

    Do you mean there will be a queue maintained for each client? Actually two, one request and one response queue. Is this design efficient?

    • @ddtoledo
      @ddtoledo 5 ปีที่แล้ว

      Amy Zeng I thought the same thing. Waiting on a comment from him about it

  • @bridgetp3733
    @bridgetp3733 9 หลายเดือนก่อน

    Thank you so much. This was fascinating!

  • @xinma7914
    @xinma7914 3 ปีที่แล้ว

    you look really good and confident

  • @JohnGummadi
    @JohnGummadi 2 ปีที่แล้ว

    Just curious about the chunks, are you assuming the client to have the knowledge of and be able to handle various file formats?

  • @avinashbole4827
    @avinashbole4827 6 ปีที่แล้ว

    Amazing video, Very detailed and to the point!! If possible, please add Fault tolerance and Security related usecases to be incorporated in the design

  • @chabhishyam
    @chabhishyam 5 ปีที่แล้ว

    Great Video Narendra. I have a question. Don't you think there won't be any compression and encryption of data happens? I think this help in both Bandwidth and security.

  • @Amandeep-bt4kl
    @Amandeep-bt4kl 4 ปีที่แล้ว +2

    Thanks Narendra for this great resource of designing DropBox, could you please help me with following few doubts:
    1. How we can implement this application to resolve any conflicts. For example, I made few changes in a document and same is synced into server and but other clients were offline for quite some time. Then, I started using another client which is offline; however, I made significant changes into it and then only realised that I am offline. Then what would happen, I don't want to loose any of my changes and the mentioned document is not a simple text file.
    2. Should chunk size be dynamically determined depending upon an individual file size or it would be fixed for the entire application. If it is fixed, and then we have made necessary changes in a document's intermittent chunk number (say suppose 2nd chunk among 10 chunks in total). Now chunk size of 2nd chunk got tripled, should we break it and reorder all the chunks (my point is if there are 1000 chunks, then re-ordering would be an issues) or keep it as it is, what are your comments on this?
    Thanks for your inputs in advance.

    • @yakshitjain3048
      @yakshitjain3048 4 ปีที่แล้ว

      There is a version control system called git which can help to accomplish the doubts you have.

  • @ZeeshanAmber
    @ZeeshanAmber 4 ปีที่แล้ว

    Great work Narendra. I'm learning a lot from your videos. I have gone through almost all your system design videos. Just checking if you can create one on a Saas product like Salesforce. I didn't find any good video on Salesforce / Shopify like services.

  • @experience-engineering
    @experience-engineering 3 ปีที่แล้ว

    Hello Narendra,
    Could you please make a video to design "google photos" like app? Or what architectural changes you would do in this existing design of drop box to limit it to "google photos"? By the way, your video has been real source of knowledge!

  • @MaheshR2021
    @MaheshR2021 6 ปีที่แล้ว

    Great job, Naren! Love your work. Keep it up!

  • @amitkabraiiit
    @amitkabraiiit 6 ปีที่แล้ว +3

    Can you create videos on data models/schemas/datasets as has been asked some comment below as well.

  • @OmprakashYadav-nq8uj
    @OmprakashYadav-nq8uj 5 ปีที่แล้ว +1

    Hey I really like the explanation and concept of solution you provide. Can you make a video of TH-cam system design. As there is no video on TH-cam yet.

  • @damluar
    @damluar 5 ปีที่แล้ว +1

    Chunks idea is good, but your statement that if we change only some bytes we will only have to upload that chunk has a problem. If you add X characters to 4th chunk, all the following chunks change as well, they all will be shifted by X characters. So when your script checks the hash sum of a chunk, it will have to upload [4;n] chunks. Unless we can adjust how we split a file into chunks.

    • @chabhishyam
      @chabhishyam 5 ปีที่แล้ว

      I think you just need to know how HDFS systems chunk the data into its default size. It won't separate the data from
      Line -L and Column - C
      1L-1C to 10L 12C -chunk1
      10L-13tC to 25L-30C - chunk2
      .
      .
      so........

  • @deepaknyool
    @deepaknyool 6 ปีที่แล้ว +2

    Great job Nagendra, look forward to seeing more interesting content from you. A part of system design it would also be nice if you could do a couple of class design and DB design examples. Design a chess game (all the classes and design patterns) or Design the database schema for instagram would be good examples.

    • @TechDummiesNarendraL
      @TechDummiesNarendraL  6 ปีที่แล้ว

      Sure I will as soon as I get more time to work on videos.

  • @arnab_speaking
    @arnab_speaking 2 ปีที่แล้ว

    sweetest part of the video at 15th Min

  • @VenkeeN17
    @VenkeeN17 4 ปีที่แล้ว +1

    Great system design video. Thank you !!!!

  • @Tigerjz32
    @Tigerjz32 4 ปีที่แล้ว +1

    I really like the idea of breaking the file into chunks but why not use document diff libraries to be sure to only capture the changes that have taken place?

    • @rabindrapatra7151
      @rabindrapatra7151 ปีที่แล้ว

      calculating diff is a complicated in a collbartive environment. He was explaining in one video about operational transformer. we can use libraries but how that works also we have to know.

  • @ilyanaoumov5425
    @ilyanaoumov5425 4 ปีที่แล้ว +1

    I'm not convinced by the argument for using queues in the design. If clients need to obtain the latest changes, they will need to establish a connection of some sort to some service. You could make a REST call asking for the latest data which could do a search against the metadata DB, or you could call a service that reads from the queue. I think the main argument for using a queue is latency. Your read/write path might take a long time, so could stand to gain by doing append only writes to a queue and having latest metadata responses pre-populated in multiple queues.

  • @rahulchudasama
    @rahulchudasama 5 ปีที่แล้ว +2

    First it was awesome explanation, one question clicked in my mind how git ver works with file history and lines changes detected?

  • @yishanlu3644
    @yishanlu3644 4 ปีที่แล้ว

    The most handsome tech guy I have found in youtube! Thanks a lot !

  • @clintonrego2765
    @clintonrego2765 5 ปีที่แล้ว

    Great video!! Wanted to cross check is S3 block storage? I thought S3 is object storage and EBS is block storage.
    I didn't know what the 2 were but luckily stumbled upon 19:00 and started googling. Thanks!

  • @adilsheikh9916
    @adilsheikh9916 4 หลายเดือนก่อน

    @26:14 : If the devices are offline, for that I think, we are using Clients in our services...I think queues will be used for scaling.
    @33:19 : Seems statement got incomplete during editing?

  • @yogidalal90
    @yogidalal90 4 ปีที่แล้ว

    @34:00 - @36:00 how ORM wrapper solves the issue of re-sharding or addition of more DBs?

  • @nomadsoul466
    @nomadsoul466 3 ปีที่แล้ว

    Are you saying there is a response queue for every client?

  • @shivaprasad.v.g7526
    @shivaprasad.v.g7526 4 ปีที่แล้ว

    This is amazing video with lots of details. If you could add more details on which part runs where , it will be complete .

  • @ramane2900
    @ramane2900 3 ปีที่แล้ว

    Hey Narendra,
    Great video. I am learning a lot. In the beginning of the video you talked about designing the system for 10 million users. Is this coming in another video? How is the sizing for required resources done. I am curious. Thanks mate.

  • @partrivedi1122
    @partrivedi1122 2 ปีที่แล้ว

    Truly one of the best system design videos on TH-cam. Well done!

  • @bethmonka8741
    @bethmonka8741 5 ปีที่แล้ว

    Trying to understand what is reading the Database? It seems the sync service is the only one interacting with the database and then pushing the same message it received back to the response queue?

  • @palashmaran
    @palashmaran 2 ปีที่แล้ว

    Narendra L Design looks good. But i doubt current design will work using browser as clients. Could you please update the documentation on how to optimize downloads upload/download using browsers as google chrome etc. Is it possible to add client side logic split and merge chunks in google chrome browser ?

  • @adityamanjrekar7675
    @adityamanjrekar7675 6 ปีที่แล้ว +1

    The videos are amazing, Very helpful. I have seen all your videos. Thank you so much. Can you please make a video on Designing Amazon Lockers?

  • @vishalmahavratayajula9658
    @vishalmahavratayajula9658 4 ปีที่แล้ว

    Naren, how does the message queue send http protocol to other clients is it polling? I think it should be a different protocol. I think it should be a websocket right?

  • @PrasanjeetMohapatra
    @PrasanjeetMohapatra 6 ปีที่แล้ว

    You deserve a million subs. please make a system design on Inshorts and Instagram.

  • @vishwastyagi2770
    @vishwastyagi2770 5 ปีที่แล้ว

    Hi Narendra,
    What is that first client mentioned which does chunking, indexing and pushing messages to other clients?
    Where this client resides?

  • @mogomotsiseiphemo1681
    @mogomotsiseiphemo1681 5 ปีที่แล้ว

    Great work! I think we should have a block on the client side to reconstruct the document!

  • @325venkat1
    @325venkat1 2 ปีที่แล้ว

    Would be helpful if you also include the sql schema for important tables such as entity table, chunk table etc. At 31:43 you only show chunk table json but since this is mysql, not fully clear. How would entity (file/folder) table look like?

  • @parulsaxena1136
    @parulsaxena1136 5 ปีที่แล้ว

    Wonderful videos! Learned a lot from your videos.