System Design Interview: Design Dropbox or Google Drive w/ a Ex-Meta Staff Engineer

แชร์
ฝัง
  • เผยแพร่เมื่อ 30 ธ.ค. 2024

ความคิดเห็น • 278

  • @abhijeet8710
    @abhijeet8710 6 หลายเดือนก่อน +143

    "Have you done any System Design course ? How are you so good with this subject ?" - These were the word of my interviewer. I had a High Level + Low Level system design with a start-up recently. Surprisingly the question was to design a file sharing system such as Google Drive as described in this video with some additional features. I explained the HLD with the diagram as I had learned from the the concepts of this video. After the HLD was over, the interviewer told me that I have created a very robust & elegant system. He further said, he was so satisfied with the HLD, that he no longer wants to go into the LLD.
    Folks, these videos are the absolutely anything that you will ever require to ace a system design interview. Do remember to learn the fundamentals used in the system. A huge thanks to #Hello Interview for putting out the best content out there.

    • @JohnVandivier
      @JohnVandivier 6 หลายเดือนก่อน +13

      "he was so satisfied with the HLD, that he no longer wants to go into the LLD. "
      GOALS! kudos and congrats

    • @hello_interview
      @hello_interview  6 หลายเดือนก่อน +16

      This is epic!

    • @charan775
      @charan775 2 หลายเดือนก่อน

      which startup bro?

    • @abhijit-sarkar
      @abhijit-sarkar 29 วันที่ผ่านมา +1

      These videos are undoubtedly great, but your interviewing experience at some start up doesn't prove that. Interviewing is taught at FAANG companies, and some dude at a company that opened 6 months ago wouldn't even come within 9 miles of a FAANG interviewer.

  • @KiritiSai93
    @KiritiSai93 2 หลายเดือนก่อน +12

    You guys remind me of the "Acquired" podcast hosts. No click-baits or cringe posts, just sheer passion about the subject and high-quality in-depth analysis of things. Kudos and hope you continue the great work!

    • @hello_interview
      @hello_interview  2 หลายเดือนก่อน +2

      That’s the idea. Pure value no BS 🫡

    • @draugno7
      @draugno7 หลายเดือนก่อน

      I also loved the jokes and an occasional reassurance in the Uber video, looking forward to more! Ddinngdding (that driver's phone after Taylor Swift concert in a badly designed system). This channel is simply amazing because it ties together all of the concepts I learned and even elaborates on different DSs and DBs. Someone said 'no shade to other youtubers' but I say 'yes shade' because they usually confuse and frustrate people who watch with incomplete diagrams and explanations.

  • @YeetYeetYe
    @YeetYeetYe 4 หลายเดือนก่อน +33

    Simply amazing. I don't mean to throw shade to other channels, but this is by FAR the best system design interview prep. So many other channels are just people with a couple of months of experience at FAANG and it really shows the difference between junior FAANG engineers and Staff FAANG engineers. Extremely high quality work.

    • @hello_interview
      @hello_interview  4 หลายเดือนก่อน +2

      So glad you like them!

  • @parashar1505
    @parashar1505 วันที่ผ่านมา

    There are many system design courses - both paid and free - and I have bought and seen many. I have rarely seen someone so organised, so methodical, so all-encompassing like the way are in creating a flow in the design. This just shows what a great thinker one needs to be to be able to create such a framework and flow. You would make everyone a bit of a better thinker than they are with your videos. Many thanks!

  • @EamonLinskey
    @EamonLinskey 7 หลายเดือนก่อน +39

    These are the best System Design videos I have found. Great framework for approaching problems, clear explanations, helpful diagrams. And I really appreciate the notes about how insight’s different seniority levels might approach specific parts

  • @Wololowizz
    @Wololowizz 4 หลายเดือนก่อน +7

    I must say that this is the best system design video I've seen so far. You covered the problem and solution step-by-step while other videos just throws a bunch of ideas right away. Sometimes I feel overwhelmed watching other videos thinking that's impossible to know all of that, but watching this video we can know what's the expectation for each level and the most important thought: you don't need to know everything. And that's gold

    • @hello_interview
      @hello_interview  4 หลายเดือนก่อน +1

      Glad you liked it! Check out our others if you haven’t already. Same format :)

  • @andjelaarsic9217
    @andjelaarsic9217 6 หลายเดือนก่อน +12

    My mind is absolutely blown by how beautifully everything is explained. I love how you understand what would be possible questions/confusions from people watching and you address them by explaining pros and cons.
    Thank you so much for the content! Your walkthroughs are by far the most useful and interesting.

    • @hello_interview
      @hello_interview  6 หลายเดือนก่อน +1

      High praise! Appreciate you taking the time to share this 😊

  • @GauravGupta-op8ol
    @GauravGupta-op8ol 8 หลายเดือนก่อน +18

    With my systems design interview coming up, I was looking forward to your video.
    It's great as always.

  • @madhurnsit
    @madhurnsit 4 หลายเดือนก่อน +3

    This is the best content I have come across on System Design interviews. Wish I had landed here this sooner. Thank you so much!

  • @levimatheri7682
    @levimatheri7682 6 หลายเดือนก่อน +2

    Wow, by far the best system design videos anywhere. I love how simple you make it, and the invaluable tips!

  • @md_dm490
    @md_dm490 8 หลายเดือนก่อน +5

    This channel has the best system design content on youtube. Keep up the good work.

  • @Gamble396
    @Gamble396 2 หลายเดือนก่อน

    One of the best System Design channels. Please keep uploading.

  • @jagrit07
    @jagrit07 2 หลายเดือนก่อน +1

    Watched 20 minutes of the video so far and This is the 3rd resource I am watching regarding Dropbox design, I have read Alex's book, read Grokking book and now watching this just for fun and I think Evan King is actually the King lol. Amazing video, Please keep on adding more content.
    Yesterday, I commented on Tinder's Design video and now here. I think I might have to comment on all the videos once I watch those because this is really good stuff and we viewers should appreciate it and hence I will keep adding comments lol :D

  • @JShaker
    @JShaker 5 หลายเดือนก่อน +1

    I'm so grateful for all of your videos. I've been practicing using the Hello Interview AI interviews, booked one mock with one of your interviewers, watched all the videos.
    The quality is so far beyond any other content out there, and I've successfully passed 5 system design interviews. Keep up the good content, your TH-cam channel deserves to blow up and your website too #wouldinvest

  • @alexandergordon9286
    @alexandergordon9286 7 หลายเดือนก่อน +2

    It's pure gold! specially the parts where you are stopping the debates abouts what db to choose or if the calculations are needed.
    The deep dives are the best part.. no one goes that deep and thats actually what matters in an interview

  • @Jadeish01
    @Jadeish01 3 วันที่ผ่านมา

    Thank you for breaking it down so elegantly, this was super helpful

  • @lorddel
    @lorddel 8 หลายเดือนก่อน +17

    One more comment on this: comparing this to the written content on hellointerview, this one seems more round and well-thought (mainly regarding using S3 notif. on chunk upload completion, which wont work). Would be cool to see it reflected there on the platform! Good job

    • @hello_interview
      @hello_interview  8 หลายเดือนก่อน +5

      Good feedback! I'll try to get that updated, particularly by adding sync which I just last minute decided to throw into the video.

  • @batusun717
    @batusun717 3 หลายเดือนก่อน

    please upload more stuff like this. This is literally the BEST on TH-cam. Very much appreciate all the great efforts!

  • @tushargoyal554
    @tushargoyal554 3 หลายเดือนก่อน +1

    This is the best channel for learning system design. I've gone through a lot of explanations but found them talking things in isolation making it very hard to connect to get a full picture. The popular system design interview book also doesn't help much due to very discrete and sometimes inconsistent sharing of knowledge.

  • @cidwiththreeeyes
    @cidwiththreeeyes 2 หลายเดือนก่อน

    Thank you for another great video! Honestly, I don’t have any constructive criticism, it’s pretty much a perfect format for these videos-practical, concise, insightful. Other creators’ videos like this are good, but they feel like they’re just going through memorized recipes. Your videos are actually teaching system design theory. Really hope you have more of these as I make my way through your catalog.

  • @AncientArtist7
    @AncientArtist7 2 หลายเดือนก่อน

    Your content is great and really easy to follow through each step of the process. Please continue to make more system design videos. It is extremely helpful !

  • @anuragtiwari3032
    @anuragtiwari3032 7 หลายเดือนก่อน +1

    i dont comment much, but for this kind of explanation i gotta give it u. Hands down the best explanation on youtube . pls continue making these kind of videos . This channel will blow up

  • @prasidmitra6859
    @prasidmitra6859 6 หลายเดือนก่อน +1

    These are like gift from God. The best SD resources I've found in the last 3 years.

  • @yourssachin
    @yourssachin 8 หลายเดือนก่อน +8

    Love the content and explanation. I watched hundreds of videos on system design from last 4-5 years and also have paid subscription from few. I don't have any doubt that, your channel can become premier system design platform in no time if you keep the content quality high ( just like last 3 videos).
    Next video, I'd recommend to talk about messaging platform like WhatsApp or FB messenger. There are so many videos on this topic but didn't find any which explain the details and really help in the interview.

  • @chongxiaocao5737
    @chongxiaocao5737 7 หลายเดือนก่อน +1

    one of the best system design preparation video I have seen online.

  • @yankomirov4290
    @yankomirov4290 2 หลายเดือนก่อน

    You added systematic (pardon the pun) approach to such an open-ended nature of an interview. This was a game change for me! I really appreciate it, I went ahead and bought the Guided Practice which is also amazing and is my main practicing tool. Thank you so much!

  • @adeeshacharya7520
    @adeeshacharya7520 7 หลายเดือนก่อน +1

    This is really good, irrespective of whether we are taking interview or not, any person looking at this level of explanation and detail would try to picture software differnetly. Thanks for making such videos, would love to see some more

  • @mehdisaffar
    @mehdisaffar 6 หลายเดือนก่อน +1

    I love the content. It has been frustrating to watch some other system design videos where they just brush off over important details and act like everything is straightforward and easy, and just make 10s of services and never really explain the nitty-gritty details of how those things would work and IF they would actually work/be efficient etc. Thank you!

    • @mehdisaffar
      @mehdisaffar 6 หลายเดือนก่อน

      I wish you had mentioned the challenges of 2-way syncing in this context. Because this is akin to master-master replication, in case of network partition (for example user makes changes to remote, hops on another offline device, makes changes, then comes back online) there is a chance of inconsistencies (user makes different changes on device 1 vs device 2). There would probably need to be a way to offer merging changes together or have the user choose between version 1 or 2.

    • @mehdisaffar
      @mehdisaffar 6 หลายเดือนก่อน

      I think I talked too fast! You did mention reconciliation

  • @anmolgangwal9236
    @anmolgangwal9236 2 หลายเดือนก่อน

    bro we are ready to pay just enable the join icon in your channel, this content is too good to be free

  • @OneSanddman
    @OneSanddman 17 วันที่ผ่านมา

    I really love your video series. Just a slight problem to point out here. 50gbs uploaded with 100 mbs should take less than 10 minutes, not an hour 12 minutes.

  • @noobu
    @noobu 8 หลายเดือนก่อน +1

    Great stuff again!
    Not only good for interview but also for daily work
    1) Clear and concise structure
    2) Weigh trade off rigorously and explain the final decision clearly. Every single component is well though out with real world considerations

  • @pradeepbhat1363
    @pradeepbhat1363 วันที่ผ่านมา

    Great video man ! very useful for preparing for system design interview.

  • @JyotiKundani05
    @JyotiKundani05 หลายเดือนก่อน

    This video was really helpful. Amazing work of putting this together and your explanation was on point. Much appreciated!

  • @crackITTechieTalks
    @crackITTechieTalks 6 หลายเดือนก่อน +1

    This is the best system design video, I have watched!! Specially the deep dives, You nailed it !! Looking forward to watch your videos.

  • @pragatimodi950
    @pragatimodi950 6 หลายเดือนก่อน

    Hi Evan, this is my first time giving system design interviews. Really glad I found this channel to learn from. Most of my prior feedback from mocks and system design have been framework related for when I explain my design. This really helps with that and I think even at work, this is a really good approach to follow for. most things. Awesome content, thanks a lot!!!

  • @aldogutierrezalcala3047
    @aldogutierrezalcala3047 4 หลายเดือนก่อน +1

    Bro, again me, just had a system design interview using your framework, still don't have the result but definitely this framework is basically pure gold to lead a conversation that i would keep using even in a daily job.

    • @hello_interview
      @hello_interview  4 หลายเดือนก่อน +1

      Hell yes!! So glad it went well 💪

    • @Ptbcpr
      @Ptbcpr หลายเดือนก่อน

      did you end up getting the job?

  • @allenputich4192
    @allenputich4192 4 หลายเดือนก่อน

    You do an amazing job of explaining the thought process, technical details, and growth opportunities!

  • @DMA-I
    @DMA-I หลายเดือนก่อน

    I believe there is a slight flaw for the sync files from remote server feature (24:16). I believe we need to keep records in db which device which client has synced to date what updated time/what version or the get changes will loop endlessly (getchange will always get files needs to be updated, but they might just have been updated)

  • @ashutoshrana9998
    @ashutoshrana9998 7 หลายเดือนก่อน

    Will be the best system design interview channel for sure. Neat content. Keep up with the quality Man!

  • @EngineeringBootCamp
    @EngineeringBootCamp 11 วันที่ผ่านมา

    Another great video. Some questions that came up in my mind after watching this video is - 1) How does local chunking work, do I literally break the files into parts and keep that in some other system or temp folder, and upload the files from there? 2) After I have uploaded the file, do I get rid of the chunks? 3) If we had a delta change in a remote file, you talked about comparing the fingerprints on all chunks and comparing locally, to only download ones that changed, implying we still keep these chunks locally somewhere? And even if I downloaded a modified chunk, how do I go ahead and stitch the chunks together to create the unified file in the main folder? [A little more clarity on those questions would be really beneficial.]

    • @TechieTech-gx2kd
      @TechieTech-gx2kd 9 วันที่ผ่านมา +2

      1. The chunking is not a physical concept rather a virtual one, the files are still stored as bits in the physical storage but in the database dropbox maintains a table on the client side known as chunks, which keeps the ranges on the physical file representing that chunk.
      Here is schema for chunks table
      Column Name Data Type Description
      chunk_hash TEXT (Primary Key) The unique hash of the chunk (e.g., SHA-256).
      ref_count INTEGER Number of files referencing this chunk.
      file_path TEXT File path where this chunk resides.
      start_byte INTEGER Start byte position of the chunk in the file.
      end_byte INTEGER End byte position of the chunk in the file.
      Similarly dropbox has file table
      Tracks metadata about files, including their chunk composition.
      Column Name Data Type Description
      file_id TEXT (Primary Key) A unique identifier for the file (e.g., UUID).
      file_name TEXT The name of the file.
      file_path TEXT Full path to the file on the local disk.
      chunk_hashes TEXT Comma-separated list of chunk hashes in order.
      Now when you add a new file, in the application layer you create chunks and calculate hash of each of them, then try to commit those chunks in Dropbox metaService, the metadata service will inform if the chunk is already available and won't ask you to upload at BlobService.
      2. As there are no physical chunks So there is no need to get rid of chunks. on the local storage we always deal with files and not chunks.
      3. Nopes you are not keeping any chunks but instead you'll deal with hashes(chunk hashes to be precise), as soon as you receive a notification that there is a remote change you'll ask about the chunks and their hashes,
      To dive little deeper, the MetaService maintains the Server_file_journal which keeps Append Only logs for each namespace and let you know for a paricular namespace what all changes are available in the server and you download only those chunks which you don't have in local based on their hashes.
      Now once you have the chunks available you directly replace bytes of that modified file in the disk without the need to re-create the file, so you are dealing with bits here via start and end offset.
      Do let me know if you need more detail

    • @VarunVermaUSC
      @VarunVermaUSC 9 วันที่ผ่านมา

      @@TechieTech-gx2kd Thank you so much, for taking the time out and sharing those details!

    • @pradeepbhat1363
      @pradeepbhat1363 วันที่ผ่านมา

      @@TechieTech-gx2kd Thanks for the details. So, if a new byte is added to the beginning of the file, the fingerprints will change for all the chunks and will it trigger a full file upload ?

  • @krishnabirla16
    @krishnabirla16 3 หลายเดือนก่อน +2

    You did not talk about version inconsistency? If two clients keep changing their local folders, they will be in a loop of pushing their own sync and pulling the other client's sync. There has to be a timestamp/version based conflict resolution. Maybe a follow up please?

  • @VahidOnTheMove
    @VahidOnTheMove 6 หลายเดือนก่อน +1

    Thanks for the videos. 47:45 I would like to know your opinion on push approach? By push approach I meant when the File service knows there is a change in a chunk, Sync service will let the client know. And, then the client will send a request to sync/download the chunk.

  • @indreshgahoi7103
    @indreshgahoi7103 8 หลายเดือนก่อน +2

    Hey Evan , thank you so much for providing the great content. I really live the way you organize and put content across the board. ❤

  • @galashrenik3404
    @galashrenik3404 3 หลายเดือนก่อน

    One suggestion I have is that when designing APIs, your videos often highlight the importance of handling partial data, which is typically expected of senior or staff engineers. In my view, API versioning carries a similar level of significance.

  • @Marcus-yc3ib
    @Marcus-yc3ib 2 หลายเดือนก่อน

    Please keep upload these kind of videos. Thank you very much.

  • @3rd_iimpact
    @3rd_iimpact 8 หลายเดือนก่อน +3

    I just finished reading the article on this lol. I’ll check out the video as well.

  • @god_of_blunder
    @god_of_blunder 4 หลายเดือนก่อน

    these are the best Design videos i ever found, Thanks and Kudos.

  • @groovymidnight
    @groovymidnight 7 หลายเดือนก่อน

    I really like the 5-step structure, it's the best I've seen and it effectively helps me think through the designs in a methodical way.

    • @hello_interview
      @hello_interview  7 หลายเดือนก่อน

      Right on! So glad it’s useful

  • @smalladi78
    @smalladi78 6 หลายเดือนก่อน

    Thanks for posting these! Great interview as always! I am learning a lot from these interviews.
    I found it interesting that you jumped ahead in order for the non-functional requirements since you knew the large file upload requirement would impact the design enough that doing the other ones first was not beneficial since they would become irrelevant. Obviously, this comes with actual experience of working on the job.
    May I suggest doing a follow up that uses the final design from this interview and consider how it may change if you piled on a more advanced feature like syncing only a partial set of folders or sharing folders with other people.

  • @viveksharma-tt5nj
    @viveksharma-tt5nj หลายเดือนก่อน

    Simply amazing !!
    Thanks a lot for such clear and concise explanation !

  • @satyajeetkumar2588
    @satyajeetkumar2588 3 หลายเดือนก่อน

    Awesome , so simple and elegant .
    It would have been great if you would have mentioned about checksum implementation to maintain data integrity as you have mentioned in the non functional requirements just to mention not the actual implementation.

  • @adityaagarwal5348
    @adityaagarwal5348 2 หลายเดือนก่อน

    At 50:08, the delta sync approach might work in case of downloading updated chunk from s3 using range-bytes query and then updating file on the local system but it won't work other way around specifically because of s3. S3 objects are immutable so there will never be a case where a chunk will be updated. So if this questions come up in the interview, should we just mention that we won't sync files > some GBs or we should further divide the storage into blob and file-system (s3 and EFS) based on file size and handle the complexity on server?

  • @jimitshah7636
    @jimitshah7636 7 หลายเดือนก่อน

    Great video for system design preparation.
    Methodology, the way he approached the question was good. 5 steps. Pretty good

  • @ahmedkhan25
    @ahmedkhan25 6 หลายเดือนก่อน

    Excellent sys design interviews - I like the informative tone and clear approach - thanks

  • @venkatamunnangi1287
    @venkatamunnangi1287 8 หลายเดือนก่อน +3

    Thanks for the effort and videos. Easily one of the best in business for mocks and educational material.

  • @AlbaraaAlHiyari
    @AlbaraaAlHiyari 7 หลายเดือนก่อน

    I truly appreciate all the effort you've put into making these amazing videos. Please keep them coming. One insignificant (not important) nitpick. 50 GB @ 100Mbps = ~ 1hr 7min. I think you just forgot to convert the decimal to minutes. You have it correct in the write up, as in 1.11 hours (0.11 * 60 = 6.6 minutes).

    • @hello_interview
      @hello_interview  7 หลายเดือนก่อน +2

      Mental math is hard 😛

    • @AlbaraaAlHiyari
      @AlbaraaAlHiyari 7 หลายเดือนก่อน

      @@hello_interview tell me about it... Also not fun under the pressure of an interview 🤣

  • @phavelar
    @phavelar 7 หลายเดือนก่อน

    one can argue that "supporting 50gb upload file size" is a functional requirement (you placed it under non-functional requirement) - just a call out. great video!

  • @vaibhavsharma1653
    @vaibhavsharma1653 6 หลายเดือนก่อน

    Amazing.
    Some Notes:
    DeepDive:
    Chunking
    CDNs
    Adaptive Polling with only updated chunks
    Compression.

  • @jherreria
    @jherreria 6 หลายเดือนก่อน

    I really appreciate your help in this topic. I'm learning a lot! Keep the videos coming!

  • @evangeloskostopoulos8173
    @evangeloskostopoulos8173 8 หลายเดือนก่อน +2

    This is really awesome, thank you. Please keep them coming!

  • @jeremyklein953
    @jeremyklein953 7 หลายเดือนก่อน

    Really good approach. I love how you build up to the full solution. It makes a lot of sense to me and helps me reason these complex systems as well

  • @vijaykhurana8766
    @vijaykhurana8766 8 หลายเดือนก่อน +1

    Great content. Thank you for posting. One of the best system design video I have come across for this design.

  • @kojcelkelesh
    @kojcelkelesh หลายเดือนก่อน

    Very good video, very straight to the point!

  • @MrSnackysmorez
    @MrSnackysmorez 4 หลายเดือนก่อน

    I love the videos and these are some of the best explanations. I love the flow and how everything builds on each other. It makes it much more manageable to do these problems. However you are driving and dictating this and this is so much harder to do when the interviewer wants to constantly interrupt and ask questions while you are doing these steps without first letting you explain what you are doing. I have this happen pretty often. How can you tell them to just chill and let you proceed?
    Appreciate these videos!

  • @guitarMartial
    @guitarMartial หลายเดือนก่อน +1

    49:09 - time is a weird commodity in distributed systems with clock drift et al
    wouldnt vector clocks be a better solution instead? this way we can detect write conflicts pretty well too

    • @hello_interview
      @hello_interview  หลายเดือนก่อน

      Yes :)

    • @guitarMartial
      @guitarMartial หลายเดือนก่อน

      @@hello_interview Come to think of it - maybe even a Merkle tree here might be powerful.
      You are storing all the hashes already just build a local merkle tree and use anti-entropy to figure out delta periodically.
      Really wild thought - merkle tree + version vectors.
      One helps quickly figure out anti entropy as we can compare hashes the other helps with write conflict detection. Couple this with Kafka as you showed and you have a pretty amazing scaling solution.

    • @guitarMartial
      @guitarMartial หลายเดือนก่อน

      55:31 - Merkle trees et al are giving me flashbacks to Torrenting days. Indeed the files were broken up in different chunks whose shas were used to perform comparisons for the sake of completion.

  • @KITTU1623
    @KITTU1623 8 หลายเดือนก่อน +2

    Thank you very much for the videos. One small nit pick. DynamoDB supports a maximum of 400KB per item and if we are storing all the chunk metadata in the item, for a 50GB file with 5 MB chunk size, assuming we need 100Bytes per chunk metadata, our item size would be around 1MB.

  • @surojitsantra7627
    @surojitsantra7627 7 หลายเดือนก่อน

    One of the best and detailed explanation.
    Thank you so much for this content. Please upload more such videos.

  • @GabrielAnyaele
    @GabrielAnyaele 17 วันที่ผ่านมา

    I really love your videos. I have a question though, are there chunk ids constant (most likely so)?. You made mention that the chunk ids are a hash of the bytes of the chunks, what happens when the chunks are updated - Do we still maintain the initial ids?
    You put out amazing contents, I appreciate once again

  • @BlunderMunchkin
    @BlunderMunchkin 5 หลายเดือนก่อน +1

    Huh. I would have prioritized consistency over availability. So much so, in fact, that I didn't even think it was a question. Some of the biggest headaches I've experienced as a developer have been caused by having an out-of-date file. I would much rather be temporarily unable to retrieve a file than to be fooled into thinking that the file I retrieved is the correct version.

  • @suri4Musiq
    @suri4Musiq 8 หลายเดือนก่อน +1

    Loved this resouce, thank you so much! But I just wanted to point out that in my interview I was asked about sharing files with other users and I feel like this design concentrated more on just syncing files across multiple devices. In the former, I think we can talk a little more about CDN/other approaches which were hand waved here.

    • @hello_interview
      @hello_interview  8 หลายเดือนก่อน +3

      Checkout the write up I linked! I go into sharing there.

  • @bit_ty4-w4p
    @bit_ty4-w4p 5 หลายเดือนก่อน +1

    Hey Stefan, awesome video, congrats! I've got a quick question though. Around the 49:46 mark, you mention adding an "updatedAt" to a chunk at a specific id/fingerprint. If a chunk changes, its fingerprint/hash/checksum would change too, right? So that id wouldn't really match the changed chunk anymore, would it? Doesn't that mean the old chunk gets "invalidated" and a new chunk id appears? Sorry if I'm missing something obvious here.

    • @hello_interview
      @hello_interview  5 หลายเดือนก่อน +1

      No this is spot on, good call out. I was loose here. If the fingerprint is the ID, then an updatedAt does not make sense. If the fingerprint is not the ID, then it of course does. Trade off here of whether you want to keep old chunks around for versioning.

  • @nobodyknows228
    @nobodyknows228 6 หลายเดือนก่อน

    1. How can we handle write conflicts when we have a folder which is supposed to be consistent across multiple devices?
    2. Also when two devices are disconnected from the internet and if users updates some files how does the sync happens when they come back online and when both tries to write the changes at the same time at a same file path?
    I am not sure if these solutions work but I think
    1. We can use a Redis lock for writes with TTL same as the timeout or a little more of the pre-signed url. If connection fails in between we can just resume the upload when connected back. But this might be a problem when a user is trying to upload big files with large timeout durations since other users might have to wait till the user uploading currently is done.
    2. When the user comes back online we should probably first fetch all the changes that are executed on the device and raise conflicts with the user asking what action to perform(similar to git) and acquire lock to write if required.

  • @jmms49
    @jmms49 8 หลายเดือนก่อน

    great videos, thanks for uploading these. Easily the best content about system design interviews I've found.
    I would probably suggest to use merkle trees for the sync functionality, seems like a natual way to diff and sync large file systems

  • @VyasaVaniGranth
    @VyasaVaniGranth 6 หลายเดือนก่อน

    First - please continue making and sharing these videos, this is incredible. Very few high quality sources available out there and this is probably the best one in my eyes.
    Second - how realistic is it that the download and upload happen directly b/w client and S3?
    Are there security concerns with this approach that should be considered? For reference, there's a Dropbox engineer's talk where uploads go through an intermediate service - this does mean additional copies of the data meaning more memory / compute but seems more realistic.
    In general, for any design that has media upload (eg. newsfeed), would you recommend direct upload to S3?

    • @hello_interview
      @hello_interview  5 หลายเดือนก่อน

      yah its a good point, most major systems don't do this for a number of reasons. While is largely academically correct and optimal, at youtube/dropbox/etc scale, they prefer more control so they're rolling their own systems here.

  • @dark-knight494
    @dark-knight494 7 หลายเดือนก่อน

    Big fan of this channel and Evan. Please solve whatsapp/messenger type chat system next if you get some time.

  • @TatianaRacheva
    @TatianaRacheva 4 หลายเดือนก่อน

    IIRC, low latency was specifically low priority for Dropbox because they (like email) rely on the client syncing the data and user accessing the local copy when it is ready. Also, I question whether consistency is less important than availability. I don’t know, but I’m curious how the answer would be different if latency could be high and consistency had to be strong.

  • @ramannanda
    @ramannanda 6 หลายเดือนก่อน

    For the delta sync bit, probably should go a bit deeper into rechunking for an existing file, to perform the delta sync.

  • @IshaZaka
    @IshaZaka 8 หลายเดือนก่อน

    Hi Evan, Thankyou so much for providing this type of content. plz make a system design video on payment system

  • @pujamishra1475
    @pujamishra1475 8 หลายเดือนก่อน

    I have a product architecture interview coming up. I was really looking for some good product architecture/design examples and then came across this. This is very helpful because you talk about the client, user experience, malicious users and relate it to the design decisions made. Thank you!
    One question, for a product architecture interview - should we go into more details about the APIs like explicitly write out requests, response, failure/success codes or the amount of discussion you did on APis is enough for senior level?
    Can you also tell me what topics/ points would you add over the discussion in this video if this was asked in a product architecture design round. Thanks again!

  • @danielkling4647
    @danielkling4647 3 หลายเดือนก่อน

    First I would like to say that this content is excellent. Why though would you implement chunking yourself instead of using S3's multipart upload?

  • @fragrancias972
    @fragrancias972 3 หลายเดือนก่อน

    Excellent content. Please tell me if I’m mistaken, but I believe GET /files/:fileid would return a list of chunk s3 links, not the file itself.
    Also, I don’t think merely filtering chunks by update time would work for syncing. You would need a tombstone for when chunks are removed. You didn’t quite specify how “polling the DB”/ update time filtering works with delta sync.
    Merkle trees could be used to optimize the reconciliation you mentioned, right?

  • @59sharmanalin
    @59sharmanalin 2 หลายเดือนก่อน +1

    We didnt outline file sharing feature, is it because of time constraints?

    • @hello_interview
      @hello_interview  2 หลายเดือนก่อน

      Went with syncing in the video instead since people asked for that in the comments

  • @mindrust203
    @mindrust203 8 หลายเดือนก่อน +2

    Hey Evan, this content is fantastic, thank you!
    I have a question regarding your solution to chunking around the 39 minute mark
    When we ask S3 to fetch us a pre-signed URL, do we do that for all our chunks as well? Does this happen on initial request to upload the file (metadata)?
    The way the File Metadata entity schema is described, it looks like we have a top-level S3Link, but also chunk-level S3 links embedded in the file metadata, so the upload flow is a little unclear to me

    • @hello_interview
      @hello_interview  8 หลายเดือนก่อน +6

      Good question, you're right to be a little confused here. So as I alluded to S3 offers and API called multi-part upload. For this, it requires just 1 presigned url, but, multi-part upload re-stitches the chunks back into a single file in s3, so this does not allow us to send over chunk deltas for syncing.
      As a result, we have to upload as chunks manually without relying on multi-part upload. So, long answer, but yes, you'd actually need to request a presigned url for each chunk, I should have made that clearer but tbh was not sure in the moment if multi-part upload could be configured to not re-stitch the file, so I omitted :)

  • @B-Billy
    @B-Billy 2 หลายเดือนก่อน

    Pure Gold content!!! Thanks you so much.

  • @theoshow5426
    @theoshow5426 5 หลายเดือนก่อน

    Keep going man! This is great!

  • @adityaagarwal5348
    @adityaagarwal5348 2 หลายเดือนก่อน

    At 27:24 For determining which files are already available on the local system, can we store a client to files mapping on the server based on client id and then getChanges API uses that data + file metadata to calculate which files needs to be transferred to the client? I know there can be issues when there is a sync gap b/w local and remote like file is deleted on the local but anyway system is eventual consistent. Keeping lots of data on the client will grow the app size.

    • @TechieTech-gx2kd
      @TechieTech-gx2kd 9 วันที่ผ่านมา

      What dropbox implement is something amazing, it maintains a server_file_journal which is an append only log for any namespace_id, this keep on storing amy changes being made to a particular file, imagine a text file you do CRUD on the file, all these operations are stored into that server_file_journal..
      Client simply asks saying that for this nsId give me what's the latest after a specific checkpoint which is a pointer named journalId(which each client maintains for their namespace), when it asks what all happend after this journal id sever returns the chunk details(probably a different hash) and client simply downloads them.
      "Keeping lots of data on the client will grow the app size." it's not the appSize it's the userData it's what you want to keep in your machine and get quick access to and also at the same time get access to it on the remote machines too.what you are referring to is something different which ICloud offers which is optimizing storage by keeping a bare minimum photos/video thumbnail on iPhone and when users request that file it fetches high definition

  • @tvmanikandan835
    @tvmanikandan835 8 หลายเดือนก่อน

    the content is good, keep up the good work. expecting more SD videos in more details

  • @puppy851226
    @puppy851226 3 หลายเดือนก่อน

    Amazing content! Thank you hello interview!

  • @amitb2921
    @amitb2921 6 หลายเดือนก่อน +1

    Thanks for a great content, especially the Deep Dive part, which generally people do not discuss about.
    I have one question around storing the chunks as list in the DB. For 50 GB file and 5 MB chunks there will be 10K chunks created. So the chunks list will have 10K entries. Now updating one chunk list column for every chuck status change could be quite challenging. Would it be better if we have a separate table for chunks instead. Also while you do the matching of chunks with the fingerprint, You need to check 10K entries from Local DB(with separate table and indexed) vs 10K entries in the chunk list (in single table column), where former is more efficient.
    Kindly let me know what are your thoughts on above points ?

    • @hello_interview
      @hello_interview  6 หลายเดือนก่อน

      Sounds reasonable to me! Good call out

    • @amitb2921
      @amitb2921 6 หลายเดือนก่อน

      @@hello_interview Thanks a ton for the response. I have modified my comment above to be bit more clear.

  • @Ynno2
    @Ynno2 8 หลายเดือนก่อน +3

    Do you suggest a different delivery framework for system design interviews which aren't necessarily "product"?

    • @hello_interview
      @hello_interview  8 หลายเดือนก่อน +1

      Topical! Was chatting about updating the site with that soon. I’d recommend very similar, but core entities and api are what may change as they could be less relevant. Instead I’d frame it as focusing on the inputs and outputs of the system more generally. And then still thinking about the data persisted

    • @hello_interview
      @hello_interview  8 หลายเดือนก่อน +1

      I’ll do a pure infra question next

  • @ndubuezeprecious391
    @ndubuezeprecious391 2 หลายเดือนก่อน

    Great stuff. This is the best I’ve seen so far. Can I know the app you are using for the white boarding, it looks really sleek

  • @deathbombs
    @deathbombs 7 หลายเดือนก่อน

    45:45 I wonder how syncing would change if instead of folder status, it's for database writes with many writers

  • @bqrkhn
    @bqrkhn 4 หลายเดือนก่อน +1

    Very nice video.
    A question: You added a updatedAt at each chunk. But chunks are identified with their ID which is calculated from a finger print. When the file changes, the finger print changes, how do we update the updatedAt?
    Possible Answer: From client we send both old and new chunk IDs and then update both id and updatedAt. Is this the correct strategy?

    • @fragrancias972
      @fragrancias972 3 หลายเดือนก่อน

      Same question here.

    • @bqrkhn
      @bqrkhn 3 หลายเดือนก่อน

      @@fragrancias972 what do you think about my possible answer ?

    • @insofcury
      @insofcury หลายเดือนก่อน

      @@bqrkhn +1 I think this definitely solves the problem.

  • @deathbombs
    @deathbombs 7 หลายเดือนก่อน

    Voted on your website for payment system! Banks love these

  • @dashofdope
    @dashofdope หลายเดือนก่อน

    For the chunking -how many parallel calls would we do? Maybe it doesn't matter?

  • @castulo
    @castulo 8 หลายเดือนก่อน

    👏Bravo, on point as always. Thanks Evan, keep up the good work man!

  • @ediancomachio2783
    @ediancomachio2783 8 หลายเดือนก่อน +1

    this is pure gold thank you so much

  • @charan775
    @charan775 2 หลายเดือนก่อน

    how do you handle nested folders in your schema?
    also chunks could kept as separate table at user id level, so that we can reuse chunks of different files..

  • @dannyryngler6425
    @dannyryngler6425 6 หลายเดือนก่อน

    Question - what should the file id be? It can't be based on the file name, as names can change. It also couldn't be a hash of the whole file, as the file itself can obviously change. Amazing content, thank you!!

    • @hello_interview
      @hello_interview  5 หลายเดือนก่อน +1

      Depends on if you want versioning or not. Can be the fingerprint or a random uuid, depends on requirements

  • @ezwalduzumaki3161
    @ezwalduzumaki3161 หลายเดือนก่อน

    Begging you to answer... much love: One question regarding non functional requirements, how do you decide which one to pick? You started with uploading large files and not working your non functional from top to bottom, why? What's the intuition behind that?