Let me know if you guys want more system design content - I've been wanting to do more and have a lot of ideas, so hopefully you all are interested! Let me know, and feel free to drop a like 👍 =) 🚀 neetcode.io/ - Get lifetime access to every course I ever create!
Design a system that pushes notifications with image and/or video to a mobile device using local storage only. No AWS or other external servers allowed. Once you have that done, let Eufy know, bc they f’d up big time 😂
Neetcode! I've been applying to jobs for 7 months now since my graduation and struggled to get interviews as my resume was fairly unimpressive. I kept telling myself that I just needed to make sure that I killed any interview I DID get and your channel is what helped me to do just that. Got an offer today that blew my expectations out of the water. Thanks for all you do for all of us and keep it up!
Awesome Video! Thank you for going into detail and explaining what every part of an example systems architecture is accomplishing and why it's useful. One little nitpick -> (22:00 ) you say TH-cam is using TCP which is (for quite some time now) not true anymore. If you look closely at the "client-protocol" used to send the video/audio data (you show that at 18:29 ) you will see that TH-cam is using QUIC (HTTP3) which is built on top of UDP instead of TCP. It includes most of the functionality of TCP based HTTP protocol but is technically not using TCP. Just wanted to point that out in case anyone was wondering... Great video though, keep up the neet work!
I think you've missed most points in the watching part. User can change the resolution for that we need multiple videos. Also, different OS will have different file formats like iOS, and android. so if there are 4 resolutions with 3 formats, 4*3=12. Each video should be stored as 12 different videos. Also, we need to store videos in chunks in the cloud storage. We will use main Dash File which contains all the video's information like chunk number, url to that chunk. So, if user skips to the 4th chunk, we'll check the dash file to get the url for the 4th chunk and will make the request for that particular chunk having specific resolution and format.
Bro! Genuinely such eye opening content. Thank you! I am no where near solving problems like this yet but it's really helpful getting exposed to all the realistic considerations involved in designing large scale systems.
seriously your explanations of complex topics are amazing, i’m so glad i found you! I was going cross eyed looking at graphs on Leetcode and their explanations were not helpful at all. Officially subscribed, thank you so so so much!
These types of videos are such a blessings since it goes deep into a particular usecase or several design descisions which would be rather difficult to gain experience. I can try to learn as many frameworks as possible, do as many projects as possible, but if I dont have such real-world (or close to real-world) lessons like these I might be making mistakes that aren't so visible to the naked eyes.
2:14- Reliablity is defined as ability of a system to work as per specification inspite of extenal (high loads) or internal (faults) factors. If our non-functional requirement is videos to not disappear after upload , we are essentially looking for the system to be durable (not reliable).
such beauty looking at all the different high level trade offs and considerations going into designing an application that I have been using daily for the past 15 years
The fact that YT isn’t streaming a data source in complete but making HTTP requests to load chunks of video/audio and taping them at the client end BLEW MY MIND !!! my whole life seems like a lie , lol Great video, Thanks mate
Even the Internet handles data in chunks through a concept called packetization, which is a fundamental principle of data transmission in modern networks.
I have a final interview for swe full time coming up in 30 mins. I was told it’s going to have a system design question most likely. I don’t know anything about system design, but after watching this video I feel way more confident now. You explain things so well. You’re the best man.
You can easily have a user service and store all of the users information including the uploader there, hence you don't need to have it stored alongside the video but just have a userId as reference. This way you have decoupled your systems and follow a more domain driven design and if the user data changes you don't need to go through 1000 of videos just to update a profile picture. At the beginning of every system design interview/document you'd better write down your domains in order to see what are the different services you need before jumping into the design.
For things like the user profile photo, I wouldn't think you'd store that in each video document. The document should contain a reference to the profile photo. Each user can only have one profile photo, so this is fine.
Although normalisation isn't expected in NoSQL, it's expected for you to normalize at a collection level in document databases like Mongo. It just makes it easy to deal with points you mentioned, like profile pictures and what not
On the profile picture updating async, you could always solve that by referencing a latest.png and then only updating the object store to change latest.png to reference to the newest image and hence the changes would propagate much faster than updating 1000x video documents
The only video on youtube which actually assumes that "Design TH-cam" is a fresh question that was ever thrown at you in an interview and you followed a top-down approach to present a possible design.
If a video takes 1 minute to encode, a worker can encode about 60*24 = 1500 videos a day. 50 million divided by 1500 is about 35000. You're making the math much more complicated than it needs to be in the section on the encoding workers.
Putting huge video files in a queue is too much and very costly and slow but instead you should have it in S3 and then just populate an event to Kafka/RabbitMQ/etc which is consumed by the encoding service that in turn knows it has to fetch video with id: XYZ and start encoding it.
@@vyshnavramesh9305 You can utilize multiple threads and splitting them into chunks, however putting video files into a messaging queue is still a bad practice and using a technology for wrong reasons where it doesn't fit. Message brokers are not there to transfer BLOBs.
For the NoSQL user profile picture topic. Instead updating the picture in every document could it not make sense to store profile picture as a userid.jpg and overwriting this userid.jpg with changes hence the reference always pointing to the same destination and thus never requiring updating of references in multiple documents in case of picture updating?
The justification for using NoSQL was kind of lacking.Relational database do just fine with high read throughput. I don't think it's necessarily a bad choice, I'd say either SQL or NoSQL would be fine.
16:35 Lesson learned: If you have demonstrated you know your stuff, saying "I don't know" actually makes you seem more competent than trying to sneak past addressing something.
Awesome video ! Your explaination is easy to understand and it's detailed enough for a video like this. Would be great if you do more videos about system design
Once a video is encoded and stored in the object storage (encoded), does the original video need to remain on the raw object storage? If it can be removed, then what is the use case of the raw object storage? Can raw object storage be removed from the design such that uploaded videos go directly to the message queue? Also, I thought sharding can only be implemented for NoSQL databases. If SQL databases can implement sharding, then they can also horizontally scale and I don't see a downside to using them over NoSQL databses?
Message queues aren't generally optimised for holding large blobs of binary data like that - a raw video could easily be 1GB+. Also it would probably have to be placed in the queue in a single operation, while object stores usually offer multipart upload. The queue would probably just hold metadata like videoId, timestamp, videoType etc.
Even in sport live streaming, to the end users, they are mostly over TCP. TCP gets your congestion control, error code checking, and guarantee ordering delivery. With live stream, sure it sucks when you are 10 seconds behind or when you see the "10x speed up" or "fuzzy screen" when there is a slow network, but ultimately as a viewer we still want to see most of the live video to be played. The client side can throw error when there is a consistent connection error, or skip ahead when it finds itself way behind. With UDP you really don't get those benefits. TCP reusing connection also helps. In practice, almost no one will use udp. There are other proprietary protocols out there, but most of them depend on tcp.
Hey there! Thanks for the awesome video, just one question, wouldn't be better to send to the queue, encode and then store, so we only store the encoded video instead of the raw footage? At least in this case where we do not need the raw video anytime after
There’s a lot of issues in this video. Durability instead of reliability, no explanation for how videos get from the object store to the CDN, NoSQL not being the best option for metadata, caching videos at the app server being a bad idea in general when you have a CDN, apparently saving profile picture URLs in the video metadata instead of just making a separate request for user data, etc.
the profile picture thing kinda ticked me off too. it didn't make sense to me. stopped watching the video after this because it rather left me confused. we could have the user id stored with the video entry, and then use that user id to get all the user related information, including the profile picture instead of updating it for all the video records.
Hi thanks for the content. One question if in the real interview you where are exactly this question and the answer was the content of the video, how would have the candidate do on the interview?
Hey can’t we use two db here ? to tackle the problem where you said if it you upload a video and you don’t get to see the newest video . I think we can use two db here nosql and sql , sql we can use for only reads and implement caching on the sql db
@neetcode, any plans of launching pricing based on PPP so that it becomes consecutively affordable across all countries as well apart from US? Tried to send mail as well for the same, got no response
Yes. We really like your videos and content. But purchasing power difference is only making me to stop buying the pro. As a just grad student it's Ton of money other countries.
11:00 if a user updates his profile picture you don’t need to update it anywhere is the video metadata iff you have chose to name the profile picture as [uid].jpg in this manner you will always have the current profile picture
Hello Neet. I have a question. in the object oriented design interview course of yours do you have any roadmap planned like what future case studies you will be covering??
hi, Thanks, amazing video. seems like the they writing into the DB should be done only (or also) after the video is ready for watch - that would be after the encoder finished. no?
Optimization: Instead of updating the video data to include new references to new profile pictures, couldn't we theoretically use object versioning and re upload the same key with the new pfp, maybe potential consistency issues there , but are we that worried?
System design interviews seem to be totally phony. Looks like they might have originated at Google or something and then all other companies started copying this (humans arent smart so they just go with the flow; thats how human culture works). No one ever does system design in their software engineering jobs. There's always teams of people doing this for months. Whatever we can come up in 30 minutes is absolutely useless (was TH-cam really designed in 30 minutes by a single person? no), so whats the point of testing something that is useless and something that we will never use in our real jobs? It makes ZERO sense. If something requires some kind of GRE/SAT preparation for something is not practical knowledge and people are trying to "train" for something that is just not used in their real jobs, thats another red flag. If they're trying to test intelligence or problem solving, they should stick to those kinds of questions. This system design stuff is just total BS.
Considering they get most of the information from previous assessments, problem solving and technical rounds, this is the place where one can have a true SWE to SWE conversation. What you mentioned about how products are being ideated over weeks or months is true, and that's the point of system design interviews. Your idea will never become an actual product, but rather, this is a good way to test the real-time problem solving ability of a candidate. In theory, there are no wrong answers in system design interviews. Your solution might be flawless or it could have different flaws, but just the fact that you're able to adjust to changing micro-requirements and continue building is a good sign of a promising candidate.
It's not about coming up with the ideal design. It's more about seeing if the interviewer understands various concepts and technologies + can communicate them well.
Hello! Thanks for the videos, they are very helpful and very good explanations and questions. 11:04, you talk about a load balancer, this was confusing for me, because the diagram is more for architectural design than application design (microservices, bff, kubernetes), is the LB mentioned as a BFF that manages authentication and requests, or is Kubernetes LB more part of microservices? Thank you!
I made a observation that when you reply to a comment on TH-cam. By default the person's name is added in input and it is highlighted. But when we type the persons name it is not highlighted and considered as text. Does someone know how this functionality works. P.S - I'm trying this things out on TH-cam phone app.
I have a question people.. Why do we have to update the videos collection when the user updates a profile picture? Since the videos are referenced with the user via the id..and if the user updates their profile picture, only that user document should be updated? And since the video collection has the user via the id..and the next time a request for a video goes in, it will fetch the user via id and the updated profile picture will already be there in the users document. Why have you updated all the videos documents of that user asynchronously.. isn't this a bad practice..can anyone explain to me..I think I am missing something here..
@NeetCode Thank you so much for such great contents! Could you please create a system design video on how to design a feature flag/toggle system similar to CloudBees or GitLab ones.
The one thing that stumps me that gets asked all the time are things about updating the table of videos needed to be processed without losing the queue of videos that are to be processed. Any resources about improving the reliability of the encoding processor pipeline?
oh wait just got a bit farther, and you're saying sending the chunks via HTTP is streaming. I see. I thought streaming required using a different protocol.
TH-cam employs HTTP-based Adaptive Streaming technologies like DASH (Dynamic Adaptive Streaming over HTTP) and HLS (HTTP Live Streaming) rather than traditional streaming protocols !
Is it really that better to use NoSQL and have a need to update potentially unlimitted number of records instead of having one user record and update it? I don't see how it is improvement over using relational DB. Yeah, it avoids join but I don't think that it is a big performance hit.
Yes NoSQL is a bad idea here. The joins are better to get all user data when needed, rewrites are actually somewhat frequent if you actually see how users make immediate changes multiple times on upload, the metadata content fields are going to constantly change and NoSQL could be a very expensive mass update vs SQL, storing profile pic urls in the video metadata is strange when a join could very easily solve this issue, it’s also potentially a many to many relationship… SQL just better here.
Neetcode - why we decided to use NOSQL for storing metadata rather than a relational DB and also how can we handle the use case you briefly touched upon where we don't want to upload the data again in case of a network partition during file upload?
Nosql used since uploads are frequent mandating sharding. But I don't think Mongodb is suitable to TH-cam since it isn't good for Availability. There is a downtime when the master is down to elect the next master from slaves. I would prefer Cassandra due to its high availability (leaderless). But Cassandra isn't good when there are read queries on multiple attributes. Hence I would couple Cassandra with Elasticsearch inturn bringing in Fuzzysearch. If it were Netflix, I would have sticked onto a simple Mysql since writes are not frequent and provides enough querying. If fuzzy search required, I would still couple it with ES. (Would never use an ES as a primary db since unreliable).
What's up with this NoSQL scales and SQL doesn't scale? I know those interviews are weird, but I can't say something like that even if that's required to pass...
Starting off with the denormalized video collection is just bad. Obviously you would not store user data in the video document. User_id only. To help performance on the read side you would have another layer above videos and users that is denormalized. Having the correct abstractions is everything.
Let me know if you guys want more system design content - I've been wanting to do more and have a lot of ideas, so hopefully you all are interested! Let me know, and feel free to drop a like 👍 =)
🚀 neetcode.io/ - Get lifetime access to every course I ever create!
Design a system that pushes notifications with image and/or video to a mobile device using local storage only. No AWS or other external servers allowed. Once you have that done, let Eufy know, bc they f’d up big time 😂
Thanks for the video! Could you recommend any good resources for learning System Design at a level like this?
Thanks again
More systems design content sounds great and if you want to do it, then you should!
Commented this on an old video but the api for neetcode is not working. I am unable to sign in with both Google and github.
GIVE ME MORE
Neetcode! I've been applying to jobs for 7 months now since my graduation and struggled to get interviews as my resume was fairly unimpressive. I kept telling myself that I just needed to make sure that I killed any interview I DID get and your channel is what helped me to do just that. Got an offer today that blew my expectations out of the water. Thanks for all you do for all of us and keep it up!
Congratulations, I'm really happy for you! 🎉
are u fired
Hello Alex, what company you got the offer from??
@@VY-zt3ph Infineon Technologies, a German semiconductor company.
@@aelam02 u live in Germany brother??
Thank you Mr beast for system design 😊
Awesome Video! Thank you for going into detail and explaining what every part of an example systems architecture is accomplishing and why it's useful. One little nitpick -> (22:00 ) you say TH-cam is using TCP which is (for quite some time now) not true anymore. If you look closely at the "client-protocol" used to send the video/audio data (you show that at 18:29 ) you will see that TH-cam is using QUIC (HTTP3) which is built on top of UDP instead of TCP. It includes most of the functionality of TCP based HTTP protocol but is technically not using TCP.
Just wanted to point that out in case anyone was wondering...
Great video though, keep up the neet work!
First video recently that I haven't skipped for one second. Great work!
I think you've missed most points in the watching part. User can change the resolution for that we need multiple videos. Also, different OS will have different file formats like iOS, and android. so if there are 4 resolutions with 3 formats, 4*3=12. Each video should be stored as 12 different videos. Also, we need to store videos in chunks in the cloud storage. We will use main Dash File which contains all the video's information like chunk number, url to that chunk. So, if user skips to the 4th chunk, we'll check the dash file to get the url for the 4th chunk and will make the request for that particular chunk having specific resolution and format.
Bro! Genuinely such eye opening content. Thank you! I am no where near solving problems like this yet but it's really helpful getting exposed to all the realistic considerations involved in designing large scale systems.
seriously your explanations of complex topics are amazing, i’m so glad i found you! I was going cross eyed looking at graphs on Leetcode and their explanations were not helpful at all. Officially subscribed, thank you so so so much!
We want more system design video...I love all your videos ❤️
These types of videos are such a blessings since it goes deep into a particular usecase or several design descisions which would be rather difficult to gain experience. I can try to learn as many frameworks as possible, do as many projects as possible, but if I dont have such real-world (or close to real-world) lessons like these I might be making mistakes that aren't so visible to the naked eyes.
💯
Yes, we need it. Hope to have other content appear in the course, such as machine learning, data analysis.
2:14- Reliablity is defined as ability of a system to work as per specification inspite of extenal (high loads) or internal (faults) factors. If our non-functional requirement is videos to not disappear after upload , we are essentially looking for the system to be durable (not reliable).
good point
Reliability entails building software that consistently delivers correct results.. your point i think concerns performance not reliabilty
such beauty looking at all the different high level trade offs and considerations going into designing an application that I have been using daily for the past 15 years
Getting into system design and one of the best vids I've seen so far, great in depth detail
You were born for this really. Somehow this complex system explanation holds my attention and usually none do
The fact that YT isn’t streaming a data source in complete but making HTTP requests to load chunks of video/audio and taping them at the client end
BLEW MY MIND !!! my whole life seems like a lie , lol
Great video, Thanks mate
Read about HLS (http live streaming)
Even the Internet handles data in chunks through a concept called packetization, which is a fundamental principle of data transmission in modern networks.
I have a final interview for swe full time coming up in 30 mins. I was told it’s going to have a system design question most likely. I don’t know anything about system design, but after watching this video I feel way more confident now. You explain things so well. You’re the best man.
Como te fue bro
You can easily have a user service and store all of the users information including the uploader there, hence you don't need to have it stored alongside the video but just have a userId as reference. This way you have decoupled your systems and follow a more domain driven design and if the user data changes you don't need to go through 1000 of videos just to update a profile picture. At the beginning of every system design interview/document you'd better write down your domains in order to see what are the different services you need before jumping into the design.
Yes, it can be cached as well.
i would prefer a user system as well
For things like the user profile photo, I wouldn't think you'd store that in each video document. The document should contain a reference to the profile photo. Each user can only have one profile photo, so this is fine.
hasn't the author said that? profile picture is stored in S3, the DB just references it🤔
First i thought why is MrBeast is teaching system design lmaooo
Although normalisation isn't expected in NoSQL, it's expected for you to normalize at a collection level in document databases like Mongo. It just makes it easy to deal with points you mentioned, like profile pictures and what not
On the profile picture updating async, you could always solve that by referencing a latest.png and then only updating the object store to change latest.png to reference to the newest image and hence the changes would propagate much faster than updating 1000x video documents
The only video on youtube which actually assumes that "Design TH-cam" is a fresh question that was ever thrown at you in an interview and you followed a top-down approach to present a possible design.
You definitely killing it! Love this content!
If a video takes 1 minute to encode, a worker can encode about 60*24 = 1500 videos a day. 50 million divided by 1500 is about 35000. You're making the math much more complicated than it needs to be in the section on the encoding workers.
you can scale num of encoding workers based on queue backlog length
Putting huge video files in a queue is too much and very costly and slow but instead you should have it in S3 and then just populate an event to Kafka/RabbitMQ/etc which is consumed by the encoding service that in turn knows it has to fetch video with id: XYZ and start encoding it.
What if the file is fed into a pipeline and split into chunks in the first step and so parallely processed?
@@vyshnavramesh9305 You can utilize multiple threads and splitting them into chunks, however putting video files into a messaging queue is still a bad practice and using a technology for wrong reasons where it doesn't fit. Message brokers are not there to transfer BLOBs.
For the NoSQL user profile picture topic. Instead updating the picture in every document could it not make sense to store profile picture as a userid.jpg and overwriting this userid.jpg with changes hence the reference always pointing to the same destination and thus never requiring updating of references in multiple documents in case of picture updating?
The justification for using NoSQL was kind of lacking.Relational database do just fine with high read throughput. I don't think it's necessarily a bad choice, I'd say either SQL or NoSQL would be fine.
16:35 Lesson learned: If you have demonstrated you know your stuff, saying "I don't know" actually makes you seem more competent than trying to sneak past addressing something.
You can't use UDP in the browser. You can use websockets which are built on TCP
Awesome video ! Your explaination is easy to understand and it's detailed enough for a video like this. Would be great if you do more videos about system design
Why dont we skip the first object storage? Like the app server directly adds the video to the queue?
Once a video is encoded and stored in the object storage (encoded), does the original video need to remain on the raw object storage? If it can be removed, then what is the use case of the raw object storage? Can raw object storage be removed from the design such that uploaded videos go directly to the message queue?
Also, I thought sharding can only be implemented for NoSQL databases. If SQL databases can implement sharding, then they can also horizontally scale and I don't see a downside to using them over NoSQL databses?
Message queues aren't generally optimised for holding large blobs of binary data like that - a raw video could easily be 1GB+. Also it would probably have to be placed in the queue in a single operation, while object stores usually offer multipart upload.
The queue would probably just hold metadata like videoId, timestamp, videoType etc.
10:34 Why do we need to update all video documents if we just updated user image. User id will still stay the same
Please more system design content.🙇
amazing explanations! thank you so much for your time
Even in sport live streaming, to the end users, they are mostly over TCP. TCP gets your congestion control, error code checking, and guarantee ordering delivery. With live stream, sure it sucks when you are 10 seconds behind or when you see the "10x speed up" or "fuzzy screen" when there is a slow network, but ultimately as a viewer we still want to see most of the live video to be played. The client side can throw error when there is a consistent connection error, or skip ahead when it finds itself way behind. With UDP you really don't get those benefits. TCP reusing connection also helps. In practice, almost no one will use udp. There are other proprietary protocols out there, but most of them depend on tcp.
21:55 can we use web sockets instead of TCP to fetch real time data from S3?
don't really see the point of using websockets in this case and it's harder to scale websockets as well
Hey there! Thanks for the awesome video, just one question, wouldn't be better to send to the queue, encode and then store, so we only store the encoded video instead of the raw footage? At least in this case where we do not need the raw video anytime after
Durability, my friend.
Hi Neetcode. Thank you for your work. Would you mind sharing/telling us what whiteboarding app you use please? Thank you!
i was abt to ask same thing! Great vid!
Loved the conciseness of the video
At split second I thought that mr Beast is actually a software engineer
FYI - Vitess is also used by Slack. Great throughput and amazing latency!
What tool/software do you use for writing/drawing in your videos?
this is awesome!!
btw, what app/website are you using to write this notes?
Since we have to favor availability over consistency, should we use some other database like Cassandra or DynamoDB than MongoDB?
Hi NC, please what tools do you use for making your videos?
Hardware and Software tools
Very helpful. Interesting details about TH-cam using MySQL with Vitess
Thx bro, have only few hours before SD interview, vary usefull)
How did it go
There’s a lot of issues in this video. Durability instead of reliability, no explanation for how videos get from the object store to the CDN, NoSQL not being the best option for metadata, caching videos at the app server being a bad idea in general when you have a CDN, apparently saving profile picture URLs in the video metadata instead of just making a separate request for user data, etc.
the profile picture thing kinda ticked me off too. it didn't make sense to me. stopped watching the video after this because it rather left me confused.
we could have the user id stored with the video entry, and then use that user id to get all the user related information, including the profile picture instead of updating it for all the video records.
Hi thanks for the content. One question if in the real interview you where are exactly this question and the answer was the content of the video, how would have the candidate do on the interview?
10:57 what if we store the only video metadata in the nosql while other data in sql?
whats the point of having two object stores - one for the raw file and one for the encoded file?
Hey can’t we use two db here ? to tackle the problem where you said if it you upload a video and you don’t get to see the newest video .
I think we can use two db here nosql and sql , sql we can use for only reads and implement caching on the sql db
@neetcode, any plans of launching pricing based on PPP so that it becomes consecutively affordable across all countries as well apart from US? Tried to send mail as well for the same, got no response
Sorry, I might've missed your email - feel free to send it again.
Yes. We really like your videos and content. But purchasing power difference is only making me to stop buying the pro. As a just grad student it's Ton of money other countries.
@@sarathchandrareddy3033 Feel free to email me!
@@NeetCode Here in India, its a significant amount of money when converted into Indian Rupees. I really like your videos.
Why would you use in-memory over distributed caching for your LRO?
I can't wrap my head around that choice.
Thanks for amazing design throughout
11:00 if a user updates his profile picture you don’t need to update it anywhere is the video metadata iff you have chose to name the profile picture as [uid].jpg in this manner you will always have the current profile picture
Hello Neet. I have a question. in the object oriented design interview course of yours do you have any roadmap planned like what future case studies you will be covering??
hi, Thanks, amazing video.
seems like the they writing into the DB should be done only (or also) after the video is ready for watch - that would be after the encoder finished.
no?
Optimization:
Instead of updating the video data to include new references to new profile pictures, couldn't we theoretically use object versioning and re upload the same key with the new pfp, maybe potential consistency issues there , but are we that worried?
System design interviews seem to be totally phony. Looks like they might have originated at Google or something and then all other companies started copying this (humans arent smart so they just go with the flow; thats how human culture works). No one ever does system design in their software engineering jobs. There's always teams of people doing this for months. Whatever we can come up in 30 minutes is absolutely useless (was TH-cam really designed in 30 minutes by a single person? no), so whats the point of testing something that is useless and something that we will never use in our real jobs? It makes ZERO sense. If something requires some kind of GRE/SAT preparation for something is not practical knowledge and people are trying to "train" for something that is just not used in their real jobs, thats another red flag. If they're trying to test intelligence or problem solving, they should stick to those kinds of questions. This system design stuff is just total BS.
It's a fun conversation starter though
Considering they get most of the information from previous assessments, problem solving and technical rounds, this is the place where one can have a true SWE to SWE conversation.
What you mentioned about how products are being ideated over weeks or months is true, and that's the point of system design interviews. Your idea will never become an actual product, but rather, this is a good way to test the real-time problem solving ability of a candidate. In theory, there are no wrong answers in system design interviews. Your solution might be flawless or it could have different flaws, but just the fact that you're able to adjust to changing micro-requirements and continue building is a good sign of a promising candidate.
You cant design anything in 30 mins it is way to check how do people think and how much they can provide context and how they deal with trade off
It's not about coming up with the ideal design. It's more about seeing if the interviewer understands various concepts and technologies + can communicate them well.
It's definitely a skill you must acquire outside of real life software engineering. Almost like studying for an exam
You are a great teacher. Thanks!
Hi, thank you so much for sharing amazing contents!
Hello! Thanks for the videos, they are very helpful and very good explanations and questions. 11:04, you talk about a load balancer, this was confusing for me, because the diagram is more for architectural design than application design (microservices, bff, kubernetes), is the LB mentioned as a BFF that manages authentication and requests, or is Kubernetes LB more part of microservices? Thank you!
Great video. I like the high level stuff.
so queue is a place for raw videos waiting their turn to be encoded by the encoding service?
I made a observation that when you reply to a comment on TH-cam. By default the person's name is added in input and it is highlighted. But when we type the persons name it is not highlighted and considered as text. Does someone know how this functionality works.
P.S - I'm trying this things out on TH-cam phone app.
I have a question people..
Why do we have to update the videos collection when the user updates a profile picture?
Since the videos are referenced with the user via the id..and if the user updates their profile picture, only that user document should be updated?
And since the video collection has the user via the id..and the next time a request for a video goes in, it will fetch the user via id and the updated profile picture will already be there in the users document.
Why have you updated all the videos documents of that user asynchronously.. isn't this a bad practice..can anyone explain to me..I think I am missing something here..
Awesome content ❤
nice thank you so much, thanks for sharing the knowledge
with the cache can you store references vs the video ?
Why no arrows from metadata to encoded object stores? How metadata go to CDN?
Can we ask for more of a system design stuff from you? Please!
First comment, lots of love and respect. I solved lots of questions just from your explanations.
@NeetCode Thank you so much for such great contents! Could you please create a system design video on how to design a feature flag/toggle system similar to CloudBees or GitLab ones.
Why are you using noSQL and not SQL?
what books do I need to read for this any suggestions? what exactly is the name of the topic i need to research?
How does youtube handle the quality control of a video?
The one thing that stumps me that gets asked all the time are things about updating the table of videos needed to be processed without losing the queue of videos that are to be processed. Any resources about improving the reliability of the encoding processor pipeline?
You can use distributed message queue with persistence feature like Apache Kafka.
Interesting, why does youtube send chunks of data via HTTP requests instead of using streaming?
oh wait just got a bit farther, and you're saying sending the chunks via HTTP is streaming. I see. I thought streaming required using a different protocol.
TH-cam employs HTTP-based Adaptive Streaming technologies like DASH (Dynamic Adaptive Streaming over HTTP) and HLS (HTTP Live Streaming) rather than traditional streaming protocols !
Is it really that better to use NoSQL and have a need to update potentially unlimitted number of records instead of having one user record and update it? I don't see how it is improvement over using relational DB. Yeah, it avoids join but I don't think that it is a big performance hit.
Yes NoSQL is a bad idea here. The joins are better to get all user data when needed, rewrites are actually somewhat frequent if you actually see how users make immediate changes multiple times on upload, the metadata content fields are going to constantly change and NoSQL could be a very expensive mass update vs SQL, storing profile pic urls in the video metadata is strange when a join could very easily solve this issue, it’s also potentially a many to many relationship… SQL just better here.
Great video, thank you!
Should you in a design interview cover how many app servers you might need based on the DAU
Great Video!
can you explain the workers math?
Great video 👌
Neetcode - why we decided to use NOSQL for storing metadata rather than a relational DB and also how can we handle the use case you briefly touched upon where we don't want to upload the data again in case of a network partition during file upload?
Nosql used since uploads are frequent mandating sharding. But I don't think Mongodb is suitable to TH-cam since it isn't good for Availability. There is a downtime when the master is down to elect the next master from slaves. I would prefer Cassandra due to its high availability (leaderless). But Cassandra isn't good when there are read queries on multiple attributes. Hence I would couple Cassandra with Elasticsearch inturn bringing in Fuzzysearch.
If it were Netflix, I would have sticked onto a simple Mysql since writes are not frequent and provides enough querying. If fuzzy search required, I would still couple it with ES. (Would never use an ES as a primary db since unreliable).
Finally an english tutorial
What's up with this NoSQL scales and SQL doesn't scale? I know those interviews are weird, but I can't say something like that even if that's required to pass...
Could you pls design a notification system and ad click event generation system.
More sys design! Thanks!
ce faci frt
Stau tu
nice video!
Could’ve just used nosql to store the dataschema of the user model.
I first thought Mr.Beast going to teach us youtube system
But then I read the channel name 😅
Why would you use an LRU cache when the most viewed videos are the most recent? You'd want the opposite of that.
LRU means the least-recently used will be the first to be evicted - which I think is what you're also saying, unless I'm misunderstanding?
Where is the API gateway in relation to all this... Specifically in relation to the CDN?
Starting off with the denormalized video collection is just bad. Obviously you would not store user data in the video document. User_id only. To help performance on the read side you would have another layer above videos and users that is denormalized. Having the correct abstractions is everything.
Very informative!