9:30 I am confused 10^7min/60 to convert into hours right? then dividing by 3 is wrong...cz 10^4*1000/60 is what you want to compute and 1000/60 is way far from 1/3 so you should get 1000processors, not 20
I was interviewed for TH-cam recently, and this was the exact same question I was asked. Gave a similar reply. Love your solution, and the fact that you uploaded this video! Subscribed!
[1:30, 2:00] Hi Gaurav - the part where you account a multiplier for the storage requirement due to replication across data centers is really smart! I haven't seen this mentioned in many books.
Here in the above situation we are taking 10*7 mintutes which means all these videos played once in a day. While multiple user play it at a same time. So we have to take its multiple on an average. Let's say per video is played by 1000 users simultaniously. So the time will be 10*7 x 1000.Now the processor count will be changed. By the way Great explanation. Thanks
Bang on , this capacity estimation is very accurate and detailed. However I would like to avoid it in the actual system design interview since this estimation will take almost 10-15 mins of your time. But I will say its important to go through the whole video to capture the essence and use the required details in your interview.
Please upload more such videos! Its much better to calculate, make mistakes and reach the answer than cramming and googling for the answer to questions like this!
The computational power and storage you estimated is just for uploading, now if you take into account delivering the videos, serving ads, providing recommendations, that the calculation is far by several orders of magnitude
One mistake that I could find in the description: What's the size of each video? Assume the average length of a video to be 10 minutes. Assume a 10 minute video to be of size 1 GB. Or... A video is a bunch of images. 10 minutes is 600 seconds. Each second has 24 frames. So a video has 25*600 = 150,000 frames. Each frame is of size 1 MB. Which means (1.5 * 10^5) * (10^6) bytes = 150 GB. Here, a video will have 25*600 = 15,000 frames, and not 150,000. Hence, the total size would come around to 15GB. Moreover, you failed to take compression into account. I believe compressing images and videos can greatly help save storage space, plus TH-cam definitely will have figured out an optimized way for compressing, storing and extracting the original file for lesser costs. That could change the whole scenario. For interviews, it can be safe to assume a compression ratio of 0.7.
Hello sir, Your videos are really very informative for interviews. I would request you to make one on end to end pipline with big data ecosystem, data storage issues, handling streaming data, breaking microservices etc. to get a clear concept as to where in reality we can use all the stuff. Thank a ton for such amazing videos.
Awesome video! Binge watching all your videos gaurav bhaiya, can please do a video on creating a good resume for students studying in Tier 1 and Tier 2 engineering college who want to join product based companies...i mean what type of projects do we need to have on resume, etc .
9:30 1. Total Video Uploads Per Day = 10^7 minutes 2. Convert minutes to hours = Since there are 60 minutes in an hour, the total duration of videos uploaded in a day is 10^7 minutes / 60 ≈ 1.67 x 10^5 3. Calculate Data Size = If we assume that 1 hour of raw video is about 1GB, then the total data uploaded in a day is 1.67 x 10^5 x 1GB 4. Calculate Data Rate = There are 86400 seconds in a day. So, the rate of data upload per second is (1.67 x 10^5 x 1GB) /86400 ≈ 1.93GB/sec = 1930MB/sec 5. Account for Redundancy and Resolution Processing: In real scenarios, we need to process more data due to redundancy and high resolution. So, we multiply the data rate by 10. This gives us 19300 MB/Sec 6. Calculate Number of Processors: If one processor can process 20MB of data per second, then to process 19300 MB/sec we would need 19300 MB/Sec / 20 MB = 965 processors So, to handle 1 days’ worth of TH-cam video uploads we would need about 965 processors.
Hello Gaurav, For the Third part: When we had already estimated 30T of per day to be stored in the first section of the view then why do we again do the estimation of data to be processed per second? It can just be 30* 10^6 MB/24*60*60 ~ 350MB/sec
@@gkcs @Manish But I noticed one difference. The 30TB storage which he calculated for storing videos is based on assumption videos are processed and of 200MB/hour while for counting processors, he is dealing with unprocessed videos and with assumption of 1GB/hour. So u would see a difference of 5 times.
Hi Gaurav ,,, Must say amazing video tutorials by you , In this estimation case i have a doubt that I feel while calculating estimates we should consider formats (: MOV, MPEG4, AVI, WMV, MPEG PS, FLV, 3GPP and WebM)* Resolutions (1028,520....), For each format and resolution youtube will store one video is my assumption. No of Videos= No of formats * No of Resolutions.... ...let me know if you feel its an right assumption
Congratulations on 200K gaurav.... your 100K to 200K journey was pretty fast don't you think?.. Your videos are fun and informative and they help us alot. Love your content. Keep going like this...All the best!!
Wooww.. i thought that going from 720p to 480p is done on the video at runtime, but was not aware that in the backend it is changing to the resolution by replacing it with the another stored low resolution video which is ofcourse at runtime.
Hi @Gaurav Sen, Your system design content is really amazing, can you please create a video on games system design (eg: pubg , clash of clans, Pokemon GO) that how they manage millions of users at a same time.
Hey big fan here, thanks for such amazing tech concept videos. Just wanted to how do you gain so much in depth knowledge of every technology in short span of time? Do you go with books or some other resources? If possible can you share some resources, link or anything, will be much appreciated. Thanx
5:00, Using image size for estimating video size is wrong. As the algorithms compress it so that they dont store each image completely. Rather like a difference between those consecutive images.
At 9:30 why does the 10^7 get split up into 1000 and 10^4, and then you seem to just drop the 1000 portion? I understand that 1000 * 10^4 = 10^7, so why was the 1000 (or 10^3) dropped?
little confused. the video started with the estimation of "how much storage per day" and you calculated 10^7 mins per day, but at 5:37, you mention the actual number is 10^7 mins of video Per Min. thats off by a factor of 1500. so 600hrs of new videos PER min almost comes up to 180Pb (1500 * 120gb/min). which seems way off from 1Pb assumption. am i missing something?
Thanks for the great video. I think it can be improved if you are a bit slow. Also the change in the video scenes (which you have done to shorten the video) is a bit distracting. Never mind :)
Hello Gaurav, Thanks for great explanation, very clear and informative. 👍 Though I have one question when we want to do 20 second/ second then in current case we directly moved to 20 processors (actually it will be in thousands 😉) but then cpu cores, hard disk type is not considered and that will also impact this count, right? Let's say, CPU cores will helps us in having concurrent connections, threads or processing power, and with multiple HDD/ SSD we can read data in parallel. Can you share your thoughts over impact of CPU cores, HDD, SSD on number of processors?
Yes CPU cores will have an affect on the system. If we use 4-core processors, we could use 20/4 = 5 processors. A GPU would also have a similar effect to the calculations.
To be honest, the videos sizes I have after editing are about 700 MB for a 10 minute video. The 400 MB per hour estimate is a bit risky, but passable in an interview where we are estimating everything anyway :P
Hi Gaurav, The content is really informative. I have small confusion on cache requirement calculation for each thumbnail @7:12, you can multiplied by 1M ( 10 KB *90 * 1M), what is 1 M signifying?
TH-cam stores only the best quality available of any video, TH-cam uses some software which reduces the quality of that data when a user select lower resolution. TH-cam will never store same video on different different resolutions unless you upload it manually
That's incorrect. If you think that cutting down on resolution on the fly is easier or computationally cheaper, you need to read up on image and video processing. Also, try uploading a video on TH-cam. It first processes the 480p version, then heads to full HD and higher resolutions. They store multiple resolutions. Please read before making general statements.
Good one 👍 do we really need to store all quality video? Can't we store high quality and as per the read we convert it while processing the same? Though I am not sure if it is possible but was curious to know
Doing that will require lots of CPU or GPU ad hoc for transcoding the video depending upon the quality, so it's only feasible if we before hand have all different resolutions version transcoded ready.
What I am thinking is that TH-cam don't create separate resolution videos from the original resolution. It maybe loads the original video in memory and every user who is watching the same video in different resolutions gets a different process assigned, then all those processes use same memory location to read with different quality and transmit to the user device. This will save RAM in most cases and also save storage and maintenance cost. This is just my speculation 😬.
When Gaurav is making an estimate of cache - I am confused as to what 90 refers to here. he mentions that its 90 days worth youtube vids thumbnails. and since each day number of youtube vids uploaded are 1M - the total thumbnails goes upto - 90M, but in base sense it just means that Gaurav is assuming that cache is storing 90M vid thumbnails(whether we take into account the last 90 days or not). However when deciding cache for thumbnail - there are various factors that goes in - maybe i do want to know the actual number of popular videos in last 90 days-(a vid is popular that gets atleast 5K hits) and then i store thumbnails of these vids - and that number can be more thant or less than 90M right? - so why 90M is the number
Thanks for this video Gaurav. Could you also help in understanding as why SQL has been chosen as DB for you tube considering this large scale of data and performance requirement
@10:19 - You mentioned the processor has to read data from somewhere and write back to some place, right ? Reading happens from the same storage ( 30 TB without HA ) in to cache and then back to the same storage, if not , would it require more storage than the number you came up with before ? I could be assuming wrong here.
There must be a finite amount of storage space and processing power. Power consumption, physical space of the servers, material consumed in the construction of the necessary equipment and of course financial outlay. It will cost money for someone to store multiple copies of a 15 year old cat video that nobody ever watches.
I didn't got where that 1M came from at 7:38, can anyone please help me understand. Total space requirement is cache for thumbnail should be equal to videos which were uploaded in last 90 days + evergreen videos and we are assuming 1 thumbnail to be 10kb , so it should be 10kb * (Number of videos in last 90 days), is it because we assumed 10^6 videos to be uploaded per day?
Why do we need to store lower format resolution explicitly separate from high resolution. Can't it be generated or sampled down during streaming using a filter ?
@@gkcs English, I'm sorry I just sometimes can't understand what you said with your accent and I didn't discriminate and I tried hard to understand what you said,no offense, sorry I have to be honest
How you came up with 1 billion users at the start? Another thing we should also learn is to how to estimate the number of user for the software you are developing or the question you are designing in an interview.
2:39 why 2*X ? like I got by combining all the possible quality size we get 'X' but as we are keeping 3 copies so shouldn't it be 3 * x ? I don't, I got confused here 0_0 rest was good
Can someone explain why 10kb * 90 * 1M at 7:18 for the thumbnail estimation ? I understand 10kb which is compressed image but why multiple by 90 days then 1MB ? Appreciate
IF YT had just as many videos being uploaded 10 years ago as is today...3 PBs/day * 365 days/year * 10 years = ~11,000 petabytes of storage. I can see why YT is so quick to be so critical of video content and so quick to delete videos...they can hardly store just any video. How much money would have to be spent on ~11,000 peta bytes of hard drives?
Maybe this is covered in one of your videos but what's the most efficient way to check which cache in the 160 nodes of 16GB data has the actually cached stuff. Can there be sharding or something similar inside the cache or like a loadbalancer for the cache?
do they store different quality video separately the don't have any technique like if I have to send an image frame then I will store the highest quality and when the user needed low quality in case of less internet speed then I will reduce the resolution of a copy of that image and send 2:34 ?? I don't have much knowledge in this field
Represents 1 million users. Assume 1 billion YT users, and 1/1000 of them uploads video daily.. 1 billion x (1/1000) = 1M. Each thumbnail is 10kb hence 1 million users' thumbnail is 10kb x 1 million and then 90 days rules => 10kb/image x 1m users x 90 days
9:30 I am confused 10^7min/60 to convert into hours right? then dividing by 3 is wrong...cz 10^4*1000/60 is what you want to compute and 1000/60 is way far from 1/3 so you should get 1000processors, not 20
Damn, I think this is it. Couldn't even find the bug during editing.
Thanks for this Prashant!
@@gkcs Pinned Comment? Can I apply for the job now? Gaurav Sen Pvt Ltd :D
@@prashantgupta6885 Hahaha 😁
@@prashantgupta6885 now you can't, your comment is now unpinned lol 😂
@@4n81t Dunno how that happened. Pinned it again 😁
I was interviewed for TH-cam recently, and this was the exact same question I was asked. Gave a similar reply. Love your solution, and the fact that you uploaded this video! Subscribed!
Thanks 😁
Did you clear the interview?
@@sar3388 not for youtube but got into another faang
@@sar3388 no but for other reasons. I got into two other faangs though if that matters
[1:30, 2:00] Hi Gaurav - the part where you account a multiplier for the storage requirement due to replication across data centers is really smart! I haven't seen this mentioned in many books.
Here in the above situation we are taking 10*7 mintutes which means all these videos played once in a day. While multiple user play it at a same time. So we have to take its multiple on an average. Let's say per video is played by 1000 users simultaniously. So the time will be 10*7 x 1000.Now the processor count will be changed.
By the way Great explanation. Thanks
Bang on , this capacity estimation is very accurate and detailed. However I would like to avoid it in the actual system design interview since this estimation will take almost 10-15 mins of your time.
But I will say its important to go through the whole video to capture the essence and use the required details in your interview.
True, I would only estimate the capacity if I had to justify my architecture or if the interviewer specifically asked me to.
Please upload more such videos! Its much better to calculate, make mistakes and reach the answer than cramming and googling for the answer to questions like this!
😁
The computational power and storage you estimated is just for uploading, now if you take into account delivering the videos, serving ads, providing recommendations, that the calculation is far by several orders of magnitude
Yes. I've kept it simple for the interview. There's a lot more than we can talk about in an hour (recommendations, trending tab, analytics etc...)
Your brainstorming videos on designing systems and infrastructures are really helpful.
Thanks!
I was really looking for a way to calculate number of processors based on the bandwidth estimation. And there you have it. Thanks man! Love it. :)
Great video Gaurav!!
And yeah it would be 1000 processors as 10^7 minutes is 166666 hours and not 10^4/3 hours
Thanks Nishit!
So nicely he explains concepts..!!
Thank you so much for gr8 info..!!
Seeing u after so long..!!
Thank you 😁
I can see that your Math is spot on.
This is one of the best video 😍
In terms of system designing 🙏
It is a really a good conceptual video, Always like the concepts you pick and showcase
Thanks 😁
One mistake that I could find in the description:
What's the size of each video?
Assume the average length of a video to be 10 minutes.
Assume a 10 minute video to be of size 1 GB. Or...
A video is a bunch of images. 10 minutes is 600 seconds. Each second has 24 frames. So a video has 25*600 = 150,000 frames.
Each frame is of size 1 MB. Which means (1.5 * 10^5) * (10^6) bytes = 150 GB.
Here, a video will have 25*600 = 15,000 frames, and not 150,000. Hence, the total size would come around to 15GB.
Moreover, you failed to take compression into account.
I believe compressing images and videos can greatly help save storage space, plus TH-cam definitely will have figured out an optimized way for compressing, storing and extracting the original file for lesser costs.
That could change the whole scenario.
For interviews, it can be safe to assume a compression ratio of 0.7.
Gaurav, thank you for your elaborate work! Cheers 😌
This is great stuff...!!! So good to see these videos being accessible easily here.
Looks like answering questions for an interview for a job with TH-cam.
Without any writing down I estimated 1PB before watching your solution, seems to be roughly in the correct order of magnitude.
Nice :)
Love you bro. you are always there with something new and different from other youtubers. you are real. ❤❤❤❤
that was Incredible explanation. GJ
Hello sir,
Your videos are really very informative for interviews. I would request you to make one on end to end pipline with big data ecosystem, data storage issues, handling streaming data, breaking microservices etc. to get a clear concept as to where in reality we can use all the stuff.
Thank a ton for such amazing videos.
Good idea, I'll add this to my list 😁
I was wondering when a creator will upload this vedio... thnk you
Got lots in progress :)
You know what you should launch a Full fledge cosrse on system design on Udemy or on any platform. How many agrees?
It's here 😛
get.interviewready.io/courses/system-design-interview-prep
Is it for absolute beginner? If no could y give a great course for beginner?
Wow! Very impressive explanation
Google is coming with website for coronavirus crisis! Please please do system design of that! It will be super hot topic I am predicting!
Great estimation Gaurav Sir
It gives some estimate how much resources required. Excellent , I m thinking to calculate instagram or facebook resources 🤓
Great 👍
Dishant Kapoor are you a professional developer?
Awesome video! Binge watching all your videos gaurav bhaiya, can please do a video on creating a good resume for students studying in Tier 1 and Tier 2 engineering college who want to join product based companies...i mean what type of projects do we need to have on resume, etc .
Great. Keep up. I like your way of expressing things
Thanks for the upload gaurav :) , thanks for the tips to approach such problems
9:30
1. Total Video Uploads Per Day = 10^7 minutes
2. Convert minutes to hours = Since there are 60 minutes in an hour, the total duration of videos uploaded in a day is
10^7 minutes / 60 ≈ 1.67 x 10^5
3. Calculate Data Size = If we assume that 1 hour of raw video is about 1GB, then the total data uploaded in a day is
1.67 x 10^5 x 1GB
4. Calculate Data Rate = There are 86400 seconds in a day. So, the rate of data upload per second is
(1.67 x 10^5 x 1GB) /86400 ≈ 1.93GB/sec = 1930MB/sec
5. Account for Redundancy and Resolution Processing: In real scenarios, we need to process more data due to redundancy and high resolution. So, we multiply the data rate by 10. This gives us
19300 MB/Sec
6. Calculate Number of Processors: If one processor can process 20MB of data per second, then to process 19300 MB/sec
we would need 19300 MB/Sec / 20 MB = 965 processors
So, to handle 1 days’ worth of TH-cam video uploads we would need about 965 processors.
Thanks!
90% savings is a lot to assume I think, on average for video files this number should be around 50-75 %
Your uploads are informative, good job man.
😁
About TikTok system desgin -> next video please
TikTok...I have to install the app first :)
Don't do it @ Gaurav Sen
Very well explained.
Hello Gaurav,
For the Third part:
When we had already estimated 30T of per day to be stored in the first section of the view then why do we again do the estimation of data to be processed per second?
It can just be 30* 10^6 MB/24*60*60 ~ 350MB/sec
That would have been a faster method, good catch 😁
Also would have avoided me making the mistake, probably.
@@gkcs @Manish But I noticed one difference. The 30TB storage which he calculated for storing videos is based on assumption videos are processed and of 200MB/hour while for counting processors, he is dealing with unprocessed videos and with assumption of 1GB/hour. So u would see a difference of 5 times.
Great work. Keep up doing good work like this.
Thanks for video!
You are Welcome!
assumptions: key to progress further. Really helpful.
😁
Hi Gaurav ,,, Must say amazing video tutorials by you , In this estimation case i have a doubt that I feel while calculating estimates we should consider formats (: MOV, MPEG4, AVI, WMV, MPEG PS, FLV, 3GPP and WebM)* Resolutions (1028,520....), For each format and resolution youtube will store one video is my assumption. No of Videos= No of formats * No of Resolutions.... ...let me know if you feel its an right assumption
That's a good point.
I have mentioned the different resolutions, but there may also be different formats similar to how Netflix processes videos.
Legend!🙇♂️🙇♂️, You are inspirational gaurav, Thank you for the Amazing content!!❤
Congratulations on 200K gaurav....
your 100K to 200K journey was pretty fast don't you think?..
Your videos are fun and informative and they help us alot. Love your content.
Keep going like this...All the best!!
Thoughtful .. great
U can r a great teacher
Wooww.. i thought that going from 720p to 480p is done on the video at runtime, but was not aware that in the backend it is changing to the resolution by replacing it with the another stored low resolution video which is ofcourse at runtime.
7:17 Why did you mutiply 1M here?
Edit: From 9:31 It was so confusing, didn't get anything
Not sure if I am first because availability over consistently for comment section
Hahaha!
Although, i was the first one; watched on linkedin.
The way you think is superb.....How can I make my thinking skills like you...... Plz give some tips
Needed this so badly, Thanks again for the awesome video..:)
how do u stay motivated all time.?
And energetic!
Hi @Gaurav Sen,
Your system design content is really amazing, can you please create a video on games system design (eg: pubg , clash of clans, Pokemon GO) that how they manage millions of users at a same time.
I am in awe!
9.30 : What were you thinking while dividing 10^7 by 3? just want to know your thought process - though that's wrong.
Hey big fan here, thanks for such amazing tech concept videos. Just wanted to how do you gain so much in depth knowledge of every technology in short span of time? Do you go with books or some other resources? If possible can you share some resources, link or anything, will be much appreciated. Thanx
They are based on my experience and highscalability blogs 😁
I've mentioned my sources here: th-cam.com/video/bBPHpH8aKjw/w-d-xo.html
Please create one on CAP theorem and explain some non relational db design like Mongo including its drawbacks.
I have one on CAP theorem coming up soon :)
5:00, Using image size for estimating video size is wrong. As the algorithms compress it so that they dont store each image completely. Rather like a difference between those consecutive images.
At 9:30 why does the 10^7 get split up into 1000 and 10^4, and then you seem to just drop the 1000 portion? I understand that 1000 * 10^4 = 10^7, so why was the 1000 (or 10^3) dropped?
Awesome insight! You should get a job easily in silicon valley
thanks for all system design videos. wanted to suggest next topic if you get chance on how to design "zoom or facebook/youtube live" video
little confused. the video started with the estimation of "how much storage per day" and you calculated 10^7 mins per day, but at 5:37, you mention the actual number is 10^7 mins of video Per Min. thats off by a factor of 1500. so 600hrs of new videos PER min almost comes up to 180Pb (1500 * 120gb/min). which seems way off from 1Pb assumption. am i missing something?
Thanks for the great video. I think it can be improved if you are a bit slow. Also the change in the video scenes (which you have done to shorten the video) is a bit distracting. Never mind :)
Thanks for the tips!
Little confused. At 5:36. We had assumed 10^7 min for 1 day and not every minute right?
Hello Gaurav,
Thanks for great explanation, very clear and informative. 👍
Though I have one question when we want to do 20 second/ second then in current case we directly moved to 20 processors (actually it will be in thousands 😉) but then cpu cores, hard disk type is not considered and that will also impact this count, right?
Let's say, CPU cores will helps us in having concurrent connections, threads or processing power, and with multiple HDD/ SSD we can read data in parallel.
Can you share your thoughts over impact of CPU cores, HDD, SSD on number of processors?
Yes CPU cores will have an affect on the system. If we use 4-core processors, we could use 20/4 = 5 processors.
A GPU would also have a similar effect to the calculations.
Taking into account compression format like h.264 and h.265 can improve the estimation too
To be honest, the videos sizes I have after editing are about 700 MB for a 10 minute video.
The 400 MB per hour estimate is a bit risky, but passable in an interview where we are estimating everything anyway :P
I think the interviewer should fix the maximum resolution at least
Naah, defeats the purpose of estimating in the real world.
Hi Gaurav, The content is really informative. I have small confusion on cache requirement calculation for each thumbnail @7:12, you can multiplied by 1M ( 10 KB *90 * 1M), what is 1 M signifying?
1B users / 1000 as 1 every thousand uploads a video as explained in first vid
TH-cam stores only the best quality available of any video, TH-cam uses some software which reduces the quality of that data when a user select lower resolution. TH-cam will never store same video on different different resolutions unless you upload it manually
That's incorrect.
If you think that cutting down on resolution on the fly is easier or computationally cheaper, you need to read up on image and video processing. Also, try uploading a video on TH-cam. It first processes the 480p version, then heads to full HD and higher resolutions.
They store multiple resolutions. Please read before making general statements.
Good one 👍 do we really need to store all quality video? Can't we store high quality and as per the read we convert it while processing the same? Though I am not sure if it is possible but was curious to know
Doing that will require lots of CPU or GPU ad hoc for transcoding the video depending upon the quality, so it's only feasible if we before hand have all different resolutions version transcoded ready.
What I am thinking is that TH-cam don't create separate resolution videos from the original resolution.
It maybe loads the original video in memory and every user who is watching the same video in different resolutions gets a different process assigned, then all those processes use same memory location to read with different quality and transmit to the user device.
This will save RAM in most cases and also save storage and maintenance cost.
This is just my speculation 😬.
Transcoding is incredibly expensive in GPU allocation time, far more than hard disk.
When Gaurav is making an estimate of cache - I am confused as to what 90 refers to here. he mentions that its 90 days worth youtube vids thumbnails. and since each day number of youtube vids uploaded are 1M - the total thumbnails goes upto - 90M, but in base sense it just means that Gaurav is assuming that cache is storing 90M vid thumbnails(whether we take into account the last 90 days or not). However when deciding cache for thumbnail - there are various factors that goes in - maybe i do want to know the actual number of popular videos in last 90 days-(a vid is popular that gets atleast 5K hits) and then i store thumbnails of these vids - and that number can be more thant or less than 90M right? - so why 90M is the number
Hey Gaurav, At 7:17, what are you adding the 1M for? Aren't we getting 10 KB times 90 days of videos in the thumbnail?
The million is for the number of videos per day. 0:15
Thanks for this video Gaurav. Could you also help in understanding as why SQL has been chosen as DB for you tube considering this large scale of data and performance requirement
@10:19 - You mentioned the processor has to read data from somewhere and write back to some place, right ? Reading happens from the same storage ( 30 TB without HA ) in to cache and then back to the same storage, if not , would it require more storage than the number you came up with before ? I could be assuming wrong here.
There must be a finite amount of storage space and processing power. Power consumption, physical space of the servers, material consumed in the construction of the necessary equipment and of course financial outlay. It will cost money for someone to store multiple copies of a 15 year old cat video that nobody ever watches.
I didn't got where that 1M came from at 7:38, can anyone please help me understand.
Total space requirement is cache for thumbnail should be equal to videos which were uploaded in last 90 days + evergreen videos and we are assuming 1 thumbnail to be 10kb , so it should be 10kb * (Number of videos in last 90 days), is it because we assumed 10^6 videos to be uploaded per day?
same here @GauravKumar-xz9uk
@@manupathria1073Yes it is the number of user uploading a video as explained in first video
Same doubt. It should be taken as 10KB * 90 days * #thumbnails per day instead of 10K* 90 days * #thumbnails in 90 days
I think each second has 30 or 60 frames . I never heard 24🤔 . Btw love your videos and also correct me if I am wrong
Thanks!
You should Google this instead of commenting :P
You haven't heard that most movies are shot at by default 24fps?
You are probably talking about 30-60 fps in terms of Games. In terms of movie/video it is usually 24 fps.
Why do we need to store lower format resolution explicitly separate from high resolution. Can't it be generated or sampled down during streaming using a filter ?
Well, not yet. Currently, multiple resolutions is the way to go. Variable scale encoding is advancing fast though.
Could you please turn on the auto-generated caption functionality ?much appreciated
Hey Ken, which language are you comfortable with?
@@gkcs English, I'm sorry I just sometimes can't understand what you said with your accent and I didn't discriminate and I tried hard to understand what you said,no offense, sorry I have to be honest
How you came up with 1 billion users at the start?
Another thing we should also learn is to how to estimate the number of user for the software you are developing or the question you are designing in an interview.
Your calculation(for storage requirements) is for daily uploads while the actual report is for per minute. :p
@Garurav Sen...Could you please post the correct calculation for 9:30 minutes in the video onwards..
I'll leave that as an exercise to you. The answer is in the pinned comment btw
Sir is this series for freshers?
I got an estimate of 550 hours of new content every minute. Daym.
Yeah, it's about 550 to 600 per minute. But there are times like New Years and Christmas where it explodes :)
Could you make video about uber-eats system design,please?
I wold like to watch system deign of Gmail. Will u do it?
At 7:23 why 10*90 is multiplied by 1 M?
also confused. 90 days of metadata for most popular videos *1M?
2:39 why 2*X ? like I got by combining all the possible quality size we get 'X' but as we are keeping 3 copies so shouldn't it be 3 * x ?
I don't, I got confused here 0_0 rest was good
3:54 its written daily video limit(per day) .... 5:30 its mentioned per minute... am i mentioning correctly ? or am i incorrect?
8:49 : why are we multiplying 64 with 3 * 2?
Can someone explain why 10kb * 90 * 1M at 7:18 for the thumbnail estimation ? I understand 10kb which is compressed image but why multiple by 90 days then 1MB ? Appreciate
IF YT had just as many videos being uploaded 10 years ago as is today...3 PBs/day * 365 days/year * 10 years = ~11,000 petabytes of storage. I can see why YT is so quick to be so critical of video content and so quick to delete videos...they can hardly store just any video. How much money would have to be spent on ~11,000 peta bytes of hard drives?
Maybe this is covered in one of your videos but what's the most efficient way to check which cache in the 160 nodes of 16GB data has the actually cached stuff. Can there be sharding or something similar inside the cache or like a loadbalancer for the cache?
Horizontal partitioning on caches is a good idea. Have a look at consistent hashing: th-cam.com/video/zaRkONvyGr8/w-d-xo.html
@@gkcs Got it! Thanks! And thanks a lot for the quick response!!
do they store different quality video separately the don't have any technique like if I have to send an image frame then I will store the highest quality and when the user needed low quality in case of less internet speed then I will reduce the resolution of a copy of that image and send 2:34 ?? I don't have much knowledge in this field
They stores different qualities and resolutions, although Zoom works similar to your idea. Have a look at "scalable video encoding".
@@gkcs ok thank you for replying : ) this course really very good
People dont say MB ..They say megabite & megabit
I didn't understand the 500 nodes thing. What was that 64x(3x2) about ? Can someone explain it to me ?
I think u forgot to add ur own vedio storage 😁😁
Haha :P
Could you please upload a video about string matching algorithms?
I have: th-cam.com/video/XJ6e4BQYJ24/w-d-xo.html
Which are the books u read related to comp. Science.
7:14 What's the 1M multiplier for?
Represents 1 million users. Assume 1 billion YT users, and 1/1000 of them uploads video daily.. 1 billion x (1/1000) = 1M.
Each thumbnail is 10kb hence 1 million users' thumbnail is 10kb x 1 million and then 90 days rules => 10kb/image x 1m users x 90 days
The first 2 are gold the editing in the 3rd part is a little janky.