He is not a smooth operator. He lacks System design experience & doesn't know many basic things. Being Overconfidence over his face, & doesn't even behaving like a interviewee. Hence Interviewer took the responsibility to continue the design, because she needed content for TH-cam video.
@@dakshayagarwal2560 +1 It all started with him not listing watch feature in functional requirements and then saying no idea in non functional requirements. He failed the interview in the first 2 minutes.
The way keerti is explaining the concepts and logic shows she has more theoretical knowledge and Harkirat is being more practical here like there will be need of updating user’s profile picture if storing that in meta data etc… and i think that should be the approach to any system design problem which is way more than just drawing boxes.
This guy has very good real world work experience which is very clear from his questions and suggestions while Keerti has a lot of theoretical knowledge, just an observation.
The fact that Harkirat's channel has grown massively over the last few months is because he has this original element in his content and ppl can relate with what he says. Being someone who's pivoting from Non tech to Tech, I'm truly inspired by Harkirat.
16:46 A user having lots of videos would mean a coloumn in user table with data as array of videoID/ video URLs. This would make the table unnormalised (breaking 1NF). I think a better way would be to have userID as foriegn key in video info table (videoId, videoURL, userId).
Many times people choose to have denormalized data just to avoid joins.....because if you are trying to build highly scalable database, you need to shard the data and if you are going with normalized data then might need to join data from two shards which is very expensive as the data from different shards need to be brought to a single machine first for processing the join.
Great informative video! Just to add more on TCP/UDP part, TH-cam and other social media platforms where live streaming is performed, they use Dynamic Adaptive Streaming over HTTP (DASH) protocol which is based on TCP since it gives adaptive bit rate streaming - different resolutions , also video meta data with chunks adding functionality and most importantly as it uses TCP, the video quality is guaranteed . On the other hand, WebRTC uses combination of TCP and UDP, which result into poor quality at some point of time due to lost packet. WebRTC is more reliable for real time communication like video calling, peer to peer communication etc.
From this awesome System design discussion I could clearly see that these two folks have a very different thought process. Where Keerti likes to discuss about the Highlevel system design. And The dude likes to see the knitty gritty of the problem and visualizes how it will actually look physically.
When you say that the splitter service will be splitting a video using a queuing service like RabbitMq, what do you really mean? Just using terms like splitting will happen using RabbitMq doesn't make any sense. RabbitMq is a MessageQueuing service i.e. something similar to AWS SQS wherein you send messages from a client to a server or vice versa. The input to such a splitting service can't be an entire video rather the video needs to be uploaded into an archival service for instance an S3 bucket and the path to the video must be passed to the splitting service in a RabbitMq message where the splitting service will then start to chunkify the video and do miscellaneous things with it.
TH-cam uses HTTP/3 on Quic and Quic uses UDP. Also for video uploads you would always use a multipart upload on a pre-signed URL provided by the Blob store where the file actually gets stored. Transmitting the file data through internal microservice to eventually reach the Blob store would kill the network bandwidth and would be catastrophic for a system like TH-cam.
Thanks Keerti and Harkirat. I liked the format of the video. It was more like a discussion rather than an interview. This led to a very free flowing conversation. Looking forward to more! :)
i am commenting this comment now i m going to recheck this comment after 3 years , lets see at that time in which position i am working on at what salary , currently iam student doing mca from nitk....
Designing youtube is a very large example. The outcome of building such a system will only result in a very vague solution and it might not be that helpful. I would love to see designing videos of different components of youtube (example, there will be a lot of details and tradeoffs required in uploader and splitter service) instead of whole youtube and going a bit more in detail. There are a lot of creators who create such vague system design videos, but I would highly appreciate if any indian creator makes system design videos that are actually practical and are in much more depth. This will highly benefit the tech community. BTW love the content of this channel. Hope to see such good content in future too. 😊
Hey, you see so many educators teaching like this because this is what is expected in interviews and people are scared of. But it’s a good feedback and an interesting challenge, I assure you that I will come up with such videos 😇
The way you explain TH-cam system design is truly commendable. We also believe in making these concepts accessible by makin video on system design of Microsoft teams
Upload to view ratio might be a lot lower than 1:100, I guess. About TCP vs UDP, apart from consistency and quality, TCP allows the user to seek back. WebRTC is also an expensive choice btw, and is used for real-time use cases only. Overall feedback: 1. The information about ABR and the manifest file was interesting. 2. Designing TH-cam is a huge topic, but nice that you covered the basics well. It will be amazing to see more granular and detailed videos for each sub-topic. 3. Didn't feel like an interview, but was a good discussion. 4. Good video overall. Thanks. 👍
These discussions are mind refreshing as there are two different perspective of thinking for the same system, one is coder and other is a system designer. As a coder I can say she’s amazing at doing it theoretically with explaining it but with a practical coder, it even becomes more clearer to understand and design a better system all together ❤❤
There are about 120 million daily active users on youtube. And 2.7 billion monthly active users. I think she got confused between daily and monthly.. since 2.5 billion per day would just be insane
@@codedesigner8839 encoding is a heavy process , lets say you have a 20 mins videos and at the 18th min your encoder failed , you would have to retry encoding the whole video . How could encoding failed ¿ you , might be using spot fleet ec2 from AWS which can die anytime .
Sorry if I'm wrong and I'm quite new to system design but curious about this aspect: storing video URLs in the user data table. Wouldn't this practice potentially violate solid principles? 18:00
Again, there is one more flaw. Chunking should be done on the client itself. The entire use of chunking is eliminated if we need to do it at the service
I have my system design interview today and I was tensed as I can build real systems but fumble in these interviews because of relating everything with actual implementation just like Harkirat here... But this relieved me as I'm not the only one.
As much as I enjoy your videos / content and appreciate the fact that you take out time to do all this great work by sharing your knowledge and experience, it makes me kinda wonder if I over value your work. Me being an ex-Flight Engineer and understanding nearly everything you discuss, makes me think that you need to UP your game. Much more depth is expected from you. Sorry for blowing the heat, but if it improves the overall dynamics of coming to your channel and viewing your content, then it is worth to share a thought honestly. And Yes, we're it your dad and you, I would have totally be happy. But two icons talking meager...I better demand more.
It's a very high level design, if you can directly stream your data to s3 then why do we need splitter service? Also your encoding service will anyways going to create the hls chunks
Hi, I think it's wrong choice to store videoId in the user table instead we should store the user ID for each video in the video table so that whenever we query our video table with a user id we can get all the videos regarding that user.
I wish the real interviews are as interactive as this. But no, this will not happen because in real life the interviewers think they designed all of the greatest applications in the world and have a ton of ego driving them.
@KeerthiPurswani another great discussion over designing TH-cam with @harkirat1. I have a suggestion and a question. Suggestion - On the DB design how about having channels table with unique channelID which gets referred in the users table as foreign key. Channels table will have S3 bucket ID containing all the video uploads of a user. Question- Will there be multiple manifest files corresponding to different bit rates or one file will contain all the chunks of different bit rates? How the switch between the bit rates happen depend on the bandwidth?
Hey Keerti, loved the video. But please go more in-depth into why and how, and the tradeoffs. Start simple, go in-depth, then expand the functionalities. Thanks a ton. Get to learn a lot.
Around 20:00 I was having same question why would be store image name instead we can store Id and harkirat pointed it out immediately he thinks like me😂
nice content, really looking forward for your upcoming you-tubes component wise videos which will explain more minute components in more detail. keep up the good work !
one more question sorry : I take it the upload service is classed as a micro service and not a monolithic one as the upload is just one service we are demonstrating here..there will be other services as well( I don't know what but just have a. feeling there are)
@@KeertiPurswani Hi keerti, so will they have separate load balancers? Or different services have common load balancer shared between them in a microservices architecture?
Humble K meets a bit of know it all H ..I am subscribed to both for the knowledge but would say K is next level in terms of humility . hope H learnt a thing or two .. nice video in terms of content .
Can anyone elaborate a bit what is S3 or Object storage they mentioned to store video contents while discussing database. Is it Amazon S3. Sorry I am kind of newbie
Mam the upload service gets the whole video from the user before sending it to splitter right, why do we need splitter the only pre processing that is required is encoding the video right , what other pre processing is required for the video to be split into chunks?
also the CDN, does every cdn servers distributed accross the world have all the databse content cached into it? also how does security in the cdn works? does the client directly communicate with cdn? but the auth occurs on the server right
what is the best way to differentiate a functional requirement to an NFR..is there like a thumb rule ? how does one make that split to say this is FR versus NFR?
generally if you notice NFRs are basically the qualities (scalability, security, reliability etc) that the system should have not the actual functions.
For FR - think of the functions that system has to be support. For NFR - think of the quality attributes or behaviour. Latency, consistency and all tell how the system behaves vs upload and watch are its functions. Hope you understood?
These guys are masters at their respective field. If anyone who is thinking of getting at their level coming from a low tier college, non tech, different work experience with not much skills then they need atleast 10+ years of experience and even then it will be difficult. These guys are really smart, high iq hard working folks of our country.
I also didn't get how you classed Like/Dislike ,Comments as Meta Data (is that because it data about data which means it is data related to the TH-cam video?)..
Go from this basic level to advanced level in 5-Week LIVE HLD Course - www.educosys.com
Please give discount code di extra discount code
For whom you have designed the course?
harkirat X 😂harikirat V
The video started with Keerti interviewing Harkirat; and ended with a 180-degree role reversal. Harkirat is a smooth operator. 😂
He is not a smooth operator. He lacks System design experience & doesn't know many basic things. Being Overconfidence over his face, & doesn't even behaving like a interviewee. Hence Interviewer took the responsibility to continue the design, because she needed content for TH-cam video.
@@dakshayagarwal2560 +1 It all started with him not listing watch feature in functional requirements and then saying no idea in non functional requirements. He failed the interview in the first 2 minutes.
Harikirat is god
😄😄
This was never an interview, this is just a discussion
The way keerti is explaining the concepts and logic shows she has more theoretical knowledge and Harkirat is being more practical here like there will be need of updating user’s profile picture if storing that in meta data etc… and i think that should be the approach to any system design problem which is way more than just drawing boxes.
Harkirat has now become a common face in Indian Youtech community ❤
This guy has very good real world work experience which is very clear from his questions and suggestions while Keerti has a lot of theoretical knowledge, just an observation.
The fact that Harkirat's channel has grown massively over the last few months is because he has this original element in his content and ppl can relate with what he says. Being someone who's pivoting from Non tech to Tech, I'm truly inspired by Harkirat.
16:46 A user having lots of videos would mean a coloumn in user table with data as array of videoID/ video URLs. This would make the table unnormalised (breaking 1NF). I think a better way would be to have userID as foriegn key in video info table (videoId, videoURL, userId).
Many times people choose to have denormalized data just to avoid joins.....because if you are trying to build highly scalable database, you need to shard the data and if you are going with normalized data then might need to join data from two shards which is very expensive as the data from different shards need to be brought to a single machine first for processing the join.
Great informative video! Just to add more on TCP/UDP part, TH-cam and other social media platforms where live streaming is performed, they use Dynamic Adaptive Streaming over HTTP (DASH) protocol which is based on TCP since it gives adaptive bit rate streaming - different resolutions , also video meta data with chunks adding functionality and most importantly as it uses TCP, the video quality is guaranteed . On the other hand, WebRTC uses combination of TCP and UDP, which result into poor quality at some point of time due to lost packet. WebRTC is more reliable for real time communication like video calling, peer to peer communication etc.
From this awesome System design discussion I could clearly see that these two folks have a very different thought process. Where Keerti likes to discuss about the Highlevel system design. And The dude likes to see the knitty gritty of the problem and visualizes how it will actually look physically.
When you say that the splitter service will be splitting a video using a queuing service like RabbitMq, what do you really mean? Just using terms like splitting will happen using RabbitMq doesn't make any sense. RabbitMq is a MessageQueuing service i.e. something similar to AWS SQS wherein you send messages from a client to a server or vice versa. The input to such a splitting service can't be an entire video rather the video needs to be uploaded into an archival service for instance an S3 bucket and the path to the video must be passed to the splitting service in a RabbitMq message where the splitting service will then start to chunkify the video and do miscellaneous things with it.
That tcp and udp discussion was superb..
Thanks keerti di for this wonderful video ❤❤
TH-cam uses HTTP/3 on Quic and Quic uses UDP. Also for video uploads you would always use a multipart upload on a pre-signed URL provided by the Blob store where the file actually gets stored. Transmitting the file data through internal microservice to eventually reach the Blob store would kill the network bandwidth and would be catastrophic for a system like TH-cam.
Believe user Id should not be used as it creates a security gap . Using api key or cookie to get the user Id in backend would be the right approach.
Thanks Keerti and Harkirat. I liked the format of the video. It was more like a discussion rather than an interview. This led to a very free flowing conversation. Looking forward to more! :)
i am commenting this comment now i m going to recheck this comment after 3 years , lets see at that time in which position i am working on at what salary , currently iam student doing mca from nitk....
27:56 server sending the next *chunk* in higer bitrate not the next *packet* . it nothing to do with packet management.
Designing youtube is a very large example. The outcome of building such a system will only result in a very vague solution and it might not be that helpful. I would love to see designing videos of different components of youtube (example, there will be a lot of details and tradeoffs required in uploader and splitter service) instead of whole youtube and going a bit more in detail. There are a lot of creators who create such vague system design videos, but I would highly appreciate if any indian creator makes system design videos that are actually practical and are in much more depth. This will highly benefit the tech community. BTW love the content of this channel. Hope to see such good content in future too. 😊
Hey, you see so many educators teaching like this because this is what is expected in interviews and people are scared of. But it’s a good feedback and an interesting challenge, I assure you that I will come up with such videos 😇
I love your attitude of accepting such challenges ❤❤
@@KeertiPurswani bro youtube has 2.3 billion user per month and not per day , also they only have 122 million daily active users.
The way you explain TH-cam system design is truly commendable. We also believe in making these concepts accessible by makin video on system design of Microsoft teams
Upload to view ratio might be a lot lower than 1:100, I guess.
About TCP vs UDP, apart from consistency and quality, TCP allows the user to seek back. WebRTC is also an expensive choice btw, and is used for real-time use cases only.
Overall feedback:
1. The information about ABR and the manifest file was interesting.
2. Designing TH-cam is a huge topic, but nice that you covered the basics well. It will be amazing to see more granular and detailed videos for each sub-topic.
3. Didn't feel like an interview, but was a good discussion.
4. Good video overall. Thanks. 👍
These discussions are mind refreshing as there are two different perspective of thinking for the same system, one is coder and other is a system designer.
As a coder I can say she’s amazing at doing it theoretically with explaining it but with a practical coder, it even becomes more clearer to understand and design a better system all together ❤❤
20:20 this is how you should think even the others knows too . just by using logical thinking .
very nice walkthrough of the system design, kudos harkirat for opening up the network tab
A littile correction is 2.7 bilion is monthly active user not daily.Daily active user is 127 milion.Great work
22:05 when does it checks the plagiarism?
There are about 120 million daily active users on youtube. And 2.7 billion monthly active users. I think she got confused between daily and monthly.. since 2.5 billion per day would just be insane
@13:44 why don't we encode video first, then divide in chunks, why I want splitter service to have real video data ?
@@codedesigner8839 encoding is a heavy process , lets say you have a 20 mins videos and at the 18th min your encoder failed , you would have to retry encoding the whole video .
How could encoding failed ¿ you , might be using spot fleet ec2 from AWS which can die anytime .
Don't mess with Harkirat, he knows everything
Yessss
Which note taking app is this ?
All wanted to help students😂
bro joined goldman and literally became a gold man 😮😂
Nice to see you back with the interview video. I love your interview video and it's very helpful, keep shining.
Thank you so much! Means a lot 😇
@KeertiPurswani
Just to clarify 2.1 Billion are monthly active users not daily, its approximately 122 million/day.
Sorry if I'm wrong and I'm quite new to system design but curious about this aspect: storing video URLs in the user data table. Wouldn't this practice potentially violate solid principles? 18:00
Again, there is one more flaw. Chunking should be done on the client itself. The entire use of chunking is eliminated if we need to do it at the service
Thank You Mam 👏👏👏 ,
It's really getting More Information about System Design, Please Keep It Mam ,Do More Video Like This🙏 .
Ok who is the interviewer here 😂
Hari so smoothly reversed the role
I have my system design interview today and I was tensed as I can build real systems but fumble in these interviews because of relating everything with actual implementation just like Harkirat here... But this relieved me as I'm not the only one.
CAP= in the event of network partition you can either get C or A
it's 2.5Bilion / month
yt has 122 million users daily
As much as I enjoy your videos / content and appreciate the fact that you take out time to do all this great work by sharing your knowledge and experience, it makes me kinda wonder if I over value your work. Me being an ex-Flight Engineer and understanding nearly everything you discuss, makes me think that you need to UP your game. Much more depth is expected from you. Sorry for blowing the heat, but if it improves the overall dynamics of coming to your channel and viewing your content, then it is worth to share a thought honestly. And Yes, we're it your dad and you, I would have totally be happy. But two icons talking meager...I better demand more.
It's a very high level design, if you can directly stream your data to s3 then why do we need splitter service? Also your encoding service will anyways going to create the hls chunks
for beginners who comes from different backgrounds can you please make a roadmap for them how to enter into sde role
who is interviewing who here? I could not say.
the first study video i enjoing so much , i think thats video going to my life , becouse know i devloped intreast in system design
Hi, I think it's wrong choice to store videoId in the user table instead we should store the user ID for each video in the video table so that whenever we query our video table with a user id we can get all the videos regarding that user.
MongoDB is not for write heavy. why cassandra and why not mogodb.
How are chunks handled, how does the actual storage work?
I wish the real interviews are as interactive as this. But no, this will not happen because in real life the interviewers think they designed all of the greatest applications in the world and have a ton of ego driving them.
Who is interviewer and who is interviewee
"I agree with you I know about it" LMAO that was funny
He is Damn honest dev god🔥
@KeerthiPurswani another great discussion over designing TH-cam with @harkirat1. I have a suggestion and a question.
Suggestion - On the DB design how about having channels table with unique channelID which gets referred in the users table as foreign key. Channels table will have S3 bucket ID containing all the video uploads of a user.
Question- Will there be multiple manifest files corresponding to different bit rates or one file will contain all the chunks of different bit rates? How the switch between the bit rates happen depend on the bandwidth?
Why not refer to the user in the channels table with the user ID?
Genuine question*
Hey Keerti, loved the video. But please go more in-depth into why and how, and the tradeoffs.
Start simple, go in-depth, then expand the functionalities. Thanks a ton. Get to learn a lot.
Very useful discussion and excellent piece of content❤👌. Sharing it with my team for learning. Thank you for your contributions to the community.. 😊
Around 20:00 I was having same question why would be store image name instead we can store Id and harkirat pointed it out immediately he thinks like me😂
TCP/UDP discussion was great.
We shouldn't be sending userId in any post or get request. This will be a flaw in security.
I watch 300-400 shorts a day easily .on weekends it is double triple.
nice content, really looking forward for your upcoming you-tubes component wise videos which will explain more minute components in more detail.
keep up the good work !
Pushing the algorithm ❤
The discussion in this video is pure gold 🌟
Thank you for the system design video. What is the tool name that yiu are using as whiye board for design amd writing the requirements
one more question sorry : I take it the upload service is classed as a micro service and not a monolithic one as the upload is just one service we are demonstrating here..there will be other services as well( I don't know what but just have a. feeling there are)
Yes yes! If the entire logic of upload, watch and other things were in one service then it would have been monolithic. These are microservices 😇
@@KeertiPurswani Hi keerti, so will they have separate load balancers?
Or different services have common load balancer shared between them in a microservices architecture?
Great Discussion, Subscribed🤟
How to handle corrupt files or malicious files upload?
Humble K meets a bit of know it all H ..I am subscribed to both for the knowledge but would say K is next level in terms of humility . hope H learnt a thing or two .. nice video in terms of content .
H earns in crores sitting in India from the US. Pride will be there.
Explain protocols also
Why no one talks about protocols HLS, Dash etc hearing about them first time😢
Can anyone elaborate a bit what is S3 or Object storage they mentioned to store video contents while discussing database. Is it Amazon S3. Sorry I am kind of newbie
AMAZON S3 is a blob storage where we store images, videos and other data. it returns url which can be used for fetching that stored data.
Mam the upload service gets the whole video from the user before sending it to splitter right, why do we need splitter the only pre processing that is required is encoding the video right , what other pre processing is required for the video to be split into chunks?
also the CDN, does every cdn servers distributed accross the world have all the databse content cached into it? also how does security in the cdn works? does the client directly communicate with cdn? but the auth occurs on the server right
Can we use it and extend it for e-learning system design ?
so the interviewer always writes on screen? I don't have a digital pen
When did I say this is mock interview
Great discussion !! Learned a lot.
what is the best way to differentiate a functional requirement to an NFR..is there like a thumb rule ? how does one make that split to say this is FR versus NFR?
generally if you notice NFRs are basically the qualities (scalability, security, reliability etc) that the system should have not the actual functions.
For FR - think of the functions that system has to be support. For NFR - think of the quality attributes or behaviour. Latency, consistency and all tell how the system behaves vs upload and watch are its functions. Hope you understood?
@@prasannaagnihotri430 Got it , so Security comes under NFR --like designing Access Control and defence in depth solutions are NFRs..
Very nicely explained
Awesome video❤
bro was not allowed to speak in his interview :/
How do we figure the number of chunks video needs be broken down into? And each packets size?
I think the chunks have a predefined size limit. Like how mongodb breaks blob into chunks of 250kb when using gridfs... that's my assumption for yt..
Is this an interview or a masterclass for the interviewee?? Whatever it is, loved the process..
makes sense ...❤
What a lovey way to explain
Commenting so I get this more of these on my feed.
For Likes/Dislike, shares Graph DB would have been better choice
Great going. keep up the good work
why are we storing video packets in databse? Why not s3?
Dont think video packets are being stored in the DB. Their location (urls) might be, but all chunks would be stored in S3
Just finished watching. Great video!! 👍
Thanks! 😇
this is amazing!
Thank you di! Would this also be important for people who want to work as ML engineers?
Hari kirat🤣🤣
beautifully done
This was awesome video.
Can you make system design for book my show and how to prevent double booking?
It really help us.
Thanks !!
Hi Keerthi maam. is your HLD course taught in english or Hindi?
Hey, it’s taught in english. All details mentioned on the site. Do check it out! 😇
What an idea !
These guys are masters at their respective field. If anyone who is thinking of getting at their level coming from a low tier college, non tech, different work experience with not much skills then they need atleast 10+ years of experience and even then it will be difficult. These guys are really smart, high iq hard working folks of our country.
Interesting content!!!
Glad you like it! 😇
Before 2018, system design was not a thing… what?
I also didn't get how you classed Like/Dislike ,Comments as Meta Data (is that because it data about data which means it is data related to the TH-cam video?)..
Yup, that was my thought process - data about data
All those estimations went in vain, it is no where justified in the designing.
Very informative
First comment! Great content!
Thank you! 😇
24:10
Video will be multipart file??? What format we use to store in s3??
Did you watch the video? 🫢
@@KeertiPurswani yupp