I interviewed for PayPal a few weeks ago, and this was the exact systems design question they asked. They said to me, “Do a system design for Instagram”. I smiled, because I had just watched this video a few days prior, and so I knew exactly how to answer. Thank you for this video, you helped me get a job, for real :)
As a backend developer who strugles to do projects due to thinking to much and taking a lot of time to make dumb things... I think I have found one awesome channel for me. Thanks for your videos man!
Well, I wasn't even looking for this. Just got a random recommendation and now I'm watching this with full focus at 3AM. Never thought this day would come😂 Awesome video.
Probably the first video of the system design I have seen. Being a front end developer, I had a fair idea about the things but the way you explained just wow
I'm not even a CSE student. But watching your videos, Gaurav, actually intrigues a lot and motivates me to actually learn more about programming and design some my own scalable system one day. Thanks Gaurav.
Try coding train channel. You can watch most of his videos and have fun. Its like watching a movie and he does real coding. For example watch this video, even if you dont know programming, you will still understand it and its so much fun. Coding the snake game th-cam.com/video/AaGK-fj-BAM/w-d-xo.html
Been addicted to this channel recently and binge watching it even though I have my exams ongoing xD. Man you are the best. (saying this from my experience of having watched more than 100 "Real" Coding TH-camrs) My Systems Design knowledge is growing leaps and bounds by watching these and I plan to implement these good designs just after my tests are over. I have worked on several Applications as a backend developer, and always stressed heavily on scalability, flexibility, ACID properties. But, your channel has taught me a lot more good techniques and design concepts.
Gaurav, first of all, thank you for a fantastic, simple and clear explanation. Second of all, I can imagine the work went in to put this video, it must be humongous task in preparing the right content, taping, editing, etc. Great work!
7:16 I think we do need a "type" on the Activity table. For example, suppose there's a postID being "123" and a commentID also being "123". Since both postID and commentID can be interpreted as activityID on Activity table, if there a row on activity table with activityID being "123" we don't know it's for the post or for the comment, unless we have a column "type" to distinguish between them.
True in this case, it depends on your system though. In case where the id is a UUID, then there won't be a case where postId and commetId would be equal.
@@gkcs by the way now you're a software engineer with at least 3- 4 years of experience by now. Do you still practice algorithms? I'm in this dilemma whether to practice or take it light
what you said is load balancer is work of service discovery or metadata service like zookeeper/consul/etcd and what you described as gateway is the work of load balancer or reverse proxy. Looks like you haven't used these systems practically(I am not blaming) and trying to inform others based on(your interpretation of) what you read online.
Can I ask a question about your comment. I totally agree what he's describing as a load balancer is actually more like Zookeeper. But I'm confused about the gateway comment. My understanding, which I may be wrong about, is that a gateway will handle authentication, authorization, and then route the incoming HTTP request to one or more services to accomplish the task at hand depending on the configuration. So yes, it sort of acts as a reverse proxy with the addition of authentication logic and possibility of making synchronous service calls. But I don't see how the gateway is a load balancer. It doesn't distribute API calls based on load. It distributed then based on function. If you wanted load balancer between a service and the gateway or a service and another service you would still need a load balancer. Is this correct?
Hi Gaurav, Greetings. I love your work, I am a subscriber and a frequent liker. However, I find an implicit assumption in your system which considers the Instagram mobile app as thin clients. The process of storing the posts in the cache in the server would result in an unscalable system. I believe the posts are cached in the user's app memory(cache and physical storage), considering that these apps have a considerable chunk of internal storage used. An added proof for this would be if you try to open up Instagram in offline mode you can still see past posts and a toast message which says "couldn't refresh feed". I would like to have the cache on the user's system and then an identifier that is stored in a place where you are storing the cache of the post on the server. (considering the news feed functionality. This can be applied for other uses too. ) Thanks.
@@gkcs the posts ARE stored in application cache... but it doesn't invalidate the fact that mobile apps aren't still thin clients. a user can delete an app, or visit from a third party integration (not built by instagram) -- in which case these timeline feeds are still stored in horizontal caches. you wouldn't believe the amount of money instagram/twitter/etc spend on memcache to make this happen.
@@dustindiaz so which way do i have to follow? Do i need to cache posts in client side? I confuse in like sectiob , whenever user click like, should the client side make a request?
admin wadidaw caching on the client is helpful when revisiting an application. This way a user can be presented with information immediately. Caching on the server, on the other hand, is necessary for large scale services to deliver things like timelines since raw sql queries based on this system design would cause the system to fall over with just a decent amount of traffic
@Gaurav i think it should be "postId" in place of "activityId" in "comment table" when you were explaining feature no. 2 as let say we want to find all the comments for a particular post , then we will look into comment table for column postId. correct me if i am wrong
@@blinkkeebs No, I'm just going over the reddit site and reverse engineering the features, but I've also changed some things. BTW I'm building it in MEAN stack.
@@Eduardo-fk7ft Oh! I'm also building reddit clone but I use React and Next.js for the server-side rendering. The db I use is postgresql. I think I should add some extra features for the project and I found this video :)
@@blinkkeebsVery nice tech stack!!. Yes, the best way to learn anything is to make it your way or change it, that is something that works best for me. Good luck, and may google be you best friend!! :P
Great video! That said, in your descriptions of the database schema, you should mention hotspotting as a justification for certain decisions as well. Namely, a very good reason to not add a "likes" column to posts is that it creates a lot of contention on rows in a single table, especially because single posts can get hundreds of thousands of likes. You arrived at the same conclusion - building tables that allows for writes to avoid contention and thus reads to be aggregations (which can then utilize caching) - but I think focusing on the larger problem of hotspotting motivates your design decisions better.
@@rujotheone some records are getting queried more than others. the specific instance that contains the record will be much busier than the rest of the system. you're not balancing the load ideally uniformly.
Hi Gaurav, Your energy is just unmatched! Audience Request: Please consider doing a video on how would one architect IRCTC Tatkal Booking scenario - with hundreds of thousands of tickets sold in 2 to 3 minutes time duration. Thanks
Thanks for posting this, the part on how to handle the news feed helped me out a lot, originally I could only think of the first method which the administrative tasks are way too high, precomputing the news feed is an option I didn't even think about. thanks :)!
Thank you very much for this. Excellent explanation through such a complicated topic. Really helped me think through a follower service I have been struggling to commit to.
I think we could have Likes in Posts/Comments table. The reason being they don't violate NF rules + it is going to be more efficient in terms of space. And if we think whether it's a characteristic of a post, then I can't think of why not? Also, since RDs support indexing, you could also include any suggestions on which all keys to index or anything that saves the world.
Hi Gaurav! Very thankful to you for sharing your knowledge with the rest of the world! I have 3 questions about the GATEWAY: 1) Is it a Micro-Service? If not, what exactly is it (i.e what does it contain)? 2) It seems like a single point of failure, looking at the diagram. 3) If we have multiple instances of a Gateway, then would the Load balancers be needed in between Client and the Gateway Service ?
lot of learning with video and bro one request can u make video on your uber interview about question asked roundwise and that HR round which was pretty tough as you mentioned in video (Got job in uber).
Thanks Abhishek! I won't be mentioning the questions asked, because we aren't allowed to. "Got hired" will turn to "Got fired". 😝 You can go through the content on the channel, it's more extensive than an interview set 😁
Actually a lot of considerations and thinking in multiple angles is required while doing a System Design. Sometimes, it's just like 'hey, where would that service get the data from? would it need any authentication? etc/. etc./,' Prepare well!!!
Hi Gaurav, First of all, a big thanks for all the videos that you are making. I am preparing for my interviews and these videos are helping me a lot. I would request you to please make a video explaining designing of a game, maybe a little complex game such as FIFA, which can give us an idea as to how to implement real-time occurrences. Since all the moves in a game need to be processed in real time it would be different from a system like Instagram as they can afford to have a delay of a few seconds but these games can not. Thanks :)
Very good explanantion Gaurav. The way instagram generates feed today has changed drastically with their new graph api which focuses more on relationships. It would be great to see a video on that.
Hi Gaurav, I really like your videos they are very clear and to the point. I would like you to share in one of your videos how the Amazon Market Place Design will look and work like
13:30 Hi Gaurav - thanks for pointing out the need for a load balancer with the snapshot technique stored onto Gateway for network routing when we horizontally scale the server-side. But why is communicating with the load balancer inefficient? Is this to avoid constant network calls ( which are slow ) and to utilize the SS, which can be stored into memory-side on the Gateway application?
Hi Gaurav, thank you for amazing content. Can you please share your thoughts on why you chose SQL database for all these data instead of NoSQL? Since the volume is high and eventual consistency seems to be ok, can we use NoSQL database for this kind of data? Thanks
Take a break from daily routine work. And watch #GauravSen's design videos... You'll get both the idea and chull (read Motivation ) to work on your own projects. Kudos !! Great work Gaurav 🙌👏😊
About the sending of notifications to millions of users, if we are talking about pub/sub notifications, just send the notification to a topic where all users are subscribed to.
Regarding Hybrid approach : practically User1 follows the ordinary user and celebrity as well. now when post done by ordinary user it will push to user1 but when post by celebrity, system/client has to pull. now how client know when it has to pull ? @gaurav sen sir, can you please explain. or correct me if I misunderstood something...
Hi Gaurav, First of all excellent work on the videos :) I have a doubt on the DB selection, so basically what i am understanding is when we need to store information about user we may use Mysql cause of strong relationships etc but since the content [activity] of the user is kind of unstructured wouldn't it be better to use NoSql? by unstructured i mean, we may or may not have caption, may or may not have images, instead can have videos, or comments in that case can be recursively long..Please correct me if im not going in the right direction! Once again awesome work :)
imo it'll be better to have a combination of noSQL and RDBMS for example tables which need to be regularly updated such as no. of likes must be kept in a noSQL DB whereas things like content of a post which are not changed so frequently are better to be stored in RDBMS
@Gaurav at 20:55 You said regular polling/http requests (I assume would be using some connection pool, not establishing tcp connection for each poll request) from clients/cellPhones asking for updates from server will have bandwidth/battery issues. But even when we go with LongPolling/Websockets, its actually keeping either connection always(Wbsockets) open/connected or maintain connection for longer period(Long Polling) between client and server for realtime data transfer but this will also have battery and bandwidth issues right? So mainly how do we measure Pros/Cons for this specific Instagram case between Long Polling vs WebSockets?
From the mobile system design perspective, pull model is not suited for reasons like battery consumption, drop in network connectivity but a nice explanation of various possibilities.
Awesomely explained 😊 I guess it's just not that simple as it seems after your explanation. Great work is being done behind the scene. Thanks for all the awesome videos.
Awesome video series on this channel 💚. One request Gaurav- Could you please share insights on how the video&audio based systems are designed,built and the kind of algorithms/libraries that go into transcoding on scale, as they are computationally intensive tasks. Thanks.
Hi Gaurav, Your explanations are very clear and relevant. I love your accent. Most of the guys use fake accents just for videos which irritates me a lot. One request I wanna put here for the system design of Metro system, stack overflow system as mostly I saw them in the tread.
Great Video. For use case user-follower scenario, I see technical solution as Graphical DS problem (considering time). Reason is fetching feed both side will become simple and fast. Looks like an interesting problem to analyze/compare various SD approaches against above use case.
I believe we can store the user feed on the client as well, since we can recompute in case of app is reinstalled. Other thing is cache will be updated even when user is not even using the app but if we store at client we will only be updating (Hybrid Model will be better) when user is up and running.
@@gkcs I would agree, Some users like me use Instagram on browser only. :) As per as app is concerned we can keep on mobile but I see challenges as well lets say I did not open the app for like 4 hours and whole 20 post feed needs to be updated then we need to do big recalculation step. For fast response times as long as user is on the app we can cache feed on the app push will directly go to the mobile with out one extra hop in between. But I am still not sure what will be better may somewhere in between server + client would be ideal. I may be completely wrong. Thoughts?
Hi Gaurav Great video it is. Thanks for this. Had a query. How efficient it would be when a celebrity having 50 million followers(or may be more) posts something and we need to add the post in cache for all of the followers?
Hi Gaurav, I love the way you explain things. This video actually sums up all the major components, including the DB structures and High-level architecture. I have one question regarding the design, which is more on low-level design, it would be great if you can create a video on that. Q. If I need to design the data storage in-memory, which data structure we should use to store the posts. likes, follower data, such that we can fulfill the given features efficiently. Thanks,
In the use-case celebrity, the decision has also be made on the basis if the polling is affecting the user's device battery or not. WorkManager on Android is a great way to achieve that. It optimises the resources and takes decision when to give a CPU chunk to a particular application only when the device has decent resources to spare. This could could surely save a lot of CPU clocks on our server end for sending PUSH.
@gaurav sen Great video......it literally gave me an insight on how to use the theoritical knowledge we have gained as a CS engineer in real Practical Designing
It would be great if you made a video that delved deep into the concepts of load balancer vs. proxy servers (forward and reverse proxy) vs gateways. TIA! :)
18:39 bestpart, I got this follow up question in walmart interview and i was not able to answer as I didn't mention trigger to news feed service on posting :(
Great video! I would add around 7:30 concerns around concurrency when updating the likes table. Let's say a celebrity, what happens when thousands of people are hitting like at the same time? Do we lock the item while an update is happening? Do we use optimistic locking and retry on errors? Do we accept the numbers going up and down and trust the overall the number will consistently increase due to high number of people?
I interviewed for PayPal a few weeks ago, and this was the exact systems design question they asked. They said to me, “Do a system design for Instagram”. I smiled, because I had just watched this video a few days prior, and so I knew exactly how to answer. Thank you for this video, you helped me get a job, for real :)
Congratulations!
wow awesome Elli May got job in Paypal 😁
What's there to be proud of when you've seen the answer to an interview question beforehand?
@@brandonzheng1092 so one should not feel proud anyways, cuz he/she had studied that in books b4... lame perception
@@brandonzheng1092 I sense happiness rather than pride, none of that this person said implies proudness, and even if so, why not? I'd be proud.
Damn this kid is good. Better than most of the "veteran" system architects I've worked with.
You must have worked with some really crappy architects if that is indeed the case.
As a backend developer who strugles to do projects due to thinking to much and taking a lot of time to make dumb things... I think I have found one awesome channel for me.
Thanks for your videos man!
😁
Hi Sen, for the database design I think you should go from Logical ERD first then derive Physical Tables from there, it is more natural approach
Well, I wasn't even looking for this. Just got a random recommendation and now I'm watching this with full focus at 3AM.
Never thought this day would come😂
Awesome video.
Probably the first video of the system design I have seen. Being a front end developer, I had a fair idea about the things but the way you explained just wow
I'm not even a CSE student. But watching your videos, Gaurav, actually intrigues a lot and motivates me to actually learn more about programming and design some my own scalable system one day. Thanks Gaurav.
Try coding train channel. You can watch most of his videos and have fun. Its like watching a movie and he does real coding. For example watch this video, even if you dont know programming, you will still understand it and its so much fun.
Coding the snake game
th-cam.com/video/AaGK-fj-BAM/w-d-xo.html
Explains a lot why there are so many well paid people behind each successful online service, so complex, wow
Been addicted to this channel recently and binge watching it even though I have my exams ongoing xD. Man you are the best. (saying this from my experience of having watched more than 100 "Real" Coding TH-camrs) My Systems Design knowledge is growing leaps and bounds by watching these and I plan to implement these good designs just after my tests are over. I have worked on several Applications as a backend developer, and always stressed heavily on scalability, flexibility, ACID properties. But, your channel has taught me a lot more good techniques and design concepts.
Thank you 😁
Seriously man,
It is a great help
Thanks!
Gaurav, first of all, thank you for a fantastic, simple and clear explanation. Second of all, I can imagine the work went in to put this video, it must be humongous task in preparing the right content, taping, editing, etc. Great work!
Thank you!
All the important concepts are explained very simply and this is what makes this video amazing.
7:16 I think we do need a "type" on the Activity table. For example, suppose there's a postID being "123" and a commentID also being "123". Since both postID and commentID can be interpreted as activityID on Activity table, if there a row on activity table with activityID being "123" we don't know it's for the post or for the comment, unless we have a column "type" to distinguish between them.
True in this case, it depends on your system though. In case where the id is a UUID, then there won't be a case where postId and commetId would be equal.
Wow his content is really at next level
I love this guy and respect his efforts and the amount of hard work he puts in each and every video
Great video man, really appreciate the fact that you've been posting such indetail conceptual content for free.
hi Bro.. Actually the way you explained the stuff is very simple and clear.. Thanks for your time for making such videos..
Thank you!
Fantastic work it is because of people like you skills of general masses are also rising
Hey Gaurav great videos bro.. Every software engineer should know system designs to build scalable, robust applications.. keep rocking!
Thank you!
@@gkcs by the way now you're a software engineer with at least 3- 4 years of experience by now. Do you still practice algorithms? I'm in this dilemma whether to practice or take it light
@@praveen3123 Never stop learning !
Life is incomplete without a Gkcs design video
Hahaha!
Would love to see a system design of notifications (activity feed) in twitter/IG etc. Aggregate etc.
Thanks for explaining the practical use of all we study in our syllabus..Your videos are superb!
Glad to hear that!
what you said is load balancer is work of service discovery or metadata service like zookeeper/consul/etcd and what you described as gateway is the work of load balancer or reverse proxy.
Looks like you haven't used these systems practically(I am not blaming) and trying to inform others based on(your interpretation of) what you read online.
good point.
Can I ask a question about your comment. I totally agree what he's describing as a load balancer is actually more like Zookeeper. But I'm confused about the gateway comment. My understanding, which I may be wrong about, is that a gateway will handle authentication, authorization, and then route the incoming HTTP request to one or more services to accomplish the task at hand depending on the configuration. So yes, it sort of acts as a reverse proxy with the addition of authentication logic and possibility of making synchronous service calls. But I don't see how the gateway is a load balancer. It doesn't distribute API calls based on load. It distributed then based on function. If you wanted load balancer between a service and the gateway or a service and another service you would still need a load balancer. Is this correct?
Hi Gaurav,
Greetings. I love your work, I am a subscriber and a frequent liker. However, I find an implicit assumption in your system which considers the Instagram mobile app as thin clients. The process of storing the posts in the cache in the server would result in an unscalable system. I believe the posts are cached in the user's app memory(cache and physical storage), considering that these apps have a considerable chunk of internal storage used. An added proof for this would be if you try to open up Instagram in offline mode you can still see past posts and a toast message which says "couldn't refresh feed". I would like to have the cache on the user's system and then an identifier that is stored in a place where you are storing the cache of the post on the server.
(considering the news feed functionality. This can be applied for other uses too. )
Thanks.
This is a very good point. Thanks for posting 😁
@@gkcs can you explain system design for telegram
@@gkcs the posts ARE stored in application cache... but it doesn't invalidate the fact that mobile apps aren't still thin clients. a user can delete an app, or visit from a third party integration (not built by instagram) -- in which case these timeline feeds are still stored in horizontal caches. you wouldn't believe the amount of money instagram/twitter/etc spend on memcache to make this happen.
@@dustindiaz so which way do i have to follow? Do i need to cache posts in client side?
I confuse in like sectiob , whenever user click like, should the client side make a request?
admin wadidaw caching on the client is helpful when revisiting an application. This way a user can be presented with information immediately.
Caching on the server, on the other hand, is necessary for large scale services to deliver things like timelines since raw sql queries based on this system design would cause the system to fall over with just a decent amount of traffic
@Gaurav i think it should be "postId" in place of "activityId" in "comment table" when you were explaining feature no. 2
as let say we want to find all the comments for a particular post , then we will look into comment table for column postId.
correct me if i am wrong
I didn't knew the dbms subject was so much exciting....
I'm building a reddit clone, and your way of designing the news feed gave me a lot of ideas, thank you!!
me too! reddit clone from ben awad's tutorial? :D
@@blinkkeebs No, I'm just going over the reddit site and reverse engineering the features, but I've also changed some things. BTW I'm building it in MEAN stack.
@@Eduardo-fk7ft Oh! I'm also building reddit clone but I use React and Next.js for the server-side rendering. The db I use is postgresql. I think I should add some extra features for the project and I found this video :)
@@blinkkeebsVery nice tech stack!!.
Yes, the best way to learn anything is to make it your way or change it, that is something that works best for me.
Good luck, and may google be you best friend!! :P
Great video! That said, in your descriptions of the database schema, you should mention hotspotting as a justification for certain decisions as well. Namely, a very good reason to not add a "likes" column to posts is that it creates a lot of contention on rows in a single table, especially because single posts can get hundreds of thousands of likes. You arrived at the same conclusion - building tables that allows for writes to avoid contention and thus reads to be aggregations (which can then utilize caching) - but I think focusing on the larger problem of hotspotting motivates your design decisions better.
Noob question, please what is hotspotting
@@rujotheone some records are getting queried more than others. the specific instance that contains the record will be much busier than the rest of the system. you're not balancing the load ideally uniformly.
Hi Gaurav,
Your energy is just unmatched!
Audience Request: Please consider doing a video on how would one architect IRCTC Tatkal Booking scenario - with hundreds of thousands of tickets sold in 2 to 3 minutes time duration. Thanks
I'll try to work on this 😁
I'm so glad I found your channel. Keep up the good work! Nice videos:)
Thanks 😁
Amazing stuff - not only informative - but interesting!
Thanks!
From Designing Tinder to Instagram, in a very short time :D
Hahaha, just 6 months 😉
I always wait for your videos!very good content
Thankyou!! :)
Hi Gaurav, thanks for this great post. You look so young, how could you be so knowledgeable?
He is actually 45 years old. He designed a system that removes aging signs from his youtube uploads...
@@vikaspizza 🤣😂🤣😂
Thanks for posting this, the part on how to handle the news feed helped me out a lot, originally I could only think of the first method which the administrative tasks are way too high, precomputing the news feed is an option I didn't even think about. thanks :)!
Glad it helped 😁
05:05 select count (*) from likes where post/parent_id = 'xyxyx' AND type = post.
mentioning this incase anybody gets confused
Awesome man, im glad to see your channel, subscribed immediately! Very helpful!
Thank you very much for this. Excellent explanation through such a complicated topic. Really helped me think through a follower service I have been struggling to commit to.
great video! the animation part is awesome. I like all your system design videos.
Thank you 😁
I think we could have Likes in Posts/Comments table. The reason being they don't violate NF rules + it is going to be more efficient in terms of space. And if we think whether it's a characteristic of a post, then I can't think of why not?
Also, since RDs support indexing, you could also include any suggestions on which all keys to index or anything that saves the world.
You mean the "count of likes for this post", or "who has liked which content"?
@@gkcs The count of likes.
@@PrashantMarshal Will you update this record on every like?
Gaurav Sen wouldn’t the strategy for updating the Activity table (be it batched queries or point queries) be valid for the Comments table too?
Hi Gaurav! Very thankful to you for sharing your knowledge with the rest of the world!
I have 3 questions about the GATEWAY: 1) Is it a Micro-Service? If not, what exactly is it (i.e what does it contain)? 2) It seems like a single point of failure, looking at the diagram. 3) If we have multiple instances of a Gateway, then would the Load balancers be needed in between Client and the Gateway Service ?
Love you system design videos. Love from Nepal 👍
hi Gaurav bro, it was amazing .waiting for more such videos
Thanks!
your system design implementation is goign out off my head , i think need to study the basic then only i can get u what you wants to say
Great Explanation. Clear and concise.
😁
Awesome content.. what I am looking for always get from your videos.
Keep it up.
Great video, ty. I'm building an app that does something tangential to this, really helpful for real-world work!
Amazing Video, Thanks Gaurav :)
After watching this video, Instagram would never be the same for me, ever again.
Hey gourav sir😊🙏
Nice overview and well explained.
You r really great person who share our personal experience. 👍
Thanks Sandip!
lot of learning with video and bro one request can u make video on your uber interview about question asked roundwise and that HR round which was pretty tough as you mentioned in video (Got job in uber).
Thanks Abhishek!
I won't be mentioning the questions asked, because we aren't allowed to. "Got hired" will turn to "Got fired". 😝
You can go through the content on the channel, it's more extensive than an interview set 😁
Loved this playlist, thank you brother
Nice explanation. Excellent work
at 20:20
Why storing feed in cache as LRU ?
I mean if feed is used it should be deleted or replaced by new one right ?
Absolutely loved your explanation Gaurav. Thank you :)
9:45 why does Post table not have activityId. also, should comment table have a parentId to document how which post/comment the comment belongs to?
Great video!
When code_report says something, you better believe it 😎
What a wonderful channel!!! just subscribed
Actually a lot of considerations and thinking in multiple angles is required while doing a System Design.
Sometimes, it's just like 'hey, where would that service get the data from? would it need any authentication? etc/. etc./,' Prepare well!!!
Hi Gaurav,
First of all, a big thanks for all the videos that you are making. I am preparing for my interviews and these videos are helping me a lot. I would request you to please make a video explaining designing of a game, maybe a little complex game such as FIFA, which can give us an idea as to how to implement real-time occurrences. Since all the moves in a game need to be processed in real time it would be different from a system like Instagram as they can afford to have a delay of a few seconds but these games can not.
Thanks :)
Thanks Ayush!
I am working on tic tac toe currently. It'll be progressing towards more complicated games soon. 😁
Very good explanantion Gaurav. The way instagram generates feed today has changed drastically with their new graph api which focuses more on relationships. It would be great to see a video on that.
I'll look into it :)
Your English is awesome bro!!
Concise, at the same time; broad and easy to understand.
Thanks 😁
Just a test message weather you reads it or not.
BTW very good system design playlist.
Hi Gaurav, I really like your videos they are very clear and to the point. I would like you to share in one of your videos how the Amazon Market Place Design will look and work like
13:30 Hi Gaurav - thanks for pointing out the need for a load balancer with the snapshot technique stored onto Gateway for network routing when we horizontally scale the server-side. But why is communicating with the load balancer inefficient? Is this to avoid constant network calls ( which are slow ) and to utilize the SS, which can be stored into memory-side on the Gateway application?
The design before 11:19 seems mostly logical and doesn't seem to require too much knowledge on cs infra.
Hi Gaurav, thank you for amazing content. Can you please share your thoughts on why you chose SQL database for all these data instead of NoSQL? Since the volume is high and eventual consistency seems to be ok, can we use NoSQL database for this kind of data?
Thanks
I love your videos :) Thanks for sharing!
Take a break from daily routine work. And watch #GauravSen's design videos... You'll get both the idea and chull (read Motivation ) to work on your own projects.
Kudos !! Great work Gaurav 🙌👏😊
About the sending of notifications to millions of users, if we are talking about pub/sub notifications, just send the notification to a topic where all users are subscribed to.
Regarding Hybrid approach : practically User1 follows the ordinary user and celebrity as well. now when post done by ordinary user it will push to user1 but when post by celebrity, system/client has to pull. now how client know when it has to pull ? @gaurav sen sir, can you please explain. or correct me if I misunderstood something...
Hi Gaurav, First of all excellent work on the videos :) I have a doubt on the DB selection, so basically what i am understanding is when we need to store information about user we may use Mysql cause of strong relationships etc but since the content [activity] of the user is kind of unstructured wouldn't it be better to use NoSql? by unstructured i mean, we may or may not have caption, may or may not have images, instead can have videos, or comments in that case can be recursively long..Please correct me if im not going in the right direction! Once again awesome work :)
imo it'll be better to have a combination of noSQL and RDBMS for example tables which need to be regularly updated such as no. of likes must be kept in a noSQL DB whereas things like content of a post which are not changed so frequently are better to be stored in RDBMS
Much appreciated. Love it ❤🔥
@Gaurav at 20:55
You said regular polling/http requests (I assume would be using some connection pool, not establishing tcp connection for each poll request) from clients/cellPhones asking for updates from server will have bandwidth/battery issues.
But even when we go with LongPolling/Websockets, its actually keeping either connection always(Wbsockets) open/connected or maintain connection for longer period(Long Polling) between client and server for realtime data transfer but this will also have battery and bandwidth issues right? So mainly how do we measure Pros/Cons for this specific Instagram case between Long Polling vs WebSockets?
This was such a good video. Thanks!
From the mobile system design perspective, pull model is not suited for reasons like battery consumption, drop in network connectivity but a nice explanation of various possibilities.
Awesomely explained 😊
I guess it's just not that simple as it seems after your explanation.
Great work is being done behind the scene.
Thanks for all the awesome videos.
Thanks Ashutosh!
Very elaborate, thank you! 🤩
Awesome video series on this channel 💚. One request Gaurav- Could you please share insights on how the video&audio based systems are designed,built and the kind of algorithms/libraries that go into transcoding on scale, as they are computationally intensive tasks. Thanks.
Thank you!
I'll get to designing Netflix/TH-cam in a while. That will be fun! 😁
Hi Gaurav, Your explanations are very clear and relevant. I love your accent. Most of the guys use fake accents just for videos which irritates me a lot. One request I wanna put here for the system design of Metro system, stack overflow system as mostly I saw them in the tread.
StackOverflow is interesting, I'll try working it's design video soon :)
Loved the explanation :) Nice work.
Thank you!
Great Video.
For use case user-follower scenario, I see technical solution as Graphical DS problem (considering time). Reason is fetching feed both side will become simple and fast.
Looks like an interesting problem to analyze/compare various SD approaches against above use case.
You could try...
Very good explanation bro!
I believe we can store the user feed on the client as well, since we can recompute in case of app is reinstalled. Other thing is cache will be updated even when user is not even using the app but if we store at client we will only be updating (Hybrid Model will be better) when user is up and running.
You can cache it on the client. But it's better to keep such responsibilities on the server.
@@gkcs I would agree, Some users like me use Instagram on browser only. :)
As per as app is concerned we can keep on mobile but I see challenges as well lets say I did not open the app for like 4 hours and whole 20 post feed needs to be updated then we need to do big recalculation step.
For fast response times as long as user is on the app we can cache feed on the app push will directly go to the mobile with out one extra hop in between.
But I am still not sure what will be better may somewhere in between server + client would be ideal. I may be completely wrong.
Thoughts?
Hi Gaurav
Great video it is. Thanks for this.
Had a query. How efficient it would be when a celebrity having 50 million followers(or may be more) posts something and we need to add the post in cache for all of the followers?
Hi Gaurav,
I love the way you explain things. This video actually sums up all the major components, including the DB structures and High-level architecture.
I have one question regarding the design, which is more on low-level design, it would be great if you can create a video on that.
Q. If I need to design the data storage in-memory, which data structure we should use to store the posts. likes, follower data, such that we can fulfill the given features efficiently.
Thanks,
In the use-case celebrity, the decision has also be made on the basis if the polling is affecting the user's device battery or not. WorkManager on Android is a great way to achieve that. It optimises the resources and takes decision when to give a CPU chunk to a particular application only when the device has decent resources to spare. This could could surely save a lot of CPU clocks on our server end for sending PUSH.
Awesome explanation ..your videos are really helping ..keep it up bro :) :)
Thank you sir.... a Great Teacher.
4:40 why not two separate tables: likes_for_images and likes_for_comments? Isn't it much simpler for querying as well?
@gaurav sen
Great video......it literally gave me an insight on how to use the theoritical knowledge we have gained as a CS engineer in real Practical Designing
It would be great if you made a video that delved deep into the concepts of load balancer vs. proxy servers (forward and reverse proxy) vs gateways. TIA! :)
I'll look into it, thanks!
awesome explanation of fan-out scenario
subscribed .. you are just amaizing
18:39 bestpart, I got this follow up question in walmart interview and i was not able to answer as I didn't mention trigger to news feed service on posting :(
Good and nice explanation.
keep going and more improvement
Great video! I would add around 7:30 concerns around concurrency when updating the likes table. Let's say a celebrity, what happens when thousands of people are hitting like at the same time? Do we lock the item while an update is happening? Do we use optimistic locking and retry on errors? Do we accept the numbers going up and down and trust the overall the number will consistently increase due to high number of people?
Have a look at the rate limiting video for this :)
Fabulous video. Looking forward to more video. Can you elaborate on empirically optimize a ranked feed?
Dude who follows Stephen Hawking on Instagram and why would he even be on Instagram..😂😂😂 JK
P.S. Great Video btw 👍