This is the best instagram system design video without any fluff. One correction is, CDNs typically cache content based on its popularity and request frequency (so, not just for celebrities). So, a pic posted by some normal person which goes viral is also cached by CDN.
Great content Pratiksha and loved your structured approach to the overall design. May I point out a small mistake in the storage estimation. 100 mil posts per day with an average size of a post being 10 MB would be 100 * 10^6 * 10 * 10^6(1 MB) B = 1000 * 10^12 = 1000 TB
overall it gives context on how to present system design. Great work. There are many grey areas like how fanout pushes updates etc, but understandable as it was 10 min video. Without any question, Great content for 12 min video. Thank you!
Finally a properly structured sys design video with all the pieces in the right order. One feedback is, you went a little fast on of few parts without proper reasoning for the choices you are making. Like why fanout service is needed to update the feed of users and how it will work. How multiple containers will be able to talk to multiple containers of upload media service before you put in the message queue. How CDN will work only for celebrities? Why not for all? One of the most important question in these systems designs is how to speed up the feed generation by pre generating the feed instead of generating it on runtime and putting the load on DB. That's the part you should have spent sometime on. Overall of great structured video. 10x better than the already available videos on TH-cam. Subscribed.
Good one yet again. one suggestion, Can you please make a dedicated vIdeo on fault tolerance? As most of the videos just contain a template of server and database replication as a default fault tolerance strategy, diving deep into how the system will recover and rollback when a distributed transaction happens across Microservices will help a lot for the viewers.Also,naming of the microservices can be a little more intuitive>if it was intentionally used for simplicity reasons kindly ignore.Also please justify the tradeoff of choosing NoSQL for storing the Posts data, as NoSQL is not transactional in nature.
I find your system design videos to be very pragmatic. Can you do a video for technical retrospective as well. Would love to hear how you deep dive into a previous project
Hey. Ur system design video is really helpful and very much simpler to understand. The approach n sequence seems great. Just a suggestion plz give details more about using elastic search, Kafka or any messaging queue.also spark or Hadoop if necessary anywhere in design
What does mean "create post service simply adds media to the queue"? Just push binary data to the queue is bad idea, so we need to save this data in tmp storage (disk maybe) and push link to the location as an message in queue. Or what did you meant?
I found the information you provided to be very helpful and informative, especially considering it only took 12 minutes. Would it be possible for you to share a PDF version of this material? That would be greatly appreciated.
Hello Ronish, Thank you for sharing the feedback ! I would be happy to create pdf for the new videos I make. For past videos, I will check and see if it’s easy to make pdfs out of what I have.
one suggestion why do we create a separate database for each feature(user, Post, interactions etc.) I got the idea of Relationships(GraphDB). If we maintain separate databases for each type of object we need to make 3 additional DB calls to fetch the data.
4. we pick nosql because the nested structure of comments but then you provided a flat structured schema for the Interactions database. I am missing something
Please put more effor in researching and reading about the existing social media applications before comming up with a design. Fetching everything via get feed service - even media files - seems very wrong as that data can be fetched independently (via cdn or some other read post service) once postIds are retrieved. Same goes for iteration db data as that also can be fetched independently. If this all would have been returned by a single API - it will take forever.
I dont understand Interaction Database design part. You are storing PostID and the UserID along with comment and you mention nesting. But how will you support nesting this way ?
Also if youre posting the comment then the Post DB would get updated right ? so why would you need another interaction DB to begin with... hmmm I would have made the post DB to be noSQL and added nesting there for comments. You could make the POST API call to update the POST DB given the same parameters and the index of the new comment on the post
Also, Can posts and interaction data reside in a single database? Say if we need to retrieve a post and all its interactions, having a n/w call from post service to interaction service is really costly. Instead ,if we can have both of them stored locally in the same database< retrieval is much more convenient>Also we can add a caching layer on top of the posts and interactions data for frequently accessed celebrity posts which were least frequently updated
It seems you take user as a db and not a table. Why so? Why every entity is treated as a different db and a table? If it is microservises we need to talk about the overhead pf talking to each other and add a gateway. Am I missing something?
Hello @vanvothe4817, You will find all the resources in "How to Ace a System Design Interview" video. In the beginning of the video, I have shared important concepts that are useful to learn but you can skip over that and directly go to the resources mentioned later in the video. Hope this helps :) All the best!
Thank you, Divya! Are you asking for the editor I am using? It’s Excalidraw. If you want to know more about the interview preparation tools then watch “crack System design interview” video. Towards the end there are some great resources!
If the data in the Post DB is archived (for example every 6 months as you have mentioned) how can older data (> 6 months) be accessed if a user tries to access older post
We can have a on-demand retrieval mechanism, that restores archived content as user tries to access it. User will experience a slight delay when image is loading. Ex: Amazon S3 Standard-Infrequent Access (S3 Standard-IA) could be an ideal candidate. Please read more here: aws.amazon.com/s3/storage-classes/
She has a video named "Ace the system design interview" (or something similar), where she shows at the end that she is using excalidraw. She also shows which shape libraries she is using.
Have a lot of problems, 1- this system for monolithic and big system use microservice and in microservice system design is different, 2- in database image is another table, and it's a very bad example of system design. 3- in API request must have pagination, no limitation for result. And ...
This is the best instagram system design video without any fluff. One correction is, CDNs typically cache content based on its popularity and request frequency (so, not just for celebrities). So, a pic posted by some normal person which goes viral is also cached by CDN.
this is by far one of the most comprehensive and concise system design video for Instagram I've ever seen. Well done!
Great to hear!
Great content Pratiksha and loved your structured approach to the overall design. May I point out a small mistake in the storage estimation. 100 mil posts per day with an average size of a post being 10 MB would be 100 * 10^6 * 10 * 10^6(1 MB) B = 1000 * 10^12 = 1000 TB
Literally you made system design so simple, thank you much
Thank you so much! This made my day :)
The best system design interview I have seen and this gives me confidence for the interviews
Thank you so much for sharing that! Gives me encouragement to do more of these :)
overall it gives context on how to present system design. Great work. There are many grey areas like how fanout pushes updates etc, but understandable as it was 10 min video.
Without any question, Great content for 12 min video. Thank you!
Finally a properly structured sys design video with all the pieces in the right order.
One feedback is, you went a little fast on of few parts without proper reasoning for the choices you are making.
Like why fanout service is needed to update the feed of users and how it will work. How multiple containers will be able to talk to multiple containers of upload media service before you put in the message queue.
How CDN will work only for celebrities? Why not for all?
One of the most important question in these systems designs is how to speed up the feed generation by pre generating the feed instead of generating it on runtime and putting the load on DB. That's the part you should have spent sometime on.
Overall of great structured video. 10x better than the already available videos on TH-cam. Subscribed.
My question exactly
Very simple and easy to understand! Looking forward to more design videos. Thanks!
Clear ,Concise and structured explanation . Thank you so much
Glad it was helpful!
Awesome 😮 information mam . It's my first step learning system design.. i think it's a great start ... ❤
Awesome content out of all watched so far..simple and relatable
Glad you liked it!
Thank you for this content hope to see more system design interview questions covered by you
More to come soon!
You're simply the best in these system design tutorials
Thank you, iSaac! Appreciate the feedback! Will upload more videos soon!
Excellent, you made the system design so simple. Thank you so much. Keep posting good content.
Thank you so much :)
Good one yet again. one suggestion, Can you please make a dedicated vIdeo on fault tolerance? As most of the videos just contain a template of server and database replication as a default fault tolerance strategy, diving deep into how the system will recover and rollback when a distributed transaction happens across Microservices will help a lot for the viewers.Also,naming of the microservices can be a little more intuitive>if it was intentionally used for simplicity reasons kindly ignore.Also please justify the tradeoff of choosing NoSQL for storing the Posts data, as NoSQL is not transactional in nature.
The best sytem design videos aroud. really like the method of starting small and dealing with high throughput and availability next.
Amazing Video! What's the tool that you're using?
I find your system design videos to be very pragmatic. Can you do a video for technical retrospective as well. Would love to hear how you deep dive into a previous project
That's a great idea! Once I have enough system design videos, I will consider this a next topic. Thanks
Clear😊 thanks
Your videos are very helpful..please continue doing more videos...please post videos on microservices and kubernetes
Thank you so much! More videos to come:)
Very useful thank you
Thank you for the video! Could you please share which libraries you use with Excalidraw for system design?
Hey. Ur system design video is really helpful and very much simpler to understand. The approach n sequence seems great. Just a suggestion plz give details more about using elastic search, Kafka or any messaging queue.also spark or Hadoop if necessary anywhere in design
Thank you Pratiksha for quick and informative contents, please make videos on different category of system design questions
Thank you 😊
Hands down the best Instagram system design video. Would you also be able to do a system design video on trading system or a position keeping system ?
Thanks Pratiksha for always delivering informative contents.
Thank you so much @machinelearning6726 for sharing the feedback!
Very helpful
what libraries have you used
Ah can i ask you which tool you used to draw design during the video, thanks
It's free version of excalidraw
Really helpful!
Glad you think so!
You should have talked about fan out service, how it will pre create user feed
Very good
What does mean "create post service simply adds media to the queue"? Just push binary data to the queue is bad idea, so we need to save this data in tmp storage (disk maybe) and push link to the location as an message in queue. Or what did you meant?
This is excellent and to the point! BTW, which drawing tool do you use/suggest?
Excalidraw
Thanks for making this video. Was the ending abrupt? Is there a part 2 of this ?
I have covered all the content, there is no part 2. Thank you for pointing that out! It’s good feedback, I will do proper closure in next videos!
I found the information you provided to be very helpful and informative, especially considering it only took 12 minutes. Would it be possible for you to share a PDF version of this material? That would be greatly appreciated.
Hello Ronish,
Thank you for sharing the feedback ! I would be happy to create pdf for the new videos I make. For past videos, I will check and see if it’s easy to make pdfs out of what I have.
Your channel is underrated..
What tool do you use for designing systems in your videos
one suggestion why do we create a separate database for each feature(user, Post, interactions etc.) I got the idea of Relationships(GraphDB). If we maintain separate databases for each type of object we need to make 3 additional DB calls to fetch the data.
very helpful, thanks
Thank you!
4. we pick nosql because the nested structure of comments but then you provided a flat structured schema for the Interactions database. I am missing something
problem with graph storage is that its gonna be a big mess solving distributed queries.
Why didn't you use NOSQL for posts, since we are ok with eventual consistency and it also scales well.
For storage it should be 1000TB or 1PB per day? 100M * 10MB
very nice SD videos you are doing :)
Thank you 🙏🏼
Please put more effor in researching and reading about the existing social media applications before comming up with a design. Fetching everything via get feed service - even media files - seems very wrong as that data can be fetched independently (via cdn or some other read post service) once postIds are retrieved. Same goes for iteration db data as that also can be fetched independently. If this all would have been returned by a single API - it will take forever.
correction the memory req would be 1 petabyte.
I dont understand Interaction Database design part. You are storing PostID and the UserID along with comment and you mention nesting. But how will you support nesting this way ?
Also if youre posting the comment then the Post DB would get updated right ? so why would you need another interaction DB to begin with... hmmm
I would have made the post DB to be noSQL and added nesting there for comments. You could make the POST API call to update the POST DB given the same parameters and the index of the new comment on the post
Zerodha grow and any fintech design please
Also, Can posts and interaction data reside in a single database? Say if we need to retrieve a post and all its interactions, having a n/w call from post service to interaction service is really costly. Instead ,if we can have both of them stored locally in the same database< retrieval is much more convenient>Also we can add a caching layer on top of the posts and interactions data for frequently accessed celebrity posts which were least frequently updated
It seems you take user as a db and not a table. Why so? Why every entity is treated as a different db and a table? If it is microservises we need to talk about the overhead pf talking to each other and add a gateway. Am I missing something?
How to learn design system? Do you recommend any book?
Hello @vanvothe4817,
You will find all the resources in "How to Ace a System Design Interview" video. In the beginning of the video, I have shared important concepts that are useful to learn but you can skip over that and directly go to the resources mentioned later in the video.
Hope this helps :) All the best!
where did you get your accent?
ha ha ! I am not sure! I think I pick up accent pretty quickly!
Really well informative and structured video! Also, which tool you are using for high-level design?
Thank you, Divya! Are you asking for the editor I am using? It’s Excalidraw. If you want to know more about the interview preparation tools then watch “crack System design interview” video. Towards the end there are some great resources!
@@pratikshabakrola Sure, will definitely watch it. Thanks for the suggestion Pratiksha.
What is the tool that you are using to draw?
I am using Excalidraw. It's a great tool for practicing interviews or any realtime collaborations. It also has tons of built-in libraries of graphics.
If the data in the Post DB is archived (for example every 6 months as you have mentioned) how can older data (> 6 months) be accessed if a user tries to access older post
We can have a on-demand retrieval mechanism, that restores archived content as user tries to access it. User will experience a slight delay when image is loading.
Ex: Amazon S3 Standard-Infrequent Access (S3 Standard-IA) could be an ideal candidate. Please read more here: aws.amazon.com/s3/storage-classes/
Why all microservices are tightly coupled , Talking to each others database . Its a very basic design.
Hi, which app/website are you using to create this diagram ?
She has a video named "Ace the system design interview" (or something similar), where she shows at the end that she is using excalidraw. She also shows which shape libraries she is using.
I am using Excalidraw
@jelenamarusic3641
Thank you for helping others with these questions :)
Have a lot of problems,
1- this system for monolithic and big system use microservice and in microservice system design is different,
2- in database image is another table, and it's a very bad example of system design.
3- in API request must have pagination, no limitation for result.
And ...
Something wrong when calculating storage of posts per day.
are you a human or robot ?
Do I sound like a robot? lol
You've lost one 10 in your calculation, it's actually 1000TB/day
Thank you for the video! Could you please share which libraries you use in Excalidraw for system design?