I find amazing how this topic is most of the times approached as an "interview question" without realising that eventually you'll HAVE TO implement that system (or a part of it, in most cases). I find much more interesting and useful to approach the problem as questions on their own. "How do you scale to 1M users?" question in a system will make you scratch your head for quite a while.
I feel like this section needs to be a bigger part of the interview process. Thanks for the deep dive. Algos aren't always more important than understanding the system.
Yeah, I completely agree. I think they will be more and more important as systems get more and more complex and more abstract. It's less important to know how to code some algorithm vs being able to put together a complex distributed system.
Also another question, when chosing database, should we first consider transaction isolation or qps first, and relational non-relational later! Asking this because, the discord table, looks very relationally, but then they went will all the nosql stuff. And i mean the difference between the approaches to fix database constraints for figma vs discord, makes me rethink how to chose a database. Sometimes i feel like, its like chosing a tech one is comfortable in, and then work around the issues. These issues/tradeoffs will always be there, whatever you chose. I mean, if one setups, two mongodb instances in a cluster, the immediate problem of id generation comes into picture, where you now have to generate the Object(id). But i am not sure if that thought process is valid.
Nice clean video ! Would like to add that horizontal scaling indeed has some disadvantages such as data consistency challenges as well as logic required for partitioning - however the value often precedes the overheads!
Great topic and presentation, thank you! I'm looking forward to the next episodes in the series. By the way, I love the chill background music - many people overdo it but yours is just right! :)
Millions of users concurrently, should we not ask the feasibility of such a system? I mean maybe the internet root dns could have it, during some ddos attack or something! I generally tend to use Google's rps as a benchmark, with 100k rps.
ഇല്ല ടീച്ചർ, ആ സ്ത്രീ പോലും സ്വയം ചിന്തിക്കുന്നു പോലും ഉണ്ടാവില്ല ഇവിടെ അമേരിക്കയിൽ വന്നത് കൊണ്ടാണ് എനിക്ക് ഇത് പോലെ അവസരം കിട്ടിയത് എന്ന്. അത്രക്ക് നന്ദി കെട്ട വർഗ്ഗം ആണിത്
John, I have a question in terms of multiple data centers. When you say multiple data centers I hope they have their own database instances and app instances. How data will be synced between two data centers if one of the data centers goes down.?
Good question - you would need eventual consistency across all instances of your datacenter. So you'd have to build some kind of system that would update those database instances outside of your datacenter. There's lots of tradeoffs to different approaches, but in theory, you'd want eventual consistency across all instances as quickly as possible to prevent data loss
@@JohnCodes Wouldn't the real answer be something like load balancing and replication? The only real difference to a VM instance is the distance/routes. Data consistency is a given. You can't have the data being out of sync for the systems that are available and the unavailable systems should be synced as soon as they come back online and brought back into production once back in sync and usually not before.
Why not just use cloud infrastructure and let it handle a million or a billion users ! Wait.... the follow up question would be how to design such a cloud infrastructure ??
Money I guess. Also maybe at some point you need your own custom hardware which doesn't align with your needs. But the tradeoff is around the pain of maintaining. Maybe some sort of hybrid stuff.
Thank you all for being here! More system design interview videos to come! Make sure to sub for those videos coming in the next few weeks
I find amazing how this topic is most of the times approached as an "interview question" without realising that eventually you'll HAVE TO implement that system (or a part of it, in most cases). I find much more interesting and useful to approach the problem as questions on their own. "How do you scale to 1M users?" question in a system will make you scratch your head for quite a while.
I feel like this section needs to be a bigger part of the interview process. Thanks for the deep dive. Algos aren't always more important than understanding the system.
Yeah, I completely agree. I think they will be more and more important as systems get more and more complex and more abstract. It's less important to know how to code some algorithm vs being able to put together a complex distributed system.
Thanks, currently two hours away from the first system design interview I've had to do in ten years and this is a great overview :)
Update! I failed
@@DavidXNewtonhey!! wat happened? Wat scenario you got?
Also another question, when chosing database, should we first consider transaction isolation or qps first, and relational non-relational later!
Asking this because, the discord table, looks very relationally, but then they went will all the nosql stuff.
And i mean the difference between the approaches to fix database constraints for figma vs discord, makes me rethink how to chose a database.
Sometimes i feel like, its like chosing a tech one is comfortable in, and then work around the issues. These issues/tradeoffs will always be there, whatever you chose.
I mean, if one setups, two mongodb instances in a cluster, the immediate problem of id generation comes into picture, where you now have to generate the Object(id).
But i am not sure if that thought process is valid.
watching this 30m before my interview, feeling very confident now and fully ready to bomb it
Finally A good channel for SD
Finally a decent System design chanel! Btw you still write like a medical pro :D
Nice clean video ! Would like to add that horizontal scaling indeed has some disadvantages such as data consistency challenges as well as logic required for partitioning - however the value often precedes the overheads!
Very nicely explained @John - Thanks for the deep analysis on Scaling
It's really a Good one John. Greatly appreciate it.
Great topic and presentation, thank you! I'm looking forward to the next episodes in the series. By the way, I love the chill background music - many people overdo it but yours is just right! :)
Ha thanks! A low-pass on the music helps alot so it's not too loud. More to come soon!!
Hi John. Amazing Vedio. Any Books on system design that you recommend reading?
I enjoyed your video. Will you continue the series?
Man, I'm already becoming a fanboy of your channel 😂
Keep up with that great content!
One of us! One of us!!
Thanks for being here :D So happy you're enjoying it
Millions of users concurrently, should we not ask the feasibility of such a system? I mean maybe the internet root dns could have it, during some ddos attack or something!
I generally tend to use Google's rps as a benchmark, with 100k rps.
Thanks for the video! Very concise and explained with nice diagrams.
ഇല്ല ടീച്ചർ, ആ സ്ത്രീ പോലും സ്വയം ചിന്തിക്കുന്നു പോലും ഉണ്ടാവില്ല ഇവിടെ അമേരിക്കയിൽ വന്നത് കൊണ്ടാണ് എനിക്ക് ഇത് പോലെ അവസരം കിട്ടിയത് എന്ന്. അത്രക്ക് നന്ദി കെട്ട വർഗ്ഗം ആണിത്
great vid thanks! Wish there was a little more detail, like answering a "how would you design [x]? question.
Great video. You have a very cool way of talking, understood every bit of this video.
Ahhh thanks so much and thanks for watching!!
John, I have a question in terms of multiple data centers. When you say multiple data centers I hope they have their own database instances and app instances. How data will be synced between two data centers if one of the data centers goes down.?
Good question - you would need eventual consistency across all instances of your datacenter. So you'd have to build some kind of system that would update those database instances outside of your datacenter. There's lots of tradeoffs to different approaches, but in theory, you'd want eventual consistency across all instances as quickly as possible to prevent data loss
@@JohnCodes Wouldn't the real answer be something like load balancing and replication? The only real difference to a VM instance is the distance/routes. Data consistency is a given. You can't have the data being out of sync for the systems that are available and the unavailable systems should be synced as soon as they come back online and brought back into production once back in sync and usually not before.
Hello! May I ask whats stateless? also, I didnt quite understand the advantage of having an independent DB for user sessions. Thanks for your videos!
This is really amazing.
Great video!
Nice
Hi from Ireland
Helloooo!! Thanks for stopping by :D
Why not just use cloud infrastructure and let it handle a million or a billion users ! Wait.... the follow up question would be how to design such a cloud infrastructure ??
cloud is not a magic pill dude
@@telnet8674 It is, if you want high scalability and availability.
Money I guess. Also maybe at some point you need your own custom hardware which doesn't align with your needs. But the tradeoff is around the pain of maintaining. Maybe some sort of hybrid stuff.
So many ads
👍👍
dont add music