This was a realistic interview in the sense that the interviewer realizes that the scale they are asking to design for is completely insane (500K drivers, 10M orders a day) but still proceeds as a thought experiment. I had an interview recently for a game platform and the interviewer said I should design for 50M concurrent users. LMAO.
Great video! It's very helpful to see a video where both of the interviewer and the interviewee know what they're talking about in a mock. These are some thoughts about WebRTC from my experience. I've been using and implemented WebRTC for 3 years. WebRTC uses UDP under the hood, however it uses SCTP protocol on top of UDP (probably will be replaced by QUIC eventually) to create a TCP-like experience for data channels. The developer is allowed to configure either guaranteed delivery (TCP) or lossy (UDP). WebRTC requires a signaling server to build a root of trust, and connection information for both the user and the driver. WebRTC and Websocket are stateful, meaning the server needs to keep the socket connection open for the entire session, otherwise the clients must reconnect. The stateful nature makes the solution harder to scale, and more complicated. As you guys mentioned, most users likely don't track on their food delivery as often as they would with Uber. I think WebRTC is overkill in this use case. The WebRTC connection likely gets broken (from closing the app) many times in the food delivery tracking session, and cause load churns to the server. I personally prefer doing HTTP polling with HTTP/3 transport (which is also based on UDP and support 0-RTT connection establishment). I think it is the simplest solution with reasonable low load to the server and easier to load balance horizontally. But, at the end of the day, we should look at metrics to decide the best solution for the problem. Strategically, starting with the simplest solution to quickly gather important metrics would be a good idea. So, I would start with HTTP polling, collect the data, then re-evaluate if WebRTC is needed. Finally, collect the data between HTTP polling and WebRTC. HTTP polling might be the most optimal solution already, simpler solutions can do wonder sometimes 😀. There are also ways to reduce states in WebRTC.
Jordan knows what he's talking about. Every answer sounds calculated and well reasoned. Gaurav + Jordan is a killer combo. Looking forward to more such collabs
Digging deeper into single component is such a good idea. Rightly said that while discussing complex systems on the whole we tend to oversimplify. Enjoyed the video thoroughly. Thanks.
Great content. Jordan is very articulate and able to seamlessly translate theoretical knowledge to concrete solution. Quick note, consistently hashed cluster is great when keys evenly distribute across the hash space. In this case if we use `geo-hash` directly as the key, we get uneven distribution since small number of prefixes will have most of the data (e.g. geo-hashes representing Manhattan, Seattle etc) i.e. small number of keys having larger data. We could use a more dynamic range based sharding (involves a seperate manager service etc) to split key ranges based on load. Alternatively, we can hash the geo-hash itself to get even spread but loose colocation of relatively closer grids which could be important as we zoom-out for finding drivers. Either works but worth discussing the tradeoff especially considering latency is critical. (fan-out vs fan-in). Also wasn't clear what approach was proposed. Also, Jordan touched about point related to use of pure peer to peer communication for tracking (user and driver directly exchange communicate). Few reasons why this isn't practical generally, 1> Security & Firewalls. Exposed open ports in user's phones are prime target for DOS attacks. Most routers or n/w infra will actually block any new external incoming connection by default and will require a user override to whitelist/forward port. If it's a public router etc there won't be anyway at all to receive connection requests. 2> Relaying through a central service gives more reliable (in terms of throughput) network path (aws/gcp have interconnects with major isps and dedicated n/w infra), whereas for direct connections require to go through congested public interconnects. This is less of a problem when people are in close geo proximity so insignificant in this case, however relevant for whatsapp etc. Understandably not every aspect can be discussed in detail and interviewee isn't even expected to be aware of every aspect. Hence FAANG(at least F,G,U) gives interviewee leeway in picking one aspect of their choice for deep dive besides getting overall high level design. Side note. It is ok to say i don't know internals of x or y (say websockets or loadbalancers or storage engine etc) and move the interview to what you know. i.e. play by your strengths, not knowing deeply about few things is expected and doesn't result in -ve points but vaguely/incorrectly answering something will lead to -ve point and more followups around the same thing eventually leading to botched interview.
In the USA geohash fixed addresses by zip code+4. The post office has done all the work. A more sophisticated analysis of verified address locations means that you aren't sharding for a lake, or other uninhabited area, saving tables, and flagging possible false orders and incorrect data.
hi gaurav! thank you for the amazingg content you are providing. i have binge watched your entire content in the past few months.on point explanation...your teaching skills are really goood. just a suggestion if you may...you should collaborate with people who amplify your reach ...like aman dhattarwal..or maybe launch some live courses on their platform...his reach is MASSIVE..it will also help you increase sales for your startup Idk if the suggestion i gave is even possible😅but just felt like this precious content and your courses should be reaching more and more people!
Food delivery system is a complex architecture. What I see as an architecture is Client, Server and Delivery. You should have allowed Jordon to at least create some connections.
Would have been great to see details about the DB/tables design, and then going for the data storage estimations based on that. Is that something generally expected, or is the high level data storage estimation without getting into details about schema design, as done by Jordan here, also acceptable in an interview?
Bhaiya, i just passed clas 12 th..and i want to get into programming (ml/Ai) but i have never been good at problem solving..so how can I improve my problem solving from now on...can u suggest ways..pls..
Hi Gaurav, this interview feature by Interview Ready looks nice. How did you guys build it? WebRTC? Have you written any tech deep dive on implementing it?
I've worked at a company that solved a similar problem but at a smaller scale (it had 200k users and more than half weren't active). Instead of Geohashing, they did it with MongoDB Geospatial queries. Why is Redis geohashing better in this case? When does geospatial queries start to become a problem?
MongoDb supports both 2d Index where it uses b-tree and 2dsphere Index where it uses similar approach "geo hasing" I'm curious why do I have to implement that from the get-go and It's already implemented
Why do we need to do range queries within the redis instance in order to find dashers? If each shard of redis handles a range of geohashes, then can each redis shard just have a hashmap which maps a geohash -> [list of dasher ids]? You can use consistent hashing to locate the shard, then just do a hashmap lookup within that shard. As long as given a geohash, I can calculate the 8 surrounding box geohashes, this should be sufficient right? I think most libraries for geohash can calculate this quickly
I think it really just depends on the size of the geoshard that you use. If they are big, then you'll have to do a range query to quickly find a nearby driver to a given geohash. If they are the perfect size, then you could just select any of the dashers located in that node.
Hi Gaurav, This question is not related to doordash system design, Please can you explain the difference between message bus ,message queue and message broker
Message broker can be a distributed system which provides a reliable and fault tolerant way of delivering messages b/w multiple producer and consumer. A message broker can handle many to many mapping between a producer and consumer
The comment about geohash being ordered and doing binary search is a bit flawed. Remember that geohash has this property: 'abc' is ALWAYS closer to 'abd', but 'def' for example might actually be also closer to 'abc' as well. So closer in edit distance is a sufficient condition but not a necessary one. Read System Design book by Alex Xu for more info on how this could happen.
Why can't you just have all the actors (drivers, customers, businesses) subscribe (websockets) to the system and ping their location (lat/long) in real-time? Then when a customer / business needs a driver, they are getting the real-time picture of whos active in the system and where they are. So at the end of the day you don't care where drivers / customers are in the world, you just care about their availability. So you can have one table called Availability that you match everybody through.
The content was great, the tool needs to be improved. I can see both of them struggling with the interface. @Gaurav - As the interviewer in a real world scenario, the interviewer won't really be drawing as much as you were in this case.
Jordan Knows what he is taking about ; I like his confidence and the way he explain the scenario. thanks gaurav for inviting Jordan
This was a realistic interview in the sense that the interviewer realizes that the scale they are asking to design for is completely insane (500K drivers, 10M orders a day) but still proceeds as a thought experiment. I had an interview recently for a game platform and the interviewer said I should design for 50M concurrent users. LMAO.
This guy was randomly seen on my feed and since then i am following him. His channel is the most no bs system design walkthrough channel.
Thanks Hardik!
Great video! It's very helpful to see a video where both of the interviewer and the interviewee know what they're talking about in a mock.
These are some thoughts about WebRTC from my experience. I've been using and implemented WebRTC for 3 years. WebRTC uses UDP under the hood, however it uses SCTP protocol on top of UDP (probably will be replaced by QUIC eventually) to create a TCP-like experience for data channels. The developer is allowed to configure either guaranteed delivery (TCP) or lossy (UDP). WebRTC requires a signaling server to build a root of trust, and connection information for both the user and the driver. WebRTC and Websocket are stateful, meaning the server needs to keep the socket connection open for the entire session, otherwise the clients must reconnect. The stateful nature makes the solution harder to scale, and more complicated.
As you guys mentioned, most users likely don't track on their food delivery as often as they would with Uber. I think WebRTC is overkill in this use case. The WebRTC connection likely gets broken (from closing the app) many times in the food delivery tracking session, and cause load churns to the server. I personally prefer doing HTTP polling with HTTP/3 transport (which is also based on UDP and support 0-RTT connection establishment). I think it is the simplest solution with reasonable low load to the server and easier to load balance horizontally.
But, at the end of the day, we should look at metrics to decide the best solution for the problem. Strategically, starting with the simplest solution to quickly gather important metrics would be a good idea. So, I would start with HTTP polling, collect the data, then re-evaluate if WebRTC is needed. Finally, collect the data between HTTP polling and WebRTC. HTTP polling might be the most optimal solution already, simpler solutions can do wonder sometimes 😀. There are also ways to reduce states in WebRTC.
I have been watching Jordan’s content for quite a while now and dude you are just killing it thanks Gaurav for inviting him
Thanks Geeky! I appreciate it! 😙
Jordan knows what he's talking about. Every answer sounds calculated and well reasoned. Gaurav + Jordan is a killer combo. Looking forward to more such collabs
This is one of the best videos I have come across for SD Excellent interview candidate.
I have seen your videos separately, and now together! you both are great
Thank you!
Digging deeper into single component is such a good idea. Rightly said that while discussing complex systems on the whole we tend to oversimplify. Enjoyed the video thoroughly. Thanks.
It's amazing watching both of you! Learned a lot! Thanks Gaurav! Def need more of these :)
I appreciate it!
This collab was worth the wait. Nice discussion.
Great content. Jordan is very articulate and able to seamlessly translate theoretical knowledge to concrete solution.
Quick note, consistently hashed cluster is great when keys evenly distribute across the hash space. In this case if we use `geo-hash` directly as the key, we get uneven distribution since small number of prefixes will have most of the data (e.g. geo-hashes representing Manhattan, Seattle etc) i.e. small number of keys having larger data.
We could use a more dynamic range based sharding (involves a seperate manager service etc) to split key ranges based on load. Alternatively, we can hash the geo-hash itself to get even spread but loose colocation of relatively closer grids which could be important as we zoom-out for finding drivers.
Either works but worth discussing the tradeoff especially considering latency is critical. (fan-out vs fan-in). Also wasn't clear what approach was proposed.
Also, Jordan touched about point related to use of pure peer to peer communication for tracking (user and driver directly exchange communicate).
Few reasons why this isn't practical generally,
1> Security & Firewalls. Exposed open ports in user's phones are prime target for DOS attacks. Most routers or n/w infra will actually block any new external incoming connection by default and will require a user override to whitelist/forward port. If it's a public router etc there won't be anyway at all to receive connection requests.
2> Relaying through a central service gives more reliable (in terms of throughput) network path (aws/gcp have interconnects with major isps and dedicated n/w infra), whereas for direct connections require to go through congested public interconnects. This is less of a problem when people are in close geo proximity so insignificant in this case, however relevant for whatsapp etc.
Understandably not every aspect can be discussed in detail and interviewee isn't even expected to be aware of every aspect.
Hence FAANG(at least F,G,U) gives interviewee leeway in picking one aspect of their choice for deep dive besides getting overall high level design.
Side note. It is ok to say i don't know internals of x or y (say websockets or loadbalancers or storage engine etc) and move the interview to what you know. i.e. play by your strengths, not knowing deeply about few things is expected and doesn't result in -ve points but vaguely/incorrectly answering something will lead to -ve point and more followups around the same thing eventually leading to botched interview.
In the USA geohash fixed addresses by zip code+4. The post office has done all the work. A more sophisticated analysis of verified address locations means that you aren't sharding for a lake, or other uninhabited area, saving tables, and flagging possible false orders and incorrect data.
Jordan is a funny dude and not ugly!
hi gaurav! thank you for the amazingg content you are providing. i have binge watched your entire content in the past few months.on point explanation...your teaching skills are really goood. just a suggestion if you may...you should collaborate with people who amplify your reach ...like aman dhattarwal..or maybe launch some live courses on their platform...his reach is MASSIVE..it will also help you increase sales for your startup
Idk if the suggestion i gave is even possible😅but just felt like this precious content and your courses should be reaching more and more people!
Hahaha. The introduction is really funny
somewhere after 30 mins, the discussion went stray... first 30 mins too good... but after i felt the discussion was deviated
thx for the content , 15:06 why no b tree with redis ? instead a sorted set
Thats a great content ! One thing that happens notoriously without noticing are units. 5Mb is 5 mega bits "b" not bytes "B".
excellent explanation about geoshashing, thanks guys!
Nice work! First example is nice - Sri Lanka to India Food delivery 😀 ha ha.. Anyways, I like your channel Gaurav.😊
Food delivery system is a complex architecture. What I see as an architecture is Client, Server and Delivery. You should have allowed Jordon to at least create some connections.
Very nice effort, felt like gaurav missing some stuff on geo sharing and asking off questions was a bit confusing for the interviewee
Would have been great to see details about the DB/tables design, and then going for the data storage estimations based on that. Is that something generally expected, or is the high level data storage estimation without getting into details about schema design, as done by Jordan here, also acceptable in an interview?
Bhaiya, i just passed clas 12 th..and i want to get into programming (ml/Ai) but i have never been good at problem solving..so how can I improve my problem solving from now on...can u suggest ways..pls..
jordan is genius
hi Gaurav, for every video can you also please mention the pre-requisite concepts , i felt few of the things going over head
Learned a lot from this collab.. Thanks a lot Gaurav! 🔥
why this video can not be found on gaurav main page home->system design tag?
Thanks Gaurav for this great video. Just one question as redis is being used here as geo-sharded database how concurrency will be handled with redis.
Hi Gaurav, this interview feature by Interview Ready looks nice. How did you guys build it? WebRTC? Have you written any tech deep dive on implementing it?
We wrote the code in Golang, and accept http requests only.
My next video is a deep dive on the architecture, stay tuned!
Very nice 👌
I appreciate 🙏 💛 your efforts
I would have used PostGIS definitely, store the data in geometry and use all the spatial functions, indexing and what not
I've worked at a company that solved a similar problem but at a smaller scale (it had 200k users and more than half weren't active). Instead of Geohashing, they did it with MongoDB Geospatial queries. Why is Redis geohashing better in this case? When does geospatial queries start to become a problem?
MongoDb supports both 2d Index where it uses b-tree and 2dsphere Index where it uses similar approach "geo hasing" I'm curious why do I have to implement that from the get-go and It's already implemented
Why do we need to do range queries within the redis instance in order to find dashers? If each shard of redis handles a range of geohashes, then can each redis shard just have a hashmap which maps a geohash -> [list of dasher ids]? You can use consistent hashing to locate the shard, then just do a hashmap lookup within that shard.
As long as given a geohash, I can calculate the 8 surrounding box geohashes, this should be sufficient right? I think most libraries for geohash can calculate this quickly
I think it really just depends on the size of the geoshard that you use. If they are big, then you'll have to do a range query to quickly find a nearby driver to a given geohash. If they are the perfect size, then you could just select any of the dashers located in that node.
is it secure to have peer-to-peer connections between dashers and clients?
Do I need to know Web development before study system design ???
Which tool has been used for drawing?
Nice one. Learnt a lot. Keep up the good work
Isn't MYSQL harder to shard ? i believe no-sql is easier to shard in that case so i would choose it .
It has to be sharded manually, yes.
Hi Gaurav,
This question is not related to doordash system design,
Please can you explain the difference between message bus ,message queue and message broker
Message broker can be a distributed system which provides a reliable and fault tolerant way of delivering messages b/w multiple producer and consumer.
A message broker can handle many to many mapping between a producer and consumer
Gaurav talking about UDP and his voice went berserk, talking about coincidence !!!
When the interviewers makes your life miserable 😂
FYI, lat/lng to geohash is simple arithmetic. It would be ridiculous to call a service to do that.
Interview Ready UX could do with some improvement
Yes. Thanks for the feedback.
Any suggestions for improving the UX?
thanks for sharing it was awesome
The comment about geohash being ordered and doing binary search is a bit flawed. Remember that geohash has this property:
'abc' is ALWAYS closer to 'abd', but 'def' for example might actually be also closer to 'abc' as well. So closer in edit distance is a sufficient condition but not a necessary one.
Read System Design book by Alex Xu for more info on how this could happen.
th-cam.com/video/OcUKFIjhKu0/w-d-xo.html
Thank you! Looks like you've already covered in the topic :)
Lots of good stuff here but should've at the very least *mentioned* storing the Restaurant's data, menus, building the order, etc.
Why can't you just have all the actors (drivers, customers, businesses) subscribe (websockets) to the system and ping their location (lat/long) in real-time?
Then when a customer / business needs a driver, they are getting the real-time picture of whos active in the system and where they are.
So at the end of the day you don't care where drivers / customers are in the world, you just care about their availability.
So you can have one table called Availability that you match everybody through.
now i know why swiggy was not able to find a delivery partner for my order.
great video
Thanks for the vid
The content was great, the tool needs to be improved.
I can see both of them struggling with the interface.
@Gaurav - As the interviewer in a real world scenario, the interviewer won't really be drawing as much as you were in this case.
Watched. 😀-
poor performance i would say, in term of technology picked and in terms of planning as well ...
💕💕💕💕
I think I gained 30 IQ points just listening to this
Damn!
I read this as system design of Doordarshan 🤦♀️🤦♀️
Lol
Aye shoutout to J. Epstein