Corrections: In the geohash length to grid size mapping table at 15:30 and 20:01, the correct values for 7, 8, 9 and 10 are: 7 152.9m × 152.4m 8 38.2m × 19m 9 4.8m × 4.8m 10 1.2m × 59.5cm
Hi Alex, I think you need to correct the definition, at 14:00 onwards you are talking about quadtree, not Geohash, Geohash divides the grid initially into 32 smaller grids since the base here is 32. You may like to correct it.
In the video Sahn says that a database in the terabyte range is on the borderline where sharding might make sense. He also says that our read qps of 5,000 is quite high. I was wondering how he came to these conclusions and if there are specific numbers he looks for to determine if a number is high enough to warrant a design change?
even though its quite some time since i dealed with geoinformatics, i think you orgot/mixed up some points here. first of all: how do we retrieve the data and how is it stored? when it comes to osm you can get the data in a lot of diferent ways, but you always end up in using quadtree/knearest/bruteforce, but that leads us to the next question: are the searchalgos clientside? and if not, why dont we replace those searchalgos alltogether by storing data in a different way? third: what kind of transformation are we using here? most likely datums i guess, but using the one with the letters in the rows, would make the answer go in a complete different way, than equatorial and easting, or datums without eastings
So much effort spent in how to make this video so informative, well structured, precisely explained and amazingly illustrated. Thank you for sharing this with us!
Holy crap... finally a guy that actually knows what he's talking about. I've been doing this for 20 + years and this is how it's done. Kids you don't need to make it overly complex, just build it so it can scale not at scale.
Notes: - The SQL query at 20:27 would not work for the same reasons you mentioned previously in the video (prime meridian + equator). - You assume that the long/lat is the centre of the business/place, if users are able to add their own businesses this will probably not be the case - Businesses/places can span multiple grids (think of shopping malls) - Businesses/places can be bigger than a grid (think of airports) - At the very end you say it would "use the long/lat to rank the businesses and return to the client" (sorting), you also need to check if the business is still within the radius the user specified (filter); just because the business is in the same grid/neighbouring grids doesn't mean its within the radius they asked for. The first thought might be to make the geohash a list (of all the grids that it covers), but how do you calculate that? You would need a polygon (a long/lats for each vertex) that covers the (rough?) area of the business/place, then you get onto the trouble of validating that :)
Wow, I'm an enterprise software engineer and from my work experience and knowledge, this is one of the best channel that digs into the ways of working and architecturing an enterprise software. Kudos 👏
This video should be taught at universities and bootcamps. Not only for the interesting topic but also to learn how to think about this kind of problem and system design in general. Thank you!
I have been following a number of channels related to system design in last few months. While many of them are brilliant, Sahn has a very unique and effective way of communicating complex technical topics. Thanks Sahn and team for this content. Hope to see more of these.
Dude, your content is gold! I've been coding almost 20 years and prepared for interviews countless times. This has been one of the best content so far!
Please continue making videos. The speaker did an amazing job, clear with a nice tone. The visual presentation is also nicely executed. 10/10, subscribed.
In early 2000s we designed such a system for a popular real estate site very similar. We thought we had nailed the algorithm, but started getting complaints from agents and consumers. It turns out in that domain “distance” is almost always driving distance which is an entirely different problem and was much more complex in the days before quick road route planning 😀
This was super fun to watch and got me interested again in algorithms and systems in general. I decided to go ahead and buy the two books straight ahead. So having a youtube channel definitely helps getting the word out :))
If we can have some sort of video course also like the book you published, I would buy the course right away. The content and the explanation you deliver is really simple to understand and that's the beauty of a good Teacher. "Explain me like I am 5th grade" This really goes for you!
I really like this video, the presenter is very friendly and calm, please extends this, I would like to learn more. You got +1 sub from Tanzania 🇹🇿 Greetings from Tanzania 🇹🇿
I am currently going through the System Design Course offered by Design Gurus on Educative which covers similar concepts but in a text based format. I have heard a lot about ByteByteGo's courses and books as well and I'm glad that I looked up this video. Thanks Alex and the team!
I'm really lucky to see this video released. It's been about a year since I work in the area of location based service. There is not much of related information that is up to date. Thanks a lot for this video! Keep on!
I worked on a weather alert system that used S2, you more or less described its design and all the rationale for our decisions (we indexed by S2 cell id). Excellent video.
I just started reading volume 2 of System Design Interview and I'm really enjoying the content so far. Even more so when I realized you created videos to further solidify the readings! Keep them coming!
Functional vs non-functional Functional: start with the user personas and what they can do with the app. This determines the API design. Non-functional: think in terms of latency, throughput, storage. This determines the architecture, the data storage and retrieval implementations.
Hi Sam, you mention use (geohash, business id) as a compound key so we can remove business from the table efficiently. I don't think that's the right thing to do. Instead of a compound key, we should just add another index for the business id. Removing a business thus would only need to check that index. With a compound key we'd need to calculate the geohash for that business, walk the geohash index first, then look for the business id.
This is one of the best technology videos I've ever watched! You explained everything very well and with great, informative graphics. Only small suggestion - maybe use a mono spaced font for the queries etc. to improve readability. Thanks for this!
In this particular example Load balancer presented not well, because the idea 8:23 of load balancer is to distribute the same request to instances of the same service, here better would to say that there is API gateway and load balancer for each particular service instances!!
Really good content, as always. You can also add a Kafka or some other streaming service to the business service, so that writes send events to the streaming system. You can then connect that streaming system to the write database as a sink, which will allow you to distribute the write loads evenly throughout the day and handle peak loads without consuming extra resources. Streaming systems like Kafka are so heavily optimized for such use-cases that even a relatively smol cluster with 3 4Gb+2CPU+100Gb should easily be able to handle these loads and have a lot of headroom if you use z-compression on the topic. As an added benefit you can perform change management on the database transparently to the user because streaming system will buffer all write operations automatically while the sink is offline. Streaming systems generally pair with a heavy read/low write system quite well.
Do we really need to add streaming services? I mean, as we know that write traffic is really very low and also we can compromise with the consistency(will be eventually consistent), so wouldn't Kafka or any streaming service be inefficient to use?
This is an awesome video. I liked how you were able to break down complex concepts so that even beginners could understood them at a high level. Was able to learn a lot under half an hour about the complete picture of system design
I have your System Design Interview Volume I book... Never knew you were the author :) I learned CAP theorem, eventual consistency, and many others from it... I wasn't aware that there is volume 2.. will be adding it to my cart... Great explanation once again!
Currently the search API will need to query a few hundred businesses after the geoindex read. So business table will have 10x+ query load as the geoindex table
2 minutes in the video and liked it already 👍. What an amazing way of explaining stuff, simple yet effective slides. Thanks man appreciate your efforts.
excellent video ... I have one basic question, at 19:45, why do you use a compound key combining geohash and business id? if the business id is a unique key, then you are just concatenating some extra junk onto something which is already unique. or did I miss something, why does it make removal more efficient? good stuff!
Hi Alex, I think you need to correct the definition, at 14:00 onwards you are talking about quadtree, not Geohash, Geohash divides the grid initially into 32 smaller grids since the base here is 32. You may like to correct it.
Fantastic system design content. Really appreciate the clear explanation of logic used to estimate the system requirements and then determine the approach to the design.
Very nice video. I really like the information flow along with the visual presentation. A small correction though: the query at 11:20 is not giving you all business in the circle, but in a square ( its the 1-Norm and not the euclidean (2-Norm))
Corrections: In the geohash length to grid size mapping table at 15:30 and 20:01, the correct values for 7, 8, 9 and 10 are:
7 152.9m × 152.4m
8 38.2m × 19m
9 4.8m × 4.8m
10 1.2m × 59.5cm
Hi Alex, I think you need to correct the definition, at 14:00 onwards you are talking about quadtree, not Geohash, Geohash divides the grid initially into 32 smaller grids since the base here is 32. You may like to correct it.
No wonder, that is why I was thinking why the values are considered too small when in fact they are in the ballpark of 4, 5, 6.
Yeah, that was confusing I had to double check
In the video Sahn says that a database in the terabyte range is on the borderline where sharding might make sense. He also says that our read qps of 5,000 is quite high. I was wondering how he came to these conclusions and if there are specific numbers he looks for to determine if a number is high enough to warrant a design change?
even though its quite some time since i dealed with geoinformatics, i think you orgot/mixed up some points here. first of all: how do we retrieve the data and how is it stored? when it comes to osm you can get the data in a lot of diferent ways, but you always end up in using quadtree/knearest/bruteforce, but that leads us to the next question: are the searchalgos clientside? and if not, why dont we replace those searchalgos alltogether by storing data in a different way? third: what kind of transformation are we using here? most likely datums i guess, but using the one with the letters in the rows, would make the answer go in a complete different way, than equatorial and easting, or datums without eastings
Hey TH-cam algorithm, if you’re reading this, I just want to say, this is the type of video you should be recommending to software people. K thanks
It worked
Best design video ever. I am so happy that Alex decided to make videos.
Alex sir *
@@ratanlambha2602 😒
Amen
I dont think that this guys name is Alex tho
this video is a FLOW. Could not stop watching... Beautiful animation and narration. Perfect
So much effort spent in how to make this video so informative, well structured, precisely explained and amazingly illustrated. Thank you for sharing this with us!
Holy crap... finally a guy that actually knows what he's talking about. I've been doing this for 20 + years and this is how it's done. Kids you don't need to make it overly complex, just build it so it can scale not at scale.
Hey, Alex. I learn x100 times more from this video than from my past year in IT. Totally awesome content!
he is sahn lam
Notes:
- The SQL query at 20:27 would not work for the same reasons you mentioned previously in the video (prime meridian + equator).
- You assume that the long/lat is the centre of the business/place, if users are able to add their own businesses this will probably not be the case
- Businesses/places can span multiple grids (think of shopping malls)
- Businesses/places can be bigger than a grid (think of airports)
- At the very end you say it would "use the long/lat to rank the businesses and return to the client" (sorting), you also need to check if the business is still within the radius the user specified (filter); just because the business is in the same grid/neighbouring grids doesn't mean its within the radius they asked for.
The first thought might be to make the geohash a list (of all the grids that it covers), but how do you calculate that? You would need a polygon (a long/lats for each vertex) that covers the (rough?) area of the business/place, then you get onto the trouble of validating that :)
This is by far the BEST VIDEO on questions like "Design Yelp". Pure Gold !! Thanks, Alex, and ByteByteGo team for your outstanding work.
Wow, I'm an enterprise software engineer and from my work experience and knowledge, this is one of the best channel that digs into the ways of working and architecturing an enterprise software. Kudos 👏
This video should be taught at universities and bootcamps. Not only for the interesting topic but also to learn how to think about this kind of problem and system design in general. Thank you!
I have been following a number of channels related to system design in last few months. While many of them are brilliant, Sahn has a very unique and effective way of communicating complex technical topics. Thanks Sahn and team for this content. Hope to see more of these.
Dude, your content is gold! I've been coding almost 20 years and prepared for interviews countless times. This has been one of the best content so far!
Thank you for the encouragement. This is the first chapter-length video we made, and it was a lot of work. Your feedback is much apprecated.
@@ByteByteGo wonderful job.
We need more full system design videos like this one from you!
By far the best system design channel ever. Crisp presentation and fluid animations to easily showcase complex topics in a simple manner.
This has to be the highest quality system design explainer video on TH-cam. * take a bow *
he explains like a teacher you would find in a school that everyone loves
I got this question on my Meta on-site interview. Absolutely bombed it.
Thank you so much for starting to make these videos. Your step by step approach is so clear and crisp. Best System Design material ever!!!
I bought Vol 2 but I find video format slightly easier to digest. I really appreciate you making these videos.
This is hands down the best design video I’ve ever seen
Please continue making videos. The speaker did an amazing job, clear with a nice tone. The visual presentation is also nicely executed. 10/10, subscribed.
I had exactly this problem on my system design interview @ faang, failed it miserably)) Excellent quality of material here, will help a lot in future!
In early 2000s we designed such a system for a popular real estate site very similar. We thought we had nailed the algorithm, but started getting complaints from agents and consumers.
It turns out in that domain “distance” is almost always driving distance which is an entirely different problem and was much more complex in the days before quick road route planning 😀
Damn where do you work now?
@@kumarsamaksha7207 Nothing real estate related since 2008!
@@user-yr1uq1qe6y Cool
i think then you can increase the radius by a % and do a new query of driving distance on that reducer set and rearrange.
literally the traveling salesman problem :p
i hope u make 10 videos everyday so i can learn forever about system design. Thanks so much
This is such an underrated channel - very good visuals, concise speech pattern, and extremely well thought out approach to each topic
Beautiful poetry just opens up my eye. Overwhelmingly grateful. 🙏 and ❤from Chennai 🇮🇳
This was super fun to watch and got me interested again in algorithms and systems in general.
I decided to go ahead and buy the two books straight ahead. So having a youtube channel definitely helps getting the word out :))
Please continue to make such wonderful videos. It’s pure gold, glad I discovered the channel.Thank you
If we can have some sort of video course also like the book you published, I would buy the course right away. The content and the explanation you deliver is really simple to understand and that's the beauty of a good Teacher. "Explain me like I am 5th grade" This really goes for you!
I've watched like most of the SD videos from YT, i could tell you guys this one is the best. Thanks Alex
I really like this video, the presenter is very friendly and calm, please extends this, I would like to learn more.
You got +1 sub from Tanzania 🇹🇿
Greetings from Tanzania 🇹🇿
Your approach is fascinating. It kept me watching the whole video and I find this very rare. Thank you.
I am currently going through the System Design Course offered by Design Gurus on Educative which covers similar concepts but in a text based format. I have heard a lot about ByteByteGo's courses and books as well and I'm glad that I looked up this video. Thanks Alex and the team!
As humble as he is, his videos are super awesome too. Go alex..!!
wow - what a wonderful video. So well thought out - clearly lays down the ideas using very easy to understand visuals.Thank you so much Alex.
A clear and in depth explanation of the proximity service design. Specifically I liked the detail ways to index the geospatial databse.
I'm really lucky to see this video released. It's been about a year since I work in the area of location based service. There is not much of related information that is up to date.
Thanks a lot for this video! Keep on!
Are you saying the video information is out of date, or up to date?
@@ChrisCox-wv7oo it's up to date
I worked on a weather alert system that used S2, you more or less described its design and all the rationale for our decisions (we indexed by S2 cell id). Excellent video.
And for the record, because of this I bought your vol 2 of sys design
This channel gonna blowup. The graphic designs are the game changer
I just started reading volume 2 of System Design Interview and I'm really enjoying the content so far. Even more so when I realized you created videos to further solidify the readings! Keep them coming!
I have tried different videos but this one definitely stands out for understanding location based system design.
Functional vs non-functional
Functional: start with the user personas and what they can do with the app. This determines the API design.
Non-functional: think in terms of latency, throughput, storage. This determines the architecture, the data storage and retrieval implementations.
You sir are a natural for teaching, please dont stop, you're doing world a favor! Cheers !
Hi Sam, you mention use (geohash, business id) as a compound key so we can remove business from the table efficiently. I don't think that's the right thing to do. Instead of a compound key, we should just add another index for the business id. Removing a business thus would only need to check that index. With a compound key we'd need to calculate the geohash for that business, walk the geohash index first, then look for the business id.
This is one of the best technology videos I've ever watched! You explained everything very well and with great, informative graphics. Only small suggestion - maybe use a mono spaced font for the queries etc. to improve readability. Thanks for this!
In this particular example Load balancer presented not well, because the idea 8:23 of load balancer is to distribute the same request to instances of the same service, here better would to say that there is API gateway and load balancer for each particular service instances!!
Would definitely love to get more full systems design interview videos like this from you. Great video!
As usual amazing video with great details. Thank you Alex. Wish to see more videos of length 20+ mins.
Really good content, as always. You can also add a Kafka or some other streaming service to the business service, so that writes send events to the streaming system. You can then connect that streaming system to the write database as a sink, which will allow you to distribute the write loads evenly throughout the day and handle peak loads without consuming extra resources. Streaming systems like Kafka are so heavily optimized for such use-cases that even a relatively smol cluster with 3 4Gb+2CPU+100Gb should easily be able to handle these loads and have a lot of headroom if you use z-compression on the topic. As an added benefit you can perform change management on the database transparently to the user because streaming system will buffer all write operations automatically while the sink is offline. Streaming systems generally pair with a heavy read/low write system quite well.
Do we really need to add streaming services? I mean, as we know that write traffic is really very low and also we can compromise with the consistency(will be eventually consistent), so wouldn't Kafka or any streaming service be inefficient to use?
@@aniketshukla9568 Not with that kind of scale, no.
As he mentioned in the video, the number of write operations is very low hence no need for Kafka.
Amazing video. Only thing I'd add is timestamps on the video. But the content itself is priceless
This is an awesome video. I liked how you were able to break down complex concepts so that even beginners could understood them at a high level. Was able to learn a lot under half an hour about the complete picture of system design
15:37 for 7-10, units would be in metre not km
This content is unbelievable. Will be checking out your books and newsletters!
This is gold! Toss in a Geosptial DB on your interview and it’s sure to impress them
One of the best technical video i have ever watched
This is the best system design video I've seen so far. Great job mister
Loved it, well made and highly educational. Looking forward for more of those.
I have your System Design Interview Volume I book... Never knew you were the author :) I learned CAP theorem, eventual consistency, and many others from it... I wasn't aware that there is volume 2.. will be adding it to my cart... Great explanation once again!
Currently the search API will need to query a few hundred businesses after the geoindex read. So business table will have 10x+ query load as the geoindex table
This video is so good, i need to watch it a few more times, there is a lot of valuable information here
Really great explanation. The thoughtful visuals helped me to understand much better.
Another great video with clear explanation and informative graphs for software engineers
you opened my mind about the system design. unbelievable
Alex. I am your great admirer. I would humbly request you to make more System design videos.
That geohash part was superb... Damn, so many things I didn't know.
2 minutes in the video and liked it already 👍. What an amazing way of explaining stuff, simple yet effective slides. Thanks man appreciate your efforts.
Excellent content. I liked your structured way of thinking
excellent video ... I have one basic question, at 19:45, why do you use a compound key combining geohash and business id? if the business id is a unique key, then you are just concatenating some extra junk onto something which is already unique. or did I miss something, why does it make removal more efficient? good stuff!
Hi Alex, I think you need to correct the definition, at 14:00 onwards you are talking about quadtree, not Geohash, Geohash divides the grid initially into 32 smaller grids since the base here is 32. You may like to correct it.
I would index the city column. This would prevent table scans. You can then further filter using geo data.
Mindblowing content and way of presenting it! Thank you so much! Keep it up!
Thanks for this video! Cleared most of my problems in my upcoming project!! 😭😭
Another eye-opening video. Thank you very much for these high quality, insightful videos. Great work!!👏
Fantastic system design content. Really appreciate the clear explanation of logic used to estimate the system requirements and then determine the approach to the design.
Literally was designing a location based service and wanted advice, its criminal that you don't have more subs and views.
Absolutely love this video! 15 year old me craved this information but got it after 8 years
Absolutely love how succinct and on point the explanation is.
Great video!
Question please:
Usually these services will have filters like business type, pricing etc.
Which table would you add it to?
Nice video!! A lot of good information in just 24 mins, very nice... Thank you.
The design elements are at another level! Great work! Keep it up!
You are the best person to explain that content. Thanks to share it with us!
Fantastic explanation. Great job. Waiting to see more system design videos from you. Good luck
Great video with a good explanation at a high level!
Thanks!
This is amazing, what a gem of a youtube channel !
Excellent video! The core algorithm is from 11:56
better than anything that I have found for this topic! Thanks!
Awesome! The best system design video. Keep doing your great work because it helps people a lot.
Very nice video. I really like the information flow along with the visual presentation.
A small correction though: the query at 11:20 is not giving you all business in the circle, but in a square ( its the 1-Norm and not the euclidean (2-Norm))
This is by far the greatest video, thanks a lot for putting this together ❤
This is neat and effective. I only wish i saw this 3 years back when i was asked same in interview🙈😇
Great explanation with concise and well mapped out diagrams. Amazing!
So glad I discovered your channel. These are unparalleled videos on system design. Amazing work.
Amazing!
Beautiful!
Super high quality content!
So eager for the upcoming videos!
Bought your book on system design. Great experience, thanks!
Great video where everything was clearly explained. Thank you!
This is a god tier video. Please keep them coming.
Amazing, the style of the video makes it very easy to grasp the concept!
business info does not change very often so it would be a really good case for caching.
Amazing video, thank you for sharing knowledge in such well presented way.