BTW. while the video tells you to focus on high-level algorithmic concepts, I'd say don't underestimate the power of diving into the code. Those little details can make or break a system. The video praises virtual nodes in consistent hashing, but you should know that adding these can complicate things, and they are not always needed. Consistent hashing is great for Big Data, but it wont solve every problem-like hot-spotting, where some nodes get overloaded. The Leaky Bucket algorithm sounds pretty coold on paper, but have you considered request starvation or the need for a separate queuing system? Tries are efficient for string searches, but they can eat up memory like there's no tomorrow. And let's not forget Bloom filters; they're quick for caching and analytics, but what about false positives? The video mentions Kafka and SCD but trust me, implementing these algorithms in a real-world, large-scale system is a whole different ball game, yep , dude!
@@kedikeba sure ### Consistent Hashing -HRW Hashing: HRW (Highest Random Weight) hashing is stateless and provides an even distribution when you lose or gain nodes. It's easier for you to explain and understand compared to Consistent Hashing. ### Leaky Bucket Algorithm 1. Token Bucket Algorithm Unlike the Leaky Bucket you might use, the Token Bucket fills up with tokens at a steady rate. You can only send data packets if a token is available, allowing you to handle bursty data better. 2. Adaptive Leaky Bucket. This is a dynamic version that adjusts the leak rate based on your network conditions. It's more adaptive and could be more efficient for your needs. 3. Priority Queuing: If you use a priority queue alongside the Leaky Bucket, you can manage different types of traffic more efficiently. High-priority packets would get processed faster. ### Tries 1. Hash Tables: If you're dealing with languages that have large alphabets, hash tables could be more memory-efficient 2. Radix Trees: Also known as compressed tries, these trees store multiple characters in a single node. This could save you some memory. 3. Variable-Length Arrays in Node: You could use variable-length arrays in each node, sorted by frequency, to save space. This is particularly useful if your trie is sparse. ### Bloom Filters 1. Cuckoo Filters. These filters allow you to remove items dynamically and offer better space efficiency. 2. Quotient Filters. These are more memory-efficient and allow you to perform more advanced set operations like union and intersection. 3. Counting Filters If you need to remove elements, Counting Filters use counters instead of bits. This could offer you more flexibility.
I agree, for system design the SOLID principles (separation of concerns), interfaces/dependency injection and unit testability are far more important than specific algorithms. These up to a point are implementation details and can always be replaced/fine tuned to data at hand (correctness first). I am not saying they are not important, but they should come second.
@@JoseCarlosVMyup, but you have to look at the total memory footprint, including the additional overhead of the trie structure: nodes and links to child nodes, not just the strings stored themselves. Just storing 1 million unique strings of average size would occupy approximately 10 MB on their own. Regardless of whether these 1 million strings have overlapping prefixes or not, the memory overhead for the trie structure alone would be around 2 GB. Even with a 50% overlap due to common prefixes, the overhead still stands at more than 1 GB. Additionally, each node could have an array or hash map to represent transitions to child nodes, which could be mostly empty but would still take up space.
Yeah, I’m gonna go looking for overly-simplistic examples of these in action, in a language I can follow…like JavaScript. I need to see implementations to really internalize the concept.
Thank you so much, it is hard to follow - but it is not because it’s not well structured. It’s several very difficult concepts all in one video. I will watch this several times. Thank you so much.
My man - the entire premise of building applications is picking special cases. This isn’t for your gas station website, these are concepts beyond the generics. Keep the hate out of the comments, I’m sorry you don’t understand the purpose of the video and you feel inferior to others. Poor guy.
@@bntheyoutube The entire premise of teaching is not going into special cases. The entire purpose of teaching a single thing is not mixing it with other stuff. Btw, if there is anybody spreading hateful stuff in the comment, it is your attitude. Don't project your inferiority complex on me, dude.
Can you explain the difference between leaky bucket and token bucket? They seem exactly the same to me... unless the token bucket can hold an unlimited number of tokens...
So in an system design interview, I'm expected to know and utilize these algorithms? Honestly, it seems like if I start talking about some of these, my interviewer will not be familiar with them and just be confused.
@@MarcoLenzo it's about being capable to go beyond the framework that you are being given. Just how we expect from Java middles+ candidates to know low-level Java and how the Spring works behind the neat annotation, the same way we expect from seniors and architects to know what kind of load balancing algos there are pros/cons, different kinds of ratelimiting and etc. If engineer is unaware of that, they will take incredibly more time to design a performant and reliable system, because thye are just thinking on a higher level of abstraction. Tbh, if person's only developing monoliths with a load of no more than 100 RPM, then those algos are likely useless for them
I Think people are getting confused because they think this is advice for programming interviews. It's really not. As the title says, It's System Design; which is relevant in 0.01% of coding interviews (that's being generous). If you're on track to be designing say systems that run server balancing then you'll (hopefully) know these already (if not, you'll need more than a basic YT summary!); but if you're a say a web dev, games dev, making IOT things, data analytics, most AI, etc etc etc etc etc etc etc etc etc then the majority of this is unlikely to be relevant (you'll use these most days, but it won't be you determining the design) I've programmed quad trees for games now game engines (eg Unreal, Unity, etc) do that for you. I've programmed quad trees for fire simulation software; but now libraries do that for me. I've developed consistent hashing for server balancing; but now AWS, Azure, etc does this for you. Yes System design concepts are great, and understanding them will only make you a better programmer; but only for a small minority will they be things you'll need to know intimately.
Exactly! While I was watching this video I felt like maybe in the 80s or 90s you'd need to know all this as a programmer, and you'd have read all of Knuth's books, but nowadays, libraries and existing systems do that for you. It is true, though, that understanding these will help you make the right decisions. Just like understanding a Von Neumann architecture helps you write more efficient code.
In today's oversaturated job market for software developers, this good stuff is only the very basic 101 on algorithms and data structures, because there are already millions who know all this. My bet is that you will need much more to stand out from hundreds of applicants for normal positions, or thousands for the big ones.
You don't have SWE experience or you are a googler, right? Because most devs don't even know these, it's just f*cking anecdotes, and generally not very good advice regarding what you should actually learn. People should start with understanding heaps, for example, not bullshit "leaky bucket" (you can derive it in 5 minutes yourself).
@@heyman620 the movie has changed. LeetCode-style interviews are now standard, even in small companies. With five-stage interviews becoming the norm, you have to be quick and flawless in the behavioral BS as well, just to stand a chance. What was in the video some years ago sure have given applicants an edge, but now it is plain technical crap when you are up against hundreds of candidates, most of whom are thoroughly LeetCode-prepared.
@TheJacrespo I understand it man, just grind LeetCode - it's hard for all of us. These algos are mostly not the basics, you need to know common stuff very well. I generally think this video is bad advice. By the way, it also sucked 7+ years ago.
00:00 Intro
00:41 Consistent Hashing
01:47 Geohash
02:02 Quadtree
02:33 Leaky Bucket
03:05 Token Bucket
03:19 Trie
04:30 Bloom Filter
05:18 Consensus algorithms
Thanks for the video!
BTW. while the video tells you to focus on high-level algorithmic concepts, I'd say don't underestimate the power of diving into the code. Those little details can make or break a system. The video praises virtual nodes in consistent hashing, but you should know that adding these can complicate things, and they are not always needed. Consistent hashing is great for Big Data, but it wont solve every problem-like hot-spotting, where some nodes get overloaded.
The Leaky Bucket algorithm sounds pretty coold on paper, but have you considered request starvation or the need for a separate queuing system?
Tries are efficient for string searches, but they can eat up memory like there's no tomorrow. And let's not forget Bloom filters; they're quick for caching and analytics, but what about false positives? The video mentions Kafka and SCD but trust me, implementing these algorithms in a real-world, large-scale system is a whole different ball game, yep , dude!
What alternatives do you suggest to address the challenged posed by the above options ?
@@kedikeba sure
### Consistent Hashing
-HRW Hashing: HRW (Highest Random Weight) hashing is stateless and provides an even distribution when you lose or gain nodes. It's easier for you to explain and understand compared to Consistent Hashing.
### Leaky Bucket Algorithm
1. Token Bucket Algorithm Unlike the Leaky Bucket you might use, the Token Bucket fills up with tokens at a steady rate. You can only send data packets if a token is available, allowing you to handle bursty data better.
2. Adaptive Leaky Bucket. This is a dynamic version that adjusts the leak rate based on your network conditions. It's more adaptive and could be more efficient for your needs.
3. Priority Queuing: If you use a priority queue alongside the Leaky Bucket, you can manage different types of traffic more efficiently. High-priority packets would get processed faster.
### Tries
1. Hash Tables: If you're dealing with languages that have large alphabets, hash tables could be more memory-efficient
2. Radix Trees: Also known as compressed tries, these trees store multiple characters in a single node. This could save you some memory.
3. Variable-Length Arrays in Node: You could use variable-length arrays in each node, sorted by frequency, to save space. This is particularly useful if your trie is sparse.
### Bloom Filters
1. Cuckoo Filters. These filters allow you to remove items dynamically and offer better space efficiency.
2. Quotient Filters. These are more memory-efficient and allow you to perform more advanced set operations like union and intersection.
3. Counting Filters If you need to remove elements, Counting Filters use counters instead of bits. This could offer you more flexibility.
I agree, for system design the SOLID principles (separation of concerns), interfaces/dependency injection and unit testability are far more important than specific algorithms. These up to a point are implementation details and can always be replaced/fine tuned to data at hand (correctness first). I am not saying they are not important, but they should come second.
How will a trie eat up memory like no tomorrow? It's never going to take more space than just storing the entire set of strings by itself
@@JoseCarlosVMyup, but you have to look at the total memory footprint, including the additional overhead of the trie structure: nodes and links to child nodes, not just the strings stored themselves. Just storing 1 million unique strings of average size would occupy approximately 10 MB on their own. Regardless of whether these 1 million strings have overlapping prefixes or not, the memory overhead for the trie structure alone would be around 2 GB. Even with a 50% overlap due to common prefixes, the overhead still stands at more than 1 GB. Additionally, each node could have an array or hash map to represent transitions to child nodes, which could be mostly empty but would still take up space.
I found this channel by chance. Very good. It's gold
I found this video hard to follow compared to the rest. Too much packed in it. I feel suddenly incredibly dumb lol
I feel exactly the same
Me too
me too
me too
Yeah, I’m gonna go looking for overly-simplistic examples of these in action, in a language I can follow…like JavaScript. I need to see implementations to really internalize the concept.
Thank you so much, it is hard to follow - but it is not because it’s not well structured. It’s several very difficult concepts all in one video. I will watch this several times. Thank you so much.
Perhaps, if all these algos covered one by one as part of separte videos in details then that would be appreciable.
They have covered them in here beginning with consistent hashing first th-cam.com/video/UF9Iqmg94tk/w-d-xo.htmlsi=8rpEiCTd3s3_V1fS
I’d appreciate anything this man is offering you to be honest.
I would love to see videos like this covering R-trees and KD-trees. 👨💻
Another worthy algorithm is Hash Array Mapped Trie. It is a cornerstone of all modern functional programming languages.
Those algorithms needed their own video, with pseudo-code and scenario-based example
i don't know any of them, that means there is lot to learn. thanks for the video.
Love BIG brain.
There is a lot to learn here.
Thank you, I have to subscribe.
amazing video! gave me some homework to do which is awesome
You need a course in system design, and I would buy it)))
Good stuff 👍 Thank you 💜
Claims: Algorithms
Reality: Muddy conflagration of Data Structures and Algorithms, most of which are special cases of the more important generic ones
My man - the entire premise of building applications is picking special cases. This isn’t for your gas station website, these are concepts beyond the generics. Keep the hate out of the comments, I’m sorry you don’t understand the purpose of the video and you feel inferior to others. Poor guy.
@@bntheyoutube The entire premise of teaching is not going into special cases. The entire purpose of teaching a single thing is not mixing it with other stuff. Btw, if there is anybody spreading hateful stuff in the comment, it is your attitude. Don't project your inferiority complex on me, dude.
Yawn
@@lucemiserlohn wtf? You seem so dumb
What was this video made of?
a video about multi-region system that abides to data localization law would be nice
Have you guys made a video about things to know before system design?
I m still developing(c/c++) a Generalization of kd-tree and quad-oct…-trees, special trees. All - in - one structure.
Thank you...
Can you explain the difference between leaky bucket and token bucket? They seem exactly the same to me... unless the token bucket can hold an unlimited number of tokens...
What tool/editor do you use for animation, can you please share?
After Effects.
So in an system design interview, I'm expected to know and utilize these algorithms? Honestly, it seems like if I start talking about some of these, my interviewer will not be familiar with them and just be confused.
Depends on the company. If you’re gonna be working on managed systems offered by Google, Aws, azure then yes you need to know these.
They will know. Trust me
@@morenoh149can you expand on why you think they are important if you use managed systems?
I honestly think that knowing them superficially means nothing at all. You can read how much you want but the proof's in the pudding.
@@MarcoLenzo it's about being capable to go beyond the framework that you are being given. Just how we expect from Java middles+ candidates to know low-level Java and how the Spring works behind the neat annotation, the same way we expect from seniors and architects to know what kind of load balancing algos there are pros/cons, different kinds of ratelimiting and etc. If engineer is unaware of that, they will take incredibly more time to design a performant and reliable system, because thye are just thinking on a higher level of abstraction. Tbh, if person's only developing monoliths with a load of no more than 100 RPM, then those algos are likely useless for them
Pure gold!!!
How you edit your video?
What is the name of the app?
Where is merkle tree ?. It is very important in distributed systems
I Think people are getting confused because they think this is advice for programming interviews. It's really not.
As the title says, It's System Design; which is relevant in 0.01% of coding interviews (that's being generous).
If you're on track to be designing say systems that run server balancing then you'll (hopefully) know these already (if not, you'll need more than a basic YT summary!); but if you're a say a web dev, games dev, making IOT things, data analytics, most AI, etc etc etc etc etc etc etc etc etc then the majority of this is unlikely to be relevant (you'll use these most days, but it won't be you determining the design)
I've programmed quad trees for games now game engines (eg Unreal, Unity, etc) do that for you.
I've programmed quad trees for fire simulation software; but now libraries do that for me.
I've developed consistent hashing for server balancing; but now AWS, Azure, etc does this for you.
Yes System design concepts are great, and understanding them will only make you a better programmer; but only for a small minority will they be things you'll need to know intimately.
Exactly! While I was watching this video I felt like maybe in the 80s or 90s you'd need to know all this as a programmer, and you'd have read all of Knuth's books, but nowadays, libraries and existing systems do that for you.
It is true, though, that understanding these will help you make the right decisions. Just like understanding a Von Neumann architecture helps you write more efficient code.
@@Misteribel well said.
Difficult to follow. Didn't catch the algo names used in video. Also, presentation is blurry
For those interested, each algorithm has been covered in detail as well th-cam.com/video/UF9Iqmg94tk/w-d-xo.htmlsi=8rpEiCTd3s3_V1fS
4:20 For C and C++ only?
NIce, now I feel like a complete dumb since everything was new to me
@ ByteBytego pls work on your audio of the video!
Great, thanks
i wish they charge fee based on time game runs
You have listed Geo hash but not explained it.
Quadtrees
Is it just me, but I've never needed any of these algorithms in interviews
sick
This script feels like it was written by chatgpt
In today's oversaturated job market for software developers, this good stuff is only the very basic 101 on algorithms and data structures, because there are already millions who know all this. My bet is that you will need much more to stand out from hundreds of applicants for normal positions, or thousands for the big ones.
You don't have SWE experience or you are a googler, right? Because most devs don't even know these, it's just f*cking anecdotes, and generally not very good advice regarding what you should actually learn. People should start with understanding heaps, for example, not bullshit "leaky bucket" (you can derive it in 5 minutes yourself).
@@heyman620 the movie has changed. LeetCode-style interviews are now standard, even in small companies. With five-stage interviews becoming the norm, you have to be quick and flawless in the behavioral BS as well, just to stand a chance. What was in the video some years ago sure have given applicants an edge, but now it is plain technical crap when you are up against hundreds of candidates, most of whom are thoroughly LeetCode-prepared.
@TheJacrespo I understand it man, just grind LeetCode - it's hard for all of us. These algos are mostly not the basics, you need to know common stuff very well. I generally think this video is bad advice. By the way, it also sucked 7+ years ago.
@@TheJacrespo I think heyman620 needs to work on their communication... but... I'm also confident they're right.
Boom Blackbox algorithm
Also, in practice, how consistent are these algorithms?
this is hard
Algorithms are not core to system design, interfaces and decomposision are.
I dont think this goes deep enough to be useful
You speak very fast..
@ByteByteGo